The backbone of this application are the agents and their interactions. Overall, we had two several types of agents :
- User Agents: Agents attached to every user. Primarily tasked with translating incoming messages into the user’s preferred language
- Aya Agents: Various agents related to Aya, each with its own specific role/job
User Agents
The UserAgent class is used to define an agent that shall be related to every user a part of the chat room. A number of the functions implemented by the UserAgent class:
1. Translate incoming messages into the user’s preferred language
2. Activate/Invoke graph when a user sends a message
3. Maintain a chat history to assist provide context to the interpretation task to permit for ‘context-aware’ translation
class UserAgent(object):def __init__(self, llm, userid, user_language):
self.llm = llm
self.userid = userid
self.user_language = user_language
self.chat_history = []
prompt = ChatPromptTemplate.from_template(USER_SYSTEM_PROMPT2)
self.chain = prompt | llm
def set_graph(self, graph):
self.graph = graph
def send_text(self,text:str, debug = False):
message = ChatMessage(message = HumanMessage(content=text), sender = self.userid)
inputs = {"messages": [message]}
output = self.graph.invoke(inputs, debug = debug)
return output
def display_chat_history(self, content_only = False):
for i in self.chat_history:
if content_only == True:
print(f"{i.sender} : {i.content}")
else:
print(i)
def invoke(self, message:BaseMessage) -> AIMessage:
output = self.chain.invoke({'message':message.content, 'user_language':self.user_language})
return output
For essentially the most part, the implementation of UserAgent is pretty standard LangChain/LangGraph code:
- Define a LangChain chain ( a prompt template + LLM) that’s chargeable for doing the actual translation.
- Define a send_text function thats used to invoke the graph each time a user desires to send a brand new message
For essentially the most part, the performance of this agent depends on the interpretation quality of the LLM, as translation is the first objective of this agent. And LLM performance can vary significantly for translation, especially depending on the languages involved. Certain low resource languages don’t have good representation within the training data of some models and this does affect the interpretation quality for those languages.
Aya Agents
For Aya, we even have a system of separate agents that each one contributes towards the general assistant. Specifically, we have now
- AyaSupervisor : Control agent that supervises the operation of the opposite Aya agents.
- AyaQuery : Agent for running RAG based query answering
- AyaSummarizer : Agent for generating chat summaries and doing task identification
- AyaTranslator: Agent for translating messages to English
class AyaTranslator(object):def __init__(self, llm) -> None:
self.llm = llm
prompt = ChatPromptTemplate.from_template(AYA_TRANSLATE_PROMPT)
self.chain = prompt | llm
def invoke (self, message: str) -> AIMessage:
output = self.chain.invoke({'message':message})
return output
class AyaQuery(object):
def __init__(self, llm, store, retriever) -> None:
self.llm = llm
self.retriever = retriever
self.store = store
qa_prompt = ChatPromptTemplate.from_template(AYA_AGENT_PROMPT)
self.chain = qa_prompt | llm
def invoke(self, query : str) -> AIMessage:
context = format_docs(self.retriever.invoke(query))
rag_output = self.chain.invoke({'query':query, 'context':context})
return rag_output
class AyaSupervisor(object):
def __init__(self, llm):
prompt = ChatPromptTemplate.from_template(AYA_SUPERVISOR_PROMPT)
self.chain = prompt | llm
def invoke(self, message : str) -> str:
output = self.chain.invoke(message)
return output.content
class AyaSummarizer(object):
def __init__(self, llm):
message_length_prompt = ChatPromptTemplate.from_template(AYA_SUMMARIZE_LENGTH_PROMPT)
self.length_chain = message_length_prompt | llm
prompt = ChatPromptTemplate.from_template(AYA_SUMMARIZER_PROMPT)
self.chain = prompt | llm
def invoke(self, message : str, agent : UserAgent) -> str:
length = self.length_chain.invoke(message)
try:
length = int(length.content.strip())
except:
length = 0
chat_history = agent.chat_history
if length == 0:
messages_to_summarize = [chat_history[i].content for i in range(len(chat_history))]
else:
messages_to_summarize = [chat_history[i].content for i in range(min(len(chat_history), length))]
print(length)
print(messages_to_summarize)
messages_to_summarize = "n ".join(messages_to_summarize)
output = self.chain.invoke(messages_to_summarize)
output_content = output.content
print(output_content)
return output_content
Most of those agents have an identical structure, primarily consisting of a LangChain chain consisting of a custom prompt and a LLM. Exceptions include the AyaQuery agent which has an extra vector database retriever to implement RAG and AyaSummarizer which has multiple LLM functions being implemented inside it.
Design considerations
Role of AyaSupervisor Agent: Within the design of the graph, we had a hard and fast edge going from the Supervisor node to the user nodes. Which meant that each one messages that reached the Supervisor node were pushed to the user nodes itself. Subsequently, in cases where Aya was being addressed, we needed to be sure that only a single final output from Aya was being pushed to the users. We didn’t want intermediate messages, if any, to succeed in the users. Subsequently, we had the AyaSupervisor agent that acted as the one point of contact for the Aya agent. This agent was primarily chargeable for interpreting the intent of the incoming message, direct the message to the suitable task-specific agent, after which outputting the ultimate message to be shared with the users.
Design of AyaSummarizer: The AyaSummarizer agent is barely more complex in comparison with the opposite Aya agents because it carries out a two-step process. In step one, the agent first determines the variety of messages that should be summarized, which is a LLM call with its own prompt. Within the second step, once we all know the variety of messages to summarize, we collate the required messages and pass it to the LLM to generate the actual summary. Along with the summary, on this step itself, the LLM also identifies any motion items that were present within the messages and lists it out individually.
So broadly there have been three tasks: determining the length of the messages to be summarized, summarizing messages, identifying motion items. Nevertheless, on condition that the primary task was proving a bit difficult for the LLM with none explicit examples, I made the alternative to have this be a separate LLM call after which mix the 2 last two tasks as their very own LLM call.
It might be possible to eliminate the extra LLM call and mix all three tasks in a single call. Potential options include :
- Providing very detailed examples that cover all three tasks in a single step
- Generating lot of examples to truly finetune a LLM to give you the chance to perform well on this task
Role of AyaTranslator: One in all the goals with respect to Aya was to make it a multilingual AI assistant which might communicate within the user’s preferred language. Nevertheless, it could be difficult to handle different languages internally inside the Aya agents. Specifically, if the Aya agents prompt is in English and the user message is in a unique language, it could potentially create issues. So as a way to avoid such situations, as a filtering step, we translated any incoming user messages to Aya into English. In consequence, all of the interior work inside the Aya group of agents was done in English, including the output. We didnt must translate the Aya output back to the unique language because when the message reaches the users, the User agents will handle translating the message to their respective assigned language.