Multi AI Agent Systems 101

-

Initially, when ChatGPT just appeared, we used easy prompts to get answers to our questions. Then, we encountered issues with hallucinations and started using RAG (Retrieval Augmented Generation) to supply more context to LLMs. After that, we began experimenting with AI agents, where LLMs act as a reasoning engine and may determine what to do next, which tools to make use of, and when to return the ultimate answer.

The subsequent evolutionary step is to create teams of such agents that may collaborate with one another. This approach is logical because it mirrors human interactions. We work in teams where each member has a particular role:

  • The product manager proposes the following project to work on.
  • The designer creates its feel and look.
  • The software engineer develops the answer.
  • The analyst examines the info to make sure it performs as expected and identifies ways to enhance the product for purchasers.

Similarly, we are able to create a team of AI agents, each specializing in one domain. They’ll collaborate and reach a final conclusion together. Just as specialization enhances performance in real life, it could also profit the performance of AI agents.

One other advantage of this approach is increased flexibility. Each agent can operate with its own prompt, set of tools and even LLM. As an example, we are able to use different models for various parts of our system. You should use GPT-4 for the agent that needs more reasoning and GPT-3.5 for the one which does only easy extraction. We are able to even fine-tune the model for small specific tasks and use it in our crew of agents.

The potential drawbacks of this approach are time and price. Multiple interactions and knowledge sharing between agents require more calls to LLM and devour additional tokens. This might end in longer wait times and increased expenses.

There are several frameworks available for multi-agent systems today.
Listed below are among the hottest ones:

  • AutoGen: Developed by Microsoft, AutoGen uses a conversational approach and was considered one of the earliest frameworks for multi-agent systems,
  • LangGraph: While not strictly a multi-agent framework, LangGraph allows for outlining complex interactions between actors using a graph structure. So, it might even be adapted to create multi-agent systems.
  • CrewAI: Positioned as a high-level framework, CrewAI facilitates the creation of “crews” consisting of role-playing agents able to collaborating in various ways.

I’ve decided to begin experimenting with multi-agent frameworks from CrewAI because it’s quite widely popular and user friendly. So, it looks like an excellent choice to begin with.

In this text, I’ll walk you thru learn how to use CrewAI. As analysts, we’re the domain experts liable for documenting various data sources and addressing related questions. We’ll explore learn how to automate these tasks using multi-agent frameworks.

Let’s start with organising the environment. First, we’d like to put in the CrewAI foremost package and an extension to work with tools.

pip install crewai
pip install 'crewai[tools]'

CrewAI was developed to work primarily with OpenAI API, but I’d also wish to try it with a neighborhood model. In line with the ChatBot Arena Leaderboard, the very best model you possibly can run in your laptop is Llama 3 (8b parameters). It’s going to be probably the most feasible option for our use case.

We are able to access Llama models using Ollama. Installation is pretty straightforward. You should download Ollama from the web site after which undergo the installation process. That’s it.

Now, you possibly can test the model in CLI by running the next command.

ollama run llama3

For instance, you possibly can ask something like this.

Let’s create a custom Ollama model to make use of later in CrewAI.

We’ll start with a ModelFile (documentation). I only specified the bottom model (llama3), temperature and stop sequence. Nonetheless, you would possibly add more features. For instance, you possibly can determine the system message using SYSTEM keyword.

FROM llama3

# set parameters
PARAMETER temperature 0.5
PARAMETER stop Result

I’ve saved it right into a Llama3ModelFile file.

Let’s create a bash script to load the bottom model for Ollama and create the custom model we defined in ModelFile.

#!/bin/zsh

# define variables
model_name="llama3"
custom_model_name="crewai-llama3"

# load the bottom model
ollama pull $model_name

# create the model file
ollama create $custom_model_name -f ./Llama3ModelFile

Let’s execute this file.

chmod +x ./llama3_setup.sh
./llama3_setup.sh

Yow will discover each files on GitHub: Llama3ModelFile and llama3_setup.sh

We’d like to initialise the next environmental variables to make use of the local Llama model with CrewAI.

os.environ["OPENAI_API_BASE"]='http://localhost:11434/v1'

os.environ["OPENAI_MODEL_NAME"]='crewai-llama3'
# custom_model_name from the bash script

os.environ["OPENAI_API_KEY"] = "NA"

We’ve finished the setup and are able to proceed our journey.

As analysts, we regularly play the role of material experts for data and a few data-related tools. In my previous team, we used to have a channel with almost 1K participants, where we were answering a number of questions on our data and the ClickHouse database we used as storage. It took us quite plenty of time to administer this channel. It will be interesting to see whether such tasks might be automated with LLMs.

For this instance, I’ll use the ClickHouse database. If you happen to’re interested, You’ll be able to learn more about ClickHouse and learn how to set it up locally in my previous article. Nonetheless, we won’t utilise any ClickHouse-specific features, so be at liberty to stick with the database you understand.

I’ve created a fairly easy data model to work with. There are only two tables in our DWH (Data Warehouse): ecommerce_db.users and ecommerce_db.sessions. As you would possibly guess, the primary table comprises information concerning the users of our service.

The ecommerce_db.sessions table stores details about user sessions.

Regarding data source management, analysts typically handle tasks like writing and updating documentation and answering questions on this data. So, we’ll use LLM to write down documentation for the table within the database and teach it to reply questions on data or ClickHouse.

But before moving on to the implementation, let’s learn more concerning the CrewAI framework and its core concepts.

The cornerstone of a multi-agent framework is an agent concept. In CrewAI, agents are powered by role-playing. Role-playing is a tactic if you ask an agent to adopt a persona and behave like a top-notch backend engineer or helpful customer support agent. So, when making a CrewAI agent, it’s essential specify each agent’s role, goal, and backstory in order that LLM knows enough to play this role.

The agents’ capabilities are limited without tools (functions that agents can execute and get results). With CrewAI, you need to use considered one of the predefined tools (for instance, to go looking the Web, parse an internet site, or do RAG on a document), create a custom tool yourself or use LangChain tools. So, it’s pretty easy to create a strong agent.

Let’s move on from agents to the work they’re doing. Agents are working on tasks (specific assignments). For every task, we’d like to define an outline, expected output (definition of done), set of obtainable tools and assigned agent. I actually like that these frameworks follow the managerial best practices like a transparent definition of done for the tasks.

The subsequent query is learn how to define the execution order for tasks: which one to work on first, which of them can run in parallel, etc. CrewAI implemented processes to orchestrate the tasks. It provides a few options:

  • Sequential —probably the most straightforward approach when tasks are called one after one other.
  • Hierarchical — when there’s a manager (specified as LLM model) that creates and delegates tasks to the agents.

Also, CrewAI is working on a consensual process. In such a process, agents will find a way to make decisions collaboratively with a democratic approach.

There are other levers you need to use to tweak the means of tasks’ execution:

  • You’ll be able to mark tasks as “asynchronous”, then they will likely be executed in parallel, so that you will find a way to get a solution faster.
  • You should use the “human input” flag on a task, after which the agent will ask for human approval before finalising the output of this task. It could actually help you add an oversight to the method.

We’ve defined all the first constructing blocks and may discuss the holly grail of CrewAI — crew concept. The crew represents the team of agents and the set of tasks they will likely be working on. The approach for collaboration (processes we discussed above) can be defined on the crew level.

Also, we are able to arrange the memory for a crew. Memory is crucial for efficient collaboration between the agents. CrewAI supports three levels of memory:

  • Short-term memory stores information related to the present execution. It helps agents to work together on the present task.
  • Long-term memory is data concerning the previous executions stored within the local database. This sort of memory allows agents to learn from earlier iterations and improve over time.
  • Entity memory captures and structures details about entities (like personas, cities, etc.)

Straight away, you possibly can only turn on all kinds of memory for a crew with none further customisation. Nonetheless, it doesn’t work with the Llama models.

We’ve learned enough concerning the CrewAI framework, so it’s time to begin using this information in practice.

Let’s start with a sure bet: putting together the documentation for our DWH. As we discussed before, there are two tables in our DWH, and I would love to create an in depth description for them using LLMs.

First approach

At first, we’d like to think concerning the team structure. Consider this as a typical managerial task. Who would you hire for such a job?

I’d break this task into two parts: retrieving data from a database and writing documentation. So, we’d like a database specialist and a technical author. The database specialist needs access to a database, while the author won’t need any special tools.

Now, we have now a high-level plan. Let’s create the agents.

For every agent, I’ve specified the role, goal and backstory. I’ve tried my best to supply agents with all of the needed context.

database_specialist_agent = Agent(
role = "Database specialist",
goal = "Provide data to reply business questions using SQL",
backstory = '''You might be an authority in SQL, so that you might help the team
to collect needed data to power their decisions.
You might be very accurate and consider all of the nuances in data.''',
allow_delegation = False,
verbose = True
)

tech_writer_agent = Agent(
role = "Technical author",
goal = '''Write engaging and factually accurate technical documentation
for data sources or tools''',
backstory = '''
You might be an authority in each technology and communications, so you possibly can easily explain even sophisticated concepts.
You base your work on the factual information provided by your colleagues.
Your texts are concise and might be easily understood by a large audience.
You utilize skilled but somewhat a casual style in your communication.
''',
allow_delegation = False,
verbose = True
)

We’ll use an easy sequential process, so there’s no need for agents to delegate tasks to one another. That’s why I specified allow_delegation = False.

The subsequent step is setting the tasks for agents. But before moving to them, we’d like to create a custom tool to connect with the database.

First, I put together a function to execute ClickHouse queries using HTTP API.

CH_HOST = 'http://localhost:8123' # default address 

def get_clickhouse_data(query, host = CH_HOST, connection_timeout = 1500):
r = requests.post(host, params = {'query': query},
timeout = connection_timeout)
if r.status_code == 200:
return r.text
else:
return 'Database returned the next error:n' + r.text

When working with LLM agents, it’s essential to make tools fault-tolerant. For instance, if the database returns an error (status_code != 200), my code won’t throw an exception. As an alternative, it can return the error description to the LLM so it might try and resolve the difficulty.

To create a CrewAI custom tool, we’d like to derive our class from crewai_tools.BaseTool, implement the _run method after which create an instance of this class.

from crewai_tools import BaseTool

class DatabaseQuery(BaseTool):
name: str = "Database Query"
description: str = "Returns the results of SQL query execution"

def _run(self, sql_query: str) -> str:
# Implementation goes here
return get_clickhouse_data(sql_query)

database_query_tool = DatabaseQuery()

Now, we are able to set the tasks for the agents. Again, providing clear instructions and all of the context to LLM is crucial.

table_description_task = Task(
description = '''Provide the excellent overview for the info
in table {table}, in order that it is easy to know the structure
of the info. This task is crucial to place together the documentation
for our database''',
expected_output = '''The great overview of {table} within the md format.
Include 2 sections: columns (list of columns with their types)
and examples (the primary 30 rows from table).''',
tools = [database_query_tool],
agent = database_specialist_agent
)

table_documentation_task = Task(
description = '''Using provided information concerning the table,
put together the detailed documentation for this table in order that
people can use it in practice''',
expected_output = '''Well-written detailed documentation describing
the info scheme for the table {table} in markdown format,
that offers the table overview in 1-2 sentences then then
describes each columm. Structure the columns description
as a markdown table with column name, type and outline.''',
tools = [],
output_file="table_documentation.md",
agent = tech_writer_agent
)

You would possibly have noticed that I’ve used {table} placeholder within the tasks’ descriptions. We’ll use table as an input variable when executing the crew, and this value will likely be inserted into all placeholders.

Also, I’ve specified the output file for the table documentation task to save lots of the locally.

We’ve all we’d like. Now, it’s time to create a crew and execute the method, specifying the table we’re all for. Let’s try it with the users table.

crew = Crew(
agents = [database_specialist_agent, tech_writer_agent],
tasks = [table_description_task, table_documentation_task],
verbose = 2
)

result = crew.kickoff({'table': 'ecommerce_db.users'})

It’s an exciting moment, and I’m really looking forward to seeing the result. Don’t worry if execution takes a while. Agents make multiple LLM calls, so it’s perfectly normal for it to take a couple of minutes. It took 2.5 minutes on my laptop.

We asked LLM to return the documentation in markdown format. We are able to use the next code to see the formatted end in Jupyter Notebook.

from IPython.display import Markdown
Markdown(result)

At first glance, it looks great. We’ve got the valid markdown file describing the users’ table.

But wait, it’s incorrect. Let’s see what data we have now in our table.

The columns listed within the documentation are completely different from what we have now within the database. It’s a case of LLM hallucinations.

We’ve set verbose = 2 to get the detailed logs from CrewAI. Let’s read through the execution logs to discover the basis reason for the issue.

First, the database specialist couldn’t query the database as a consequence of complications with quotes.

The specialist didn’t manage to resolve this problem. Finally, this chain has been terminated by CrewAI with the next output: Agent stopped as a consequence of iteration limit or deadline.

This implies the technical author didn’t receive any factual information concerning the data. Nonetheless, the agent continued and produced completely fake results. That’s how we ended up with incorrect documentation.

Fixing the problems

Although our first iteration wasn’t successful, we’ve learned quite a bit. We’ve (not less than) two areas for improvement:

  • Our database tool is simply too difficult for the model, and the agent struggles to make use of it. We are able to make the tool more tolerant by removing quotes from the start and end of the queries. This solution will not be ideal since valid SQL can end with a quote, but let’s try it.
  • Our technical author isn’t basing its output on the input from the database specialist. We’d like to tweak the prompt to focus on the importance of providing only factual information.

So, let’s attempt to fix these problems. First, we’ll fix the tool — we are able to leverage strip to eliminate quotes.

CH_HOST = 'http://localhost:8123' # default address 

def get_clickhouse_data(query, host = CH_HOST, connection_timeout = 1500):
r = requests.post(host, params = {'query': query.strip('"').strip("'")},
timeout = connection_timeout)
if r.status_code == 200:
return r.text
else:
return 'Database returned the next error:n' + r.text

Then, it’s time to update the prompt. I’ve included statements emphasizing the importance of sticking to the facts in each the agent and task definitions.


tech_writer_agent = Agent(
role = "Technical author",
goal = '''Write engaging and factually accurate technical documentation
for data sources or tools''',
backstory = '''
You might be an authority in each technology and communications, so that you
can easily explain even sophisticated concepts.
Your texts are concise and might be easily understood by wide audience.
You utilize skilled but somewhat informal style in your communication.
You base your work on the factual information provided by your colleagues.
You stick with the facts within the documentation and use ONLY
information provided by the colleagues not adding anything.''',
allow_delegation = False,
verbose = True
)

table_documentation_task = Task(
description = '''Using provided information concerning the table,
put together the detailed documentation for this table in order that
people can use it in practice''',
expected_output = '''Well-written detailed documentation describing
the info scheme for the table {table} in markdown format,
that offers the table overview in 1-2 sentences then then
describes each columm. Structure the columns description
as a markdown table with column name, type and outline.
The documentation is predicated ONLY on the knowledge provided
by the database specialist with none additions.''',
tools = [],
output_file = "table_documentation.md",
agent = tech_writer_agent
)

Let’s execute our crew once more and see the outcomes.

We’ve achieved a bit higher result. Our database specialist was in a position to execute queries and look at the info, which is a big win for us. Moreover, we are able to see all of the relevant fields within the result table, though there are a number of other fields as well. So, it’s still not entirely correct.

I once more looked through the CrewAI execution log to work out what went incorrect. The difficulty lies in getting the list of columns. There’s no filter by database, so it returns some unrelated columns that appear within the result.

SELECT column_name 
FROM information_schema.columns
WHERE table_name = 'users'

Also, after taking a look at multiple attempts, I noticed that the database specialist, every now and then, executes select * from

query. It'd cause some issues in production as it would generate a number of data and send it to LLM.

More specialised tools

We are able to provide our agent with more specialised tools to enhance our solution. Currently, the agent has a tool to execute any SQL query, which is flexible and powerful but vulnerable to errors. We are able to create more focused tools, akin to getting table structure and top-N rows from the table. Hopefully, it can reduce the variety of mistakes.

class TableStructure(BaseTool):
name: str = "Table structure"
description: str = "Returns the list of columns and their types"

def _run(self, table: str) -> str:
table = table.strip('"').strip("'")
return get_clickhouse_data(
'describe {table} format TabSeparatedWithNames'
.format(table = table)
)

class TableExamples(BaseTool):
name: str = "Table examples"
description: str = "Returns the primary N rows from the table"

def _run(self, table: str, n: int = 30) -> str:
table = table.strip('"').strip("'")
return get_clickhouse_data(
'select * from {table} limit {n} format TabSeparatedWithNames'
.format(table = table, n = n)
)

table_structure_tool = TableStructure()
table_examples_tool = TableExamples()

Now, we'd like to specify these tools within the task and re-run our script. After the primary attempt, I got the next output from the Technical Author.

Task output: This final answer provides an in depth and factual description 
of the ecommerce_db.users table structure, including column names, types,
and descriptions. The documentation adheres to the provided information
from the database specialist with none additions or modifications.

More focused tools helped the database specialist retrieve the proper table information. Nonetheless, although the author had all of the obligatory information, we didn’t get the expected result.

As we all know, LLMs are probabilistic, so I gave it one other try. And hooray, this time, the result was pretty good.

It’s not perfect because it still includes some irrelevant comments and lacks the general description of the table. Nonetheless, providing more specialised tools has definitely paid off. It also helped to forestall issues when the agent tried to load all the info from the table.

Quality assurance specialist

We’ve achieved pretty good results, but let’s see if we are able to improve them further. A standard practice in multi-agent setups is quality assurance, which adds the ultimate review stage before finalising the outcomes.

Let’s create a brand new agent — a Quality Assurance Specialist, who will likely be in control of review.

qa_specialist_agent = Agent(
role = "Quality Assurance specialist",
goal = """Ensure the very best quality of the documentation we offer
(that it's correct and simple to know)""",
backstory = '''
You're employed as a Quality Assurance specialist, checking the work
from the technical author and ensuring that it's inline
with our highest standards.
You should check that the technical author provides the complete complete
answers and make no assumptions.
Also, it's essential ensure that the documentation addresses
all of the questions and is straightforward to know.
''',
allow_delegation = False,
verbose = True
)

Now, it’s time to explain the review task. I’ve used the context parameter to specify that this task requires outputs from each table_description_task and table_documentation_task.

qa_review_task = Task(
description = '''
Review the draft documentation provided by the technical author.
Be certain that the documentation fully answers all of the questions:
the aim of the table and its structure in the shape of table.
Be sure that that the documentation is consistent with the knowledge
provided by the database specialist.
Double check that there aren't any irrelevant comments in the ultimate version
of documentation.
''',
expected_output = '''
The ultimate version of the documentation in markdown format
that might be published.
The documentation should fully address all of the questions, be consistent
and follow our skilled but informal tone of voice.
''',
tools = [],
context = [table_description_task, table_documentation_task],
output_file="checked_table_documentation.md",
agent = qa_specialist_agent
)

Let’s update our crew and run it.

full_crew = Crew(
agents=[database_specialist_agent, tech_writer_agent, qa_specialist_agent],
tasks=[table_description_task, table_documentation_task, qa_review_task],
verbose = 2,
memory = False # don't work with Llama
)

full_result = full_crew.kickoff({'table': 'ecommerce_db.users'})

We now have more structured and detailed documentation due to the addition of the QA stage.

Delegation

With the addition of the QA specialist, it will be interesting to check the delegation mechanism. The QA specialist agent might need questions or requests that it could delegate to other agents.

I attempted using the delegation with Llama 3, however it didn’t go well. Llama 3 struggled to call the co-worker tool accurately. It couldn’t specify the proper co-worker’s name.

We achieved pretty good results with a neighborhood model that may run on any laptop, but now it’s time to modify gears and use a far more powerful model — GPT-4o.

To do it, we just have to update the next environment variables.

os.environ["OPENAI_MODEL_NAME"] = 'gpt-4o'  
os.environ["OPENAI_API_KEY"] = config['OPENAI_API_KEY'] # your OpenAI key

To modify on the delegation, we should always specify allow_delegation = True for the QA specialist agent.

Also, we are able to use handy memory functionality for OpenAI models, as mentioned above. The memory will allow our agents to share their knowledge with one another during execution and leverage long-term memory to get information from previous executions.

full_crew = Crew(
agents = [database_specialist_agent, tech_writer_agent, qa_specialist_agent],
tasks = [table_description_task, table_documentation_task, qa_review_task],
verbose = 2,
memory = True
)

full_result = full_crew.kickoff({'table': 'ecommerce_db.users'})

Let’s see the CrewAI execution logs to know how delegation works. Listed below are all of the logs for the QA specialist. We are able to see that it reached out to the database specialist to double-check the knowledge.

Thought: I would like to review the provided draft documentation to make sure it 
fully answers all of the questions and is complete, correct, and simple
to know. Specifically, I would like to be sure that it explains the aim
of the table, its structure, and includes relevant examples without
any irrelevant comments.

First, I'll compare the 2 drafts to discover any discrepancies and
missing information. Then, I'll confirm the accuracy and completeness of
the knowledge provided.

**Comparison:**
1. **Introduction:**
- First draft: Includes an introduction and examples.
- Second draft: Adds a more detailed introduction concerning the purpose and importance of the table.

2. **Columns:**
- Each drafts have the identical column structure and descriptions.

3. **Additional Information:**
- Each drafts include similar additional information, however the second draft is more concise.

4. **Examples:**
- The primary draft includes examples, that are missing within the second draft.

5. **Final Checks:**
- Ensure there aren't any irrelevant comments.
- Confirm consistency with information provided by the database specialist.

Motion: Ask query to co-worker
Motion Input: user_id

<-- omitted logs from the database specialist -->

Based on the provided context and examples for the `ecommerce_db.users` table,
the column descriptions and examples appear accurate. Listed below are the small print:
- `user_id`: This column comprises a novel identifier for every user.
The provided examples (e.g., 1000001, 1000002) align with this description
as they're unique numeric identifiers.

- `country`: This column specifies the country of the user. The provided
examples (e.g., United Kingdom, France, Germany, Netherlands) are
all valid country names.

- `is_active`: This column indicates whether the user is energetic (1) or not (0). The provided examples (e.g., 0, 1) accurately represent this binary status.

- `age`: This column shows the age of the user. The provided examples (e.g., 70,
87, 88, 25, 48, 78, 65, 31, 66, 73) are all valid ages.

Due to this fact, the column descriptions and examples provided for the
`ecommerce_db.users` table are accurate and meet the expected criteria.

Thought: Based on the remark from the database specialist,
the column descriptions and examples provided for the `ecommerce_db.users`
table are accurate. Now, I'll consolidate the very best elements from each drafts,
ensuring the documentation is complete, clear, and freed from irrelevant comments.

Motion: Finalize the documentation in markdown format, incorporating
the detailed introduction, examples, and ensuring the tone is skilled
but informal.

After I tried the delegation for the primary time, I didn’t enable memory, which led to incorrect results. The info specialist and the technical author initially returned the proper information. Nonetheless, when the QA specialist returned with the follow-up questions, they began to hallucinate. So, it looks like delegation works higher when memory is enabled.

Here’s the ultimate output from GPT-4o. The result looks pretty nice now. We definitely can use LLMs to automate documentation.

So, the primary task has been solved!

I used the identical script to generate documentation for the ecommerce_db.sessions table as well. It's going to be handy for our next task. So, let’s not waste any time and move on.

Our next task is answering questions based on the documentation because it’s common for a lot of data analysts (and other specialists).

We'll start easy and can create just two agents:

  • The documentation support specialist will likely be answering questions based on the docs,
  • The support QA agent will review the reply before sharing it with the shopper.

We'll have to empower the documentation specialist with a few tools that can allow them to see all of the files stored within the directory and browse the files. It’s pretty straightforward since CrewAI has implemented such tools.

from crewai_tools import DirectoryReadTool, FileReadTool

documentation_directory_tool = DirectoryReadTool(
directory = '~/crewai_project/ecommerce_documentation')

base_file_read_tool = FileReadTool()

Nonetheless, since Llama 3 keeps scuffling with quotes when calling tools, I needed to create a custom tool on top of the FileReaderTool to beat this issue.

from crewai_tools import BaseTool

class FileReadToolUPD(BaseTool):
name: str = "Read a file's content"
description: str = "A tool that might be used to read a file's content."

def _run(self, file_path: str) -> str:
# Implementation goes here
return base_file_read_tool._run(file_path = file_path.strip('"').strip("'"))

file_read_tool = FileReadToolUPD()

Next, as we did before, we'd like to create agents, tasks and crew.

data_support_agent = Agent(
role = "Senior Data Support Agent",
goal = "Be probably the most helpful support for you colleagues",
backstory = '''You're employed as a support for data-related questions
in the corporate.
Although you are a giant expert in our data warehouse, you double check
all of the facts in documentation.
Our documentation is totally up-to-date, so you possibly can fully depend on it
when answering questions (you do not need to examine the actual data
in database).
Your work could be very essential for the team success. Nonetheless, remember
that examples of table rows don't show all of the possible values.
You should be sure that you provide the very best possible support: answering
all of the questions, making no assumptions and sharing only the factual data.
Be creative try your best to resolve the shopper problem.
''',
allow_delegation = False,
verbose = True
)

qa_support_agent = Agent(
role = "Support Quality Assurance Agent",
goal = """Ensure the very best quality of the answers we offer
to the purchasers""",
backstory = '''You're employed as a Quality Assurance specialist, checking the work
from support agents and ensuring that it's inline with our highest standards.
You should check that the agent provides the complete complete answers
and make no assumptions.
Also, it's essential ensure that the documentation addresses all
the questions and is straightforward to know.
''',
allow_delegation = False,
verbose = True
)

draft_data_answer = Task(
description = '''Very essential customer {customer} reached out to you
with the next query:
```
{query}
```

Your task is to supply the very best answer to all of the points within the query
using all available information and never making any assumprions.
If you happen to do not have enough information to reply the query, just say
that you simply do not know.''',
expected_output = '''The detailed informative answer to the shopper's
query that addresses all the purpose mentioned.
Be sure that that answer is complete and stict to facts
(with none additional information not based on the factual data)''',
tools = [documentation_directory_tool, file_read_tool],
agent = data_support_agent
)

answer_review = Task(
description = '''
Review the draft answer provided by the support agent.
Be certain that the it fully answers all of the questions mentioned
within the initial inquiry.
Be sure that that the reply is consistent and doesn't include any assumptions.
''',
expected_output = '''
The ultimate version of the reply in markdown format that might be shared
with the shopper.
The reply should fully address all of the questions, be consistent
and follow our skilled but informal tone of voice.
We're very chill and friendly company, so do not forget to incorporate
all of the polite phrases.
''',
tools = [],
agent = qa_support_agent
)

qna_crew = Crew(
agents = [data_support_agent, qa_support_agent],
tasks = [draft_data_answer, answer_review],
verbose = 2,
memory = False # don't work with Llama
)

Let’s see how it really works in practice.

result = qna_crew.kickoff(
{'customer': "Max",
'query': """Hey team, I hope you are doing well. I would like to seek out
the numbers before our CEO presentation tomorrow, so I'll really
appreciate your help.
I would like to calculate the variety of sessions from our Windows users in 2023. I've tried to seek out the table with such data in our data warehouse, but wasn't in a position to.
Do you've got any ideas whether we store the needed data somewhere,
in order that I can query it? """
}
)

We’ve got a polite, practical and helpful answer in return. That’s really great.

**Hello Max,**

Thanks for reaching out along with your query! I'm completely satisfied to enable you to
find the variety of sessions from Windows users in 2023.
After reviewing our documentation, I discovered that we do store data
related to sessions and users in our ecommerce database, specifically in
the `ecommerce_db.sessions` table.

To reply your query, I can give you a step-by-step guide
on learn how to query this table using SQL. First, you need to use the `session_id`
column together with the `os` column filtering for "Windows" and
the `action_date` column filtering for dates in 2023.
Then, you possibly can group the outcomes by `os` using the `GROUP BY` clause
to count the variety of sessions that meet these conditions.

Here's a sample SQL query that ought to offer you the specified output:

```sql
SELECT COUNT(*)
FROM ecommerce_db.sessions
WHERE os = 'Windows'
AND action_date BETWEEN '2023-01-01' AND '2023-12-31'
GROUP BY os;
```

This question will return the whole variety of sessions from Windows
users in 2023. I hope this helps! If you've got any further questions or
need more assistance, please don't hesitate to ask.

Let’s complicate the duty a bit. Suppose we are able to get not only questions on our data but additionally about our tool (ClickHouse). So, we could have one other agent within the crew — ClickHouse Guru. To provide our CH agent some knowledge, I'll share a documentation website with it.

from crewai_tools import ScrapeWebsiteTool, WebsiteSearchTool
ch_documenation_tool = ScrapeWebsiteTool(
'https://clickhouse.com/docs/en/guides/creating-tables')

If it's essential work with a lengthy document, you would possibly try using RAG (Retrieval Augmented generation) — WebsiteSearchTool. It's going to calculate embeddings and store them locally in ChromaDB. In our case, we'll stick with an easy website scraper tool.

Now that we have now two material experts, we'd like to choose who will likely be working on the questions. So, it’s time to make use of a hierarchical process and add a manager to orchestrate all of the tasks.

CrewAI provides the manager implementation, so we only have to specify the LLM model. I’ve picked the GPT-4o.

from langchain_openai import ChatOpenAI
from crewai import Process

complext_qna_crew = Crew(
agents = [ch_support_agent, data_support_agent, qa_support_agent],
tasks = [draft_ch_answer, draft_data_answer, answer_review],
verbose = 2,
manager_llm = ChatOpenAI(model='gpt-4o', temperature=0),
process = Process.hierarchical,
memory = False
)

At this point, I had to modify from Llama 3 to OpenAI models again to run a hierarchical process because it hasn’t worked for me with Llama (much like this issue).

Now, we are able to try our recent crew with several types of questions (either related to our data or ClickHouse database).

ch_result = complext_qna_crew.kickoff(
{'customer': "Maria",
'query': """Good morning, team. I'm using ClickHouse to calculate
the number of shoppers.
Could you please remind whether there's an option so as to add totals
in ClickHouse?"""
}
)

doc_result = complext_qna_crew.kickoff(
{'customer': "Max",
'query': """Hey team, I hope you are doing well. I would like to seek out
the numbers before our CEO presentation tomorrow, so I'll really
appreciate your help.
I would like to calculate the variety of sessions from our Windows users
in 2023. I've tried to seek out the table with such data
in our data warehouse, but wasn't in a position to.
Do you've got any ideas whether we store the needed data somewhere,
in order that I can query it. """
}
)

If we have a look at the ultimate answers and logs (I’ve omitted them here since they're quite lengthy, but yow will discover them and full logs on GitHub), we'll see that the manager was in a position to orchestrate accurately and delegate tasks to co-workers with relevant knowledge to deal with the shopper's query. For the primary (ClickHouse-related) query, we got an in depth answer with examples and possible implications of using WITH TOTALS functionality. For the data-related query, models returned roughly the identical information as we’ve seen above.

So, we’ve built a crew that may answer various varieties of questions based on the documentation, whether from a neighborhood file or an internet site. I believe it’s a wonderful result.

Yow will discover all of the code on GitHub.

In this text, we’ve explored using the CrewAI multi-agent framework to create an answer for writing documentation based on tables and answering related questions.

Given the extensive functionality we’ve utilised, it’s time to summarise the strengths and weaknesses of this framework.

Overall, I find CrewAI to be an incredibly useful framework for multi-agent systems:

  • It’s straightforward, and you possibly can construct your first prototype quickly.
  • Its flexibility allows to resolve quite sophisticated business problems.
  • It encourages good practices like role-playing.
  • It provides many handy tools out of the box, akin to RAG and an internet site parser.
  • The support of several types of memory enhances the agents’ collaboration.
  • Built-in guardrails help prevent agents from getting stuck in repetitive loops.

Nonetheless, there are areas that could possibly be improved:

  • While the framework is straightforward and simple to make use of, it’s not very customisable. As an example, you currently can’t create your individual LLM manager to orchestrate the processes.
  • Sometimes, it’s quite difficult to get the complete detailed information from the documentation. For instance, it’s clear that CrewAI implemented some guardrails to forestall repetitive function calls, however the documentation doesn’t fully explain how it really works.
  • One other improvement area is transparency. I like to know how frameworks work under the hood. For instance, in Langchain, you need to use langchain.debug = True to see all of the LLM calls. Nonetheless, I haven’t discovered learn how to get the identical level of detail with CrewAI.
  • The complete support for the local models could be an incredible addition, as the present implementation either lacks some features or is difficult to get working properly.

The domain and tools for LLMs are evolving rapidly, so I’m hopeful that we’ll see plenty of progress within the near future.

Thanks quite a bit for reading this text. I hope this text was insightful for you. If you've got any follow-up questions or comments, please leave them within the comments section.

This text is inspired by the “Multi AI Agent Systems with CrewAI” short course from DeepLearning.AI.

Data & Product Analytics Lead at Sensible | ClickHouse Evangelist

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x