Easy methods to Construct a Powerful Deep Research System

is a well-liked feature you’ll be able to activate in apps similar to ChatGPT and Google Gemini. It allows users to ask a question as usual, and the applying spends an extended time properly researching the query and coming up with a greater answer than normal LLM responses.

You may also apply this to your personal collection of documents. For instance, suppose you have got hundreds of documents of internal company information, it is advisable to create a deep research system that takes in user questions, scans all of the available (internal) documents, and comes up with a very good answer based on that information.

This infographic highlights the principal contents of this text. I’ll discuss through which situations you must construct a deep research system, and through which situations simpler approaches like RAG or keyword search are more suitable. Continuing, I’ll discuss how one can construct a deep research system, including gathering data, creating tools, and putting all of it along with an orchestrator LLM and subagents. Image by ChatGPT.

Why construct a deep research system?

The primary query you may ask yourself is:

Why do I would like a deep research system?

This can be a fair query, because there are other alternatives which might be viable in lots of situations:

Feed all data into an LLM
RAG
Keyword search

In case you can get away with these simpler systems, it is best to almost at all times do this. The by far easiest approach is solely feeding all the information into an LLM. In case your information is contained in fewer than 1 million tokens, this is certainly a very good option.

Moreover, if traditional RAG works well, or yow will discover relevant information with a keyword search, it is best to also select those options. Nevertheless, sometimes, neither of those solutions is powerful enough to resolve your problem. Perhaps you must deeply analyze many sources, and chunk retrieval from similarity (RAG) isn’t adequate. Or you’ll be able to’t use keyword search since you’re not familiar enough with the dataset to know which keywords to make use of. During which case, it is best to think about using a deep research system.

Easy methods to construct a deep research system

You’ll be able to naturally utilize the deep research system from providers similar to OpenAI, which provides a Deep Research API. This will be a very good alternative if you need to keep things easy. Nevertheless, in this text, I’ll discuss in additional detail how a deep research system is built up, and why it’s useful. Anthropic wrote a superb article on their Multi Agent Research System (which is deep research), which I like to recommend reading to grasp more details concerning the topic.

Gathering and indexing information

Step one for any information finding system is to assemble all of your information in a single place. Perhaps you have got information in apps like:

Google Drive
Notion
Salesforce

You then either need to assemble this information in a single place (convert all of it to PDFs, for instance, and store them in the identical folder), or you’ll be able to connect with these apps, like ChatGPT has done in its application.

After gathering the data, we now have to index it to make it easily available. The 2 principal indices it is best to create are:

Keyword search index. For instance BM25
Vector similarity index: Chunk up your text, embed it, and store it in a vectorDB like Pinecone

This makes the data easily accessible from the tools I’ll describe in the following session.

Tools

The agents we’ll be using in a while need tools to fetch relevant information. It is best to thus make a series of functions that make it easy for the LLM to fetch the relevant information. For instance, if the user queries for a Sales report, the LLM might need to make a keyword seek for that and analyse the retrieved documents. These tools can appear like this:

@tool 
def keyword_search(query: str) -> str:
    """
    Seek for keywords within the document.
    """
    results = keyword_search(query)

    # format responses to make it easy for the LLM to read
    formatted_results = "n".join([f"{result['file_name']}: {result['content']}" for end in results])

    return formatted_results


@tool
def vector_search(query: str) -> str:
    """
    Embed the query and seek for similar vectors within the document.
    """
    vector = embed(query)
    results = vector_search(vector)

    # format responses to make it easy for the LLM to read
    formatted_results = "n".join([f"{result['file_name']}: {result['content']}" for end in results])

    return formatted_results

You may also allow the agent access to other functions, similar to:

Web search
Filename only search

And other potentially relevant functions

Putting all of it together

A deep research system typically consists of an orchestrator agent and lots of subagents. The approach will likely be as follows:

An orchestrator agent receives the user query and plans approaches to take
Many subagents are sent to fetch relevant information and feed the summarized information back to the orchestrator
The orchestrator determines if it has enough information to reply the user query. If no, we return to the last bullet point; if yes, we will provide for the ultimate bullet point
The orchestrator puts all the data together and provides the user with a solution

This figure highlights the deep research system I discussed. You input the user query, an orchestrator agent processes it, and sends subagents to fetch info from the document corpus. The orchestrator agent then determines if it has enough information to reply to the user query. If the reply isn’t any, it fetches more information, and if it has enough information, it generates a response for the user. Image by the creator.

Moreover, you may also have a clarifying query, if the user’s query is vague, or simply to narrow down the scope of the user’s query. You’ve probably experienced this in the event you used any deep research system from a frontier lab, where the deep research system at all times starts off by asking a clarifying query.

Often, the orchestrator is a bigger/higher model, for instance, Claude Opus, or GPT-5 with high reasoning effort. The subagents are typically smaller, similar to GPT-4.1 and Claude Sonnet.

The principal advantage of this approach (over traditional RAG, especially) is that you simply allow the system to scan and analyze more information, lowering the possibility of missing information that’s relevant to reply to the user query. The incontrovertible fact that you have got to scan more documents also typically makes the system slower. Naturally, this can be a trade-off between time and quality of responses.

Conclusion

In this text, I even have discussed how one can construct a deep research system. I first covered the motivation for constructing such a system, and through which scenarios it is best to as an alternative deal with constructing simpler systems, similar to RAG or keyword search. Continuing, I discussed the muse for what a deep research system is, which essentially takes in a user query, plans for how one can answer it, sends sub-agents to fetch relevant information, aggregates that information, and responds to the user.

👉 Find me on socials:

🧑‍💻 Get in contact

🔗 LinkedIn

🐦 X / Twitter

✍️ Medium

You may also read a few of my other articles:

Easy methods to Construct a Powerful Deep Research System

Table of contents

Why construct a deep research system?