LangGraph 101: Let’s Construct A Deep Research Agent

-

that truly work in practice just isn’t a simple task.

You must consider find out how to orchestrate the multi-step workflow, keep track of the agents’ states, implement essential guardrails, and monitor decision processes as they occur.

Fortunately, LangGraph addresses exactly those pain points for you.

Recently, Google just demonstrated this perfectly by open-sourcing a full-stack implementation of a Deep Research Agent built with LangGraph and Gemini (with Apache-2.0 license).

This isn’t a toy implementation: the agent can’t only search, but in addition dynamically evaluate the outcomes to come to a decision if more information is required by doing further searches. This iterative workflow is strictly the sort of thing where LangGraph really shines.

So, if you ought to learn the way LangGraph works in practice, what higher place to start out than an actual, working agent like this?

Here’s our game plan for this tutorial post: We’ll adopt a “problem-driven” learning approach. As a substitute of starting with lengthy, abstract concepts, we’ll jump right into the code and examine Google’s implementation. After that, we’ll connect every bit back to the core concepts of LangGraph.

By the tip, you’ll not only have a working research agent but in addition enough LangGraph knowledge to construct whatever comes next.

Here is the visual roadmap for this post:

Figure 1. Table of Contents for this post. (Image by creator)

1. The Big Picture — Modeling the Workflow with Graphs, Nodes, and Edges

🎯 The problem

On this case study, we’ll construct something exciting: an LLM-based research-agumented agent, the minimal replication of the features you’ve already seen in ChatGPT, Gemini, Claude, or Perplexity. That’s what we’re aiming for here.

Specifically, our agent will work like this:

First things first, let’s sketch out a high-level flowchart in order that we’re clear what we’re constructing here:

Figure 2. High-level flowchart (Image by creator)

💡LangGraph’s solution

Now, how should we model this workflow in LangGraph? Well, because the name suggests, LangGraph uses graph representations. Okay, but why use graphs?

The short answer is that this: graphs are great for modeling complex, stateful flows, similar to the applying we aim to construct here. When you have got branching decisions, loops that have to circle back, and all the opposite messy realities that real-world agentic workflow would throw at you, graphs provide you with one of the natural ways to represent all of them.

Technically, a graph consists of nodes and edges. In LangGraph’s world, nodes are individual processing steps within the workflow, and edges define transitions between steps, that’s, defining how control and state flow through the system.

> Let’s see some code!

In LangGraph, the interpretation from flowchart to code is simple. Let’s take a look at agent/graph.py from the Google repository to see how this is completed.

Step one is to create the graph itself:

from langgraph.graph import StateGraph
from agent.state import (
    OverallState,
    QueryGenerationState,
    ReflectionState,
    WebSearchState,
)
from agent.configuration import Configuration

# Create our Agent Graph
builder = StateGraph(OverallState, config_schema=Configuration)

Here, StateGraph is LangGraph’s builder class for a state-aware graph. It accepts anOverallState class that defines what information can move between nodes (that is the agent memory part we are going to discuss in the following section), and a Configuration class that defines runtime-tunable parameters, equivalent to which LLM to call at individual steps, the variety of initial queries to generate, etc. More details on this may follow in the following sections.

Once we now have the graph container, we will add nodes to it:

# Define the nodes we are going to cycle between
builder.add_node("generate_query", generate_query)
builder.add_node("web_research", web_research)
builder.add_node("reflection", reflection)
builder.add_node("finalize_answer", finalize_answer)

The add_node() method takes the primary argument because the node’s name and the second argument because the callable that’s executed when the node runs.

Generally, this callable is usually a plain function, an async function, a LangChain Runnable, and even one other compiled StateGraph.

In our specific case:

  • generate_query generates search queries based on the user’s query.
  • web_search performs web research using the native Google Search API tool.
  • reflection identifies knowledge gaps and generates potential follow-up queries.
  • finalize_answer finalizes the research summary.

We are going to examine the detailed implementation of those functions later.

Okay, now that we now have the nodes defined, the following step is so as to add edges to attach them and define execution order:

from langgraph.graph import START, END

# Set the entrypoint as `generate_query`
# Which means this node is the primary one called
builder.add_edge(START, "generate_query")

# Add conditional edge to proceed with search queries in a parallel branch
builder.add_conditional_edges(
    "generate_query", continue_to_web_research, ["web_research"]
)

# Reflect on the internet research
builder.add_edge("web_research", "reflection")

# Evaluate the research
builder.add_conditional_edges(
    "reflection", evaluate_research, ["web_research", "finalize_answer"]
)

# Finalize the reply
builder.add_edge("finalize_answer", END)

A few things are price stating here:

  • Notice how those node names we defined earlier (e.g., “generate_query”, “web_research”, etc.) now come in useful—we will reference them directly in our edge definitions.
  • We see that two forms of edges are used, i.e., the static edge and the conditional edge.
  • When builder.add_edge() is used, a direct, unconditional connection between two nodes is created. In our case, builder.add_edge("web_research", "reflection") principally signifies that after web research is accomplished, the flow will move to the reflection step.
  • Then again, when builder.add_conditional_edges() is used, the flow may jump to different branches at runtime. We want three key arguments when making a conditional edge: the source node, a routing function, and a listing of possible destination nodes. The routing function examines the present state and returns the name of the following node to go to. For instance, the evaluate_research() function determines whether the agent needs more research (in that case, go to the "web_research" node) or if the data is already sufficient that the agent can finalize the reply (go to the "finalize_answer" node).

  • We also notice two special nodes: START and END. These are LangGraph’s built-in entry and exit points. Every graph needs exactly one start line (where execution begins), but can have multiple ending points (where execution terminates).

Finally, it’s time to place every part together and compile the graph into an executable agent:

graph = builder.compile(name="pro-search-agent")

And that’s it! We’ve successfully translated our flowchart right into a LangGraph implementation.

🎁 Bonus Read: Why Do Graphs Truly Shine?

Beyond being a natural fit for nonlinear workflows, LangGraph’s node/edge/graph representation brings several additional practical advantages that make constructing and managing agents easy in the true world:

  • Tremendous-grained control & observability. Because every node/edge has its own identity, you may easily checkpoint your progress and examine under the hood when something unexpected happens. This makes debugging and evaluation easy.
  • Modularity & reuse. You possibly can bundle individual steps into , justlike Lego bricks. Talking about software best practices in motion.
  • Parallel paths. When parts of your workflow are independent, graphs easily allow them to run concurrently. Obviously, this helps address latency issues and makes your system more robust to faults, which is very critical when your pipelines are complex.
  • Easily visualizable. Whether it’s debugging or presenting the approach, it’s all the time nice to have the option to see the workflow logic. Graphs are only natural for visualization.

📌Key takeaways

Let’s recap what we’ve covered on this foundational section:

  • LangGraph uses graphs to explain the agentic workflow, as graphs elegantly handle branching, looping, and other nonlinear procedures.
  • In LangGraph, nodes represent processing steps and edges define transitions between steps.
  • LangGraph implements two forms of edges: static edges and conditional edges. When you have got fixed transitions between nodes, use static edges. If the transition may change in runtime based on dynamic decision, use conditional edges.
  • Constructing a graph in LangGraph is easy. You first create a StateGraph, then add nodes (with their functions), connect them with edges. Finally, you compile the graph. Done!
Figure 3. Constructing agentic graph in LangGraph. (Image by creator)

Now that we understand the essential structure, you’re probably wondering: how does information flow between these nodes? This brings us to one among LangGraph’s most significant concepts: state management.

Let’s check that out.


2. The Agent’s Memory — How Nodes Share Information with State

Figure 4. The present progress. (Image by Writer)

🎯 The problem

As our agent walks through the graph we defined earlier, it needs to maintain track of things it has generated/learned. For instance:

  • The unique query from the user.
  • The list of search queries it has generated.
  • The content it has retrieved from the online.
  • Its own internal reflections about whether the gathered information is sufficient.
  • The ultimate, polished answer.

So, how should we maintain that information in order that our nodes don’t work in isolation but as an alternative collaborate and construct upon one another’s work?

💡 LangGraph’s solution

The LangGraph way of solving this problem is by introducing a central state object, a shared whiteboard that each node within the graph can take a look at and write on.

Here’s how it really works:

  • When a node is executed, it receives the present state of the graph.
  • The node performs its task (e.g., calls an LLM, runs a tool) using information from the state.
  • The node then returns a dictionary containing only the parts of the state it desires to update or add.
  • LangGraph then takes this output and mechanically merges it into the principal state object, before passing it to the following node.

Because the state passing and merging are handled on the framework level by LangGraph, individual nodes don’t have to worry about find out how to access or update shared data.  They only have to deal with their specific task logic.

Also, this pattern makes your agent workflows highly modular. You possibly can easily add, remove, or reorder nodes without breaking the state flow.

> Let’s see some code!

Remember this line from the last section?

# Create our Agent Graph
builder = StateGraph(OverallState, config_schema=Configuration)

We mentioned that OverallState defines the agent’s memory, but doesn’t yet show how exactly it’s implemented. Now it’s time to open the black box.

Within the repo, OverallState is defined inagent/state.py:

from typing import TypedDict, Annotated, List
from langgraph.graph.message import add_messages
import operator

class OverallState(TypedDict):
    messages: Annotated[list, add_messages]
    search_query: Annotated[list, operator.add]
    web_research_result: Annotated[list, operator.add]
    sources_gathered: Annotated[list, operator.add]
    initial_search_query_count: int
    max_research_loops: int
    research_loop_count: int
    reasoning_model: str

Essentially, we will see that the so-called state is a TypedDict that serves as a contract. It defines every field your workflow cares about and the way those fields needs to be merged when multiple nodes write to them. Let’s break that down:

  • Field purposes: messages stores conversation history, search_query,web_search_result , and source_gathered track the agent’s research process. The opposite fields control agent behavior by setting limits and tracking progress.
  • The Annotated pattern: We see some fields use Annotated[list, add_messages]or Annotated[list, operator.add]. This is supposed to inform LangGraph find out how to do the merge update when multiple nodes modify the identical field. Specifically, add_messages is LangGraph’s built-in function for intelligently merging conversation messages, while operator.add concatenates lists when nodes add latest items.
  • Merge behavior: Fields like research_loop_count: int simply replace the old value when updated. Annotated fields, then again, are .  They construct up over time as different nodes dump information into it.

While OverallState serves as the worldwide memory, probably it is healthier to also define smaller, node-specific states to act as a transparent “API contract” for what a node needs and produces. In spite of everything, it is usually the case that one specific node won’t require all the data from the whole OverallState, nor modify all of the content in OverallState.

This is strictly what LangGraph did.

Inagent/state.py, besides defining OverallState, three other states are also defined:

class ReflectionState(TypedDict):
    is_sufficient: bool
    knowledge_gap: str
    follow_up_queries: Annotated[list, operator.add]
    research_loop_count: int
    number_of_ran_queries: int

class QueryGenerationState(TypedDict):
    query_list: list[Query]

class WebSearchState(TypedDict):
    search_query: str
    id: str

Those states are utilized by the nodes in the next way (agent/graph.py):

from agent.state import (
    OverallState,
    QueryGenerationState,
    ReflectionState,
    WebSearchState,
)

def generate_query(
    state: OverallState, 
    config: RunnableConfig
) -> QueryGenerationState:
    # ...Some logic to generate search queries...
    return {"query_list": result.query}

def continue_to_web_research(
    state: QueryGenerationState
):
    # ...Some logic to send out multiple search queries...

def web_research(
    state: WebSearchState, 
    config: RunnableConfig
) -> OverallState:
    # ...Some logic to performs web research...
    return {
        "sources_gathered": sources_gathered,
        "search_query": [state["search_query"]],
        "web_research_result": [modified_text],
    }

def reflection(
    state: OverallState, 
    config: RunnableConfig
) -> ReflectionState:
    # ...Some logic to reflect on the outcomes...
    return {
        "is_sufficient": result.is_sufficient,
        "knowledge_gap": result.knowledge_gap,
        "follow_up_queries": result.follow_up_queries,
        "research_loop_count": state["research_loop_count"],
        "number_of_ran_queries": len(state["search_query"]),
    }

def evaluate_research(
    state: ReflectionState,
    config: RunnableConfig,
) -> OverallState:
    # ...Some logic to find out the following step within the research flow...

def finalize_answer(
    state: OverallState, 
    config: RunnableConfig) -> OverallState:
    # ...Some logic to finalize the research summary...

    return {
        "messages": [AIMessage(content=result.content)],
        "sources_gathered": unique_sources,
    }

Take thereflection node for example: It reads from the OverallState but returns a dictionary that matches the ReflectionState contract. Afterward, LangGraph will handle the job of merging them into the principal OverallState, making them available for the following nodes within the graph.

🎁 Bonus Read: Where Did My State Go?

A standard confusion when working with LangGraph is how OverallState and these smaller, node-specific states interact. Let’s clear that confusion here.

The crucial mental model we want to have is that this: there is simply one state dictionary at runtime, the OverallState.

Node-specific TypedDicts should not extra runtime data stores. As a substitute, they are only typed “views” onto the one underlying dictionary (OverallState), that temporarily zoom in on the parts a node should see or produce. The aim of their existence is that the kind checker and the LangGraph runtime can implement clear contracts.

Figure 5. A fast comparison of the 2 state types. (Image by Writer)

Before a node runs, LangGraph can use its type hints to create a “slice” of the OverallState containing only the inputs that the node needs.

The node runs its logic and returns its small, specific output dictionary (e.g., a ReflectionState dict).

LangGraph takes the returned dictionary and runs OverallState.update(return_dict). If any keys were defined with an aggregator (like operator.add), that logic is applied. The updated OverallState is then passed to the following node.

So why has LangGraph embraced this two-level state definition? Besides enforcing a transparent contract for the node and making node operations self-documenting, there are two other advantages also price mentioning:

  • Drop-in reusability: Because a node only advertises the small slice of state it needs and produces, it becomes a modular, plug-and-play component. For instance, a generate_query node that only needs {user_query} from the state and outputs {queries} could be dropped into one other, completely different graph, as long as that graph’s OverallState can provide a user_query. If the node were coded against the  global state (i.e., typed with OverallState for each its input and output), you may easily break the workflow when you rename any unrelated key. This modularity is sort of essential for constructing complex systems.
  • Efficiency in parallel flows: Imagine our agent must run 10 web searches concurrently. If we’re using a node-specific state as a small payload, we then just have to send the search query to every parallel branch. That is far more efficient than sending a replica of the whole agent memory (remember the complete chat history can also be stored in OverallState!) to all ten branches. This fashion, we will dramatically cut down on memory and serialization overhead.

So what does this mean for us in practice?

  •  Declare in OverallState every key that should persist or to be visible to multiple different nodes.
  •  Make the node-specific states as small as possible. They need to contain  the fields that the node is liable for producing.
  •  Every key you write have to be declared in some state schema; otherwise, LangGraph raises InvalidUpdateError when the node tries to write down it.

📌Key takeaways

Let’s recap what we’ve covered on this section:

  • LangGraph maintains states at two levels: At the worldwide level, there may be the OverallState object that serves because the central memory. At the person node level, small, TypedDict-based objects store node-specific inputs/outputs. This keeps the state management clean and arranged.
  • After each step, nodes would return minimal output dicts, which is then merged back into the central memory (OverallState). This merging is completed in line with your custom rules (e.g., operator.add for lists).
  • Nodes are self-contained and modular. You possibly can easily resue them like constructing blocks to create latest workflows.
Figure 6. Key points to recollect in LangGraph state management. (Image by creator)

Now we’ve understood the graph’s structure and the way state flows through it, but what happens  each node? Let’s now turn to the node operations.


3. Node Operations — Where The Real Work Happens

Figure 7. The present progress. (Image by Writer)

Our graph can route messages and hold state, but inside each node, we still have to:

  • Ensure the LLM outputs the fitting format.
  • Call external APIs.
  • Run multiple searches in parallel.
  • Determine when to stop the loop.

Luckily, LangGraph has your back with several solid approaches for tackling these challenges. Let’s meet them one after the other, each through a slice of our working codebase.

3.1 Structured output

🎯 The issue

Getting an LLM to return a JSON object is straightforward, but parsing free-text JSON is just unreliable in practice. As soon as LLMs use a special phrase, add unexpected formatting, or change the important thing order, our workflow can easily go off the rails. In brief, we want guaranteed, validatable output structures at each processing step.

💡 LangGraph’s solution

We constrain the LLM to generate output that conforms to a predefined schema. This could be done by attaching a Pydantic schema to the LLM call using llm.with_structured_output(), which is a helper method that’s provided by every LangChain chat-model wrapper (e.g., ChatGoogleGenerativeAI, ChatOpenAI, etc.).

> Let’s see some code!

Let’s take a look at the generate_query node, whose job is to create a listing of search queries. Since we want this list to be a clean Python object, not a messy string, for the following node to parse, it might be idea to implement the output schema, with SearchQueryList (defined in agent/tools_and_schemas.py):

from typing import List
from pydantic import BaseModel, Field

class SearchQueryList(BaseModel):
    query: List[str] = Field(
        description="An inventory of search queries for use for web research."
    )
    rationale: str = Field(
        description="A temporary explanation of why these queries are relevant to the research topic."
    )

And here is how this schema is utilized in the generate_query node:

from langchain_google_genai import ChatGoogleGenerativeAI
from agent.prompts import (
    get_current_date,
    query_writer_instructions,
)

def generate_query(
    state: OverallState, 
    config: RunnableConfig
) -> QueryGenerationState:
    """LangGraph node that generates a search queries 
       based on the User's query.

    Uses Gemini 2.0 Flash to create an optimized search 
    query for web research based on the User's query.

    Args:
        state: Current graph state containing the User's query
        config: Configuration for the runnable, including LLM 
                provider settings

    Returns:
        Dictionary with state update, including search_query key 
        containing the generated query
    """
    configurable = Configuration.from_runnable_config(config)

    # check for custom initial search query count
    if state.get("initial_search_query_count") is None:
        state["initial_search_query_count"] = configurable.number_of_initial_queries

    # init Gemini 2.0 Flash
    llm = ChatGoogleGenerativeAI(
        model=configurable.query_generator_model,
        temperature=1.0,
        max_retries=2,
        api_key=os.getenv("GEMINI_API_KEY"),
    )
    structured_llm = llm.with_structured_output(SearchQueryList)

    # Format the prompt
    current_date = get_current_date()
    formatted_prompt = query_writer_instructions.format(
        current_date=current_date,
        research_topic=get_research_topic(state["messages"]),
        number_queries=state["initial_search_query_count"],
    )
    # Generate the search queries
    result = structured_llm.invoke(formatted_prompt)
    return {"query_list": result.query}

Here, llm.with_structured_output(SearchQueryList) wraps the Gemini model with LangChain’s structured-output helper. Under the hood, it uses the model’s preferred structured-output feature (JSON mode for Gemini 2.0 Flash) and mechanically parses the reply right into a SearchQueryList Pydantic instance, so result is already validated Python data.

It’s also interesting to ascertain out the system prompt Google used for this node:

query_writer_instructions = """Your goal is to generate sophisticated and 
diverse web search queries. These queries are intended for a complicated 
automated web research tool able to analyzing complex results, following 
links, and synthesizing information.

Instructions:
- All the time prefer a single search query, only add one other query if the unique 
  query requests multiple elements or elements and one query just isn't enough.
- Each query should deal with one specific aspect of the unique query.
- Don't produce greater than {number_queries} queries.
- Queries needs to be diverse, if the subject is broad, generate greater than 1 query.
- Don't generate multiple similar queries, 1 is enough.
- Query should be sure that essentially the most current information is gathered. 
  The present date is {current_date}.

Format: 
- Format your response as a JSON object with ALL three of those exact keys:
   - "rationale": Temporary explanation of why these queries are relevant
   - "query": An inventory of search queries

Example:

Topic: What revenue grew more last yr apple stock or the number of individuals 
buying an iphone
```json
{{
    "rationale": "To reply this comparative growth query accurately, 
we want specific data points on Apple's stock performance and iPhone sales 
metrics. These queries goal the precise financial information needed: 
company revenue trends, product-specific unit sales figures, and stock price 
movement over the identical fiscal period for direct comparison.",
    "query": ["Apple total revenue growth fiscal year 2024", "iPhone unit 
sales growth fiscal year 2024", "Apple stock price growth fiscal year 2024"],
}}
```

Context: {research_topic}"""

We see some prompt engineering best practices in motion, like defining the model’s role, specifying constraints, providing an example for illustration, etc.

3.2 Tool calling

🎯 The issue

For our research agent to succeed, it needs up-to-date information from the online. To comprehend that, it needs a “tool” to go looking the online.

💡 LangGraph’s solution

Nodes can execute tools. These could be native LLM tool-calling features (like in Gemini) or integrated through LangChain’s tool abstractions. Once the tool-calling results are gathered, they could be placed back into the agent’s state.

> Let’s see some code!

For the tool-calling usage pattern, let’s take a look at the web_research node. This node uses Gemini’s native tool-calling feature to perform Google searches. Notice how the tool is specified directly within the model’s configuration.

from langchain_google_genai import ChatGoogleGenerativeAI
from agent.prompts import (
    web_searcher_instructions,
)
from agent.utils import (
    get_citations,
    insert_citation_markers,
    resolve_urls,
)

def web_research(
    state: WebSearchState, 
    config: RunnableConfig
) -> OverallState:
    """LangGraph node that performs web research using the native Google 
       Search API tool.

    Executes an online search using the native Google Search API tool in 
    combination with Gemini 2.0 Flash.

    Args:
        state: Current graph state containing the search query and 
               research loop count
        config: Configuration for the runnable, including search API settings

    Returns:
        Dictionary with state update, including sources_gathered, 
        research_loop_count, and web_research_results
    """
    # Configure
    configurable = Configuration.from_runnable_config(config)
    formatted_prompt = web_searcher_instructions.format(
        current_date=get_current_date(),
        research_topic=state["search_query"],
    )

    # Uses the google genai client because the langchain client doesn't 
    # return grounding metadata
    response = genai_client.models.generate_content(
        model=configurable.query_generator_model,
        contents=formatted_prompt,
        config={
            "tools": [{"google_search": {}}],
            "temperature": 0,
        },
    )
    # resolve the urls to short urls for saving tokens and time
    resolved_urls = resolve_urls(
        response.candidates[0].grounding_metadata.grounding_chunks, state["id"]
    )
    # Gets the citations and adds them to the generated text
    citations = get_citations(response, resolved_urls)
    modified_text = insert_citation_markers(response.text, citations)
    sources_gathered = [item for citation in citations for item in citation["segments"]]

    return {
        "sources_gathered": sources_gathered,
        "search_query": [state["search_query"]],
        "web_research_result": [modified_text],
    }

The LLM sees the Google Search tool and understands that it may use the tool to meet the prompt. A key good thing about this native integration is the grounding_metadata returned with the response. That metadata incorporates  — essentially, snippets of the reply paired with the URL that justified them. This principally gives us citations totally free.

3.3 Conditional routing

🎯 The issue

After the initial research, how does the agent know whether to stop or proceed? We want a control mechanism to create a research loop that may terminate itself.

💡 LangGraph’s solution

Conditional routing is handled by a special sort of node: as an alternative of returning state, this node returns the name of the node to go to. Effectively, this node implements a routing function that inspects the present state and comes to a decision regarding find out how to direct the traffic inside the graph.

> Let’s see some code!

The evaluate_research node is our agent’s decision-maker. It checks the is_sufficient flag set by the reflection node and compares the present research_loop_count value against a pre-configured maximum threshold value.

def evaluate_research(
    state: ReflectionState,
    config: RunnableConfig,
) -> OverallState:
    """LangGraph routing function that determines the following step within the 
       research flow.

    Controls the research loop by deciding whether to proceed gathering 
    information or to finalize the summary based on the configured maximum 
    variety of research loops.

    Args:
        state: Current graph state containing the research loop count
        config: Configuration for the runnable, including max_research_loops 
                setting

    Returns:
        String literal indicating the following node to go to 
        ("web_research" or "finalize_summary")
    """
    configurable = Configuration.from_runnable_config(config)
    max_research_loops = (
        state.get("max_research_loops")
        if state.get("max_research_loops") just isn't None
        else configurable.max_research_loops
    )
    if state["is_sufficient"] or state["research_loop_count"] >= max_research_loops:
        return "finalize_answer"
    else:
        return [
            Send(
                "web_research",
                {
                    "search_query": follow_up_query,
                    "id": state["number_of_ran_queries"] + int(idx),
                },
            )
            for idx, follow_up_query in enumerate(state["follow_up_queries"])
        ]

If the condition to stop is met, it returns the string "finalize_answer", and LangGraph proceeds to that node. If not, it returns a brand new list of Send objects containing the follow_up_queries, which spins up one other parallel wave of web research, continuing the loop.

Send object…What’s it then?

Well, it’s LangGraph’s way of triggering parallel execution. Let’s turn to that now.

3.4 Parallel processing

🎯 The issue

To reply the user’s query as comprehensively as possible, we would wish our generate_query node to supply multiple search queries. Nevertheless, we don’t wish to run those search queries one after the other, as it might be very slow and inefficient. What we wish is to execute the online searches for all queries concurrently.

💡 LangGraph’s solution

To trigger parallel execution, a node can return a listing of Send objects. Send is a special directive that tells the LangGraph scheduler to dispatch these tasks to the required node (e.g.,"web_research") concurrently, each with its own piece of state.

> Let’s see some code!

To enable the parallel search, Google’s implementation introduces the continue_to_web_research node to act as a dispatcher. It takes the query_list from the state and creates a separate Send task for every query.

from langgraph.types import Send

def continue_to_web_research(
    state: QueryGenerationState
):
    """LangGraph node that sends the search queries to the online research node.
    That is used to spawn n variety of web research nodes, one for every 
    search query.
    """
    return [
        Send("web_research", {"search_query": search_query, "id": int(idx)})
        for idx, search_query in enumerate(state["query_list"])
    ]

And that’s all of the code you wish. The magic lives in what happens this node returns.

When LangGraph receives this list, it’s smart enough not to easily loop through it. Actually, it triggers a complicated fan-out/fan-in process under the hood to handle things concurrently:

To start with, each Send object carries only the tiny payload you gave it ({"search_query": ..., "id": ...}), not the whole OverallState. The aim here is to have fast serialization.

Then, the graph scheduler spins off an asyncio task for each item within the list. This concurrency happens mechanically, you because the workflow builder don’t have to worry anything about writing async def or managing a thread pool.

Finally, in spite of everything the parallel web_research branches are accomplished, their individually returned dictionaries are mechanically merged back into the principal OverallState. Remember the Annotated[list, operator.add] we discussed at first? Now it becomes crucial: fields defined with this kind of reducer, like sources_gathered, may have their results concatenated right into a single list.

You might wish to ask: what happens if one among the parallel searches fails or times out? This is strictly why we added a custom id to every Send payload. This ID flows directly into the trace logs, allowing you to pinpoint and debug the precise branch that failed.

If you happen to remember from earlier, we now have the next line in our graph definition:

# Add conditional edge to proceed with search queries in a parallel branch
builder.add_conditional_edges(
    "generate_query", continue_to_web_research, ["web_research"]
)

You could be wondering: why do we want to declare continue_to_web_research node as a part of a conditional edge?

The crucial thing to comprehend is that: continue_to_web_research isn’t just one other step within the pipeline—it’s a routing function.

The generate_query node can return queries (when the user asks something trivial) or twenty. A static edge would force the workflow to invoke web_research exactly once, even when there’s nothing to do. By implementing as a edge continue_to_web_research decides at runtime, whether to dispatch—and, due to Send, what number of parallel branches to spawn. If continue_to_web_research returns an empty list, LangGraph simply doesn’t follow the sting. That saves the round-trip to the search API.

Finally, that is again the software engineering best practice in motion: generate_query focuses on , continue_to_web_research on , and web_research on , aclean separation of concerns.

3.5 Configuration management

🎯 The issue

For nodes to properly do their jobs, they should know, for instance:

  • Which LLM to make use of with what parameter settings (e.g., temperature)?
  • What number of initial search queries needs to be generated?
  • What’s the cap on total research loops and on per-run concurrency?
  • And lots of others…

In brief, we want a clean, centralized approach to manage these settings without cluttering our core logic.

💡 LangGraph’s Solution

LangGraph solves this by passing a single, standardized config into every node that needs it. This object acts as a universal container for run-specific settings.

Contained in the node, LangGraph then uses a custom, typed helper class to intelligently parse this config object. This helper class implements a transparent hierarchy for fetching values:

  • It first looks for overrides passed within the config object for the present run.
  • If not found, it falls back to checking for environment variables.
  • If still not found, it uses the defaults defined directly on this helper class.

> Let’s see some code!

Let’s take a look at the implementation of the reflection node to see it in motion.

def reflection(
    state: OverallState, 
    config: RunnableConfig
) -> ReflectionState:
    """LangGraph node that identifies knowledge gaps and generates 
      potential follow-up queries.

    Analyzes the present summary to discover areas for further research 
    and generates potential follow-up queries. Uses structured output to 
    extract the follow-up query in JSON format.

    Args:
        state: Current graph state containing the running summary and 
               research topic
        config: Configuration for the runnable, including LLM provider 
                settings

    Returns:
        Dictionary with state update, including search_query key containing 
        the generated follow-up query
    """
    configurable = Configuration.from_runnable_config(config)
    # Increment the research loop count and get the reasoning model
    state["research_loop_count"] = state.get("research_loop_count", 0) + 1
    reasoning_model = state.get("reasoning_model") or configurable.reasoning_model

    # Format the prompt
    current_date = get_current_date()
    formatted_prompt = reflection_instructions.format(
        current_date=current_date,
        research_topic=get_research_topic(state["messages"]),
        summaries="nn---nn".join(state["web_research_result"]),
    )
    # init Reasoning Model
    llm = ChatGoogleGenerativeAI(
        model=reasoning_model,
        temperature=1.0,
        max_retries=2,
        api_key=os.getenv("GEMINI_API_KEY"),
    )
    result = llm.with_structured_output(Reflection).invoke(formatted_prompt)

    return {
        "is_sufficient": result.is_sufficient,
        "knowledge_gap": result.knowledge_gap,
        "follow_up_queries": result.follow_up_queries,
        "research_loop_count": state["research_loop_count"],
        "number_of_ran_queries": len(state["search_query"]),
    }

Only one line of boilerplate is required within the node:

configurable = Configuration.from_runnable_config(config)

There are quite a couple of “-ish” terms floating around. Let’s unpack them one after the other, starting with Configuration:

import os
from pydantic import BaseModel, Field
from typing import Any, Optional

from langchain_core.runnables import RunnableConfig

class Configuration(BaseModel):
    """The configuration for the agent."""

    query_generator_model: str = Field(
        default="gemini-2.0-flash",
        metadata={
            "description": "The name of the language model to make use of for the agent's query generation."
        },
    )

    reflection_model: str = Field(
        default="gemini-2.5-flash-preview-04-17",
        metadata={
            "description": "The name of the language model to make use of for the agent's reflection."
        },
    )

    answer_model: str = Field(
        default="gemini-2.5-pro-preview-05-06",
        metadata={
            "description": "The name of the language model to make use of for the agent's answer."
        },
    )

    number_of_initial_queries: int = Field(
        default=3,
        metadata={"description": "The variety of initial search queries to generate."},
    )

    max_research_loops: int = Field(
        default=2,
        metadata={"description": "The utmost variety of research loops to perform."},
    )

    @classmethod
    def from_runnable_config(
        cls, config: Optional[RunnableConfig] = None
    ) -> "Configuration":
        """Create a Configuration instance from a RunnableConfig."""
        configurable = (
            config["configurable"] if config and "configurable" in config else {}
        )

        # Get raw values from environment or config
        raw_values: dict[str, Any] = {
            name: os.environ.get(name.upper(), configurable.get(name))
            for name in cls.model_fields.keys()
        }

        # Filter out None values
        values = {k: v for k, v in raw_values.items() if v just isn't None}

        return cls(**values)

That is the custom helper class we mentioned earlier. You possibly can see Pydantic is heavily used to define all of the parameters for the agent. One thing to note is that this class also defines an alternate constructor method from_runnable_config(). This constructor method creates a Configuration instance by pulling values from different sources while enforcing the overriding hierarchy we discussed in “💡 LangGraph’s Solution” above.

config is the input to from_runnable_config() method. Technically, it’s a RunnableConfig type, however it’s really only a dictionary with optional metadata. In LangGraph, it’s mainly used as a structured approach to carry contextual information across the graph. For instance, it may carry things like tags, tracing options, and — most significantly—a nested dictionary of overrides under the "configurable" key.

Finally, by calling in every node:

configurable = Configuration.from_runnable_config(config)

we create an instance of the Configuration class by combining data from three sources: first, the config["configurable"], then environment variables, and at last the category defaults. So configurable is a totally initialized, ready-to-use object that offers the node access to all relevant settings, equivalent to configurable.reflection_model.

reasoning_model = state.get("reasoning_model") or configurable.reasoning_model

To recap: Configuration is the definition, config is the runtime input, and configurable is the result, i.e., the parsed configuration object your node uses.

🎁 Bonus Read: What Didn’t We Cover?

LangGraph has quite a bit more to supply than what we will cover on this tutorial. As you construct more complex agents, you’ll probably end up asking questions like these:

1. Can I make my application more responsive?

LangGraph supports streaming, so you may output results token by token for a real-time user experience.

2. What happens when an API call fails?

LangGraph implements retry and fallback mechanisms to handle errors.

3. Methods to avoid re-running expensive computations?

If a few of your nodes have to conduct expensive processing, you should utilize LangGraph’s caching mechanism to cache the node outputs. Also, LangGraph supports checkpoints. This feature enables you to save your graph’s state and pick up where you left off. This is very necessary if you have got a long-running process and you ought to pause it and resume it later.

4. Can I implement human-in-the-loop workflows?

Yes. LangGraph has built-in support for human-in-the-loop workflows. This allows you to pause the graph and wait for user input or approval before proceeding.

5. How can I trace my agent’s behavior?

LangGraph integrates natively with LangSmith, which provides detailed traces and observability into your agent’s behaviors with minimal setup.

6. How can my agent mechanically discover and use latest tools?

LangGraph supports MCP (Model Context Protocol) integrations. This enables it to auto-discover and use tools that follow this open standard.

Take a look at the LangGraph official docs for more details.

📌Key takeaways

Let’s recap what we’ve covered on this section:

  • Structured output: Use .with_structured_output to force the AI’s response to suit a selected structure you define. This makes sure you mostly get clean, reliable data that your downstream steps can easily parse.
  • Tool calling: You possibly can embed tools within the model calls in order that the agent can interact with the skin world.
  • Conditional routing: That is the way you construct “select your individual adventure” logic. A node can resolve where to go next just by returning the name of the following node. This fashion, you may dynamically create loops and decision points, making your agent’s workflow rather more intelligent.
  • Parallel processing: LangGraph lets you trigger multiple steps to run at the identical time. All of the heavy lifting of fanning out the roles and fanning back in to gather the outcomes are mechanically handled by LangGraph.
  • Configuration management: As a substitute of scattering settings throughout your code, you should utilize a dedicated Configuration class to administer runtime settings, environment variables, defaults, etc., in a single clean, central place.
Figure 8. Various elements of enhancing LLM agent capabilities. (Image by creator)

4. Conclusions

Now we have covered loads of ground on this post! Now we’ve seen how LangGraph’s core concepts come together to construct a real-world research agent, let’s conclude our journey with a couple of key takeaways:

  • Graphs naturally describe agentic workflows. Real-world workflows involve loops, branches, and dynamic decisions. LangGraph’s graph-based architecture (nodes, edges, and state) provides a clean and intuitive approach to represent and manage this complexity.
  • State is the agent’s memory. The central OverallState object is a shared whiteboard that each node within the graph can take a look at and write on. Along with node-specific state schemas, they create the agent’s memory system.
  • Nodes are modular components which are reusable. In LangGraph, it is best to construct nodes with clear responsibilities, e.g., generating queries, calling tools, or routing logic. This makes the agentic system easier to check, maintain, and extend.
  • Control is in your hands. In LangGraph, you may direct the logical flow with conditional edges, implement data reliability with structured outputs, use centralized configuration to tune parameters globally, or use Send to attain parallel execution of tasks. Their combination gives you the facility to construct smart, efficient, and reliable agents.

Now with all of the knowledge you have got about LangGraph, what do you ought to construct next?

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x