Constructing a LangGraph Agent from Scratch

The term “AI agent” is probably the most popular straight away. They emerged after the LLM hype, when people realized that the newest LLM capabilities are impressive but that they will only perform tasks on which they’ve been explicitly trained. In that sense, normal LLMs would not have tools that will allow them to do anything outside their scope of information.

RAG

To handle this, Retrieval-Augmented Generation (RAG) was later introduced to retrieve additional context from external data sources and inject it into the prompt, so the LLM becomes aware of more context. We are able to roughly say that RAG made the LLM more knowledgeable, but for more complex problems, the LLM + RAG approach still failed when the answer path was not known prematurely.

RAG pipeline

Agents

Agents are a remarkable concept built around LLMs that introduce state, decision-making, and memory. Agents might be considered a set of predefined tools for analyzing results and storing them in memory for later use before producing the ultimate answer.

LangGraph

LangGraph is a well-liked framework used for creating agents. Because the name suggests, agents are constructed using graphs with nodes and edges.

Nodes represent the agent’s state, which evolves over time. Edges define the control flow by specifying transition rules and conditions between nodes.

To raised understand LangGraph in practice, we are going to undergo an in depth example. While LangGraph might sound too verbose for the issue below, it often has a much larger impact on complex problems with large graphs.

First, we’d like to put in the essential libraries.

langgraph==1.0.5
langchain-community==0.4.1
jupyter==1.1.1
notebook==7.5.1
langchain[openai]

Then we import the essential modules.

import os
from dotenv import load_dotenv

import json
import random
from pydantic import BaseModel
from typing import Optional, List, Dict, Any

from langgraph.graph import StateGraph, START, END
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage
from langchain.chat_models import init_chat_model
from langchain.tools import tool

from IPython.display import Image, display

We’d also must create an file and add an there:

OPENAI_API_KEY=...

Then, with load_dotenv(), we are able to load the environment variables into the system.

load_dotenv()

Extra functionalities

The function below shall be useful for us to visually display constructed graphs.

def display_graph(graph):
    return display(Image(graph.get_graph().draw_mermaid_png()))

Agent

Allow us to initialize an agent based on GPT-5-nano using an easy command:

llm = init_chat_model("openai:gpt-5-nano")

State

In our example, we are going to construct an agent able to answering questions on soccer. Its thought process shall be based on retrieved statistics about players.

To try this, we’d like to define a state. In our case, it can be an entity containing all the knowledge an LLM needs a couple of player. To define a state, we’d like to put in writing a category that inherits from :

class PlayerState(BaseModel):
    query: str
    selected_tools: Optional[List[str]] = None
    name: Optional[str] = None
    club: Optional[str] = None
    country: Optional[str] = None
    number: Optional[int] = None
    rating: Optional[int] = None
    goals: Optional[List[int]] = None
    minutes_played: Optional[List[int]] = None
    summary: Optional[str] = None

When moving between LangGraph nodes, each node takes as input an instance of PlayerState that specifies find out how to process the state. Our task shall be to define how exactly that state is processed.

Tools

First, we are going to define a few of the tools an agent can use. A tool might be roughly considered a further function that an agent can call to retrieve the knowledge needed to reply a user’s query.

To define a tool, we’d like to put in writing a function with a @tool decorator.

To make our examples simpler, we’re going to use mock data as a substitute of real data retrieved from external sources, which is frequently the case for production applications.

In the primary tool, we are going to return details about a player’s club and country by name.

@tool
def fetch_player_information_tool(name: str):
    """Incorporates information concerning the football club of a player and its country"""
    data = {
        'Haaland': {
            'club': 'Manchester City',
            'country': 'Norway'
        },
        'Kane': {
            'club': 'Bayern',
            'country': 'England'
        },
        'Lautaro': {
            'club': 'Inter',
            'country': 'Argentina'
        },
        'Ronaldo': {
            'club': 'Al-Nassr',
            'country': 'Portugal'
        }
    }
    if name in data:
        print(f"Returning player information: {data[name]}")
        return data[name]
    else:
        return {
            'club': 'unknown',
            'country': 'unknown'
        }

def fetch_player_information(state: PlayerState):
    return fetch_player_information_tool.invoke({'name': state.name})

You is perhaps asking why we place a tool inside one other function, which looks as if over-engineering. In truth, these two functions have different responsibilities.

The function takes a state as a parameter and is compatible with the LangGraph framework. It extracts the name field and calls a tool that operates on the parameter level.

It provides a transparent separation of concerns and allows easy reuse of the identical tool across multiple graph nodes.

Then we’ve got a similar function that retrieves a player’s jersey number:

@tool
def fetch_player_jersey_number_tool(name: str):
    "Returns player jersey number"
    data = {
        'Haaland': 9,
        'Kane': 9,
        'Lautaro': 10,
        'Ronaldo': 7
    }
    if name in data:
        print(f"Returning player number: {data[name]}")
        return {'number': data[name]}
    else:
        return {'number': 0}

def fetch_player_jersey_number(state: PlayerState):
    return fetch_player_jersey_tool.invoke({'name': state.name})

For the third tool, we shall be fetching the player’s FIFA rating:

@tool
def fetch_player_rating_tool(name: str):
    "Returns player rating within the FIFA"
    data = {
        'Haaland': 92,
        'Kane': 89,
        'Lautaro': 88,
        'Ronaldo': 90
    }
    if name in data:
        print(f"Returning rating data: {data[name]}")
        return {'rating': data[name]}
    else:
        return {'rating': 0}

def fetch_player_rating(state: PlayerState):
    return fetch_player_rating_tool.invoke({'name': state.name})

Now, allow us to write several more graph node functions that may retrieve external data. We aren’t going to label them as tools as before, which implies they won’t be something the agent decides to call or not.

def retrieve_goals(state: PlayerState):
    name = state.name
    data = {
        'Haaland': [25, 40, 28, 33, 36],
        'Kane': [33, 37, 41, 38, 29],
        'Lautaro': [19, 25, 27, 24, 25],
        'Ronaldo': [27, 32, 28, 30, 36]
    }
    if name in data:
        return {'goals': data[name]}
    else:
        return {'goals': [0]}

Here’s a graph node that retrieves the variety of minutes played during the last several seasons.

def retrieve_minutes_played(state: PlayerState):
    name = state.name
    data = {
        'Haaland': [2108, 3102, 3156, 2617, 2758],
        'Kane': [2924, 2850, 3133, 2784, 2680],
        'Lautaro': [2445, 2498, 2519, 2773],
        'Ronaldo': [3001, 2560, 2804, 2487, 2771]
    }
    if name in data:
        return {'minutes_played': data[name]}
    else:
        return {'minutes_played': [0]}

Below is a node that extracts a player’s name from a user query.

def extract_name(state: PlayerState):
    query = state.query
    prompt = f"""
You're a football name extractor assistant.
Your goal is to only extract a surname of a footballer in the next query.
User query: {query}
You've to only output a string containing one word - footballer surname.
    """
    response = llm.invoke([HumanMessage(content=prompt)]).content
    print(f"Player name: ", response)
    return {'name': response}

Now’s the time when things get interesting. Do you remember the three tools we defined above? Because of them, we are able to now create a planner that may ask the agent to decide on a particular tool to call based on the context of the situation:

def planner(state: PlayerState):
    query = state.query
    prompt = f"""
You're a football player summary assistant.
You've the next tools available: ['fetch_player_jersey_number', 'fetch_player_information', 'fetch_player_rating']
User query: {query}
Resolve which tools are required to reply.
Return a JSON list of tool names, e.g. ['fetch_player_jersey_number', 'fetch_rating']
    """
    response = llm.invoke([HumanMessage(content=prompt)]).content
    try:
        selected_tools = json.loads(response)
    except:
        selected_tools = []
    return {'selected_tools': selected_tools}

In our case, we are going to ask the agent to create a summary of a soccer player. It should settle on its own which tool to call to retrieve additional data. Docstrings under tools play a crucial role: they supply the agent with additional context concerning the tools.

Below is our final graph node, which can take multiple fields retrieved from previous steps and call the LLM to generate final summary.

def write_summary(state: PlayerState):
    query = state.query
    data = {
        'name': state.name,
        'country': state.country,
        'number': state.number,
        'rating': state.rating,
        'goals': state.goals,
        'minutes_played': state.minutes_played,
    }
    prompt = f"""
You're a football reporter assistant.
Given the next data and statistics of the football player, you should have to create a markdown summary of that player.
Player data:
{json.dumps(data, indent=4)}
The markdown summary has to incorporate the next information:

- Player full name (if only first name or last name is provided, attempt to guess the total name)
- Player country (also add flag emoji)
- Player number (also add the number within the emoji(-s) form)
- FIFA rating
- Total variety of goals in last 3 seasons
- Average variety of minutes required to attain one goal
- Response to the user query: {query}
    """
    response = llm.invoke([HumanMessage(content=prompt)]).content
    return {"summary": response}

Graph construction

We now have all the weather to construct a graph. Firstly, we initialize the graph using the StateGraph constructor. Then, we add nodes to that graph one after the other using the add_node() method. It takes two parameters: a string used to assign a reputation to the node, and a callable function related to the node that takes a graph state as its only parameter.

graph_builder = StateGraph(PlayerState)
graph_builder.add_node('extract_name', extract_name)
graph_builder.add_node('planner', planner)
graph_builder.add_node('fetch_player_jersey_number', fetch_player_jersey_number)
graph_builder.add_node('fetch_player_information', fetch_player_information)
graph_builder.add_node('fetch_player_rating', fetch_player_rating)
graph_builder.add_node('retrieve_goals', retrieve_goals)
graph_builder.add_node('retrieve_minutes_played', retrieve_minutes_played)
graph_builder.add_node('write_summary', write_summary)

Right away, our graph consists only of nodes. We’d like so as to add edges to it. The sides in LangGraph are oriented and added via the add_edge() method, specifying the names of the beginning and end nodes.

The one thing we’d like to take note of is the planner, which behaves barely in a different way from other nodes. As shown above, it will probably return the selected_tools field, which comprises 0 to three output nodes.

For that, we’d like to make use of the add_conditional_edges() method taking three parameters:

The planner node name;
A callable function taking a LangGraph node and returning an inventory of strings indicating the list of node names needs to be called;
A dictionary mapping strings from the second parameter to node names.

In our case, we are going to define the route_tools() node to easily return the state.selected_tools field consequently of a planner function.

def route_tools(state: PlayerState):
    return state.selected_tools or []

Then we are able to construct nodes:

graph_builder.add_edge(START, 'extract_name')
graph_builder.add_edge('extract_name', 'planner')
graph_builder.add_conditional_edges(
    'planner',
    route_tools,
    {
        'fetch_player_jersey_number': 'fetch_player_jersey_number',
        'fetch_player_information': 'fetch_player_information',
        'fetch_player_rating': 'fetch_player_rating'
    }
)
graph_builder.add_edge('fetch_player_jersey_number', 'retrieve_goals')
graph_builder.add_edge('fetch_player_information', 'retrieve_goals')
graph_builder.add_edge('fetch_player_rating', 'retrieve_goals')
graph_builder.add_edge('retrieve_goals', 'retrieve_minutes_played')
graph_builder.add_edge('retrieve_minutes_played', 'write_summary')
graph_builder.add_edge('write_summary', END)

START and END are LangGraph constants used to define the graph’s start and end points.

The last step is to compile the graph. We are able to optionally visualize it using the helper function defined above.

graph = graph_builder.compile()
display_graph(graph)

Getting first experience with LangGraph — Graph diagram

Example

We at the moment are finally capable of use our graph! To achieve this, we are able to use the invoke method and pass a dictionary containing the query field with a custom user query:

result = graph.invoke({
    'query': 'Will Haaland give you the option to win the FIFA World Cup for Norway in 2026 based on his recent performance and stats?'
})

And here is an example result we are able to obtain!

{'query': 'Will Haaland give you the option to win the FIFA World Cup for Norway in 2026 based on his recent performance and stats?',
 'selected_tools': ['fetch_player_information', 'fetch_player_rating'],
 'name': 'Haaland',
 'club': 'Manchester City',
 'country': 'Norway',
 'rating': 92,
 'goals': [25, 40, 28, 33, 36],
 'minutes_played': [2108, 3102, 3156, 2617, 2758],
 'summary': '- Full name: Erling Haalandn- Country: Norway 🇳🇴n- Number: N/A
- FIFA rating: 92n- Total goals in last 3 seasons: 97 (28 + 33 + 36)n- Average minutes per goal (last 3 seasons): 87.95 minutes per goaln- Will Haaland win the FIFA World Cup for Norway in 2026 based on recent performance and stats?n  - Short answer: Not guaranteed. Haaland stays among the many world’s top forwards (92 rating, elite goal output), and he could possibly be a key factor for Norway. Nonetheless, World Cup success is a team achievement depending on Norway’s overall squad quality, depth, tactics, injuries, and tournament context. Based on statistics alone, he strengthens Norway’s possibilities, but a World Cup title in 2026 can't be predicted with certainty.'}

A cool thing is that we are able to observe your complete state of the graph and analyze the tools the agent has chosen to generate the ultimate answer. The ultimate summary looks great!

Conclusion

In this text, we’ve got examined AI agents which have opened a brand new chapter for LLMs. Equipped with state-of-the-art tools and decision-making, we now have much greater potential to resolve complex tasks.

An example we saw in this text introduced us to LangGraph — probably the most popular frameworks for constructing agents. Its simplicity and elegance allow to construct complex decision chains. While, for our easy example, LangGraph might look like overkill, it becomes extremely useful for larger projects where state and graph structures are way more complex.

Constructing a LangGraph Agent from Scratch

RAG

Agents

LangGraph

Extra functionalities

Agent

State

Tools

Graph construction

Example

Conclusion

Resources

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

3 Questions: On the long run of AI and the mathematical and physical sciences

14,000 routers are infected by malware that is highly proof against takedowns

An Intuitive Guide to MCMC (Part I): The Metropolis-Hastings Algorithm

Recent MIT class uses anthropology to enhance chatbots

Introducing Nemotron 3 Super: An Open Hybrid Mamba-Transformer MoE for Agentic Reasoning

Constructing a LangGraph Agent from Scratch

RAG

Agents

LangGraph

Extra functionalities

Agent

State

Tools

Graph construction

Example

Conclusion

Resources

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.