Constructing Human-In-The-Loop Agentic Workflows

like OpenAI’s GPT-5.4 and Anthropic’s Opus 4.6 have demonstrated outstanding capabilities in executing long-running agentic tasks.

Consequently, we see an increased use of LLM agents across individual and enterprise settings to perform complex tasks, comparable to running financial analyses, constructing apps, and conducting extensive research.

These agents, whether a part of a highly autonomous setup or a pre-defined workflow, can execute multi-step tasks using tools to realize goals with minimal human oversight.

Nonetheless, ‘minimal’ doesn’t mean zero human oversight.

Quite the opposite, human review stays vital due to LLMs’ inherent probabilistic nature and the potential for errors.

These errors can propagate and compound along the workflow, especially once we string quite a few agentic components together.

You’ll have noticed the impressive progress agents have made within the coding domain. The explanation is that code is comparatively easy to confirm (i.e., it either runs or fails, and feedback is visible immediately).

But in areas like content creation, research, or decision-making, correctness is commonly subjective and harder to judge mechanically.

That’s the reason human-in-the-loop (HITL) design stays critical.

In this text, we’ll walk through the way to use LangGraph to establish a human-in-the-loop agentic workflow for content generation and publication on Bluesky.

(1) Primer to LangGraph

LangGraph (a part of the LangChain ecosystem) is a low-level agent orchestration framework and runtime for constructing agentic workflows.

It’s my go-to framework given its high degree of control and customizability, which is significant for production-grade solutions.

While LangChain offers a middleware object (HumanInTheLoopMiddleware) to simply start with human oversight in agent calls, it is completed at a high level of abstraction that masks the underlying mechanics.

LangGraph, against this, doesn’t abstract the prompts or architecture, thereby giving us the finer degree of control that we want. It explicitly lets us define:

How data flows between steps
Where decisions and code executions occur
Where human intervention is required

Subsequently, we’ll use LangGraph to display the HITL concept inside an agentic workflow.

It’s also helpful to differentiate between agentic workflows and autonomous AI agents.

Agentic workflows have predetermined paths and are designed to execute in an outlined order, with LLMs and/or agents integrated into a number of components. However, AI agents autonomously plan, execute, and iterate towards a goal.

In this text, we give attention to agentic workflows, through which we deliberately insert human checkpoints right into a pre-defined flow.

Comparing agentic workflows and LLM agents | Image used under license

(2) Example Workflow

For our example, we will construct a social media content generation workflow as follows:

Content generation workflow | Image by writer

User enters a subject of interest (e.g., “”).
The web search node utilizes the Tavily tool to look online for articles matching the highest.
The highest search result is chosen and fed into an LLM within the content-creation node to generate a social media post.
Within the review node, there are two human review checkpoints:
(i) Present generated content for humans to approve, reject, or edit;
(ii) Upon approval, the workflow triggers the Bluesky API tool and requests final confirmation before posting it online.

Here’s what it looks like when run from the terminal:

Workflow run in terminal | Image by writer

And here is the live post on my Bluesky profile:

Bluesky social media post generated from workflow | Image by writer

Bluesky is a social platform just like Twitter (X), and it’s chosen on this demo because its API is way easier to access and use.

(3) Key Concepts

The core mechanism behind the HITL setup in LangGraph is the concept of interrupts.

Interrupts (using interrupt() and Command in LangGraph) enable us to pause graph execution at specific points, display certain information to the human, and await their input before resuming the workflow.

Command is a flexible object that permits us to update the graph state (update), specify the following node to execute (goto), or capture the worth to resume graph execution with (resume).

Here’s what the flow looks like:

(1) Upon reaching the interrupt() function, execution pauses, and the payload passed into it’s shown to the user. The payload passed in interrupt should typically be JSON or string format, e.g.,

decision = interrupt("Should we get KFC for lunch?") # String shown to user

(2) After the user responds, we pass the response values to the graph to resume execution. It involves using Command and its resume parameter as a part of re-invoking the graph:

if human_response == "yes":
    return graph.invoke(Command(resume="KFC"))
else:
    return graph.invoke(Command(resume="McDonalds"))

(3) The response value in resume is returned within the decision variable, which the node will use for the remainder of the node execution and subsequent graph flow:

if decision == "KFC":
    return Command(goto="kfc_order_node", update={"lunch_choice": "KFC")
else:
    return Command(goto="mcd_order_node", update={"lunch_choice": "McDonalds")

Interrupts are dynamic and could be placed anywhere within the code, unlike static breakpoints, that are fixed before or after specific nodes.

That said, we typically place interrupts either inside the nodes or inside the tools called during graph execution.

Finally, let’s speak about checkpointers. When a workflow pauses at an interrupt, we want a method to save its current state so it might probably resume later.

We due to this fact need a checkpoint to persist the state in order that the state just isn’t lost throughout the interrupt pause. Consider a checkpoint as a snapshot of the graph state at a given time limit.

For development, it is appropriate to avoid wasting the state in memory with the InMemorySaver checkpointer.

For production, it is healthier to make use of stores like Postgres or Redis. With that in mind, we will use the SQLite checkpoint in this instance as a substitute of an in-memory store.

To make sure the graph resumes exactly at the purpose where the interrupt occurred, we want to pass and use the identical thread ID.

The thread ID is passed into config on each graph invocation in order that LangGraph knows which state to resume from after the interrupt.

Now that we have now covered the concepts of interrupts, Command, checkpoints, and threads, let’s get into the code walkthrough.

(4) Code Walkthrough

(4.1) Initial Setup

We start by installing the required dependencies and generating API keys for Bluesky, OpenAI, LangChain, LangGraph, and Tavily.

# requirements.txt
langchain-openai>=1.1.9
langgraph>=1.0.8
langgraph-checkpoint-sqlite>=3.0.3
openai>=2.20.0
tavily-python>=0.7.21

# env.example
export OPENAI_API_KEY=your_openai_api_key
export TAVILY_API_KEY=your_tavily_api_key
export BLUESKY_HANDLE=yourname.bsky.social
export BLUESKY_APP_PASSWORD=your_bluesky_app_password

(4.2) Define State

We arrange the State, which is the shared, structured data object serving because the graph’s central memory. It includes fields that capture key information, like post content and approval status.

The post_data secret’s where the generated post content will probably be stored.

(4.3) Interrupt at node level

We mentioned earlier that interrupts can occur on the node level or inside tool calls. Allow us to see how the previous works by establishing the human review node.

The aim of the review node is to pause execution and present the draft content to the user for review.

Here we see the interrupt() in motion (lines 8 to 13), where the graph execution pauses at the primary section of the node function.

The details key passed into interrupt() incorporates the generated content, while the motion key triggers a handler function (handle_content_interrupt()) to support the review:

The generated content is printed within the terminal for the user to view, and so they can approve it as-is, reject it outright, or edit it directly within the terminal before approving.

Based on the choice, the handler function returns certainly one of three values:

True (approved),
False (rejected), or
String value corresponding to the user-edited content (edited).

This return value is passed back to the review node using graph.invoke(Command=resume…), which resumes execution from where interrupt() was called (line 15) and determines which node to go next: approve, reject, or edit content and proceed to approve.

(4.4) Interrupt at Tool level

Interrupts may also be defined on the tool call level. That is demonstrated in the following human review checkpoint within the approve node before the content is published online on Bluesky.

As a substitute of placing interrupt() inside a node, we place it inside the publish_post tool that creates posts via the Bluesky API:

Identical to what we saw on the node level, we call a handler function (handle_publish_interrupt) to capture the human decision:

The return value from this review step is either:

{"motion": "confirm"}, or
{"motion": "cancel} ,

The latter a part of the code (i.e., from line 19) within the publish_post tool uses this return value to find out whether to proceed with post publication on Bluesky or not.

(4.5) Setup Graph with Checkpointer

Next, we connect the nodes in a graph for compilation and introduce a SQLite checkpointer to capture snapshots of the state at each interrupt.

check_same_thread=False

(4.6) Setup Full Workflow with Config

With the graph ready, we now place it right into a workflow that kickstarts the content generation pipeline.

This workflow includes configuring a thread ID, which is passed to everygraph.invoke(). This ID is the link that ties the invocations together, in order that the graph pauses at an interrupt and resumes from where it left off.

You would possibly have noticed the __interrupt__ key within the code above. It is solely a special key that LangGraph adds to the result every time an interrupt() is hit.

In other words, it’s the primary signal indicating that graph execution has paused and is waiting for human input before continuing.

By placing __interrupt__ as a part of a while loop, it means the loop keeps checking whether an interrupt remains to be ongoing. Once the interrupt is resolved, the important thing disappears, and the while loop exits.

With the workflow complete, we will run it like this:

run_hitl_workflow(query="latest news about Anthropic")

(5) Best Practices of Interrupts

While interrupts are powerful in enabling HITL workflows, they could be disruptive if used incorrectly.

As such, I like to recommend reading this LangGraph documentation. Listed below are some practical rules to take into account:

Don’t wrap interrupt calls in try/except blocks, or they may not pause execution properly
Keep interrupt calls in the identical order each time and don’t skip or rearrange them
Only pass JSON-safe values into interrupts and avoid complex objects
Be sure that any code before an interrupt can safely run multiple times (i.e., idempotency) or move it after the interrupt

For instance, I faced a difficulty in the net search node where I placed an interrupt right after the Tavily search. The intention was to pause and permit users to review the search results for content generation.

But because interrupts work by rerunning the nodes they were called from, the node just reran the net search and passed along a unique set of search results than those I approved earlier.

Subsequently, interrupts work best as a gate before an motion, but when we use them after a non-deterministic step (like search), we want to persist the result or risk getting something different on resume.

Wrapping It Up

Human review can appear to be a bottleneck in agentic tasks, but it surely stays critical, especially in domains where outcomes are subjective or hard to confirm.

LangGraph makes it straightforward to construct HITL workflows with interrupts and checkpointing.

Subsequently, the challenge is deciding where to position those human decision points to strike a superb balance between oversight and efficiency.

Constructing Human-In-The-Loop Agentic Workflows

Contents

(1) Primer to LangGraph

(2) Example Workflow

(3) Key Concepts

(4) Code Walkthrough

(4.1) Initial Setup

(4.2) Define State

(4.3) Interrupt at node level

(4.4) Interrupt at Tool level

(4.5) Setup Graph with Checkpointer

(4.6) Setup Full Workflow with Config

(5) Best Practices of Interrupts

Wrapping It Up

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Agentic commerce runs on truth and context

Augmenting citizen science with computer vision for fish monitoring

Scaling Token Factory Revenue and AI Efficiency by Maximizing Performance per Watt

This startup wants to alter how mathematicians do math

Maximize AI Infrastructure Throughput by Consolidating Underutilized GPU Workloads

Constructing Human-In-The-Loop Agentic Workflows

Contents

(1) Primer to LangGraph

(2) Example Workflow

(3) Key Concepts

(4) Code Walkthrough

(4.1) Initial Setup

(4.2) Define State

(4.3) Interrupt at node level

(4.4) Interrupt at Tool level

(4.5) Setup Graph with Checkpointer

(4.6) Setup Full Workflow with Config

(5) Best Practices of Interrupts

Wrapping It Up

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.