your AI agent in the middle of the workflow is a common pain point. If you have built your own agentic applications, you’ve most likely already seen this happen.
While LLMs nowadays are incredibly capable, they’re still not quite there yet to run fully autonomously in a complex workflow. For any practical agentic application, human inputs are still much needed for making critical decisions and necessary course correction.
This is where human-in-the-loop patterns come in. And the good news is, you can easily implement them in LangGraph.
In my previous post (LangGraph 101: Let’s build a deep research agent), we thoroughly explained the core concepts of LangGraph and walked through in detail how to use LangGraph to build a practical deep research agent. We showed how the research agent can autonomously search, evaluate results, and iterate until it finds sufficient evidence to reach a comprehensive answer.
One loose end from that blog is that the agent ran completely autonomously from start to finish. There is no entry point for human guidance or feedback.
Let’s fix that in this tutorial!
So, here’s our game plan: we’ll take the same research agent and enhance it with human-in-the-loop functionalities. You’ll see exactly how to implement checkpoints that allow human feedback to make your agents more reliable and trustworthy.
If you’re new to LangGraph or want a refresher on the core LangGraph conpcets, I highly encourage you to check out my previous post. I’ll try to make the current post self-contained, but may skip some explanation given the space constraint. You can find more detailed descriptions in my previous post.
1. Problem Statement
In this post, we build upon the deep research agent we had from the previous post, where we add human-in-the-loop checkpoints so that the user can review the agent’s decision and provide feedback.
As a quick reminder, our deep research agent works like this:
It takes in a user query, autonomously searches the web, examines the search results it obtains, and then decide if enough information has been found. If that’s the case, it proceeds with creating a well-crafted mini-report with proper citations; Otherwise, it circles back to dig deeper with more searches.
The illustration below shows the delta we’re building: the left depicts the workflow of the original deep research agent, the right represents the same agentic workflow but with human-in-the-loop augmentation.
Notice that we’ve added two human-in-the-loop checkpoints in the enhanced workflow on the right:
- Checkpoint 1 is introduced right after the agent generates its initial search queries. The objective here is to allow the user to review and refine the search strategy before any web searches start.
- Checkpoint 2 happens during the iterative search loop. This is when the agent decides if it needs more information, i.e., conduct more searches. Adding a checkpoint here would give the user the opportunity to take a look at what the agent has found so far, determine if indeed sufficient information has already been gathered, and if not, what further search queries to use.
Simply by adding those two checkpoints, we effectively transform a fully autonomous agentic workflow into an LLM-human collaborative one. The agent still deals with the heavy lifting, e.g., generating queries, searching, synthesizing results, and proposing further queries, but now the user would have the intervention points to weave in their judgment.
This is a human-in-the-loop research workflow in action.
2. Mental Model: Graphs, Edges, Nodes, and Human-in-the-Loop
Let’s first establish a solid mental model before checking out the code. We’ll briefly discuss LangGraph’s core and human-in-the-loop mechanism. For a more thorough discussion on LangGraph in general, please refer to LangGraph 101: Let’s build a deep research agent.
2.1 Workflow Representation
LangGraph represents workflows as directed graphs. Each step in your agent’s workflow becomes a node. Essentially, a node is a function where all the actual work is done. To link the nodes, LangGraph uses edges, which basically define how the workflow moves from one step to the next.
Specific to our research agent, nodes would be those boxes in Figure 1, handling tasks such as “generate search queries,” “search the web,” or “reflect on results.” Edges are the arrows, determining the flow such as whether to continue searching or generate the final answer.
2.2 State Management
Now, as our research agent moves through different nodes, it needs to keep track of things it has learned and generated. LangGraph realizes this functionality by maintaining a central state object, which you can think of as a shared whiteboard that every node in the graph can look at and write on.
This way, each node can receive the current state, do its work, and return only the parts it wants to update. LangGraph would then automatically merge these updates into the main state, before passing it to the next node.
This approach allows LangGraph to handle all the state management at the framework level, so that individual nodes only need to focus on their specific tasks. It makes workflows highly modular—you can easily add, remove, or reorder nodes without breaking the state flow.
2.3 Human-in-the-Loop
Now, let’s talk about human-in-the-loop. In LangGraph, this is achieved by introducing an interruption mechanism. Here is how this pattern works:
- Inside a node, you insert a checkpoint. When the graph execution reaches this designated checkpoint, LangGraph would pause the workflow and present relevant information to the human.
- The human can then review this information and decide whether to edit/approve what the agent suggests.
- Once the human provides the input, the workflow resumes the graph run (identified by an ID) exactly from the same node. The node restarts from the top, but when it reaches the inserted checkpoint, it fetches the human’s input instead of pausing. The graph execution continues from there.
With this conceptual foundation in place, let’s see how to translate this human-in-the-loop augmented deep research agent into an actual implementation.
3. From Concept to Code
In this post, we’ll build upon Google’s open-sourced implementation built with LangGraph and Gemini (with Apache-2.0 license). It is a full-stack implementation, but for now, we’ll only focus on the backend logic (backend/src/agent/ directory) where the research agent is defined.
Once you have forked the repo, you’ll see the following key files:
- configuration.py: defines the
Configurationclass that manages all configurable parameters for the research agent. - graph.py: the main orchestration file that defines the LangGraph workflow. We’ll mainly work with this file.
- prompts.py: contains all the prompt templates used by different nodes.
- state.py: defines the TypedDict classes that represent the state passed between graph nodes.
- tools_and_schemas.py: defines Pydantic models for LLMs to produce structured outputs.
- utils.py: utility functions for processing searched data, e.g., extract & format URLs, add citations, etc.
Let’s start with the graph.py and work from there.
3.1 Workflow
As a reminder, we aim to augment the existing deep research agent with human-in-the-loop verifications. Earlier, we mentioned that we want to add two checkpoints. In the flowchart below, you can see that two new nodes will be added to the existing workflow.

In LangGraph, the translation from flowchart to code is straightforward. Let’s start with creating the graph itself:
from langgraph.graph import StateGraph
from agent.state import (
OverallState,
QueryGenerationState,
ReflectionState,
WebSearchState,
)
from agent.configuration import Configuration
# Create our Agent Graph
builder = StateGraph(OverallState, config_schema=Configuration)
Here, we use StateGraph to define a state-aware graph. It accepts anOverallState class that defines what information can move between nodes, and a Configuration class that defines runtime-tunable parameters.
Once we have the graph container, we can add nodes to it:
# Define the nodes we will cycle between
builder.add_node("generate_query", generate_query)
builder.add_node("web_research", web_research)
builder.add_node("reflection", reflection)
builder.add_node("finalize_answer", finalize_answer)
# New human-in-the-loop nodes
builder.add_node("review_initial_queries", review_initial_queries)
builder.add_node("review_follow_up_plan", review_follow_up_plan)
The add_node() method takes the first argument as the node’s name and the second argument as the function that will get executed when the node runs. Note that we have added two new human-in-the-loop nodes compared to the original implementation.
If you cross-compare the node names with the flowchart in Figure 2, you would see that essentially, we have one node reserved for every step. Later, we will examine the detailed implementation of those functions one by one.
Ok, now that we have the nodes defined, let’s add edges to connect them and define execution order:
from langgraph.graph import START, END
# Set the entrypoint as `generate_query`
# This means that this node is the first one called
builder.add_edge(START, "generate_query")
# Checkpoint #1
builder.add_edge("generate_query", "review_initial_queries")
# Add conditional edge to continue with search queries in a parallel branch
builder.add_conditional_edges(
"review_initial_queries", continue_to_web_research, ["web_research"]
)
# Reflect on the web research
builder.add_edge("web_research", "reflection")
# Checkpoint #2
builder.add_edge("reflection", "review_follow_up_plan")
# Evaluate the research
builder.add_conditional_edges(
"review_follow_up_plan", evaluate_research, ["web_research", "finalize_answer"]
)
# Finalize the answer
builder.add_edge("finalize_answer", END)
Note that we have wired the two human-in-the-loop checkpoints directly into the workflow:
- Checkpoint 1: after
generate_querynode, the initial search queries are routed toreview_initial_queries. Here, humans can review and edit/approve the proposed search queries before any web searches begin. - Checkpoint 2: after
reflectionnode, the produced examination, including the sufficiency flag and (if any) the proposed follow-up search queries, is routed toreview_follow_up_plan. Here, humans can evaluate whether the assessment is accurate and adjust the follow-up plan accordingly.
The routing functions, i.e., continue_to_web_research and evaluate_research, handle the routing logic based on human decisions at these checkpoints.
A quick note on
builder.add_conditional_edges(): It’s used to add conditional edges so that the flow may jump to different branches at runtime. It requires three key arguments: the source node, a routing function, and a list of possible destination nodes. The routing function examines the current state and returns the name of the next node to visit.continue_to_web_researchis special here, as it doesn’t actually perform “decision-making” but rather enable parallel searching, if there are multiple queries generated (or suggested by the human) in the first step. We’ll see its implementation later.
Finally, we put everything together and compile the graph into an executable agent:
from langgraph.checkpoint.memory import InMemorySaver
checkpointer = InMemorySaver()
graph = builder.compile(name="pro-search-agent", checkpointer=checkpointer)
Note that we have added a checkpointer object here, which is crucial for achieving human-in-the-loop functionality.
When your graph execution gets interrupted, LangGraph would need to dump the current state of the graph somewhere. Those states might include things like all the work done so far, the data collected, and, of course, exactly where the execution paused. All the information is important to allow the graph to resume seamlessly when human input is provided.
To save this “snapshot”, we have a couple of options. For development and testing purposes, InMemorySaver is a perfect option. It simply stores the graph state in memory, making it fast and simple to work with.
For production deployment, however, you’ll want to use something more sophisticated. For those cases, a proper database-backed checkpointer like PostgresSaver or SqliteSaver would be good options.
LangGraph abstracts this away, so switching from development to production requires only changing this one line of code—the rest of your graph logic remains unchanged. For now, we’ll just stick with the in-memory persistence.
Next up, we’ll take a closer look at individual nodes and see what actions they take.
For the nodes that existed in the original implementation, I’ll keep the discussion brief since I have already covered them in detail in my previous post. In this post, our main focus will be on the two new human-in-the-loop nodes and how they implement the interrupt patterns we mentioned earlier.
3.2 LLM Models
Most of our nodes in the deep research agent are powered by LLMs. In the configuration.py, we have defined the following Gemini models to drive our nodes:
class Configuration(BaseModel):
"""The configuration for the agent."""
query_generator_model: str = Field(
default="gemini-2.5-flash",
metadata={
"description": "The name of the language model to use for the agent's query generation."
},
)
reflection_model: str = Field(
default="gemini-2.5-flash",
metadata={
"description": "The name of the language model to use for the agent's reflection."
},
)
answer_model: str = Field(
default="gemini-2.5-pro",
metadata={
"description": "The name of the language model to use for the agent's answer."
},
)
Note that they might be different from the original implementation. I recommend the Gemini-2.5 series models.
3.3 Node #1: Generate Queries
The generate_query node is used to generate the initial search queries based on the user’s question. Here is how this node is implemented:
from langchain_google_genai import ChatGoogleGenerativeAI
from agent.prompts import (
get_current_date,
query_writer_instructions,
)
def generate_query(
state: OverallState,
config: RunnableConfig
) -> QueryGenerationState:
"""LangGraph node that generates a search queries
based on the User's question.
Args:
state: Current graph state containing the User's question
config: Configuration for the runnable, including LLM
provider settings
Returns:
Dictionary with state update, including search_query key
containing the generated query
"""
configurable = Configuration.from_runnable_config(config)
# check for custom initial search query count
if state.get("initial_search_query_count") is None:
state["initial_search_query_count"] = configurable.number_of_initial_queries
# init Gemini model
llm = ChatGoogleGenerativeAI(
model=configurable.query_generator_model,
temperature=1.0,
max_retries=2,
api_key=os.getenv("GEMINI_API_KEY"),
)
structured_llm = llm.with_structured_output(SearchQueryList)
# Format the prompt
current_date = get_current_date()
formatted_prompt = query_writer_instructions.format(
current_date=current_date,
research_topic=get_research_topic(state["messages"]),
number_queries=state["initial_search_query_count"],
)
# Generate the search queries
result = structured_llm.invoke(formatted_prompt)
return {"query_list": result.query}
The LLM’s output is enforced by using SearchQueryList schema:
from typing import List
from pydantic import BaseModel, Field
class SearchQueryList(BaseModel):
query: List[str] = Field(
description="A list of search queries to be used for web research."
)
rationale: str = Field(
description="A brief explanation of why these queries are relevant to the research topic."
)
3.4 Node #2: Review Initial Queries
This is our first checkpoint. The idea here is that the user can review the initial queries proposed by the LLM and decide if they want to edit/approve the LLM’s output. Here is how we can implement it:
from langgraph.types import interrupt
def review_initial_queries(state: QueryGenerationState) -> QueryGenerationState:
# Retrieve LLM's proposals
suggested = state["query_list"]
# Interruption mechanism
human = interrupt({
"kind": "review_initial_queries",
"suggested": suggested,
"instructions": "Approve as-is, or return queries=[...]"
})
final_queries = human["queries"]
# Limit the total number of queries
cap = state.get("initial_search_query_count")
if cap:
final_queries = final_queries[:cap]
return {"query_list": final_queries}
Let’s break down what’s happening in this checkpoint node:
- First, we extract the search queries that were proposed by the previous
generate_querynode. These queries are what the human wants to review. - The
interrupt()function is where the magic happens. When the node execution hits this function, the entire graph is paused and the payload is presented to the human. The payload is defined in the dictionary that is input to theinterrupt()function. As shown in the code, there are three fields: thekind, which identifies the semantics associated with this checkpoint;suggested, which contains the list of LLM’s proposed search queries; andinstructions, which is a simple text that gives guidance on what the human should do. Of course, the payload passed tointerrupt()can be any dictionary structure you want. It’s mainly a UI/UX concern. - At this point, your application’s frontend is able to display this content to the user. I’ll show you how to interact with it in the demo section later.
- When the human provides their feedback, the graph resumes execution. A key thing to note is that the
interrupt()call now returns the human’s input instead of pausing. The human feedback needs to provide aqueriesfield that contains their approved list of search queries. That’s what thereview_initial_queriesnode expects. - Finally, we apply the configured limits to prevent excessive searches.
That’s it! Present LLM’s proposal, pause, incorporate human feedback, and resume. That’s the foundation of all human-in-the-loop nodes in LangGraph.
3.5 Parallel Web Searches
After the human approves the initial queries, we route them to the web research node. This is achieved via the following routing function:
def continue_to_web_research(state: QueryGenerationState):
"""LangGraph node that sends the search queries to the web research node.
This is used to spawn n number of web research nodes, one for each search query.
"""
return [
Send("web_research", {"search_query": search_query, "id": int(idx)})
for idx, search_query in enumerate(state["query_list"])
]
This function takes the approved query list and creates parallel web_research tasks, one for each query. Using LangGraph’s Send mechanism, we can launch multiple web searches concurrently.
3.6 Node #3: Web Research
This is where the actual web searching happens:
def web_research(state: WebSearchState, config: RunnableConfig) -> OverallState:
"""LangGraph node that performs web research using the native Google Search API tool.
Executes a web search using the native Google Search API tool in combination with Gemini 2.0 Flash.
Args:
state: Current graph state containing the search query and research loop count
config: Configuration for the runnable, including search API settings
Returns:
Dictionary with state update, including sources_gathered, research_loop_count, and web_research_results
"""
# Configure
configurable = Configuration.from_runnable_config(config)
formatted_prompt = web_searcher_instructions.format(
current_date=get_current_date(),
research_topic=state["search_query"],
)
# Uses the google genai client as the langchain client doesn't return grounding metadata
response = genai_client.models.generate_content(
model=configurable.query_generator_model,
contents=formatted_prompt,
config={
"tools": [{"google_search": {}}],
"temperature": 0,
},
)
# resolve the urls to short urls for saving tokens and time
gm = getattr(response.candidates[0], "grounding_metadata", None)
chunks = getattr(gm, "grounding_chunks", None) if gm is not None else None
resolved_urls = resolve_urls(chunks or [], state["id"])
# Gets the citations and adds them to the generated text
citations = get_citations(response, resolved_urls) if resolved_urls else []
modified_text = insert_citation_markers(response.text, citations)
sources_gathered = [item for citation in citations for item in citation["segments"]]
return {
"sources_gathered": sources_gathered,
"search_query": [state["search_query"]],
"web_research_result": [modified_text],
}
The code is mostly self-explanatory. We first configure the search, then call Google’s Search API via Gemini with search tools enabled. Once we obtain the search results, we extract URLs, resolve citations, and then format the search results with proper citation markers. Finally, we update the state with gathered sources and formatted search results.
Note that we have hardened the URL resolving and citation retrieving against scenarios when the search results did not return any grounding data. Therefore, you would see that the implementation for getting the citations and adding them to the generated text is slightly different from the original version. Also, we have implemented an updated version of resolve_urls function:
def resolve_urls(urls_to_resolve, id):
"""
Create a map from original URL -> short URL.
Accepts None or empty; returns {} in that case.
"""
if not urls_to_resolve:
return {}
prefix = f"https://vertexaisearch.cloud.google.com/id/"
urls = []
for site in urls_to_resolve:
uri = None
try:
web = getattr(site, "web", None)
uri = getattr(web, "uri", None) if web is not None else None
except Exception:
uri = None
if uri:
urls.append(uri)
if not urls:
return {}
index_by_url = {}
for i, u in enumerate(urls):
index_by_url.setdefault(u, i)
# Build stable short links
resolved_map = {u: f"{prefix}{id}/{index_by_url[u]}" for u in index_by_url}
return resolved_map
This updated version can be used as a drop-in replacement for the original resolve_urls function, as the original one does not handle edge cases properly.
3.7 Node #4: Reflection
The reflection node analyzes the gathered web research results to determine if more information is needed.
def reflection(state: OverallState, config: RunnableConfig) -> ReflectionState:
"""LangGraph node that identifies knowledge gaps and generates potential follow-up queries.
Analyzes the current summary to identify areas for further research and generates
potential follow-up queries. Uses structured output to extract
the follow-up query in JSON format.
Args:
state: Current graph state containing the running summary and research topic
config: Configuration for the runnable, including LLM provider settings
Returns:
Dictionary with state update, including search_query key containing the generated follow-up query
"""
configurable = Configuration.from_runnable_config(config)
# Increment the research loop count and get the reasoning model
state["research_loop_count"] = state.get("research_loop_count", 0) + 1
reflection_model = state.get("reflection_model") or configurable.reflection_model
# Format the prompt
current_date = get_current_date()
formatted_prompt = reflection_instructions.format(
current_date=current_date,
research_topic=get_research_topic(state["messages"]),
summaries="\n\n---\n\n".join(state["web_research_result"]),
)
# init Reasoning Model
llm = ChatGoogleGenerativeAI(
model=reflection_model,
temperature=1.0,
max_retries=2,
api_key=os.getenv("GEMINI_API_KEY"),
)
result = llm.with_structured_output(Reflection).invoke(formatted_prompt)
return {
"is_sufficient": result.is_sufficient,
"knowledge_gap": result.knowledge_gap,
"follow_up_queries": result.follow_up_queries,
"research_loop_count": state["research_loop_count"],
"number_of_ran_queries": len(state["search_query"]),
}
This analysis feeds directly into our second human-in-the-loop checkpoint.
Note that we can update the ReflectionState schema in the state.py file:
class ReflectionState(TypedDict):
is_sufficient: bool
knowledge_gap: str
follow_up_queries: list
research_loop_count: int
number_of_ran_queries: int
Instead of using an additive reducer, we use a plain list for follow_up_queries so that human input can directly overwrite what LLM has proposed.
3.8 Node #5: Review Follow-Up Plan
The purpose of this checkpoint is to allow humans to validate the LLM’s assessment and decide whether to continue researching:
def review_follow_up_plan(state: ReflectionState) -> ReflectionState:
human = interrupt({
"kind": "review_follow_up_plan",
"is_sufficient": state["is_sufficient"],
"knowledge_gap": state["knowledge_gap"],
"suggested": state["follow_up_queries"],
"instructions": (
"To complete research: {'is_sufficient': true}\n"
"To continue with modified queries: {'follow_up_queries': [...], 'knowledge_gap': '...'}\n"
"To add/modify queries only: {'follow_up_queries': [...]}"
),
})
if human.get("is_sufficient", False) is True:
return {
"is_sufficient": True,
"knowledge_gap": state["knowledge_gap"],
"follow_up_queries": state["follow_up_queries"],
}
return {
"is_sufficient": False,
"knowledge_gap": human.get("knowledge_gap", state["knowledge_gap"]),
"follow_up_queries": human["follow_up_queries"],
}
Following the same pattern, we first design the payload that will be shown to the human. This payload includes the kind of this interruption, a binary flag indicating if the research is sufficient, the knowledge gap identified by LLM, follow-up queries suggested by the LLM, and a small tip of what feedback the human should input.
Upon examination, the human can directly say that the research is sufficient. Or the human can keep the sufficiency flag to be False, and edit/approve what the reflection node LLM has proposed.
Either way, the results will be sent to the research evaluation function, which will route to the corresponding next node.
3.9 Routing Logic: Continue or Finalize
After the human review, this routing function will determine the next step:
def evaluate_research(
state: ReflectionState,
config: RunnableConfig,
) -> OverallState:
"""LangGraph routing function that determines the next step in the research flow.
Controls the research loop by deciding whether to continue gathering information
or to finalize the summary based on the configured maximum number of research loops.
Args:
state: Current graph state containing the research loop count
config: Configuration for the runnable, including max_research_loops setting
Returns:
String literal indicating the next node to visit ("web_research" or "finalize_summary")
"""
configurable = Configuration.from_runnable_config(config)
max_research_loops = (
state.get("max_research_loops")
if state.get("max_research_loops") is not None
else configurable.max_research_loops
)
if state["is_sufficient"] or state["research_loop_count"] >= max_research_loops:
return "finalize_answer"
else:
return [
Send(
"web_research",
{
"search_query": follow_up_query,
"id": state["number_of_ran_queries"] + int(idx),
},
)
for idx, follow_up_query in enumerate(state["follow_up_queries"])
]
If the human concludes that the research is sufficient or we’ve already reached the maximum research loop limit, this function will route to finalize_answer. Otherwise, it will spawn new web research tasks (in parallel) using the human-approved follow-up queries.
3.10 Node #6: Finalize Answer
This is the final node of our graph, which synthesizes all the gathered information into a comprehensive answer with proper citations:
def finalize_answer(state: OverallState, config: RunnableConfig):
"""LangGraph node that finalizes the research summary.
Prepares the final output by deduplicating and formatting sources, then
combining them with the running summary to create a well-structured
research report with proper citations.
Args:
state: Current graph state containing the running summary and sources gathered
Returns:
Dictionary with state update, including running_summary key containing the formatted final summary with sources
"""
configurable = Configuration.from_runnable_config(config)
answer_model = state.get("answer_model") or configurable.answer_model
# Format the prompt
current_date = get_current_date()
formatted_prompt = answer_instructions.format(
current_date=current_date,
research_topic=get_research_topic(state["messages"]),
summaries="\n---\n\n".join(state["web_research_result"]),
)
# init Reasoning Model, default to Gemini 2.5 Flash
llm = ChatGoogleGenerativeAI(
model=answer_model,
temperature=0,
max_retries=2,
api_key=os.getenv("GEMINI_API_KEY"),
)
result = llm.invoke(formatted_prompt)
# Replace the short urls with the original urls and add all used urls to the sources_gathered
unique_sources = []
for source in state["sources_gathered"]:
if source["short_url"] in result.content:
result.content = result.content.replace(
source["short_url"], source["value"]
)
unique_sources.append(source)
return {
"messages": [AIMessage(content=result.content)],
"sources_gathered": unique_sources,
}
With this, our human-in-the-loop research workflow is now complete.
4. Running the Agent: Handling Interrupts and Resumptions
In this section, let’s take our newly enhanced deep research agent for a ride! We’ll walk through a complete interaction in Jupyter Notebook where a human guides the research process at both checkpoints.
To smoothly run our current agent, you need to obtain a Gemini API key. You can get the key from Google AI Studio. Once you have the key, remember to create the .env file and paste in your Gemini API key: GEMINI_API_KEY=”your_actual_api_key_here”.
4.1 Starting the Research
As an example, in the first cell, let’s ask the agent about quantum computing developments:
from agent import graph
from langgraph.types import Command
config = {"configurable": {"thread_id": "session_1"}}
Q = "What are the latest developments in quantum computing?"
result = graph.invoke({"messages": [{"role": "user", "content": Q}]}, config=config)
Note that we have supplied a thread ID in the configuration. In fact, this is a crucial piece for achieving human-in-the-loop workflows. Internally, LangGraph uses this ID to persist the states. Later, when we resume, LangGraph will know which execution to resume.
4.2 Checkpoint #1: Review Initial Queries
After running the first cell, the graph executes until it hits our first checkpoint. If you print the results in the next cell:
result
You would see something like this:
{'messages': [HumanMessage(content='What are the latest developments in quantum computing?', additional_kwargs={}, response_metadata={}, id='68beb541-aedb-4393-bb12-a7f1a22cb4f7')],
'search_query': [],
'web_research_result': [],
'sources_gathered': [],
'approved_initial_queries': [],
'approved_followup_queries': [],
'__interrupt__': [Interrupt(value={'kind': 'review_initial_queries', 'suggested': ['quantum computing breakthroughs 2024 2025', 'quantum computing hardware developments 2024 2025', 'quantum algorithms and software advancements 2024 2025'], 'instructions': 'Approve as-is, or return queries=[...]'}, id='4c23dab27cc98fa0789c61ca14aa6425')]}
Notice that a new key is created: __interrupt__, that contains the payload sent back for the human to review. All the keys of the returned payload are exactly the ones we defined in the node.
Now, as a user, we can proceed to edit/approve the search queries. For now, let’s say we are happy with the LLM’s suggestions, so we can simply accept them. This can be achieved by simply re-sending what LLM’s suggestions back to the node:
# Human input
human_edit = {"queries": result["__interrupt__"][0].value["suggested"]}
# Resume the graph
result = graph.invoke(Command(resume=human_edit), config=config)
Running this cell would take a bit of time, as the graph will launch the searches and synthesize the research results. Afterward, the reflection node would review the results and propose follow-up queries.
4.3 Checkpoint #2: Review Follow-Up Queries
In a new cell, if we now run:
result["__interrupt__"][0].value
You would see the payload with the keys defined in the corresponding node:
{'kind': 'review_follow_up_plan',
'is_sufficient': False,
'knowledge_gap': 'The summaries provide high-level progress in quantum error correction (QEC) but lack specific technical details about the various types of quantum error-correcting codes being developed and how these codes are being implemented and adapted for different qubit modalities (e.g., superconducting, trapped-ion, neutral atom, photonic, topological). A deeper understanding of the underlying error correction schemes and their practical realization would provide more technical depth.',
'suggested': ['What are the different types of quantum error-correcting codes currently being developed and implemented (e.g., surface codes, topological codes, etc.), and what are the specific technical challenges and strategies for their realization in various quantum computing hardware modalities such as superconducting, trapped-ion, neutral atom, photonic, and topological qubits?'],
'instructions': "To complete research: {'is_sufficient': true}\nTo continue with modified queries: {'follow_up_queries': [...], 'knowledge_gap': '...'}\nTo add/modify queries only: {'follow_up_queries': [...]}"}
Let’s say we agree with what the LLM has proposed. But we also want to add a new one to the search query:
human_edit = {
"follow_up_queries": [
result["__interrupt__"][0].value["suggested"][0],
'fault-tolerant quantum computing demonstrations IBM Google IonQ PsiQuantum 2024 2025'
]
}
result = graph.invoke(Command(resume=human_edit), config=config)
We can resume the graph again, and that’s it for how to interact with a human-in-the-loop agent.
5. Conclusion
In this post, we’ve successfully augmented our deep research agent with human-in-the-loop functionalities. Instead of running fully autonomous, we now have a built-in mechanism to prevent the agent from going off-track while enjoying the efficiency of automated research.
Technically, this is achieved by using LangGraph’s interrupt() mechanism within carefully chosen nodes. A good mental model to have is like this: node hits “pause,” you edit or approve, press “play,” node restarts with your input, and it moves on. All these happen without disrupting the underlying graph structure.
Now that you have all this knowledge, are you ready to build the next human-AI collaborative workflow?
