LangGraph for agents
LangGraph models agents as explicit graphs: nodes mutate shared state, and edges encode control flow, including loops that simple chains cannot express cleanly. Checkpointing enables resume, replay, human interrupts, and durable execution for long-running workflows.
Prerequisites: messages + tools (LangChain for agents).
Process — compiled graph lifecycle
Reducer example: add_messages merges message lists deterministically rather than overwriting.
State-to-state graph anatomy
Why interviewers like this: it makes failure handling visible. You can point to where evidence is verified, where tools are gated, and where a human can interrupt before an irreversible transition.
Classic ReAct loop as a graph shape
Termination = router returns END path when latest AI chunk has no tool_calls.
Pattern A — preset ReAct (create_react_agent)
Fastest sane loop; swaps model & tool list cleanly.
from langchain_core.messages import HumanMessage
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
@tool
def kb_lookup(query: str) -> str:
"""Hybrid search stub—replace with ACL-aware retrieval."""
return f"[stub]{query}: policy snippet..."
model = ChatOpenAI(model="gpt-4o-mini", temperature=0)
graph = create_react_agent(model=model, tools=[kb_lookup])
state = graph.invoke(
{"messages": [HumanMessage("Summarize PII logging stance from KB")]},
config={"configurable": {"thread_id": "tenant-42-thread-9"}},
)
print(state["messages"][-1].content)thread_id pairs with eventual checkpointing for conversational memory & recovery.
Pattern B — explicit router edges
Customize branching (retrieve → rerank gate → reply) while sharing one state shape.
from typing import Annotated
from langchain_core.messages import AnyMessage, HumanMessage
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.graph import END, START, StateGraph
from langgraph.graph.message import add_messages
from langgraph.prebuilt import ToolNode, tools_condition
from typing_extensions import TypedDict
class AgentState(TypedDict):
messages: Annotated[list[AnyMessage], add_messages]
@tool
def add_ints(a: int, b: int) -> int:
"""Add two integers."""
return a + b
llm_tools = ChatOpenAI(model="gpt-4o-mini", temperature=0).bind_tools([add_ints])
tool_runner = ToolNode([add_ints])
def call_model(state: AgentState):
return {"messages": [llm_tools.invoke(state["messages"])]}
builder = StateGraph(AgentState)
builder.add_node("agent", call_model)
builder.add_node("tools", tool_runner)
builder.add_edge(START, "agent")
builder.add_conditional_edges(
"agent",
tools_condition,
{"tools": "tools", END: END},
)
builder.add_edge("tools", "agent")
compiled = builder.compile()
out = compiled.invoke(
{"messages": [HumanMessage("Use tools: compute 19+23")]},
)Add checkpointing:
from langgraph.checkpoint.memory import MemorySaver
compiled_ckpt = builder.compile(checkpointer=MemorySaver())
compiled_ckpt.invoke(..., config={"configurable": {"thread_id": "abc"}})(MemorySaver is illustrative; Postgres/SQL store for durable deploy.)
Production notes
| Concern | LangGraph practice |
|---|---|
| Durability | Use a real checkpointer for production so interrupted runs can resume from the last saved state. |
| Human-in-the-loop | Pause at approval nodes before destructive, financial, legal, or external-message actions. |
| Replay | Re-run from checkpoints to debug why a route or tool call happened. |
| State design | Keep state typed and small: messages, task facts, counters, approvals, evidence ids, and tool results. |
| Determinism | Keep node side effects isolated; do not hide network writes inside "planning" nodes. |
Interview questions — LangGraph
1. LangGraph vs simple while-loop around chat completions?
- Graph gives explicit node boundaries, reproducible checkpoints, concurrency patterns, eventual streaming partial states.
2. Thread vs checkpointer?
-
Thread: logical continuity key (
thread_id) across invocations. -
Checkpointer: saves intermediate state checkpoints keyed by
(thread_id, step).
3. How encode human approvals?
- Interrupt nodes pause before irreversible transitions; resume with operator decision—maps to approvals & compliance audits.
4. Where does max-steps guard live best?
- Router wrapper counting iterations or reducer tracking
step; refuse additional tool hops.
5. Debugging infinite tool oscillation.
- Inspect last two AI tool calls equivalence; widen schema constraints; escalate on duplicate pattern.
6. Why is durable execution useful for agents?
- A long run can pause for approval, survive failures, and resume without replaying every prior model/tool step.
7. What belongs in graph state vs external storage?
- State carries workflow facts and references. Large documents, secrets, raw files, and long-term records belong in storage with access control.
8. Where should tool authorization live?
- In the tool gateway or tool implementation, not only in the graph prompt. The graph can route to a guard node, but the server still enforces identity and scope.
Next
Observe runs in LangSmith · Harden rollout in Agentic production.
Spotted something unclear or wrong on this page?