·7 min read·v0.5.0

DeepAgents Architecture

Batteries-included agent harness built on LangGraph — planning, filesystem, sub-agents, and context management out of the box.

langgraphagentsmiddlewaresub-agentscontext-managementpython-sdk
View repository →
CLI / UI
Core SDK / Graph
Backends (Storage)
Middleware
LLM Providers
Skills / Memory
Tools
External / Integrations

System Layers

Clients & Interfaces
Deep Agents CLITextual TUI, headless mode
🔨ACP (Zed)Agent Client Protocol
🌐LangGraph StudioVisual debugging UI
🔗Python SDKcreate_deep_agent()
Core Engine — create_deep_agent()
LangGraph CompiledStateGraphcreate_agent() + middleware stack
💬System PromptBASE_AGENT_PROMPT + custom
🤖LLM (default: Claude Sonnet 4.6)Provider-agnostic via init_chat_model
Middleware Pipeline (ordered)
TodoListMiddlewarewrite_todos planning
📄SkillsMiddlewareProgressive disclosure
📁FilesystemMiddlewareread/write/edit/ls/glob/grep
👥SubAgentMiddlewaretask() tool delegation
🗃SummarizationMiddlewareAuto-compaction on token overflow
🔧PatchToolCallsMiddlewareDangling tool-call repair
AsyncSubAgentMiddlewareRemote LangGraph tasks
🔒AnthropicPromptCachingMiddlewareCache prefix optimization
📚MemoryMiddlewareAGENTS.md injection
HumanInTheLoopMiddlewareinterrupt_on tool approval
Tools (LLM-callable)
write_todosTask breakdown & tracking
📄read_filePaginated with line numbers
write_fileCreate new files
🔧edit_fileExact string replacement
🔍ls / glob / grepDirectory navigation & search
executeShell commands (sandboxed)
👥taskDelegate to subagents
Backends (Pluggable Storage & Execution)
📋StateBackendEphemeral, in LangGraph state
💾FilesystemBackendLocal disk I/O
🗃StoreBackendPersistent via BaseStore
🚀LangSmithSandboxRemote sandboxed execution
🏠LocalShellBackendLocal shell + filesystem
🔀CompositeBackendPath-prefix routing
External Services & Integrations
🤖Anthropic ClaudeDefault provider
🤖OpenAI GPTResponses API / Chat
🤖Google Geminivia langchain-google-genai
📈LangSmithTracing, evals, deployments
🔌MCP Serverslangchain-mcp-adapters
Partner SandboxesDaytona, Modal, RunLoop, QuickJS

Core Flow — Agent Invocation Lifecycle

1
User invokes agentcreate_deep_agent() assembles middleware stack, tools, and system prompt into a CompiledStateGraph
2
Middleware interceptsEach middleware's wrap_model_call() fires before every LLM request: injects system context, filters tools, manages state
3
LLM reasoningModel receives system prompt + conversation history + available tools; emits text and/or tool_calls
4
Tool executionLangGraph's tool node dispatches calls to read_file, execute, task, etc. against the resolved backend
5
Sub-agent delegationThe task tool spawns a child agent with its own middleware stack, isolated context, and shared backend
6
Context managementSummarizationMiddleware monitors token usage; when threshold is exceeded, offloads history and injects summary
7
Loop or returnIf tool calls remain, loop back to step 2. Otherwise, return final response. Recursion limit: 1000 steps

Agent Coordination Model

Synchronous Sub-Agents
Declared via SubAgent TypedDict (name, description, system_prompt)
Invoked through the task tool by name
Each sub-agent gets its own create_agent() graph with full middleware stack
A default general-purpose sub-agent is added automatically
Supports CompiledSubAgent for pre-built runnables
Sub-agents inherit parent tools unless overridden
Human-in-the-loop via interrupt_on per sub-agent
Async Sub-Agents (Remote)
Declared via AsyncSubAgent TypedDict (name, graph_id, url)
Launched on remote LangGraph deployments via SDK
Non-blocking: returns task ID immediately
Main agent can monitor, update, or cancel tasks
Auth via LANGGRAPH_API_KEY env vars
Routed through AsyncSubAgentMiddleware
Ideal for long-running background research

Key Subsystem — Middleware Pipeline Architecture

deepagents/middleware/
├── __init__.py               ← Exports all middleware classes
├── filesystem.py             ← File tools + backend resolution (FilesystemMiddleware)
├── subagents.py              ← SubAgent specs + task tool (SubAgentMiddleware)
├── async_subagents.py        ← Remote async task orchestration
├── summarization.py          ← Token-aware auto-compaction + offload
├── memory.py                 ← AGENTS.md loading & system prompt injection
├── skills.py                 ← Progressive skill discovery from SKILL.md frontmatter
├── patch_tool_calls.py       ← Repairs dangling tool calls in message history
└── _utils.py                 ← append_to_system_message helper
wrap_model_call()
Intercepts every LLM request, Injects tools, filters capabilities, Transforms system prompt
before_agent()
Runs once at agent start, Patches state, loads memory/skills, Initializes backends
Tool Interception
Add, remove, or filter tools per-call, execute removed if no SandboxBackend
BackendFactory
Resolves backend lazily via factory, Callable[[ToolRuntime], BackendProtocol]
Prompt Caching
AnthropicPromptCachingMiddleware, Placed near end to maximize stable prefix, Optimizes cache hits
Ordering Matters
Todo → Skills → Filesystem → SubAgents, Summarization → Patch → Cache, Memory → HITL

Data & State Model

AgentState
messages: list[AnyMessage], files: dict[str, FileData], todo_list: list[TodoItem], summarization_events: list
FileData
content: str (utf-8 or base64), encoding: utf-8 | base64, created_at: ISO 8601, modified_at: ISO 8601
SubAgent (TypedDict)
name: str, description: str, system_prompt: str, tools?: Sequence[BaseTool], model?: str | BaseChatModel, middleware?: list[AgentMiddleware]
AsyncSubAgent (TypedDict)
name: str, description: str, graph_id: str, url?: str, headers?: dict[str, str]
BackendProtocol (ABC)
ls(path) → LsResult, read(path, offset, limit) → ReadResult, write(path, content) → WriteResult, edit(path, old, new) → EditResult, grep(pattern) → GrepResult, glob(pattern) → GlobResult
SandboxBackendProtocol
extends BackendProtocol, execute(cmd, timeout?) → ExecuteResponse, upload_files(list[tuple]), download_files(list[str]), id: str (unique sandbox ID)
SkillMetadata
name: str (max 64 chars), description: str (max 1024), path: str (SKILL.md path), license?: str, allowed_tools?: list[str]
ExecuteResponse
output: str (stdout + stderr), exit_code: int | None, truncated: bool

Package / Directory Map

deepagents/                              Monorepo root (MIT, Python 3.11+)
├── libs/deepagents/                     Core SDK — create_deep_agent(), middleware, backends
│   ├── deepagents/graph.py              Main entry: create_deep_agent() factory
│   ├── deepagents/middleware/           10 middleware classes (filesystem, subagents, summarization, etc.)
│   ├── deepagents/backends/            6 backend implementations (state, filesystem, store, sandbox, etc.)
│   └── deepagents/_models.py            Model resolution: "provider:model" → BaseChatModel
├── libs/cli/                            Terminal agent (Textual TUI + headless mode)
│   ├── deepagents_cli/agent.py          CLI agent creation with LocalShellBackend
│   ├── deepagents_cli/app.py            Main Textual application
│   └── deepagents_cli/tools.py          CLI-specific tools (web search, etc.)
├── libs/acp/                            Agent Client Protocol for Zed editor
├── libs/evals/                          Evaluation suite & Harbor integration
│   ├── deepagents_evals/               Category tagging, radar charts
│   └── deepagents_harbor/              LangSmith backend for eval runs
├── libs/partners/                       Sandbox integrations
│   ├── daytona/                         Daytona cloud sandbox
│   ├── modal/                           Modal serverless sandbox
│   ├── runloop/                         RunLoop sandbox
│   └── quickjs/                         QuickJS lightweight runtime
├── examples/                            Reference agents (deep_research, text-to-sql, etc.)
└── .github/                             CI/CD workflows, release-please config
The Middleware-First Architecture

Deep Agents' power comes from its ordered middleware pipeline that intercepts every LLM call — dynamically injecting tools, managing context windows, and orchestrating sub-agents — while a pluggable backend protocol abstracts storage and execution, letting the same agent run against ephemeral state, local filesystems, or remote sandboxes without changing a single line of agent logic.