March 24, 2026·7 min read·v0.5.0

DeepAgents Architecture

Batteries-included agent harness built on LangGraph — planning, filesystem, sub-agents, and context management out of the box.

langgraphagentsmiddlewaresub-agentscontext-managementpython-sdk

View repository →

CLI / UI

Core SDK / Graph

Backends (Storage)

Middleware

LLM Providers

Skills / Memory

Tools

External / Integrations

System Layers

Clients & Interfaces

▶Deep Agents CLITextual TUI, headless mode

🔨ACP (Zed)Agent Client Protocol

🌐LangGraph StudioVisual debugging UI

🔗Python SDKcreate_deep_agent()

Core Engine — create_deep_agent()

⚙LangGraph CompiledStateGraphcreate_agent() + middleware stack

💬System PromptBASE_AGENT_PROMPT + custom

🤖LLM (default: Claude Sonnet 4.6)Provider-agnostic via init_chat_model

Middleware Pipeline (ordered)

☑TodoListMiddlewarewrite_todos planning

📄SkillsMiddlewareProgressive disclosure

📁FilesystemMiddlewareread/write/edit/ls/glob/grep

👥SubAgentMiddlewaretask() tool delegation

🗃SummarizationMiddlewareAuto-compaction on token overflow

🔧PatchToolCallsMiddlewareDangling tool-call repair

✈AsyncSubAgentMiddlewareRemote LangGraph tasks

🔒AnthropicPromptCachingMiddlewareCache prefix optimization

📚MemoryMiddlewareAGENTS.md injection

✋HumanInTheLoopMiddlewareinterrupt_on tool approval

Tools (LLM-callable)

☑write_todosTask breakdown & tracking

📄read_filePaginated with line numbers

✎write_fileCreate new files

🔧edit_fileExact string replacement

🔍ls / glob / grepDirectory navigation & search

⌫executeShell commands (sandboxed)

👥taskDelegate to subagents

Backends (Pluggable Storage & Execution)

📋StateBackendEphemeral, in LangGraph state

💾FilesystemBackendLocal disk I/O

🗃StoreBackendPersistent via BaseStore

🚀LangSmithSandboxRemote sandboxed execution

🏠LocalShellBackendLocal shell + filesystem

🔀CompositeBackendPath-prefix routing

External Services & Integrations

🤖Anthropic ClaudeDefault provider

🤖OpenAI GPTResponses API / Chat

🤖Google Geminivia langchain-google-genai

📈LangSmithTracing, evals, deployments

🔌MCP Serverslangchain-mcp-adapters

☁Partner SandboxesDaytona, Modal, RunLoop, QuickJS

Core Flow — Agent Invocation Lifecycle

User invokes agent — create_deep_agent() assembles middleware stack, tools, and system prompt into a CompiledStateGraph

↓

Middleware intercepts — Each middleware's wrap_model_call() fires before every LLM request: injects system context, filters tools, manages state

↓

LLM reasoning — Model receives system prompt + conversation history + available tools; emits text and/or tool_calls

↓

Tool execution — LangGraph's tool node dispatches calls to read_file, execute, task, etc. against the resolved backend

↓

Sub-agent delegation — The task tool spawns a child agent with its own middleware stack, isolated context, and shared backend

↓

Context management — SummarizationMiddleware monitors token usage; when threshold is exceeded, offloads history and injects summary

↓

Loop or return — If tool calls remain, loop back to step 2. Otherwise, return final response. Recursion limit: 1000 steps

Agent Coordination Model

Synchronous Sub-Agents

Declared via SubAgent TypedDict (name, description, system_prompt)

Invoked through the task tool by name

Each sub-agent gets its own create_agent() graph with full middleware stack

A default general-purpose sub-agent is added automatically

Supports CompiledSubAgent for pre-built runnables

Sub-agents inherit parent tools unless overridden

Human-in-the-loop via interrupt_on per sub-agent

Async Sub-Agents (Remote)

Declared via AsyncSubAgent TypedDict (name, graph_id, url)

Launched on remote LangGraph deployments via SDK

Non-blocking: returns task ID immediately

Main agent can monitor, update, or cancel tasks

Auth via LANGGRAPH_API_KEY env vars

Routed through AsyncSubAgentMiddleware

Ideal for long-running background research

Key Subsystem — Middleware Pipeline Architecture

deepagents/middleware/
├── __init__.py               ← Exports all middleware classes
├── filesystem.py             ← File tools + backend resolution (FilesystemMiddleware)
├── subagents.py              ← SubAgent specs + task tool (SubAgentMiddleware)
├── async_subagents.py        ← Remote async task orchestration
├── summarization.py          ← Token-aware auto-compaction + offload
├── memory.py                 ← AGENTS.md loading & system prompt injection
├── skills.py                 ← Progressive skill discovery from SKILL.md frontmatter
├── patch_tool_calls.py       ← Repairs dangling tool calls in message history
└── _utils.py                 ← append_to_system_message helper

wrap_model_call()

Intercepts every LLM request, Injects tools, filters capabilities, Transforms system prompt

before_agent()

Runs once at agent start, Patches state, loads memory/skills, Initializes backends

Tool Interception

Add, remove, or filter tools per-call, execute removed if no SandboxBackend

BackendFactory

Resolves backend lazily via factory, Callable[[ToolRuntime], BackendProtocol]

Prompt Caching

AnthropicPromptCachingMiddleware, Placed near end to maximize stable prefix, Optimizes cache hits

Ordering Matters

Todo → Skills → Filesystem → SubAgents, Summarization → Patch → Cache, Memory → HITL

Data & State Model

AgentState

messages: list[AnyMessage], files: dict[str, FileData], todo_list: list[TodoItem], summarization_events: list

FileData

content: str (utf-8 or base64), encoding: utf-8 | base64, created_at: ISO 8601, modified_at: ISO 8601

SubAgent (TypedDict)

name: str, description: str, system_prompt: str, tools?: Sequence[BaseTool], model?: str | BaseChatModel, middleware?: list[AgentMiddleware]

AsyncSubAgent (TypedDict)

name: str, description: str, graph_id: str, url?: str, headers?: dict[str, str]

BackendProtocol (ABC)

ls(path) → LsResult, read(path, offset, limit) → ReadResult, write(path, content) → WriteResult, edit(path, old, new) → EditResult, grep(pattern) → GrepResult, glob(pattern) → GlobResult

SandboxBackendProtocol

extends BackendProtocol, execute(cmd, timeout?) → ExecuteResponse, upload_files(list[tuple]), download_files(list[str]), id: str (unique sandbox ID)

SkillMetadata

name: str (max 64 chars), description: str (max 1024), path: str (SKILL.md path), license?: str, allowed_tools?: list[str]

ExecuteResponse

output: str (stdout + stderr), exit_code: int | None, truncated: bool

Package / Directory Map

deepagents/                              Monorepo root (MIT, Python 3.11+)
├── libs/deepagents/                     Core SDK — create_deep_agent(), middleware, backends
│   ├── deepagents/graph.py              Main entry: create_deep_agent() factory
│   ├── deepagents/middleware/           10 middleware classes (filesystem, subagents, summarization, etc.)
│   ├── deepagents/backends/            6 backend implementations (state, filesystem, store, sandbox, etc.)
│   └── deepagents/_models.py            Model resolution: "provider:model" → BaseChatModel
├── libs/cli/                            Terminal agent (Textual TUI + headless mode)
│   ├── deepagents_cli/agent.py          CLI agent creation with LocalShellBackend
│   ├── deepagents_cli/app.py            Main Textual application
│   └── deepagents_cli/tools.py          CLI-specific tools (web search, etc.)
├── libs/acp/                            Agent Client Protocol for Zed editor
├── libs/evals/                          Evaluation suite & Harbor integration
│   ├── deepagents_evals/               Category tagging, radar charts
│   └── deepagents_harbor/              LangSmith backend for eval runs
├── libs/partners/                       Sandbox integrations
│   ├── daytona/                         Daytona cloud sandbox
│   ├── modal/                           Modal serverless sandbox
│   ├── runloop/                         RunLoop sandbox
│   └── quickjs/                         QuickJS lightweight runtime
├── examples/                            Reference agents (deep_research, text-to-sql, etc.)
└── .github/                             CI/CD workflows, release-please config

The Middleware-First Architecture

Deep Agents' power comes from its ordered middleware pipeline that intercepts every LLM call — dynamically injecting tools, managing context windows, and orchestrating sub-agents — while a pluggable backend protocol abstracts storage and execution, letting the same agent run against ephemeral state, local filesystems, or remote sandboxes without changing a single line of agent logic.

← cd ../architectures