Agent API
| Feature | Pydantic AI | Status | Docs | Notes |
|---|---|---|---|---|
agent.run() | agent.run(prompt, deps=x) | ✅ | Agents | agent.run(prompt, { deps: x }) |
agent.run_stream() | agent.run_stream(prompt) | ✅ | Streaming | agent.stream(prompt) |
| Agent name | agent.name | ✅ | Agents | name on AgentOptions |
| System prompt (static) | system_prompt="..." | ✅ | Agents | systemPrompt: "..." |
| System prompt (dynamic) | @agent.system_prompt decorator | ✅ | Agents | agent.addSystemPrompt(fn) or systemPrompt: [fn] |
| Tools | @agent.tool / tools=[...] | ✅ | Tools | agent.addTool(tool({...})) |
| Structured output | result_type: BaseModel | ✅ | Structured Output | outputSchema: z.object({...}) |
| Result validators | @agent.result_validator | ✅ | Result Validators | agent.addResultValidator(fn) |
| Max retries | max_retries / max_result_retries | ✅ | Agents | maxRetries on AgentOptions |
| Max turns | max_turns | ✅ | Agents | maxTurns on AgentOptions |
| Message history | message_history= | ✅ | Message History | { messageHistory: [...] } on run() |
| Metadata tagging | metadata= on run | ✅ | Agents | { metadata: {...} } on run()/stream() - accessible via ctx.metadata |
Agent.override() | Context manager swapping model/deps/toolsets | ✅ | Testing | agent.override({ model, tools, ... }).run(prompt) |
| Event-stream run | agent.run_stream_events() | ✅ | Streaming | agent.runStreamEvents(prompt) - async iterable of typed AgentStreamEvent objects |
| End strategy | end_strategy | ✅ | Agents | endStrategy: 'early' | 'exhaustive' on AgentOptions/RunOptions |
| Max concurrency | max_concurrency | ✅ | Agents | maxConcurrency on AgentOptions - semaphore-based cap on concurrent tool executions |
instructions field | @agent.instructions decorator | ✅ | Agents | instructions on AgentOptions/RunOptions; re-injected each turn, not stored in message history |
| Model-specific settings | model_settings= on run() | ✅ | Models | modelSettings: { temperature, maxTokens, ... } on AgentOptions or RunOptions |
| Sync run | agent.run_sync() | ❌ | - | Deno is async-native - not applicable |
| Node-level iteration | agent.iter() / AgentRun | ❌ | - | Not applicable - use runStreamEvents() for step-by-step observation in async TypeScript |
| Last run messages | agent.last_run_messages | ❌ | - | Removed from Pydantic AI; superseded by result.newMessages |
Tools
| Feature | Pydantic AI | Status | Docs | Notes |
|---|---|---|---|---|
| Tools with context | @agent.tool | ✅ | Tools | tool({ execute: (ctx, args) => ... }) |
Tool maxRetries | retries= on @agent.tool | ✅ | Tools | maxRetries on ToolDefinition |
| Plain tools (no ctx) | @agent.tool_plain | ✅ | Tools | plainTool({ name, description, parameters, execute }) |
Tool prepare method | prepare= on Tool class | ✅ | Tools | prepare: (ctx) => tool | null on ToolDefinition |
args_validator | args_validator= on tool | ✅ | Tools | argsValidator: (args) => void on ToolDefinition |
Tool.from_schema() | Build tool from raw JSON schema | ✅ | Tools | fromSchema({ name, description, jsonSchema, execute }) |
| Multi-modal returns | Return images / audio / binary from tools | ✅ | Multi-Modal | BinaryContent / BinaryImage - returned from execute, auto-converted |
UploadedFile support | UploadedFile for provider file uploads | ✅ | Multi-Modal | UploadedFile type + uploadedFileSchema for tool parameters |
| Tool result metadata | Attach metadata keyed by tool_call_id | ✅ | Tools | ctx.attachMetadata(toolCallId, meta) - exposed on result.toolMetadata |
| Output functions | Final-action tools (no model feedback loop) | ✅ | Structured Output | outputTool({ ... }) - sets isOutput: true, ends run on call |
| Sequential execution | sequential=True on tool | ✅ | Tools | sequential: true on ToolDefinition - acquires mutex before executing |
| Deferred tools | Tools requiring human approval before execution | ✅ | Human-in-the-Loop | requiresApproval: true on ToolDefinition - see Deferred Tools section |
| MCP server tools | Connect external MCP servers as tool providers | ✅ | MCP | MCPToolset wraps any MCPClient - see MCP section |
| Docstring extraction | Auto-doc from Python docstrings | ❌ | - | No runtime equivalent in TypeScript - use description field explicitly |
Toolsets
| Feature | Pydantic AI | Status | Docs | Notes |
|---|---|---|---|---|
FunctionToolset | Group locally defined function tools | ✅ | Toolsets | new FunctionToolset([tool1, tool2]) |
CombinedToolset | Merge multiple toolsets into one | ✅ | Toolsets | new CombinedToolset(ts1, ts2) |
FilteredToolset | Filter a toolset based on context | ✅ | Toolsets | new FilteredToolset(ts, (ctx) => boolean) |
PrefixedToolset | Add prefix to tool names | ✅ | Toolsets | new PrefixedToolset(ts, "prefix_") |
RenamedToolset | Map new names onto existing tools | ✅ | Toolsets | new RenamedToolset(ts, { old: "new" }) |
| Toolset reuse | Share toolsets across agents | ✅ | Toolsets | Toolset is a plain interface - pass the same instance to multiple agents |
| Runtime swap | Replace toolsets during testing | ✅ | Testing | agent.override({ toolsets: [...] }).run(prompt) |
PreparedToolset | Modify entire tool list before each step | ✅ | Toolsets | new PreparedToolset(inner, (ctx, tools) => tools) - dynamic per-turn |
ApprovalRequiredToolset | Enforce human approval on a toolset | ✅ | Human-in-the-Loop | new ApprovalRequiredToolset(inner) - all tools get requiresApproval |
WrapperToolset | Custom execution behaviour around a toolset | ✅ | Toolsets | class MyWrapper extends WrapperToolset { callTool(...) { ... } } |
ExternalToolset | Deferred execution outside agent process | ✅ | Human-in-the-Loop | new ExternalToolset([{ name, description, jsonSchema }]) - schema-only |
Deferred tools (human-in-the-loop & external execution)
| Feature | Pydantic AI | Status | Docs | Notes |
|---|---|---|---|---|
requires_approval=True | Mark a tool as approval-required | ✅ | Human-in-the-Loop | requiresApproval: true on ToolDefinition or tool() options |
ApprovalRequired exception | Pause agent, surface pending calls to caller | ✅ | Human-in-the-Loop | ApprovalRequiredError - catch it, inspect .requests, resume with results |
DeferredToolRequests | Container of pending tool calls needing approval | ✅ | Human-in-the-Loop | DeferredToolRequests class with .requests array |
DeferredToolResults | Provide approved (or overridden) results | ✅ | Human-in-the-Loop | agent.resume(deferred, { results: [...] }) or run(..., { deferredResults }) |
| Argument override on resume | Modify args during approval before execution | ✅ | Human-in-the-Loop | argsOverride field on DeferredToolResult |
CallDeferred exception | Defer a tool call to an external process | ✅ | Human-in-the-Loop | ExternalToolset raises ApprovalRequiredError for all tools |
ExternalToolset | Accept raw JSON schema tools for deferred calls | ✅ | Toolsets | new ExternalToolset([{ name, description, jsonSchema }]) |
Output & Structured Results
| Feature | Pydantic AI | Status | Docs | Notes |
|---|---|---|---|---|
| Single schema output | result_type: BaseModel | ✅ | Structured Output | outputSchema: z.object({...}) via final_result tool |
| Result validators | @agent.result_validator | ✅ | Result Validators | addResultValidator(fn) - throw to retry |
result.all_messages() | Full message history | ✅ | Message History | result.messages (full) + result.newMessages (this run) |
result.new_messages() | Messages added in this run only | ✅ | Message History | result.newMessages on RunResult and StreamResult |
@agent.output_validator | Validate output post-parse | ✅ | Result Validators | Covered by addResultValidator |
| Union output types | output_type=[TypeA, TypeB] | ✅ | Structured Output | outputSchema: [schemaA, schemaB] - registers final_result_0, _1… |
| Native structured output | NativeOutput marker class | ✅ | Structured Output | outputMode: 'native' - uses AI SDK Output.object() / JSON mode |
| Prompted output mode | PromptedOutput marker class | ✅ | Structured Output | outputMode: 'prompted' - schema injected into system prompt |
| Streaming structured output | Partial validation as output streams | ✅ | Streaming | result.partialOutput async iterable on StreamResult |
| Message serialization | ModelMessagesTypeAdapter | ✅ | Message History | serializeMessages(msgs) / deserializeMessages(json) |
| Disable schema prompt | template=False on output marker | ✅ | Structured Output | outputTemplate: false on AgentOptions |
BinaryImage output | Generate images as output type | ✅ | Multi-Modal | outputSchema: BINARY_IMAGE_OUTPUT - first tool result with image/* MIME type becomes the run output as BinaryContent |
Message history
| Feature | Pydantic AI | Status | Docs | Notes |
|---|---|---|---|---|
| Pass history to next run | message_history=result.all_messages() | ✅ | Message History | { messageHistory: result.messages } |
new_messages() | Slice of messages from current run only | ✅ | Message History | result.newMessages on RunResult and StreamResult |
| Cross-model compatibility | Messages work across providers | ✅ | Message History | AI SDK CoreMessage is provider-agnostic |
| History processors | history_processors=[...] | ✅ | Message History | historyProcessors: [trimHistoryProcessor(n), ...] on AgentOptions |
| Message serialization | JSON roundtrip via ModelMessagesTypeAdapter | ✅ | Message History | serializeMessages() / deserializeMessages() |
| Token-aware trimming | Keep last N messages by token count | ✅ | Message History | tokenTrimHistoryProcessor(maxTokens, tokenCounter?) |
| LLM-based summarization | Summarize old turns via a model call | ✅ | Message History | summarizeHistoryProcessor(model, { maxMessages?, summarizePrompt? }) |
| Privacy filtering | Strip sensitive fields before model call | ✅ | Message History | privacyFilterProcessor(rules) - regex + field-path redaction |
Dependencies
| Feature | Pydantic AI | Status | Docs | Notes |
|---|---|---|---|---|
| Typed deps | RunContext[MyDeps] | ✅ | Dependencies | Agent<MyDeps, TOutput> - deps typed via generic parameter |
| Deps in tools | ctx: RunContext[MyDeps] in tool | ✅ | Dependencies | ctx.deps in execute(ctx, args) |
| Deps in system prompts | @agent.system_prompt with context | ✅ | Dependencies | agent.addSystemPrompt((ctx) => ...) |
| Deps in result validators | @agent.result_validator with context | ✅ | Dependencies | agent.addResultValidator((ctx, result) => ...) |
RunContext accessors | .deps, .usage, .metadata | ✅ | Dependencies | Full RunContext<TDeps> type with all accessors |
| Override deps in tests | Pass fake deps via agent.override() | ✅ | Testing | agent.override({ deps: fakeDeps }).run(prompt) |
Usage & Limits
Errors
| Feature | Pydantic AI | Status | Docs | Notes |
|---|---|---|---|---|
UserError | Raised for bad agent config | ✅ | Error Handling | |
UnexpectedModelBehavior | Malformed model response | ✅ | Error Handling | |
MaxRetriesExceeded | Too many tool/output retries | ✅ | Error Handling | |
MaxTurnsReached | Hit maxTurns cap | ✅ | Error Handling | |
UsageLimitExceeded | Hit a UsageLimits cap | ✅ | Results | |
ApprovalRequiredError | Tool requires human approval | ✅ | Human-in-the-Loop | |
ModelRequestsDisabledError | Fired when setAllowModelRequests(false) | ✅ | Testing |
MCP (Model Context Protocol)
| Feature | Pydantic AI | Status | Docs | Notes |
|---|---|---|---|---|
MCPServerStdio | Subprocess stdio transport | ✅ | MCP | MCPStdioClient using @modelcontextprotocol/sdk |
MCPServerStreamableHTTP | HTTP Streamable transport | ✅ | MCP | MCPHttpClient using StreamableHTTPClientTransport |
MCPServerSSE | Server-Sent Events transport (deprecated) | ❌ | - | Prefer MCPHttpClient (StreamableHTTP) |
| Dynamic tool discovery | Auto-convert MCP tools to Pydantic AI tools | ✅ | MCP | MCPToolset.tools() fetches and converts MCP tools automatically |
| Elicitation support | MCP server can request structured input | ✅ | MCP | elicitationCallback option on MCPToolset |
| Server instructions | Access MCP server instructions post-connect | ✅ | MCP | mcpToolset.getServerInstructions() |
| Tool caching | Cache discovered tools with invalidation | ✅ | MCP | toolCacheTtlMs option on MCPToolset (default 60 s) |
| Multi-server support | Mount multiple MCP servers simultaneously | ✅ | MCP | MCPManager - add servers, call .connect(), use as a Toolset |
| Config file loading | Load MCP config with env variable references | ✅ | MCP | loadMCPConfig(path) - supports ${ENV_VAR} interpolation |
Testing
| Feature | Pydantic AI | Status | Docs | Notes |
|---|---|---|---|---|
| Mock model | MockLanguageModelV1 (from ai/test) | ✅ | Testing | Equivalent to Pydantic AI’s TestModel |
| Multi-turn mock | mockValues(...) | ✅ | Testing | Cycle through responses across turns |
| Stream mock | convertArrayToReadableStream | ✅ | Testing | Build mock stream chunks |
Agent.override() | Swap model/deps/toolsets in tests without modifying app code | ✅ | Testing | agent.override({ model: mockModel }).run(prompt) |
capture_run_messages() | Context manager to inspect all model request/response objects | ✅ | Testing | captureRunMessages(() => agent.run(...)) returns messages[][] |
ALLOW_MODEL_REQUESTS=False | Global flag to prevent accidental real API calls | ✅ | Testing | setAllowModelRequests(false) - throws ModelRequestsDisabledError |
TestModel | Auto-generates valid structured data from schema, calls all tools | ✅ | Testing | new TestModel() / createTestModel({ outputSchema }) - schema-aware |
FunctionModel | Custom function drives model responses | ✅ | Testing | new FunctionModel((params) => result) - full control per turn |
Multi-Agent
| Feature | Pydantic AI | Status | Docs | Notes |
|---|---|---|---|---|
| Agent-as-tool | Tool that calls child.run(usage=ctx.usage) internally | ✅ | Multi-Agent | Pattern: tool({ execute: async (ctx, { prompt }) => { const r = await child.run(prompt, { deps: ctx.deps }); ... } }) |
| Usage aggregation | Pass usage=ctx.usage to sub-agent to merge costs | ✅ | Multi-Agent | Manually add sub-agent usage to ctx.usage inside the tool |
| Programmatic hand-off | App code dispatches agents sequentially | ✅ | Multi-Agent | Documented pattern |
pydantic_graph - FSM | Typed state machine with BaseNode | ✅ | Graph | Graph, BaseNode, GraphRun |
| Graph state persistence | SimpleStatePersistence, FileStatePersistence | ✅ | Graph | MemoryStatePersistence, FileStatePersistence - pause/resume across restarts |
| Graph visualization | Mermaid diagram generation | ✅ | Graph | toMermaid(graph, nodes) returns Mermaid flowchart string |
Graph.iter() / .next() | Manual stepping through graph nodes | ✅ | Graph | graph.runIter(state, startNode) returns GraphRun with .next() method |
| A2A protocol | agent.to_a2a() - expose agent as ASGI A2A server | ✅ | - (docs coming soon) | new A2AAdapter(agent, opts) - JSON-RPC handler with tasks/send, tasks/get, tasks/cancel, agent card at /.well-known/agent.json |
Observability
| Feature | Pydantic AI | Status | Docs | Notes |
|---|---|---|---|---|
| Logfire integration | Auto-traces runs, turns, and tool calls | ❌ | - | Not applicable (Logfire is Python-only) |
| OpenTelemetry support | OTel Gen-AI semantic conventions | ✅ | OpenTelemetry | instrumentAgent(agent, opts) - uses AI SDK experimental_telemetry |
| Run-level spans | Structured spans per run with metadata | ✅ | OpenTelemetry | AI SDK auto-creates run spans when telemetry is enabled |
| Tool-level spans | Span per tool call with args and result | ✅ | OpenTelemetry | AI SDK auto-creates tool spans when telemetry is enabled |
| HTTPX instrumentation | Capture raw HTTP request/response | ❌ | - | Not applicable (no HTTPX in Deno/Node) |
Custom TracerProvider | Bring your own OTel tracer | ✅ | OpenTelemetry | Pass tracer in TelemetrySettings via instrumentAgent or modelSettings |
| Content exclusion | Strip prompt/response from spans | ✅ | OpenTelemetry | excludeContent: true on InstrumentationOptions → recordInputs/Outputs: false |
Evaluation framework (Pydantic Evals)
| Feature | Pydantic AI | Status | Docs | Notes |
|---|---|---|---|---|
| Datasets & Cases | Dataset, Case - typed test scenarios | ✅ | Evals | Dataset.fromArray(), Dataset.fromJSON(), Dataset.fromFile() |
| Built-in evaluators | Exact match, type validation | ✅ | Evals | equalsExpected(), equals(), contains(), isInstance(), isValidSchema(), maxDuration(), hasMatchingSpan(), custom() |
| LLM-as-judge | LLM-based evaluators for subjective qualities | ✅ | Evals | llmJudge({ rubric, model }) + helpers judgeOutput(), judgeInputOutput(), etc. |
| Custom evaluators | Domain-specific scoring functions | ✅ | Evals | custom(name, fn) factory or implement Evaluator interface directly |
| Report-level evaluators | Confusion matrix, precision/recall, ROC AUC, KS | ✅ | Evals | confusionMatrix(), precisionRecall(), rocAuc(), kolmogorovSmirnov() |
| Span-based evaluation | Score runs via OTel trace spans | ✅ | Evals | SpanTree, hasMatchingSpan() - build SpanTree.fromSpanData() from captured spans |
| Experiments | Run and compare datasets across model/prompt combos | ✅ | Evals | dataset.evaluate(task, opts) or runExperiment({ dataset, task, evaluators }) |
| Logfire integration | Visualize eval results in Logfire | ❌ | - | Not applicable (Logfire is Python-only) |
| Async + concurrency | Configurable concurrency and retries for evals | ✅ | Evals | maxConcurrency + maxRetries on EvaluateOptions; Semaphore-based |
| Dataset generation | LLM-generated test cases from Zod schemas | ✅ | Evals | generateDataset({ model, inputSchema, nExamples }) |
Durable execution
| Feature | Pydantic AI | Status | Docs | Notes |
|---|---|---|---|---|
| Temporal integration | TemporalAgent - offloads model/tool calls to activities | ✅ | Temporal | TemporalAgent + MockTemporalAgent - requires Node.js for Temporal worker |
| DBOS integration | Postgres-backed state checkpointing | ❌ | - | Not implemented (skipped by design) |
| Prefect integration | Transactional task semantics with cache keys | ❌ | - | Not implemented (skipped by design) |
AG-UI Protocol
| Feature | Pydantic AI | Status | Docs | Notes |
|---|---|---|---|---|
| AG-UI event streaming | AGUIAdapter.run_stream() - agent-to-UI events | ✅ | AG-UI | AGUIAdapter.handleRequest(input) returns SSE Response |
| Follow-up messaging | Continue conversation after tool call results | ✅ | AG-UI | input.messages history passed as messageHistory automatically |
| Structured event types | Typed event payloads for all agent actions | ✅ | AG-UI | AGUIEvent discriminated union with 16 event variants |
Multi-Modal Support
| Feature | Pydantic AI | Status | Docs | Notes |
|---|---|---|---|---|
| Image input to tools | Pass images into tool parameters | ✅ | Multi-Modal | binaryContentSchema in tool parameters; BinaryContent in execute |
| Audio / video input | Audio and video as tool parameters | ✅ | Multi-Modal | BinaryContent with audio/video MIME types; isAudioContent() type guard |
| Document input | PDFs and documents as tool parameters | ✅ | Multi-Modal | BinaryContent with application/pdf etc.; isDocumentContent() guard |
UploadedFile | File reference for provider file uploads | ✅ | Multi-Modal | UploadedFile type + uploadedFileSchema + uploadedFileToToolResult() |
BinaryImage output | Agent returns a generated image | ✅ | Multi-Modal | outputSchema: BINARY_IMAGE_OUTPUT - agent returns BinaryContent when a tool produces an image/* result |