Feature Parity - Vibes Agent SDK

This document tracks every Pydantic AI feature and its status in the TypeScript framework. Use it as a backlog when deciding what to port next. Vibes is designed to stay current with Pydantic AI - an AI agent automatically tracks new releases and ports relevant changes. See auto-updates for details. Legend: ✅ Ported · 🚧 Partial · ❌ Not ported

Agent API

Feature	Pydantic AI	Status	Docs	Notes
`agent.run()`	`agent.run(prompt, deps=x)`	✅	Agents	`agent.run(prompt, { deps: x })`
`agent.run_stream()`	`agent.run_stream(prompt)`	✅	Streaming	`agent.stream(prompt)`
Agent name	`agent.name`	✅	Agents	`name` on `AgentOptions`
System prompt (static)	`system_prompt="..."`	✅	Agents	`systemPrompt: "..."`
System prompt (dynamic)	`@agent.system_prompt` decorator	✅	Agents	`agent.addSystemPrompt(fn)` or `systemPrompt: [fn]`
Tools	`@agent.tool` / `tools=[...]`	✅	Tools	`agent.addTool(tool({...}))`
Structured output	`result_type: BaseModel`	✅	Structured Output	`outputSchema: z.object({...})`
Result validators	`@agent.result_validator`	✅	Result Validators	`agent.addResultValidator(fn)`
Max retries	`max_retries` / `max_result_retries`	✅	Agents	`maxRetries` on `AgentOptions`
Max turns	`max_turns`	✅	Agents	`maxTurns` on `AgentOptions`
Message history	`message_history=`	✅	Message History	`{ messageHistory: [...] }` on `run()`
Metadata tagging	`metadata=` on run	✅	Agents	`{ metadata: {...} }` on `run()`/`stream()` - accessible via `ctx.metadata`
`Agent.override()`	Context manager swapping model/deps/toolsets	✅	Testing	`agent.override({ model, tools, ... }).run(prompt)`
Event-stream run	`agent.run_stream_events()`	✅	Streaming	`agent.runStreamEvents(prompt)` - async iterable of typed `AgentStreamEvent` objects
End strategy	`end_strategy`	✅	Agents	`endStrategy: 'early' \| 'exhaustive'` on `AgentOptions`/`RunOptions`
Max concurrency	`max_concurrency`	✅	Agents	`maxConcurrency` on `AgentOptions` - semaphore-based cap on concurrent tool executions
`instructions` field	`@agent.instructions` decorator	✅	Agents	`instructions` on `AgentOptions`/`RunOptions`; re-injected each turn, not stored in message history
Model-specific settings	`model_settings=` on `run()`	✅	Models	`modelSettings: { temperature, maxTokens, ... }` on `AgentOptions` or `RunOptions`
Sync run	`agent.run_sync()`	❌	-	Deno is async-native - not applicable
Node-level iteration	`agent.iter()` / `AgentRun`	❌	-	Not applicable - use `runStreamEvents()` for step-by-step observation in async TypeScript
Last run messages	`agent.last_run_messages`	❌	-	Removed from Pydantic AI; superseded by `result.newMessages`

Tools

Feature	Pydantic AI	Status	Docs	Notes
Tools with context	`@agent.tool`	✅	Tools	`tool({ execute: (ctx, args) => ... })`
Tool `maxRetries`	`retries=` on `@agent.tool`	✅	Tools	`maxRetries` on `ToolDefinition`
Plain tools (no ctx)	`@agent.tool_plain`	✅	Tools	`plainTool({ name, description, parameters, execute })`
Tool `prepare` method	`prepare=` on `Tool` class	✅	Tools	`prepare: (ctx) => tool \| null` on `ToolDefinition`
`args_validator`	`args_validator=` on tool	✅	Tools	`argsValidator: (args) => void` on `ToolDefinition`
`Tool.from_schema()`	Build tool from raw JSON schema	✅	Tools	`fromSchema({ name, description, jsonSchema, execute })`
Multi-modal returns	Return images / audio / binary from tools	✅	Multi-Modal	`BinaryContent` / `BinaryImage` - returned from `execute`, auto-converted
`UploadedFile` support	`UploadedFile` for provider file uploads	✅	Multi-Modal	`UploadedFile` type + `uploadedFileSchema` for tool parameters
Tool result metadata	Attach metadata keyed by `tool_call_id`	✅	Tools	`ctx.attachMetadata(toolCallId, meta)` - exposed on `result.toolMetadata`
Output functions	Final-action tools (no model feedback loop)	✅	Structured Output	`outputTool({ ... })` - sets `isOutput: true`, ends run on call
Sequential execution	`sequential=True` on tool	✅	Tools	`sequential: true` on `ToolDefinition` - acquires mutex before executing
Deferred tools	Tools requiring human approval before execution	✅	Human-in-the-Loop	`requiresApproval: true` on `ToolDefinition` - see Deferred Tools section
MCP server tools	Connect external MCP servers as tool providers	✅	MCP	`MCPToolset` wraps any `MCPClient` - see MCP section
Docstring extraction	Auto-doc from Python docstrings	❌	-	No runtime equivalent in TypeScript - use `description` field explicitly

Toolsets

Feature	Pydantic AI	Status	Docs	Notes
`FunctionToolset`	Group locally defined function tools	✅	Toolsets	`new FunctionToolset([tool1, tool2])`
`CombinedToolset`	Merge multiple toolsets into one	✅	Toolsets	`new CombinedToolset(ts1, ts2)`
`FilteredToolset`	Filter a toolset based on context	✅	Toolsets	`new FilteredToolset(ts, (ctx) => boolean)`
`PrefixedToolset`	Add prefix to tool names	✅	Toolsets	`new PrefixedToolset(ts, "prefix_")`
`RenamedToolset`	Map new names onto existing tools	✅	Toolsets	`new RenamedToolset(ts, { old: "new" })`
Toolset reuse	Share toolsets across agents	✅	Toolsets	`Toolset` is a plain interface - pass the same instance to multiple agents
Runtime swap	Replace toolsets during testing	✅	Testing	`agent.override({ toolsets: [...] }).run(prompt)`
`PreparedToolset`	Modify entire tool list before each step	✅	Toolsets	`new PreparedToolset(inner, (ctx, tools) => tools)` - dynamic per-turn
`ApprovalRequiredToolset`	Enforce human approval on a toolset	✅	Human-in-the-Loop	`new ApprovalRequiredToolset(inner)` - all tools get `requiresApproval`
`WrapperToolset`	Custom execution behaviour around a toolset	✅	Toolsets	`class MyWrapper extends WrapperToolset { callTool(...) { ... } }`
`ExternalToolset`	Deferred execution outside agent process	✅	Human-in-the-Loop	`new ExternalToolset([{ name, description, jsonSchema }])` - schema-only

Deferred tools (human-in-the-loop & external execution)

Feature	Pydantic AI	Status	Docs	Notes
`requires_approval=True`	Mark a tool as approval-required	✅	Human-in-the-Loop	`requiresApproval: true` on `ToolDefinition` or `tool()` options
`ApprovalRequired` exception	Pause agent, surface pending calls to caller	✅	Human-in-the-Loop	`ApprovalRequiredError` - catch it, inspect `.requests`, resume with results
`DeferredToolRequests`	Container of pending tool calls needing approval	✅	Human-in-the-Loop	`DeferredToolRequests` class with `.requests` array
`DeferredToolResults`	Provide approved (or overridden) results	✅	Human-in-the-Loop	`agent.resume(deferred, { results: [...] })` or `run(..., { deferredResults })`
Argument override on resume	Modify args during approval before execution	✅	Human-in-the-Loop	`argsOverride` field on `DeferredToolResult`
`CallDeferred` exception	Defer a tool call to an external process	✅	Human-in-the-Loop	`ExternalToolset` raises `ApprovalRequiredError` for all tools
`ExternalToolset`	Accept raw JSON schema tools for deferred calls	✅	Toolsets	`new ExternalToolset([{ name, description, jsonSchema }])`

Output & Structured Results

Feature	Pydantic AI	Status	Docs	Notes
Single schema output	`result_type: BaseModel`	✅	Structured Output	`outputSchema: z.object({...})` via `final_result` tool
Result validators	`@agent.result_validator`	✅	Result Validators	`addResultValidator(fn)` - throw to retry
`result.all_messages()`	Full message history	✅	Message History	`result.messages` (full) + `result.newMessages` (this run)
`result.new_messages()`	Messages added in this run only	✅	Message History	`result.newMessages` on `RunResult` and `StreamResult`
`@agent.output_validator`	Validate output post-parse	✅	Result Validators	Covered by `addResultValidator`
Union output types	`output_type=[TypeA, TypeB]`	✅	Structured Output	`outputSchema: [schemaA, schemaB]` - registers `final_result_0`, `_1`…
Native structured output	`NativeOutput` marker class	✅	Structured Output	`outputMode: 'native'` - uses AI SDK `Output.object()` / JSON mode
Prompted output mode	`PromptedOutput` marker class	✅	Structured Output	`outputMode: 'prompted'` - schema injected into system prompt
Streaming structured output	Partial validation as output streams	✅	Streaming	`result.partialOutput` async iterable on `StreamResult`
Message serialization	`ModelMessagesTypeAdapter`	✅	Message History	`serializeMessages(msgs)` / `deserializeMessages(json)`
Disable schema prompt	`template=False` on output marker	✅	Structured Output	`outputTemplate: false` on `AgentOptions`
`BinaryImage` output	Generate images as output type	✅	Multi-Modal	`outputSchema: BINARY_IMAGE_OUTPUT` - first tool result with `image/*` MIME type becomes the run output as `BinaryContent`

Message history

Feature	Pydantic AI	Status	Docs	Notes
Pass history to next run	`message_history=result.all_messages()`	✅	Message History	`{ messageHistory: result.messages }`
`new_messages()`	Slice of messages from current run only	✅	Message History	`result.newMessages` on `RunResult` and `StreamResult`
Cross-model compatibility	Messages work across providers	✅	Message History	AI SDK `CoreMessage` is provider-agnostic
History processors	`history_processors=[...]`	✅	Message History	`historyProcessors: [trimHistoryProcessor(n), ...]` on `AgentOptions`
Message serialization	JSON roundtrip via `ModelMessagesTypeAdapter`	✅	Message History	`serializeMessages()` / `deserializeMessages()`
Token-aware trimming	Keep last N messages by token count	✅	Message History	`tokenTrimHistoryProcessor(maxTokens, tokenCounter?)`
LLM-based summarization	Summarize old turns via a model call	✅	Message History	`summarizeHistoryProcessor(model, { maxMessages?, summarizePrompt? })`
Privacy filtering	Strip sensitive fields before model call	✅	Message History	`privacyFilterProcessor(rules)` - regex + field-path redaction

Dependencies

Feature	Pydantic AI	Status	Docs	Notes
Typed deps	`RunContext[MyDeps]`	✅	Dependencies	`Agent<MyDeps, TOutput>` - deps typed via generic parameter
Deps in tools	`ctx: RunContext[MyDeps]` in tool	✅	Dependencies	`ctx.deps` in `execute(ctx, args)`
Deps in system prompts	`@agent.system_prompt` with context	✅	Dependencies	`agent.addSystemPrompt((ctx) => ...)`
Deps in result validators	`@agent.result_validator` with context	✅	Dependencies	`agent.addResultValidator((ctx, result) => ...)`
`RunContext` accessors	`.deps`, `.usage`, `.metadata`	✅	Dependencies	Full `RunContext<TDeps>` type with all accessors
Override deps in tests	Pass fake deps via `agent.override()`	✅	Testing	`agent.override({ deps: fakeDeps }).run(prompt)`

Usage & Limits

Feature	Pydantic AI	Status	Docs	Notes
Usage tracking	`result.usage()`	✅	Results	`result.usage` - prompt/completion tokens + requests
`UsageLimits`	Cap request count, input tokens, output tokens, tool calls	✅	Results	`usageLimits: { maxRequests, maxInputTokens, ... }` on `AgentOptions` or `run()`

Errors

Feature	Pydantic AI	Status	Docs
`UserError`	Raised for bad agent config	✅	Error Handling
`UnexpectedModelBehavior`	Malformed model response	✅	Error Handling
`MaxRetriesExceeded`	Too many tool/output retries	✅	Error Handling
`MaxTurnsReached`	Hit `maxTurns` cap	✅	Error Handling
`UsageLimitExceeded`	Hit a `UsageLimits` cap	✅	Results
`ApprovalRequiredError`	Tool requires human approval	✅	Human-in-the-Loop
`ModelRequestsDisabledError`	Fired when `setAllowModelRequests(false)`	✅	Testing

MCP (Model Context Protocol)

Feature	Pydantic AI	Status	Docs	Notes
`MCPServerStdio`	Subprocess stdio transport	✅	MCP	`MCPStdioClient` using `@modelcontextprotocol/sdk`
`MCPServerStreamableHTTP`	HTTP Streamable transport	✅	MCP	`MCPHttpClient` using `StreamableHTTPClientTransport`
`MCPServerSSE`	Server-Sent Events transport (deprecated)	❌	-	Prefer `MCPHttpClient` (StreamableHTTP)
Dynamic tool discovery	Auto-convert MCP tools to Pydantic AI tools	✅	MCP	`MCPToolset.tools()` fetches and converts MCP tools automatically
Elicitation support	MCP server can request structured input	✅	MCP	`elicitationCallback` option on `MCPToolset`
Server instructions	Access MCP server `instructions` post-connect	✅	MCP	`mcpToolset.getServerInstructions()`
Tool caching	Cache discovered tools with invalidation	✅	MCP	`toolCacheTtlMs` option on `MCPToolset` (default 60 s)
Multi-server support	Mount multiple MCP servers simultaneously	✅	MCP	`MCPManager` - add servers, call `.connect()`, use as a `Toolset`
Config file loading	Load MCP config with env variable references	✅	MCP	`loadMCPConfig(path)` - supports `${ENV_VAR}` interpolation

Testing

Feature	Pydantic AI	Status	Docs	Notes
Mock model	`MockLanguageModelV1` (from `ai/test`)	✅	Testing	Equivalent to Pydantic AI’s `TestModel`
Multi-turn mock	`mockValues(...)`	✅	Testing	Cycle through responses across turns
Stream mock	`convertArrayToReadableStream`	✅	Testing	Build mock stream chunks
`Agent.override()`	Swap model/deps/toolsets in tests without modifying app code	✅	Testing	`agent.override({ model: mockModel }).run(prompt)`
`capture_run_messages()`	Context manager to inspect all model request/response objects	✅	Testing	`captureRunMessages(() => agent.run(...))` returns `messages[][]`
`ALLOW_MODEL_REQUESTS=False`	Global flag to prevent accidental real API calls	✅	Testing	`setAllowModelRequests(false)` - throws `ModelRequestsDisabledError`
`TestModel`	Auto-generates valid structured data from schema, calls all tools	✅	Testing	`new TestModel()` / `createTestModel({ outputSchema })` - schema-aware
`FunctionModel`	Custom function drives model responses	✅	Testing	`new FunctionModel((params) => result)` - full control per turn

Multi-Agent

Feature	Pydantic AI	Status	Docs	Notes
Agent-as-tool	Tool that calls `child.run(usage=ctx.usage)` internally	✅	Multi-Agent	Pattern: `tool({ execute: async (ctx, { prompt }) => { const r = await child.run(prompt, { deps: ctx.deps }); ... } })`
Usage aggregation	Pass `usage=ctx.usage` to sub-agent to merge costs	✅	Multi-Agent	Manually add sub-agent usage to `ctx.usage` inside the tool
Programmatic hand-off	App code dispatches agents sequentially	✅	Multi-Agent	Documented pattern
`pydantic_graph` - FSM	Typed state machine with `BaseNode`	✅	Graph	`Graph`, `BaseNode`, `GraphRun`
Graph state persistence	`SimpleStatePersistence`, `FileStatePersistence`	✅	Graph	`MemoryStatePersistence`, `FileStatePersistence` - pause/resume across restarts
Graph visualization	Mermaid diagram generation	✅	Graph	`toMermaid(graph, nodes)` returns Mermaid flowchart string
`Graph.iter()` / `.next()`	Manual stepping through graph nodes	✅	Graph	`graph.runIter(state, startNode)` returns `GraphRun` with `.next()` method
A2A protocol	`agent.to_a2a()` - expose agent as ASGI A2A server	✅	- (docs coming soon)	`new A2AAdapter(agent, opts)` - JSON-RPC handler with `tasks/send`, `tasks/get`, `tasks/cancel`, agent card at `/.well-known/agent.json`

Observability

Feature	Pydantic AI	Status	Docs	Notes
Logfire integration	Auto-traces runs, turns, and tool calls	❌	-	Not applicable (Logfire is Python-only)
OpenTelemetry support	OTel Gen-AI semantic conventions	✅	OpenTelemetry	`instrumentAgent(agent, opts)` - uses AI SDK `experimental_telemetry`
Run-level spans	Structured spans per run with metadata	✅	OpenTelemetry	AI SDK auto-creates run spans when telemetry is enabled
Tool-level spans	Span per tool call with args and result	✅	OpenTelemetry	AI SDK auto-creates tool spans when telemetry is enabled
HTTPX instrumentation	Capture raw HTTP request/response	❌	-	Not applicable (no HTTPX in Deno/Node)
Custom `TracerProvider`	Bring your own OTel tracer	✅	OpenTelemetry	Pass `tracer` in `TelemetrySettings` via `instrumentAgent` or `modelSettings`
Content exclusion	Strip prompt/response from spans	✅	OpenTelemetry	`excludeContent: true` on `InstrumentationOptions` → `recordInputs/Outputs: false`

Evaluation framework (Pydantic Evals)

Feature	Pydantic AI	Status	Docs	Notes
Datasets & Cases	`Dataset`, `Case` - typed test scenarios	✅	Evals	`Dataset.fromArray()`, `Dataset.fromJSON()`, `Dataset.fromFile()`
Built-in evaluators	Exact match, type validation	✅	Evals	`equalsExpected()`, `equals()`, `contains()`, `isInstance()`, `isValidSchema()`, `maxDuration()`, `hasMatchingSpan()`, `custom()`
LLM-as-judge	LLM-based evaluators for subjective qualities	✅	Evals	`llmJudge({ rubric, model })` + helpers `judgeOutput()`, `judgeInputOutput()`, etc.
Custom evaluators	Domain-specific scoring functions	✅	Evals	`custom(name, fn)` factory or implement `Evaluator` interface directly
Report-level evaluators	Confusion matrix, precision/recall, ROC AUC, KS	✅	Evals	`confusionMatrix()`, `precisionRecall()`, `rocAuc()`, `kolmogorovSmirnov()`
Span-based evaluation	Score runs via OTel trace spans	✅	Evals	`SpanTree`, `hasMatchingSpan()` - build `SpanTree.fromSpanData()` from captured spans
Experiments	Run and compare datasets across model/prompt combos	✅	Evals	`dataset.evaluate(task, opts)` or `runExperiment({ dataset, task, evaluators })`
Logfire integration	Visualize eval results in Logfire	❌	-	Not applicable (Logfire is Python-only)
Async + concurrency	Configurable concurrency and retries for evals	✅	Evals	`maxConcurrency` + `maxRetries` on `EvaluateOptions`; Semaphore-based
Dataset generation	LLM-generated test cases from Zod schemas	✅	Evals	`generateDataset({ model, inputSchema, nExamples })`

Durable execution

Feature	Pydantic AI	Status	Docs	Notes
Temporal integration	`TemporalAgent` - offloads model/tool calls to activities	✅	Temporal	`TemporalAgent` + `MockTemporalAgent` - requires Node.js for Temporal worker
DBOS integration	Postgres-backed state checkpointing	❌	-	Not implemented (skipped by design)
Prefect integration	Transactional task semantics with cache keys	❌	-	Not implemented (skipped by design)

AG-UI Protocol

Feature	Pydantic AI	Status	Docs	Notes
AG-UI event streaming	`AGUIAdapter.run_stream()` - agent-to-UI events	✅	AG-UI	`AGUIAdapter.handleRequest(input)` returns SSE `Response`
Follow-up messaging	Continue conversation after tool call results	✅	AG-UI	`input.messages` history passed as `messageHistory` automatically
Structured event types	Typed event payloads for all agent actions	✅	AG-UI	`AGUIEvent` discriminated union with 16 event variants

Feature	Pydantic AI	Status	Docs	Notes
Image input to tools	Pass images into tool parameters	✅	Multi-Modal	`binaryContentSchema` in tool `parameters`; `BinaryContent` in `execute`
Audio / video input	Audio and video as tool parameters	✅	Multi-Modal	`BinaryContent` with audio/video MIME types; `isAudioContent()` type guard
Document input	PDFs and documents as tool parameters	✅	Multi-Modal	`BinaryContent` with `application/pdf` etc.; `isDocumentContent()` guard
`UploadedFile`	File reference for provider file uploads	✅	Multi-Modal	`UploadedFile` type + `uploadedFileSchema` + `uploadedFileToToolResult()`
`BinaryImage` output	Agent returns a generated image	✅	Multi-Modal	`outputSchema: BINARY_IMAGE_OUTPUT` - agent returns `BinaryContent` when a tool produces an `image/*` result

​Agent API

​Tools

​Toolsets

​Deferred tools (human-in-the-loop & external execution)

​Output & Structured Results

​Message history

​Dependencies

​Usage & Limits

​Errors

​MCP (Model Context Protocol)

​Testing

​Multi-Agent

​Observability

​Evaluation framework (Pydantic Evals)

​Durable execution

​AG-UI Protocol

​Multi-Modal Support