Usage Limits - Vibes Agent SDK

UsageLimits lets you set hard caps on how much an agent may consume during a single run. When a limit is exceeded, the SDK throws a UsageLimitError before the next model request, stopping the run cleanly.

The `UsageLimits` interface

interface UsageLimits {
  /** Maximum number of model requests (turns). */
  maxRequests?: number;
  /** Maximum input tokens consumed. */
  maxInputTokens?: number;
  /** Maximum output tokens generated. */
  maxOutputTokens?: number;
  /** Maximum total tokens (input + output). */
  maxTotalTokens?: number;
}

All fields are optional. Set only the limits you care about — omitted fields are uncapped.

When limits are checked

The SDK checks usage limits before each model request in the turn loop. This means:

The check runs after tool results are appended but before sending the next prompt.
If the current usage already meets or exceeds a limit, UsageLimitError is thrown immediately.
A run that stays within limits for its entire lifetime never throws this error.

Agent-level limits

Set usageLimits on the Agent constructor to apply the same caps to every run of that agent:

import { Agent } from "jsr:@vibesjs/sdk";
import { anthropic } from "@ai-sdk/anthropic";

const agent = new Agent({
  model: anthropic("claude-sonnet-4-6"),
  systemPrompt: "You are a helpful assistant.",
  usageLimits: {
    maxRequests: 5,        // at most 5 model calls per run
    maxTotalTokens: 10_000, // at most 10 000 tokens in + out combined
  },
});

Per-run limits

Pass usageLimits to agent.run() (or agent.stream()) to override or supplement the agent-level limits for a single run. Per-run limits take precedence.

const result = await agent.run("Summarise this document", {
  usageLimits: {
    maxInputTokens: 4_000,
    maxOutputTokens: 1_000,
  },
});

Per-run limits are useful when the same agent is used for both short and long tasks. Set conservative agent-level defaults and relax them selectively for expensive operations.

Handling `UsageLimitError`

Import and catch UsageLimitError to respond gracefully when a limit is hit:

import { Agent, UsageLimitError } from "jsr:@vibesjs/sdk";

try {
  const result = await agent.run(userMessage);
  console.log(result.output);
} catch (err) {
  if (err instanceof UsageLimitError) {
    console.error(
      `Run stopped: ${err.limitKind} reached ${err.current} (limit: ${err.limit})`
    );
    // e.g. return a partial result, notify the user, or log for billing
  } else {
    throw err;
  }
}

`UsageLimitError` properties

Property	Type	Description
`limitKind`	`"requests" \| "inputTokens" \| "outputTokens" \| "totalTokens"`	Which limit was exceeded
`current`	`number`	The usage value at the point of failure
`limit`	`number`	The configured cap that was hit
`message`	`string`	Human-readable description, e.g. `"Usage limit exceeded: totalTokens reached 10000 (limit: 10000)"`

Combining with `maxTurns`

usageLimits.maxRequests and maxTurns both cap the number of model calls, but they are distinct:

Setting	Throws	Checked
`maxTurns`	`MaxTurnsError`	After the turn loop — stops the agent when the turn count is reached
`usageLimits.maxRequests`	`UsageLimitError`	Before each model request — stops the agent when the request count is met

Use maxTurns as a structural safety net and usageLimits.maxRequests when you want to track requests against a quota.

Accessing usage inside a run

The current cumulative usage is available on RunContext inside tools and result validators via ctx.usage:

import { tool } from "jsr:@vibesjs/sdk";
import { z } from "zod";

const checkBudget = tool({
  name: "check_budget",
  description: "Report current token usage",
  parameters: z.object({}),
  execute: async (ctx) => {
    const { inputTokens, outputTokens, totalTokens, requests } = ctx.usage;
    return `Used ${totalTokens} tokens across ${requests} requests so far.`;
  },
});

Agents

Full Agent constructor options including maxTurns

Troubleshooting

Understanding and catching UsageLimitError

​The UsageLimits interface

​When limits are checked

​Agent-level limits

​Per-run limits

​Handling UsageLimitError

​UsageLimitError properties

​Combining with maxTurns

​Accessing usage inside a run

Agents

Troubleshooting

The `UsageLimits` interface

When limits are checked

Agent-level limits

Per-run limits

Handling `UsageLimitError`

`UsageLimitError` properties

Combining with `maxTurns`

Accessing usage inside a run