# Actions Source: https://trigger.dev/docs/ai-chat/actions Custom commands sent from the frontend that mutate chat state without consuming a turn — undo, rollback, edit, regenerate. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. ## Overview Custom actions let the frontend send structured commands (undo, rollback, edit, regenerate) that modify the conversation state. **Actions are not turns**: they fire `hydrateMessages` (if set) and `onAction` only. No turn lifecycle hooks (`onTurnStart` / `prepareMessages` / `onBeforeTurnComplete` / `onTurnComplete`), no `run()`, no turn-counter increment. The trace span is named `chat action`. Actions wake the agent from suspension the same way a new message does, run their handler against the latest accumulator state, and emit a `trigger:turn-complete` chunk so the frontend's `useChat` knows the action has been applied. ## Defining an action handler Define an `actionSchema` for validation and an `onAction` handler that uses [`chat.history`](/docs/ai-chat/backend#chat-history) to modify state: ```ts theme={"theme":"css-variables"} import { z } from "zod"; export const myChat = chat.agent({ id: "my-chat", actionSchema: z.discriminatedUnion("type", [ z.object({ type: z.literal("undo") }), z.object({ type: z.literal("rollback"), targetMessageId: z.string() }), z.object({ type: z.literal("edit"), messageId: z.string(), text: z.string() }), ]), onAction: async ({ action }) => { switch (action.type) { case "undo": chat.history.slice(0, -2); // Remove last user + assistant exchange break; case "rollback": chat.history.rollbackTo(action.targetMessageId); break; case "edit": chat.history.replace(action.messageId, { id: action.messageId, role: "user", parts: [{ type: "text", text: action.text }], }); break; } // returning void → side-effect-only, no model call }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` **Lifecycle flow:** Wake → parse action against `actionSchema` → `hydrateMessages` (if set) → **`onAction`** → apply `chat.history` mutations → emit `trigger:turn-complete` → wait for next message. ## Returning a model response from an action `onAction` can return a `StreamTextResult`, `string`, or `UIMessage` to produce a response. The returned stream is auto-piped to the frontend just like a normal turn, but the rest of the turn machinery (`onTurnStart`, `onTurnComplete`, etc.) still does not fire. ```ts theme={"theme":"css-variables"} onAction: async ({ action, messages }) => { if (action.type === "regenerate") { chat.history.slice(0, -1); // drop the last assistant return streamText({ model: anthropic("claude-sonnet-4-5"), messages, stopWhen: stepCountIs(15), }); } // other actions return void → side-effect only } ``` This is useful for actions that both mutate state and want a fresh model response (regenerate-from-here, retry-with-different-style). Persistence is your responsibility inside `onAction` itself; you have access to the streamed response object. ## Gating actions on HITL state If you have a [human-in-the-loop](/docs/ai-chat/patterns/human-in-the-loop) tool waiting on `addToolOutput`, you usually want to refuse competing actions like `regenerate` until the answer arrives. [`chat.history.getPendingToolCalls()`](/docs/ai-chat/backend#chat-history) gives you exactly that signal: ```ts theme={"theme":"css-variables"} onAction: async ({ action, messages, signal }) => { if (action.type === "regenerate") { if (chat.history.getPendingToolCalls().length > 0) return; // gated chat.history.slice(0, -1); return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); } }, ``` ## Sending actions from the frontend ```ts theme={"theme":"css-variables"} // Browser — TriggerChatTransport const stream = await transport.sendAction(chatId, { type: "undo" }); // Server — AgentChat const stream = await agentChat.sendAction({ type: "rollback", targetMessageId: "msg-3" }); ``` The action payload is validated against `actionSchema` on the backend; invalid actions throw and surface as a stream error. The `action` parameter in `onAction` is fully typed from the schema. For silent state changes that should never appear as a turn (e.g. injecting background context), use [`chat.inject()`](/docs/ai-chat/background-injection) instead. Actions are explicit user-driven mutations; injections are agent-side context updates. ## See also * [`chat.history`](/docs/ai-chat/backend#chat-history) — the imperative API actions use to mutate state * [Sending actions from the frontend](/docs/ai-chat/frontend#sending-actions) — `transport.sendAction` ergonomics * [`hydrateMessages`](/docs/ai-chat/lifecycle-hooks#hydratemessages) — fires before `onAction` when set * [Branching conversations](/docs/ai-chat/patterns/branching-conversations) — pairs action handlers with backend-controlled history * [Human-in-the-loop](/docs/ai-chat/patterns/human-in-the-loop) — gating fresh actions while a tool is waiting # Anatomy of an agent Source: https://trigger.dev/docs/ai-chat/anatomy The moving parts of a chat agent — the agent task, the session, the frontend transport — and which page covers each. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. **A chat agent is three parts: a long-lived agent task that runs the turn loop, a durable Session carrying messages in and the response stream out, and a frontend transport that plugs the session into `useChat`.** The pages in this section each own one part of that picture. This page is the map — if you'd rather read mechanics end to end, skip to [How it works](/docs/ai-chat/how-it-works). ```mermaid theme={"theme":"css-variables"} flowchart LR FE["Frontend
useChat + transport"] -- "user messages" --> IN([Session .in]) IN --> AGENT["Agent task
turn loop + hooks"] AGENT --> OUT([Session .out]) OUT -- "streamed response" --> FE ``` Everything below maps onto one annotated agent: ```ts trigger/my-agent.ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { streamText, stepCountIs } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; export const myAgent = chat.agent({ id: "my-agent", // Tools declared on the config survive history re-conversion // across turns — see Tools. tools: { searchDocs }, // Hooks fire around each turn: validation, persistence, // post-turn work — see Lifecycle hooks. onTurnComplete: async ({ responseMessage }) => { await db.messages.save(responseMessage); }, // The turn loop. Messages arrive accumulated; you stream back. // Options, levels, and alternatives — see Backend. run: async ({ messages, tools, signal }) => streamText({ ...chat.toStreamTextOptions({ tools }), model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal, stopWhen: stepCountIs(15), }), }); ``` The frontend side is one hook — `useTriggerChatTransport` connects `useChat` to the agent's session, no API routes ([Frontend](/docs/ai-chat/frontend)). Underneath, the conversation lives on a [Session](/docs/ai-chat/sessions): a pair of durable streams keyed on your `chatId` that survives refreshes, deploys, and run boundaries. ## Where each part is covered | Part | Page | | ----------------------------------------------------- | ------------------------------------------- | | `chat.agent()` options, the turn loop, piping | [Backend](/docs/ai-chat/backend) | | Hooks around each turn (`onTurnComplete`, hydration) | [Lifecycle hooks](/docs/ai-chat/lifecycle-hooks) | | Declaring tools, typed payloads, `toModelOutput` | [Tools](/docs/ai-chat/tools) | | `useChat` wiring, tokens, starting sessions | [Frontend](/docs/ai-chat/frontend) | | Driving a chat from your server instead of a browser | [Server-side chat](/docs/ai-chat/server-chat) | | The durable substrate under every agent | [Sessions](/docs/ai-chat/sessions) | | Per-run typed state inside the loop | [chat.local](/docs/ai-chat/chat-local) | | Type-safe payloads, client data, and messages | [Types](/docs/ai-chat/types) | | Building without the managed lifecycle | [Custom agents](/docs/ai-chat/custom-agents) | | End-to-end mechanics: what survives a refresh and why | [How it works](/docs/ai-chat/how-it-works) | Beyond this section: [Features](/docs/ai-chat/fast-starts) covers opt-in capabilities (Head Start, compaction, steering, actions), and [Patterns](/docs/ai-chat/patterns/sub-agents) covers production recipes (sub-agents, HITL approvals, persistence, recovery). # Backend Source: https://trigger.dev/docs/ai-chat/backend Three approaches to building your chat backend — chat.agent(), session iterator, or raw task primitives. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. There are three abstraction levels for a chat backend. All three speak the same wire protocol, so the [frontend transport](/docs/ai-chat/frontend) works unchanged whichever you pick. | Capability | `chat.agent()` | `chat.createSession()` | Raw primitives | | ------------------------------------- | -------------- | ------------------------------------------------------------------------------ | -------------- | | Turn loop, stop signals, accumulation | Managed | Managed | You write it | | Lifecycle hooks | Yes | No — inline code per turn | No | | Continuation recovery on new runs | Automatic | [Manual seeding](/docs/ai-chat/custom-agents#continuation-runs-and-history-seeding) | Manual seeding | | Compaction / steering | Built-in | Built-in | Manual | | Head Start, actions, tool approvals | Yes | No | No | | Custom stream conversion | No | Limited | Full control | | Agent dashboard visibility | Yes | Yes (via `customAgent`) | Yes | The raw-primitives column assumes [`chat.customAgent()`](/docs/ai-chat/custom-agents) as the wrapper, which is what makes the task visible to the agent dashboard. Start with `chat.agent()`. Drop to `chat.createSession()` when you want to own the per-turn code (model routing, persistence, custom telemetry) without rebuilding the turn loop. Drop to raw primitives only when you need full control over stream conversion or a custom protocol. ## chat.agent() The highest-level approach. Handles message accumulation, stop signals, turn lifecycle, and auto-piping automatically. ### Simple: return a StreamTextResult Return the `streamText` result from `run` and it's automatically piped to the frontend: ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { streamText, stepCountIs } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; export const simpleChat = chat.agent({ id: "simple-chat", run: async ({ messages, signal }) => { return streamText({ ...chat.toStreamTextOptions(), // prepareStep, system, telemetry (see note below) model: anthropic("claude-sonnet-4-5"), system: "You are a helpful assistant.", messages, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` **Always spread `chat.toStreamTextOptions()` first** (as above) so your explicit overrides win. It wires up the `prepareStep` callback behind [compaction](/docs/ai-chat/compaction), [steering](/docs/ai-chat/pending-messages), and [background injection](/docs/ai-chat/background-injection), all of which silently no-op without it, and injects the system prompt from `chat.prompt()`, the resolved model (when you pass a `registry`), and telemetry metadata. Examples below keep the spread implicit for brevity, so include it in real code. ### Using chat.pipe() for complex flows For complex agent flows where `streamText` is called deep inside your code, use `chat.pipe()`. It works from **anywhere inside a task** — even nested function calls. ```ts trigger/agent-chat.ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { streamText } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; import type { ModelMessage } from "ai"; export const agentChat = chat.agent({ id: "agent-chat", run: async ({ messages }) => { // Don't return anything — chat.pipe is called inside await runAgentLoop(messages); }, }); async function runAgentLoop(messages: ModelMessage[]) { // ... agent logic, tool calls, etc. const result = streamText({ model: anthropic("claude-sonnet-4-5"), messages, stopWhen: stepCountIs(15), }); // Pipe from anywhere — no need to return it await chat.pipe(result); } ``` ### Custom data parts Add custom `data-*` parts to the assistant's response message via `chat.response.write()` (from `run()`) or the `writer` parameter in lifecycle hooks. Non-transient `data-*` chunks are automatically added to `responseMessage.parts` and surface in `onTurnComplete` for persistence: ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", onBeforeTurnComplete: async ({ writer, turn }) => { // This data part will be in responseMessage.parts in onTurnComplete writer.write({ type: "data-metadata", data: { turn, model: "gpt-4o", timestamp: Date.now() }, }); }, onTurnComplete: async ({ responseMessage }) => { // responseMessage.parts includes the data-metadata part await db.messages.save(responseMessage); }, run: async ({ messages, signal }) => { // Also works from run() via chat.response chat.response.write({ type: "data-context", data: { searchResults: results }, }); return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` Add `transient: true` to data chunks that should stream to the frontend but NOT persist in the response message. Use this for progress indicators, loading states, and other temporary UI: ```ts theme={"theme":"css-variables"} // Transient — frontend sees it, but NOT in onTurnComplete's responseMessage writer.write({ type: "data-progress", id: "search", data: { percent: 50 }, transient: true, }); ``` This matches the AI SDK's semantics: `data-*` chunks persist to `message.parts` by default. Only `transient: true` chunks are ephemeral. Non-data chunks (`text-delta`, `tool-*`, etc.) are handled by `streamText` and captured via `onFinish` — they don't need `chat.response`. `chat.response` and the `writer` accumulation behavior work with `chat.agent` and `chat.createSession`. If you're using [`chat.customAgent`](/docs/ai-chat/custom-agents), you own the accumulator — see the raw-task example for the manual pattern. ### Raw streaming with `chat.stream` For low-level stream access (piping from subtasks, reading streams by run ID), use `chat.stream`. Chunks written via `chat.stream` go directly to the realtime output — they are **NOT** accumulated into the response message regardless of the `transient` flag. ```ts theme={"theme":"css-variables"} // Raw stream — always ephemeral, never in responseMessage const { waitUntilComplete } = chat.stream.writer({ execute: ({ write }) => { write({ type: "data-status", data: { message: "Processing..." } }); }, }); await waitUntilComplete(); ``` Use `data-*` chunk types (e.g. `data-status`, `data-progress`) for custom data. The AI SDK processes these into `DataUIPart` objects in `message.parts` on the frontend. Writing the same `type` + `id` again updates the existing part instead of creating a new one — useful for live progress. `chat.stream` exposes the full stream API: | Method | Description | | ------------------------------------- | ------------------------------------------ | | `chat.stream.writer(options)` | Write individual chunks via a callback | | `chat.stream.pipe(stream, options?)` | Pipe a `ReadableStream` or `AsyncIterable` | | `chat.stream.append(value, options?)` | Append raw data | | `chat.stream.read(runId, options?)` | Read the stream by run ID | For piping streams from subtasks to the parent chat (via `target: "root"`), see the [Sub-agents pattern](/docs/ai-chat/patterns/sub-agents). ### Backed by a Session Every `chat.agent` conversation is backed by a durable [Session](/docs/ai-chat/sessions): `externalId` is your `chatId`, `type` is `"chat.agent"`, and `taskIdentifier` is the agent's task ID. The session is the run manager. It owns the chat's runs, persists across run lifecycles, and orchestrates handoffs (idle continuation, `chat.requestUpgrade`). You rarely touch it directly, since `chat.stream`, `chat.messages`, and `chat.stopSignal` wrap everything, but `payload.sessionId` is there when you need to reach in, e.g. `sessions.open(payload.sessionId)` to write from a sub-agent or from outside the turn loop. ### Tools Declare your tools on the agent config, then read them back (typed) from the `run()` payload. Declaring them on the config, not just on `streamText`, is what lets the SDK re-apply each tool's `toModelOutput` when it re-converts history on later turns. ```ts theme={"theme":"css-variables"} const tools = { searchDocs }; export const myChat = chat.agent({ id: "my-chat", tools, run: async ({ messages, tools, signal }) => streamText({ ...chat.toStreamTextOptions({ tools }), model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal, stopWhen: stepCountIs(15), }), }); ``` See [Tools](/docs/ai-chat/tools) for `toModelOutput` across turns, per-turn dynamic tools, the typed run payload, and how config tools relate to skills. ### Lifecycle hooks `chat.agent({ ... })` accepts hooks that fire in a fixed order around each turn, plus dedicated suspend/resume hooks. The full reference lives on its own page: * [Lifecycle hooks](/docs/ai-chat/lifecycle-hooks) — `onPreload`, `onChatStart`, `onValidateMessages`, `hydrateMessages`, `onTurnStart`, `onBeforeTurnComplete`, `onTurnComplete`, `onChatSuspend` / `onChatResume`, `exitAfterPreloadIdle`, plus how `ctx` plumbs through every callback. **Per-turn order:** `onValidateMessages` → `hydrateMessages` → `onChatStart` (chat's first message only) → `onTurnStart` → `run()` → `onBeforeTurnComplete` → `onTurnComplete`. ### Using prompts Use [AI Prompts](/docs/ai/prompts) to manage your system prompt as versioned, overridable config. Store the resolved prompt in a lifecycle hook with `chat.prompt.set()`, then spread `chat.toStreamTextOptions()` into `streamText` — it includes the system prompt, model, config, and telemetry automatically. ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { prompts } from "@trigger.dev/sdk"; import { streamText, createProviderRegistry } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; import { z } from "zod"; const registry = createProviderRegistry({ anthropic }); const systemPrompt = prompts.define({ id: "my-chat-system", model: "anthropic:claude-sonnet-4-5", config: { temperature: 0.7 }, variables: z.object({ name: z.string() }), content: `You are a helpful assistant for {{name}}.`, }); export const myChat = chat.agent({ id: "my-chat", clientDataSchema: z.object({ userId: z.string() }), onChatStart: async ({ clientData }) => { const user = await db.user.findUnique({ where: { id: clientData.userId } }); const resolved = await systemPrompt.resolve({ name: user.name }); chat.prompt.set(resolved); }, run: async ({ messages, signal }) => { return streamText({ ...chat.toStreamTextOptions({ registry }), // system, model, config, telemetry messages, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` `chat.toStreamTextOptions()` returns an object with `system`, `model` (resolved via the registry), `temperature`, and `experimental_telemetry` — all from the stored prompt. Properties you set after the spread (like a client-selected model) take precedence. **Which form to call:** | Form | Use when | | ----------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `chat.toStreamTextOptions()` | Default. Wires up `prepareStep` (compaction, steering, background injection), the stored prompt's `system` / `model` / `config`, and telemetry metadata. | | `chat.toStreamTextOptions({ registry })` | You're using [Prompts](/docs/ai/prompts) with a provider-prefixed model string (e.g. `"anthropic:claude-sonnet-4-5"`). The registry resolves the prefix to a real model instance via `createProviderRegistry({ anthropic, openai, ... })`. | | `chat.toStreamTextOptions({ tools })` | You want HITL tool approvals — pass the same `tools` object you give to `streamText`. The SDK then knows which tool calls need to pause on `needsApproval: true`. | | `chat.toStreamTextOptions({ registry, tools })` | Both of the above. | See [Prompts](/docs/ai/prompts) for the full guide — defining templates, variable schemas, dashboard overrides, and the management SDK. ### Stop generation #### How stop works Calling `stop()` from `useChat` sends a stop signal to the running task via input streams. The task's `streamText` call aborts (if you passed `signal` or `stopSignal`), but the **run stays alive** and waits for the next message. The partial response is captured and accumulated normally. #### Abort signals The `run` function receives three abort signals: | Signal | Fires when | Use for | | -------------- | ------------------------------------------- | ---------------------------------------------------------------------- | | `signal` | Stop **or** cancel | Pass to `streamText` — handles both cases. **Use this in most cases.** | | `stopSignal` | Stop only (per-turn, reset each turn) | Custom logic that should only run on user stop, not cancellation | | `cancelSignal` | Run cancel, expire, or maxDuration exceeded | Cleanup that should only happen on full cancellation | ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", run: async ({ messages, signal, stopSignal, cancelSignal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal, // Handles both stop and cancel stopWhen: stepCountIs(15), }); }, }); ``` Use `signal` (the combined signal) in most cases. The separate `stopSignal` and `cancelSignal` are only needed if you want different behavior for stop vs cancel. #### Detecting stop in callbacks The `onTurnComplete` event includes a `stopped` boolean that indicates whether the user stopped generation during that turn: ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", onTurnComplete: async ({ chatId, uiMessages, stopped }) => { await db.chat.update({ where: { id: chatId }, data: { messages: uiMessages, lastStoppedAt: stopped ? new Date() : undefined }, }); }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` You can also check stop status from **anywhere** during a turn using `chat.isStopped()`. This is useful inside `streamText`'s `onFinish` callback where the AI SDK's `isAborted` flag can be unreliable (e.g. when using `createUIMessageStream` + `writer.merge()`): ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { streamText } from "ai"; export const myChat = chat.agent({ id: "my-chat", run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal, onFinish: ({ isAborted }) => { // isAborted may be false even after stop when using createUIMessageStream const wasStopped = isAborted || chat.isStopped(); if (wasStopped) { // handle stop — e.g. log analytics } }, stopWhen: stepCountIs(15), }); }, }); ``` #### Cleaning up aborted messages When stop happens mid-stream, the captured response message can contain parts in an incomplete state — tool calls stuck in `partial-call`, reasoning blocks still marked as `streaming`, etc. These can cause UI issues like permanent spinners. `chat.agent` automatically cleans up the `responseMessage` when stop is detected before passing it to `onTurnComplete`. If you use `chat.pipe()` manually and capture response messages yourself, use `chat.cleanupAbortedParts()`: ```ts theme={"theme":"css-variables"} const cleaned = chat.cleanupAbortedParts(rawResponseMessage); ``` This removes tool invocation parts stuck in `partial-call` state and marks any `streaming` text or reasoning parts as `done`. Stop signal delivery is best-effort. There is a small race window where the model may finish before the stop signal arrives, in which case the turn completes normally with `stopped: false`. This is expected and does not require special handling. ### Tool approvals Tools with `needsApproval: true` pause execution until the user approves or denies via the frontend. Define the tool as normal and pass it to `streamText` — `chat.agent` handles the rest: ```ts theme={"theme":"css-variables"} const sendEmail = tool({ description: "Send an email. Requires human approval.", inputSchema: z.object({ to: z.string(), subject: z.string(), body: z.string() }), needsApproval: true, execute: async ({ to, subject, body }) => { await emailService.send({ to, subject, body }); return { sent: true }; }, }); export const myChat = chat.agent({ id: "my-chat", run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, tools: { sendEmail }, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` When the model calls an approval-required tool, the turn completes with the tool in `approval-requested` state. After the user approves on the frontend, the updated message is sent back and `chat.agent` replaces it in the conversation accumulator by matching the message ID. `streamText` then executes the approved tool and continues. See [Tool approvals](/docs/ai-chat/frontend#tool-approvals) in the frontend docs for the UI setup. ### Persistence To build a chat app that survives page refreshes you persist two things, both server-side from inside the agent: 1. **Conversation state.** Full `UIMessage[]` keyed by `chatId`. Written from `onTurnStart` (so the user message is durable before streaming begins) and `onTurnComplete` (so the assistant reply lands). 2. **Session state.** The transport's reconnect metadata: `publicAccessToken` and `lastEventId`. Written alongside the messages from the same hooks. Sessions let the transport reconnect to an existing run after a page refresh. Without them, every page load would start a new run, losing the conversation context that was accumulated in the previous run. For the full per-hook breakdown, race-condition warnings (atomic `lastEventId` writes, why not to use `chat.defer` in `onTurnStart`), token renewal via the `accessToken` callback, and an end-to-end three-file example, see [Database persistence](/docs/ai-chat/patterns/database-persistence). ### Pending messages (steering) Users can send messages while the agent is executing tool calls. With `pendingMessages`, these messages are injected between tool-call steps, steering the agent mid-execution: ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", pendingMessages: { shouldInject: ({ steps }) => steps.length > 0, }, run: async ({ messages, signal }) => { return streamText({ ...chat.toStreamTextOptions({ registry }), messages, tools: { /* ... */ }, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` On the frontend, the `usePendingMessages` hook handles sending, tracking, and rendering injection points. See [Pending Messages](/docs/ai-chat/pending-messages) for the full guide — backend configuration, frontend hook, queuing vs steering, and how injection works with all three chat variants. ### Background injection Inject context from background work into the conversation using `chat.inject()`. Combine with `chat.defer()` to run analysis between turns and inject results before the next response — self-review, RAG augmentation, safety checks, etc. ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", onTurnComplete: async ({ messages }) => { chat.defer( (async () => { const review = await generateObject({ /* ... */ }); if (review.object.needsImprovement) { chat.inject([ { role: "system", content: `[Self-review]\n${review.object.suggestions.join("\n")}`, }, ]); } })() ); }, run: async ({ messages, signal }) => { return streamText({ ...chat.toStreamTextOptions({ registry }), messages, abortSignal: signal }); }, }); ``` See [Background Injection](/docs/ai-chat/background-injection) for the full guide — timing, self-review example, and how it differs from pending messages. ### Actions Custom actions let the frontend send structured commands (undo, rollback, edit, regenerate) that modify the conversation state. **Actions are not turns**: they fire `hydrateMessages` (if set) and `onAction` only. The full surface (defining `actionSchema`, returning a model response from `onAction`, gating against pending HITL tool calls, and sending actions from the frontend) lives on its own page. See [Actions](/docs/ai-chat/actions). ### Chat history Imperative API for reading and modifying the accumulated message history. Works from any hook (`onAction`, `onTurnStart`, `onBeforeTurnComplete`, `onTurnComplete`, `hydrateMessages`) or from `run()` and AI SDK tools. The agent's accumulator — not `session.out` — is the source of truth for the full conversation. The `.out` stream is a bounded sliding window (roughly one turn at steady state, see [Records on `session.out`](/docs/ai-chat/client-protocol#records-on-session-out)); the durable history lives in the agent's accumulator and is persisted to S3 between turns for fast next-run boots. `chat.history` reads and mutates that accumulator directly. **Reads.** Synchronous against the current accumulator state. | Method | Description | | --------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `chat.history.all()` | Returns a copy of the current accumulated UI messages. | | `chat.history.getChain()` | Same as `all()`. Use whichever name reads better in context. | | `chat.history.findMessage(messageId)` | Returns the message with that id, or `undefined`. | | `chat.history.getPendingToolCalls()` | Tool calls on the most recent assistant message that are still in `input-available` state (waiting on `addToolOutput`). | | `chat.history.getResolvedToolCalls()` | All tool calls in the chain in `output-available` or `output-error` state. | | `chat.history.extractNewToolResults(message)` | Tool results in `message` whose `toolCallId` is not already resolved in the chain. Most useful in `hydrateMessages` against an incoming wire message, before the runtime merges it. | Each pending and resolved entry is shaped `{ toolCallId, toolName, messageId }`. Each new-result entry is `{ toolCallId, toolName, output, errorText? }`, where `errorText` is set only for `output-error` parts. **Mutations.** Applied at lifecycle checkpoints (after hooks return). Multiple mutations in the same hook compose correctly. | Method | Description | | ------------------------------------------ | ------------------------------------------------------ | | `chat.history.set(messages)` | Replace all messages. Same as `chat.setMessages()`. | | `chat.history.remove(messageId)` | Remove a specific message by ID. | | `chat.history.rollbackTo(messageId)` | Keep messages up to and including the given ID (undo). | | `chat.history.replace(messageId, message)` | Replace a specific message by ID (edit). | | `chat.history.slice(start, end?)` | Keep only messages in the given range. | ```ts theme={"theme":"css-variables"} // Undo the last exchange in onAction onAction: async ({ action }) => { if (action.type === "undo") { chat.history.slice(0, -2); } }, // Trim history in onTurnComplete onTurnComplete: async ({ uiMessages }) => { if (uiMessages.length > 50) { chat.history.slice(-20); } }, ``` The HITL reads let an action or hook decide what to do without walking the accumulator manually: ```ts theme={"theme":"css-variables"} // Refuse a regenerate while a tool call is still awaiting an answer onAction: async ({ action }) => { if (action.type === "regenerate") { if (chat.history.getPendingToolCalls().length > 0) return; chat.history.slice(0, -1); } }, // Side-effect once per net-new tool result when wire messages come in hydrateMessages: async ({ incomingMessages }) => { for (const msg of incomingMessages) { for (const r of chat.history.extractNewToolResults(msg)) { await onToolResolved({ id: r.toolCallId, output: r.output, errorText: r.errorText }); } } return incomingMessages; }, ``` `extractNewToolResults` compares against the *current* chain. Inside `onTurnComplete`, the chain already contains the just-finished `responseMessage`, so it returns `[]`. Use it where the message is from outside the accumulator: `hydrateMessages` (incoming wire), `onAction` if the action carries a message, or any custom pre-merge code path. ### prepareMessages Transform model messages before they're used anywhere — in `run()`, in compaction rebuilds, and in compaction results. Define once, applied everywhere. Use this for Anthropic cache breaks, injecting system context, stripping PII, etc. ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", prepareMessages: ({ messages, reason }) => { // Add Anthropic cache breaks to the last message if (messages.length === 0) return messages; const last = messages[messages.length - 1]; return [ ...messages.slice(0, -1), { ...last, providerOptions: { ...last.providerOptions, anthropic: { cacheControl: { type: "ephemeral" } }, }, }, ]; }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` The `reason` field tells you why messages are being prepared: | Reason | Description | | ---------------------- | ------------------------------------------------- | | `"run"` | Messages being passed to `run()` for `streamText` | | `"compaction-rebuild"` | Rebuilding from a previous compaction summary | | `"compaction-result"` | Fresh compaction just produced these messages | ### Version upgrades Chat agent runs are pinned to the worker version they started on. When you deploy a new version, suspended runs resume on the old code. Call `chat.requestUpgrade()` in `onTurnStart` to skip `run()` and exit immediately — the transport re-triggers the same message on the latest version. See the [Version Upgrades pattern](/docs/ai-chat/patterns/version-upgrades) for the full guide. ### Ending a run on your terms By default, a chat agent stays idle after each turn waiting for the next user message. Call `chat.endRun()` from `run()`, `chat.defer()`, `onBeforeTurnComplete`, or `onTurnComplete` to exit the loop once the current turn finishes — no upgrade signal, no idle wait. ```ts theme={"theme":"css-variables"} chat.agent({ id: "one-shot", run: async ({ messages, signal }) => { // Single-response agent — exit after this turn. chat.endRun(); return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` The current turn streams through normally, `onBeforeTurnComplete` / `onTurnComplete` fire, the turn-complete chunk is written, and the run exits instead of suspending. The next user message on the same `chatId` starts a fresh run via the standard continuation flow. Use this when the agent knows its work is done (budget exhausted, goal achieved, one-shot response) rather than relying on the idle timeout. Unlike `chat.requestUpgrade()`, no `upgrade-required` signal is sent to the client, so there's no version-migration semantics. If you persist `lastEventId` to your own storage for cross-page-load resume, **don't clear it on `chat.endRun()`**. The cursor is sessionId-keyed and stays valid across Run boundaries — clearing it forces the next `sendMessages` to subscribe from `seq_num=0`, where it may hit the prior turn's stale `turn-complete` record and close the stream empty before the new Run's chunks arrive. ### Runtime configuration #### chat.setTurnTimeout() Override how long the run stays suspended waiting for the next message. Call from inside `run()`: ```ts theme={"theme":"css-variables"} run: async ({ messages, signal }) => { chat.setTurnTimeout("2h"); // Wait longer for this conversation return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, ``` #### chat.setIdleTimeoutInSeconds() Override how long the run stays idle (active, using compute) after each turn: ```ts theme={"theme":"css-variables"} run: async ({ messages, signal }) => { chat.setIdleTimeoutInSeconds(60); // Stay idle for 1 minute return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, ``` Longer idle timeout means faster responses but more compute usage. Set to `0` to suspend immediately after each turn (minimum latency cost, slight delay on next message). #### Stream options Control how `streamText` results are converted to the frontend stream via `toUIMessageStream()`. Set static defaults on the task, or override per-turn. ##### Error handling with onError When `streamText` encounters an error mid-stream (rate limits, API failures, network errors), the `onError` callback converts it to a string that's sent to the frontend as an `{ type: "error", errorText }` chunk. The AI SDK's `useChat` receives this via its `onError` callback. By default, the raw error message is sent to the frontend. Use `onError` to sanitize errors and avoid leaking internal details: ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", uiMessageStreamOptions: { onError: (error) => { // Log the full error server-side for debugging console.error("Stream error:", error); // Return a sanitized message — this is what the frontend sees if (error instanceof Error && error.message.includes("rate limit")) { return "Rate limited — please wait a moment and try again."; } return "Something went wrong. Please try again."; }, }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` `onError` is also called for tool execution errors, so a single handler covers both LLM errors and tool failures. On the frontend, handle the error in `useChat`: ```tsx theme={"theme":"css-variables"} const { messages, sendMessage } = useChat({ transport, onError: (error) => { // error.message contains the string returned by your onError handler toast.error(error.message); }, }); ``` ##### Reasoning and sources Control which AI SDK features are forwarded to the frontend: ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", uiMessageStreamOptions: { sendReasoning: true, // Forward model reasoning (default: true) sendSources: true, // Forward source citations (default: false) }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` ##### Custom message IDs By default, response message IDs are generated using the AI SDK's built-in `generateId`. Pass a custom `generateMessageId` function to use your own ID format (e.g. UUID-v7): ```ts theme={"theme":"css-variables"} import { v7 as uuidv7 } from "uuid"; export const myChat = chat.agent({ id: "my-chat", uiMessageStreamOptions: { generateMessageId: () => uuidv7(), }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` With the `.withUIMessage()` builder, set it under `streamOptions`: ```ts theme={"theme":"css-variables"} import { v7 as uuidv7 } from "uuid"; export const myChat = chat .withUIMessage({ streamOptions: { generateMessageId: () => uuidv7(), sendReasoning: true, }, }) .agent({ id: "my-chat", run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` The generated ID is sent to the frontend in the stream's `start` chunk, so frontend and backend always reference the same ID for each message. This is important for features like tool approvals, where the frontend resends an assistant message and the backend needs to match it by ID in the conversation accumulator. ##### Per-turn overrides Override per-turn with `chat.setUIMessageStreamOptions()` — per-turn values merge with the static config (per-turn wins on conflicts). The override is cleared automatically after each turn. ```ts theme={"theme":"css-variables"} run: async ({ messages, clientData, signal }) => { // Enable reasoning only for certain models if (clientData.model?.includes("claude")) { chat.setUIMessageStreamOptions({ sendReasoning: true }); } return streamText({ model: openai(clientData.model ?? "gpt-4o"), messages, abortSignal: signal }); }, ``` `chat.setUIMessageStreamOptions()` works across all abstraction levels — `chat.agent()`, `chat.createSession()` / `turn.complete()`, and `chat.pipeAndCapture()`. See [ChatUIMessageStreamOptions](/docs/ai-chat/reference#chatuimessagestreamoptions) for the full reference. `onFinish` is managed internally for response capture and cannot be overridden here. Use `streamText`'s `onFinish` callback for custom finish handling, or use [raw task mode](/docs/ai-chat/custom-agents) for full control over `toUIMessageStream()`. ### Manual mode with task() If you need full control over task options, use the standard `task()` with `ChatTaskPayload` and `chat.pipe()`: ```ts theme={"theme":"css-variables"} import { task } from "@trigger.dev/sdk"; import { chat, type ChatTaskPayload } from "@trigger.dev/sdk/ai"; import { streamText } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; export const manualChat = task({ id: "manual-chat", retry: { maxAttempts: 3 }, queue: { concurrencyLimit: 10 }, run: async (payload: ChatTaskPayload) => { const result = streamText({ model: anthropic("claude-sonnet-4-5"), messages: payload.messages, stopWhen: stepCountIs(15), }); await chat.pipe(result); }, }); ``` Manual mode does not get automatic message accumulation or the `onTurnComplete`/`onChatStart` lifecycle hooks. The `responseMessage` field in `onTurnComplete` will be `undefined` when using `chat.pipe()` directly. Use `chat.agent()` for the full multi-turn experience. *** ## Custom agents Both lower levels — `chat.createSession()` (managed turn iterator, your turn body) and `chat.customAgent()` with raw primitives (hand-rolled loop, full stream-conversion control) — are covered together on the Custom agents page, including the `ChatTurn` surface, the continuation-seeding pattern, and the hand-rolled-loop checklist: Build agents without the managed lifecycle — createSession or raw primitives. # Background injection Source: https://trigger.dev/docs/ai-chat/background-injection Inject context from background work into the agent's conversation — self-review, RAG augmentation, or any async analysis. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. ## Overview `chat.inject()` queues model messages for injection into the conversation. Messages are picked up at the start of the next turn or at the next `prepareStep` boundary (between tool-call steps). This is the backend counterpart to [pending messages](/docs/ai-chat/pending-messages) — pending messages come from the user via the frontend, while `chat.inject()` comes from your task code. ## Basic usage ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; // Queue a system message for injection chat.inject([ { role: "system", content: "The user's account was just upgraded to Pro.", }, ]); ``` Messages are appended to the model messages before the next LLM inference call. The LLM sees them as part of the conversation context. ## Common pattern: defer + inject The most powerful pattern combines `chat.defer()` (background work) with `chat.inject()` (inject results). Background work runs in parallel with the idle wait between turns, and results are injected before the next response. ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", onTurnComplete: async ({ messages }) => { // Kick off background analysis — doesn't block the turn chat.defer( (async () => { const analysis = await analyzeConversation(messages); chat.inject([ { role: "system", content: `[Analysis of conversation so far]\n\n${analysis}`, }, ]); })() ); }, run: async ({ messages, signal }) => { return streamText({ ...chat.toStreamTextOptions({ registry }), messages, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` ### Timing 1. Turn completes, `onTurnComplete` fires 2. `chat.defer()` registers the background work 3. The run immediately starts waiting for the next message (no blocking) 4. Background work completes, `chat.inject()` queues the messages 5. User sends next message, turn starts 6. Injected messages are appended before `run()` executes 7. The LLM sees the injected context alongside the new user message If the background work finishes *during* a tool-call loop (not between turns), the messages are picked up at the next `prepareStep` boundary instead. ## Example: self-review A cheap model reviews the agent's response after each turn and injects coaching for the next one. Uses [Prompts](/docs/ai/prompts) for the review prompt and `generateObject` for structured output. ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { prompts } from "@trigger.dev/sdk"; import { streamText, generateObject, createProviderRegistry, stepCountIs } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; import { z } from "zod"; const registry = createProviderRegistry({ anthropic }); const selfReviewPrompt = prompts.define({ id: "self-review", model: "anthropic:claude-haiku-4-5", content: `You are a conversation quality reviewer. Analyze the assistant's most recent response. Focus on: - Whether the response answered the user's question - Missed opportunities to use tools or provide more detail - Tone mismatches Be concise. Only flag issues worth fixing.`, }); export const myChat = chat.agent({ id: "my-chat", onTurnComplete: async ({ messages }) => { chat.defer( (async () => { const resolved = await selfReviewPrompt.resolve({}); const review = await generateObject({ model: registry.languageModel(resolved.model ?? "anthropic:claude-haiku-4-5"), ...resolved.toAISDKTelemetry(), system: resolved.text, prompt: messages .filter((m) => m.role === "user" || m.role === "assistant") .map((m) => { const text = typeof m.content === "string" ? m.content : Array.isArray(m.content) ? m.content .filter((p: any) => p.type === "text") .map((p: any) => p.text) .join("") : ""; return `${m.role}: ${text}`; }) .join("\n\n"), schema: z.object({ needsImprovement: z.boolean(), suggestions: z.array(z.string()), }), }); if (review.object.needsImprovement) { chat.inject([ { role: "system", content: `[Self-review]\n\n${review.object.suggestions.map((s) => `- ${s}`).join("\n")}\n\nApply these naturally.`, }, ]); } })() ); }, run: async ({ messages, signal }) => { return streamText({ ...chat.toStreamTextOptions({ registry }), messages, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` The self-review runs on `claude-haiku-4-5` (fast, cheap) in the background. If the user sends another message before it completes, the coaching is still injected — `chat.inject()` persists across the idle wait. ## Other use cases * **RAG augmentation**: After each turn, fetch relevant documents and inject them as context for the next response * **Safety checks**: Run a moderation model on the response, inject warnings if issues are detected * **Fact-checking**: Verify claims in the response using search tools, inject corrections * **Context enrichment**: Look up user/account data based on what was discussed, inject it as system context ## `chat.defer` standalone `chat.defer()` is also useful on its own, without `chat.inject()`. Any work whose timing has no resume implication — analytics, audit logs, search-index writes, cache warming — can run in parallel with streaming instead of in the critical path. All deferred promises are awaited (with a 5s timeout) before `onTurnComplete` fires. ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", onTurnStart: async ({ chatId, runId }) => { // Analytics — fire-and-forget, irrelevant to resume. chat.defer(analytics.track("turn_started", { chatId, runId })); }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` `chat.defer()` can be called from anywhere during a turn — hooks, `run()`, or nested helpers. All deferred promises are collected and awaited together before `onTurnComplete`. **Don't use `chat.defer()` for the message-history write in `onTurnStart`.** That write must land *before* the model starts streaming, otherwise a mid-stream page refresh will read `[]` from your DB and lose the user's message from the rendered conversation. See [Database persistence — `onTurnStart`](/docs/ai-chat/patterns/database-persistence#onturnstart). Reserve `chat.defer` for writes whose timing has no resume implication. ## How it differs from pending messages | | `chat.inject()` | [Pending messages](/docs/ai-chat/pending-messages) | | ----------------------- | --------------------------------------------------- | --------------------------------------------- | | **Source** | Backend task code | Frontend user input | | **Triggered by** | Your code (e.g. `onTurnComplete` + `chat.defer()`) | User sending a message during streaming | | **Injection point** | Start of next turn, or next `prepareStep` boundary | Next `prepareStep` boundary only | | **Message role** | Any (`system`, `user`, `assistant`) | Typically `user` | | **Frontend visibility** | Not visible unless you write custom `data-*` chunks | Visible via `usePendingMessages` hook | ## API reference ### chat.inject() ```ts theme={"theme":"css-variables"} chat.inject(messages: ModelMessage[]): void ``` Queue model messages for injection at the next opportunity. Messages persist across the idle wait between turns — they are not reset when a new turn starts. **Parameters:** | Parameter | Type | Description | | ---------- | ---------------- | ------------------------------------------------ | | `messages` | `ModelMessage[]` | Model messages to inject (from the `ai` package) | Messages are drained (consumed) when: 1. A new turn starts — before `run()` executes 2. A `prepareStep` boundary is reached — between tool-call steps during streaming `chat.inject()` writes to an in-memory queue in the current process. It works from any code running in the same task — lifecycle hooks, deferred work, tool execute functions, etc. It does not work from subtasks or other runs. # Changelog Source: https://trigger.dev/docs/ai-chat/changelog Pre-release updates for AI chat agents. ## chat.agent reliability fixes A batch of fixes for edge cases around message delivery, stopping, and error handling: * **No more duplicate turns from mid-stream sends.** A user message sent while the agent was streaming could be delivered twice — once via steering and again on the next turn — running a duplicate turn. Delivery is now deduplicated. * **Idempotent input appends.** Sends to `session.in` carry an idempotency key, so a client retry after a network blip can't append the same message twice. * **Stop clears streaming state.** Stopping a generation now clears the session's streaming snapshot, so a page reload right after a stop no longer replays the stopped turn. * **`onTurnComplete` fires on errored turns.** When `run()` or a lifecycle hook throws, `onTurnComplete` now runs with `error` carrying the thrown value and `finishReason: "error"`, and the failed turn's user message is persisted so it isn't lost on the next run. Use this to mark the turn failed in your own storage. See [error handling](/docs/ai-chat/error-handling#using-onturncomplete). ```ts theme={"theme":"css-variables"} onTurnComplete: async ({ chatId, uiMessages, stopped, error }) => { await db.chat.update({ where: { id: chatId }, data: { messages: uiMessages, lastTurnStatus: error ? "errored" : stopped ? "stopped" : "ok", }, }); }, ``` * **Full tag sets on chat runs.** Runs triggered by chat sessions can now carry the full set of dashboard tags instead of being silently truncated. * **Stream hygiene for custom agents.** Manual `chat.writeTurnComplete()` callers now trim the output stream the way `chat.agent` does, sending a custom action no longer leaves a second stream reader running, and a long-lived `watch` subscription no longer grows its dedupe set without bound. ## Continuation boots no longer stall Continuation runs (after a cancel, crash, or version upgrade) used to stall around 10 seconds before the first turn: finding the `session.in` resume cursor drained an SSE long-poll that always waited out its full 5 second inactivity window, twice per boot. The cursor is now found with a non-blocking records read, the boot reads run concurrently, and chat snapshots carry the cursor so subsequent boots skip the scan entirely. ## chat.headStart: hydration and reasoning fixes Two fixes for the [Head Start](/docs/ai-chat/fast-starts) handover: * With `hydrateMessages` registered, the warm route's step-1 partial now reaches the agent's accumulator, so `onTurnComplete` carries the full first turn, tool-call handovers resume from step 2 instead of re-running step 1, and the assistant `messageId` stays stable across the handover. * Extended-thinking models' step-1 reasoning now lands in the durable session history (and `onTurnComplete`) under the same assistant `messageId`, with provider metadata intact so Anthropic thinking signatures survive replays. ## chat.createSession: stop and continuation fixes Stopping a generation no longer wedges the run: `turn.complete()` bare-awaited the AI SDK's `totalUsage` promise, which never settles after a stop-abort, so the loop hung inside the stopped turn and the chat couldn't take another message. It's now raced with a timeout, the same guard `chat.agent`'s turn loop uses. Continuation runs also no longer invoke the model with an empty prompt: a message-less continuation boot now waits for the next session input, and `turn.continuation` is preserved so your loop can seed stored history on the first turn: ```ts theme={"theme":"css-variables"} for await (const turn of session) { if (turn.continuation && turn.number === 0) { const stored = await loadMessages(turn.chatId); const incoming = turn.uiMessages.filter((m) => !stored.some((s) => s.id === m.id)); await turn.setMessages([...stored, ...incoming]); } // ... streamText + turn.complete as usual } ``` See [chat.createSession](/docs/ai-chat/backend#chat-createsession). ## trigger skills: agent skills for your coding assistant The CLI's new `trigger skills` command installs Trigger.dev agent skills — including the chat.agent authoring skill — into your coding assistant's native skills directory (Claude Code, Cursor, GitHub Copilot, and AGENTS-compatible tools such as Codex). The skills ship inside the CLI, versioned with it, and `trigger dev` offers to install them on first run. `trigger init` can now also set up the MCP server and skills as part of project scaffolding. ```bash theme={"theme":"css-variables"} npx trigger.dev@4.5.0-rc.6 skills ``` ## AI SDK 7 support `chat.agent` and the chat surfaces now work against Vercel AI SDK 7. The `ai` peer range widened to include v7, so you can build your agent against v5, v6, or v7 with the same `@trigger.dev/sdk/ai`, `chat`, and `chat/react` imports; your installed `ai` major drives the types. v5 and v6 are unchanged. On v7, model-call spans moved out of `ai` core into the separate `@ai-sdk/otel` adapter, so `experimental_telemetry` alone produces nothing until an integration is registered. Install `@ai-sdk/otel` alongside `ai@7` and the SDK registers it for you once per worker at chat agent boot, so your `streamText` spans keep flowing into the run trace with no extra setup: ```sh theme={"theme":"css-variables"} npm install @ai-sdk/otel ``` If you (or a library you import) already register `@ai-sdk/otel`, the SDK detects the existing integration and skips its own registration, so you won't get duplicate spans. Set `TRIGGER_AI_SDK_OTEL_AUTOREGISTER=0` to disable auto-registration entirely. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and [AI SDK 7 telemetry](/docs/ai-chat/reference#ai-sdk-7-telemetry) in the reference. Task-backed tools wired in with `ai.toolExecute` also propagate their tool `context` on v7, which renamed the field from v6's `experimental_context`. ## `useTriggerChatTransport` recovers a stale session When a chat's restored session state pointed at a session that no longer exists in the current environment (restored from a different environment, or from before the sessions model), the transport assumed it was live and never created a real one, so the next message 404'd and the chat could not send. The transport now treats a 404 from a session call as a missing session: after the existing token refresh it recreates the session via `startSession`, drops the stale resume cursor, and retries the send once. ## `tools` option on `chat.agent`: `toModelOutput` survives across turns `chat.agent` now takes a `tools` option. Until now tools only went to `streamText` inside `run()`, which meant the SDK had no tools when it re-converted the persisted `UIMessage` history at the start of each turn. Any tool with a `toModelOutput` (raw image bytes turned into an image content part, or a sub-agent transcript compressed to a summary) had its transform applied on turn 1 and skipped from turn 2 onward, so the raw output got stringified back into the prompt. Declare your tools on the config and the SDK threads them into that conversion, so `toModelOutput` is re-applied every turn. The resolved set is handed back, typed, on the `run()` payload as `tools`, so you declare them once: ```ts theme={"theme":"css-variables"} const tools = { searchDocs, renderChart }; export const myChat = chat.agent({ tools, run: async ({ messages, tools, signal }) => streamText({ ...chat.toStreamTextOptions({ tools }), messages, abortSignal: signal }), }); ``` `tools` also accepts a per-turn function (`(event) => ToolSet`) for tools that depend on the user or a feature flag. Only `inputSchema` and `toModelOutput` are read during conversion, never `execute`. No behavior change for agents that don't declare `tools`. A new `InferChatUIMessageFromTools` helper derives the chat `UIMessage` type (with typed tool parts) directly from a tool set. See the new [Tools](/docs/ai-chat/tools) guide. ## HITL continuations — slim wire by default + field-level merge `chat.addToolOutput(...)` and `chat.addToolApproveResponse(...)` continuations on reasoning-heavy agent loops used to fail two ways: either the wire body crossed the `/in/append` cap (encrypted reasoning blobs + tool input routinely > 512 KiB), or apps that slimmed the wire as a workaround landed a tool call with no `arguments` on the next LLM step (the per-turn merge replaced the hydrated message wholesale instead of overlaying only the new tool-state advance). Both modes are fixed. The transport (`TriggerChatTransport.sendMessages`, `AgentChat.sendRaw`) now slims the assistant message itself on `submit-message` turns whose assistant carries resolved or approval-responded tool parts. The wire shape ships as `{ id, role: "assistant", parts: [] }` — `state` plus `output` / `errorText` / `approval`, depending on the new state. Everything else (reasoning blobs, prior text, tool `input`, provider metadata) is reconstructed server-side from `hydrateMessages` or the durable snapshot. Continuation payloads typically drop from 600 KiB – 1 MiB to \~1 KiB. The per-turn merge now overlays only the tool-part state advances (`output-available` / `output-error` / `approval-responded` / `output-denied`) from the wire copy onto the matching hydrated entry. Hydrated `input`, text, reasoning, and provider metadata stay put. The agent still accepts a fuller `UIMessage` on the wire (the merge only reads the resolved fields), so custom transports that ship more don't break — they just waste bytes. ### `hydrateMessages` upsert-by-id If your `hydrateMessages` hook persists the incoming message, **upsert by id** — don't unconditionally push. HITL continuations ship the existing assistant's id with a slim payload; a blind `stored.push(newMsg)` duplicates the row in the chain you return, the merge updates the first match, and the slim duplicate hits `toModelMessages` with no `input`. A new `upsertIncomingMessage` helper is exported from `@trigger.dev/sdk/ai` to handle this for the common case: ```ts theme={"theme":"css-variables"} import { chat, upsertIncomingMessage } from "@trigger.dev/sdk/ai"; chat.agent({ hydrateMessages: async ({ chatId, trigger, incomingMessages }) => { const record = await db.chat.findUnique({ where: { id: chatId } }); const stored = record?.messages ?? []; if (upsertIncomingMessage(stored, { trigger, incomingMessages })) { await db.chat.update({ where: { id: chatId }, data: { messages: stored } }); } return stored; }, }); ``` The helper pushes fresh user messages, no-ops on HITL continuations (so the runtime can overlay the new tool-state advance), and skips on non-`submit-message` triggers. Returns `true` if it mutated `stored`. The examples in [lifecycle hooks](/docs/ai-chat/lifecycle-hooks#hydratemessages), [Database persistence](/docs/ai-chat/patterns/database-persistence#alternative-hydratemessages), and [Persistence and replay](/docs/ai-chat/patterns/persistence-and-replay) have all been updated. Custom hydrate logic (branching, rollback, etc.) can still write the upsert by hand — the helper is a convenience for the common shape. ### `onValidateMessages` slim wire caveat The slim wire is what arrives in `onValidateMessages` on HITL turns. `validateUIMessages` from `ai` rejects the slim shape (the AI SDK schema requires `input` on resolved tool parts), so filter to user messages first (or skip validation entirely on those turns). See the updated example in [lifecycle hooks](/docs/ai-chat/lifecycle-hooks#onvalidatemessages). ### `/in/append` 413 + precise cap In parallel: * The 413 response now carries CORS headers, so browser fetches can read the status instead of failing as opaque `TypeError: Failed to fetch`. App-side retry-on-disconnect loops no longer spin forever on a permanently-rejected payload. * The per-record cap is now computed precisely against S2's actual ceiling instead of the conservative 512 KiB floor. Legitimate \~600 – 900 KiB tool outputs (search results, file content) now succeed; pathological all-quote content that would double under JSON escape still rejects cleanly with a clear error. See the updated [413 row in the client protocol](/docs/ai-chat/client-protocol#step-3-send-messages-stops-and-actions). ## v4.5.0-rc.1 — two bug fixes Patch release on top of `4.5.0-rc.0`. Upgrade with: ```sh theme={"theme":"css-variables"} npx trigger.dev@4.5.0-rc.1 update # npm pnpm dlx trigger.dev@4.5.0-rc.1 update # pnpm yarn dlx trigger.dev@4.5.0-rc.1 update # yarn bunx trigger.dev@4.5.0-rc.1 update # bun ``` ### Fixes * **Agent Skills silently missing in `trigger dev`** for projects whose task files read `process.env` at module top level (e.g. a third-party SDK client initialized at import). [Skill folders](/docs/ai-chat/patterns/skills) now bundle into `.trigger/skills/` reliably regardless of which env vars are set when the CLI launches. ([#3690](https://github.com/triggerdotdev/trigger.dev/pull/3690)) * **`COULD_NOT_FIND_EXECUTOR`** when a task's definition is loaded via `await import(...)` from inside another task's `run()` — common when lazy-loading sub-agent tasks. Runtime workers now register such tasks with a sentinel file context, and the catalog logs a one-time warning per task id. ([#3688](https://github.com/triggerdotdev/trigger.dev/pull/3688)) ## v4.5.0-rc.0 — AI Agents graduate from chat-prerelease First release candidate of v4.5. Everything covered by the `0.0.0-chat-prerelease-*` entries below now ships under a stable semver tag. Install: ```bash theme={"theme":"css-variables"} pnpm add @trigger.dev/sdk@rc ``` (Or pin `4.5.0-rc.0` explicitly.) ### What's in the box * **`chat.agent`** — multi-turn AI chat backends as durable Trigger.dev tasks. Lifecycle hooks, recovery from cancel/crash/OOM, version upgrades, all in. See [Overview](/docs/ai-chat/overview) and [Quick Start](/docs/ai-chat/quick-start). * **Sessions** — the durable bi-directional stream primitive that backs `chat.agent`. Use it directly for any pattern that needs durable bi-directional streaming across runs. See [Sessions](/docs/ai-chat/sessions). * **`useTriggerChatTransport`** — a custom AI SDK `ChatTransport` for `useChat`. No API routes. See [Frontend](/docs/ai-chat/frontend). * **Head Start** — opt-in route handler that runs the first `streamText` step in your warm server while the agent boots in parallel. Cuts cold-start TTFC roughly in half. See [Fast starts](/docs/ai-chat/fast-starts#head-start). * **AI Prompts** — code-defined, deploy-versioned templates with dashboard overrides for text + model. Integrates with `chat.agent` via `chat.prompt.set()` + `chat.toStreamTextOptions()`. See [Prompts](/docs/ai/prompts). * **`ai.toolExecute`** — wire any Trigger subtask in as the `execute` of an AI SDK `tool()`. See [Sub-agents](/docs/ai-chat/patterns/sub-agents). ### Compatibility `@trigger.dev/sdk@4.5.0-rc.0` requires `ai` `^5.0.0 || ^6.0.0` (Vercel AI SDK), React `^18.0 || ^19.0` (for the `chat/react` subpath), and Node.js `>=18.20.0`. Full matrix on the [API Reference](/docs/ai-chat/reference#compatibility). ### Docs This release ships with a refreshed AI Agents documentation set covering [Backend](/docs/ai-chat/backend), [Frontend](/docs/ai-chat/frontend), [Sessions](/docs/ai-chat/sessions), [Lifecycle hooks](/docs/ai-chat/lifecycle-hooks), [`chat.local`](/docs/ai-chat/chat-local), the [Patterns](/docs/ai-chat/patterns/sub-agents) library, [Testing](/docs/ai-chat/testing), and a full [API Reference](/docs/ai-chat/reference). ## Recovery boot — context-preserving continuation after cancel / crash / OOM When a `chat.agent` run dies mid-stream (the user cancels, the worker OOMs, an unhandled exception kills the process), the next continuation run now reconstructs the conversation context automatically. Follow-ups like "keep going" continue the partial response; fresh follow-ups like "scrap that, what's 7+8?" abandon it and answer the new question. No customer code required. Under the hood: the boot now reads BOTH stream tails — `session.out` for any partial assistant the dead run was streaming, `session.in` for any user messages it never acknowledged — and splices `[firstInFlightUser, partialAssistant]` onto the chain when both are present. The model sees full prior context plus the latest user message. For policies different from "preserve context" — drop the partial entirely, synthesize tool results for an interrupted tool call, emit a recovery banner to the UI — register the new `onRecoveryBoot` hook: ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; export const myChat = chat.agent({ id: "my-chat", onRecoveryBoot: async ({ partialAssistant, inFlightUsers, writer, cause, previousRunId }) => { writer.write({ type: "data-chat-recovery", data: { cause, previousRunId, partialPresent: partialAssistant !== undefined }, transient: true, }); // return nothing → smart default applies }, run: async ({ messages, signal }) => streamText({ model, messages, abortSignal: signal }), }); ``` The hook receives `settledMessages`, `inFlightUsers`, `partialAssistant`, `pendingToolCalls`, `previousRunId`, `cause`, and a lazy `writer`. Return any of `chain`, `recoveredTurns`, or `beforeBoot` to override the default. Agents using `hydrateMessages` skip the hook — customer-owned persistence is the source of truth. Also retracts the OOM resilience caveat: model context on retry is no longer "incomplete" without `hydrateMessages`. The smart default reconstructs full context from `session.out` replay. See [Recovery boot](/docs/ai-chat/patterns/recovery-boot) for the full guide. ## `session.out` is now bounded — header-form control records + per-turn trim Long-lived chats were accumulating `session.out` records forever (every turn appends; nothing trimmed). The Sessions dashboard re-streamed the entire history from `seq_num=0` on every page load, and OOM-retry boot scanned the whole stream to find the last turn-complete. After this release `session.out` stays roughly **one turn long forever** at steady state. After each `turn-complete`, the agent appends an S2 `trim` command record pointing back to the previous turn-complete's seq\_num. Full conversation history continues to live in the durable S3 snapshot, not on the stream. Resume across a single turn boundary still works (the previous `turn-complete` is still on the stream and S2's eventually-consistent trim window gives 10-60s of grace); resume across multiple turns of inactivity falls back to the snapshot. ### What changed on the wire `trigger:turn-complete` and `trigger:upgrade-required` are no longer JSON data chunks on `session.out`. They're now **header-form control records** under a uniform `trigger-control` namespace: ``` headers: ["trigger-control", "turn-complete"] ["public-access-token", "eyJ..."] // optional, refreshed JWT on turn-complete body: "" ``` ``` headers: ["trigger-control", "upgrade-required"] body: "" ``` The control event names ("turn-complete", "upgrade-required") are unchanged conceptually — they just moved from `chunk.type` into a `trigger-control` header value. Body is always empty; metadata that previously rode in the chunk (e.g. `publicAccessToken`) now rides on sibling headers. `turn-complete` also picks up a new optional sibling header — `["session-in-event-id", ""]` — carrying the agent's committed-consume cursor on `.in` as of this turn. It's an agent-internal contract that lets the next worker boot seed its `.in` SSE subscription past already-processed user messages, without relying on a wall-clock-derived dedup cutoff. Custom transports should ignore the header; it has no client-side meaning. ### Custom transport implementers Built-in SDK transports (`TriggerChatTransport`, `AgentChat`) handle this transparently — `onTurnComplete` fires the same way with the same payload. Custom transports filtering on `chunk.type === "trigger:turn-complete"` need to switch to the header-based filter: ```ts theme={"theme":"css-variables"} import { controlSubtype } from "@trigger.dev/core/v3"; const control = controlSubtype(record.headers); if (control === "turn-complete") { // refresh token from record.headers, end turn, etc. } ``` The full uniform filter rule (data records vs control records vs S2 command records like `trim`) is documented at [Records on `session.out`](/docs/ai-chat/client-protocol#records-on-session-out). ### Sessions dashboard snapshot read The Sessions detail page in the trigger.dev dashboard now reads the agent's S3 snapshot first via a presigned URL, then SSE-tails from `snapshot.lastOutEventId`. Bandwidth and time-to-first-render are O(unread turns) instead of O(session lifetime). Sessions that registered a `hydrateMessages` hook (which skips snapshot writes) show only the most recent turn — those customers typically have their own DB-backed dashboards. ### Breaking surface * Custom transports parsing `chunk.type` for turn-complete / upgrade-required must switch to the `trigger-control` header check. * Snapshot consumers should import `ChatSnapshotV1` / `ChatSnapshotV1Schema` from `@trigger.dev/core/v3` (now an exported shape, not SDK-internal). Hard cutover — no compat shim. v4.5 is prerelease. ### Docs * [Records on `session.out`](/docs/ai-chat/client-protocol#records-on-session-out) — full filter rule for data / control / command records. * [Resuming a stream](/docs/ai-chat/client-protocol#resuming-a-stream) — explicit single-turn vs multi-turn-away semantics. * [`turn-complete` control record](/docs/ai-chat/client-protocol#turn-complete-control-record) and [`upgrade-required` control record](/docs/ai-chat/client-protocol#upgrade-required-control-record) — replaced the old chunk-shape docs. ## 512 KiB `/in/append` ceiling removed for long chats — slim wire + S3 snapshot `chat.agent` long-running chats with heavy tool results were hitting the realtime API's 512 KiB body cap on `/realtime/v1/sessions/{id}/in/append` once the accumulated `UIMessage[]` history (which the wire shipped in full on every send) crossed the limit. The 413 surfaced as a CORS error in browsers and stalled chats around turn 10–30 with tool use. The wire is now **delta-only**: each `.in/append` carries at most one new `UIMessage` (the new user turn or a tool-approval response) instead of the full history. The agent rebuilds prior history at run boot from a durable JSON snapshot in object storage plus a replay of the `session.out` tail. The 512 KiB ceiling stops being pressure — slim payloads are normally a few KB regardless of chat length. ```ts theme={"theme":"css-variables"} // Before — full history shipped on every send { messages: [u1, a1, u2, a2, /* ... 30 turns ... */, u31], chatId, trigger: "submit-message" } // After — only the new turn { message: u31, chatId, trigger: "submit-message" } ``` ### What changed * **`ChatTaskWirePayload`**: `messages: UIMessage[]` is removed. Replaced by `message?: UIMessage` (singular, optional) and a dedicated `headStartMessages?: UIMessage[]` field used only by `chat.headStart` first-turn handover. * **Run boot**: when `hydrateMessages` is not registered, the runtime reads `packets/{projectRef}/{envSlug}/sessions/{sessionId}/snapshot.json` from object storage and replays any `session.out` chunks landed since the snapshot's cursor. Snapshot writes happen after every `onTurnComplete`, awaited so they survive an idle suspend. * **`hydrateMessages` short-circuit**: registering the hook skips snapshot read/write and replay entirely. Customer is the source of truth for history, same as today. * **`hydrateMessages.incomingMessages`**: now consistently 0-or-1-length across every trigger type. Previously `regenerate-message` and continuations occasionally shipped full history; they now ship none. * **`onChatStart` is now once-per-chat**: fires only on the chat's very first user message; does NOT fire on continuation runs (post-`endRun`, post-waitpoint-timeout, post-`chat.requestUpgrade`) or on OOM-retry attempts. The `continuation` and `previousRunId` fields on `ChatStartEvent` are now `@deprecated` (always `false` / `undefined` when the hook fires). Drop any `if (continuation) return;` gates from `onChatStart` — they're now unreachable. For per-turn setup that runs on continuations too, move to `onTurnStart`. * **Continuation boot payload**: the server now strips `message` / `messages` / `trigger` from the cached `basePayload` on continuation runs, and the SDK enters a new continuation-wait branch that waits silently on `session.in` for the next user message. Fixes a phantom-turn bug where stale boot-payload fields were replayed on every resume. * **OOM-retry boot**: uses the snapshot's `lastOutTimestamp` as the `session.in` cutoff, saving one stream subscription per retry. * **Built-in transports**: `TriggerChatTransport`, `AgentChat`, mid-stream pending-message handling, and `chat.headStart` route handler all updated to the slim shape. Existing customer code calling `transport.sendMessage(...)` / `agentChat.sendMessage(...)` is unaffected — the change is below those surfaces. ### Object store configuration Snapshot read/write reuses Trigger.dev's existing object-store infrastructure — the same presigned-URL routes used for large payloads. Set `OBJECT_STORE_*` env vars on your webapp deployment if you haven't already; MinIO works locally via `OBJECT_STORE_DEFAULT_PROTOCOL`. If no object store is configured **and** no `hydrateMessages` hook is registered, conversations don't survive run boundaries (the runtime logs a warning at registration time). Either configure an object store or register `hydrateMessages`. ### Breaking surface * **Custom transports**: any code constructing `ChatTaskWirePayload` directly must drop `messages` and use `message`. See the rewritten [Client Protocol](/docs/ai-chat/client-protocol). * **Client-side `setMessages` no longer round-trips**: full-history mutations on the client never reached the agent before this release either, but the slim wire makes that explicit. Use server-side [`chat.history.set()`](/docs/ai-chat/backend#chat-history) inside `onTurnStart` for compaction. * **Custom server-to-server senders**: code calling `apiClient.appendToSessionInput(sessionId, ...)` or hitting `/realtime/v1/sessions/{id}/in/append` directly must switch to the slim shape. Hard cutover — there is no compat shim. v4.5 is prerelease. ### Docs * Rewritten [Client Protocol](/docs/ai-chat/client-protocol) — slim payload, new `headStartMessages` field, new "How history is rebuilt" and "Head-start protocol caveat" sections. * New [Persistence and replay](/docs/ai-chat/patterns/persistence-and-replay) — end-to-end walkthrough of the snapshot model, OOM-retry interaction, crash semantics, `hydrateMessages` short-circuit. * New [Tool result auditing](/docs/ai-chat/patterns/tool-result-auditing) — the `extractNewToolResults` + `onTurnComplete` / `hydrateMessages` pattern for HITL audit logging. * [v4.5 section of the upgrade guide](/docs/ai-chat/upgrade-guide#v45-wire-format-change) — migration steps for custom transports and `hydrateMessages` consumers. * [`hydrateMessages`](/docs/ai-chat/lifecycle-hooks#hydratemessages), [`onChatStart`](/docs/ai-chat/lifecycle-hooks#onchatstart) — clarifications on the new `incomingMessages` and `messages` shapes. ## `chat.history` read primitives for HITL flows Customers building human-in-the-loop tools were re-implementing the same accumulator-walking logic to figure out which tool calls were pending, which were resolved, and which results in an incoming wire message were actually new. Lifted into the SDK as five new methods on `chat.history`: | Method | Description | | --------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `chat.history.getPendingToolCalls()` | Tool calls on the most recent assistant message in `input-available` state — gates fresh user turns during HITL. | | `chat.history.getResolvedToolCalls()` | All tool calls in the chain in `output-available` or `output-error` state. | | `chat.history.extractNewToolResults(message)` | Tool results in `message` whose `toolCallId` is not already resolved on the chain. Most useful in `hydrateMessages` against an incoming wire message, before the runtime merges it. | | `chat.history.getChain()` | Same as `chat.history.all()` — alias that reads better alongside parent-aware APIs. | | `chat.history.findMessage(messageId)` | Direct lookup; `undefined` if absent. | ```ts theme={"theme":"css-variables"} // Refuse a regenerate while a tool call is awaiting an answer onAction: async ({ action }) => { if (action.type === "regenerate") { if (chat.history.getPendingToolCalls().length > 0) return; chat.history.slice(0, -1); } }, // Side-effect once per net-new tool result on incoming wire messages hydrateMessages: async ({ incomingMessages }) => { for (const msg of incomingMessages) { for (const r of chat.history.extractNewToolResults(msg)) { await auditLog.record({ id: r.toolCallId, output: r.output, errorText: r.errorText }); } } return incomingMessages; }, ``` See [`chat.history`](/docs/ai-chat/backend#chat-history) and [Human-in-the-loop](/docs/ai-chat/patterns/human-in-the-loop). ## Fix: HITL `addToolOutput` resume preserves the assistant message id In some HITL flows the AI SDK regenerated the assistant message id when the user's `addToolOutput` answer round-tripped back to the agent. The fresh id slipped past the runtime's id-based merge, leaving the resolved tool answer attached to a sibling assistant message instead of the head, which broke downstream dedup and rendered the tool answer twice. The runtime now records `toolCallId → head messageId` whenever an assistant with tool parts lands in the accumulator and rewrites the incoming id back via that map before the merge. Customers who had a content-match workaround for this can drop it. ## `chat.agent` actions are no longer turns Submitting an action via `transport.sendAction()` previously fell through to the regular turn machinery, calling `onTurnStart`, `run()`, `onTurnComplete`, etc. — meaning every action fired an LLM call by default. The workaround was a `chat.local`-based `skipModelCall` flag read in `run()`. Actions now fire `hydrateMessages` and `onAction` only. No `onTurnStart` / `prepareMessages` / `onBeforeTurnComplete` / `onTurnComplete`, no `run()` invocation, no turn-counter increment. The trace span is named `chat action` instead of `chat turn N`. `onAction`'s return type widens: returning `void` is side-effect-only (default); returning a `StreamTextResult`, `string`, or `UIMessage` produces a model response that's auto-piped back to the frontend. ### Migration If you had `run()` branching on `payload.trigger === "action"` for a model response, return your `streamText(...)` from `onAction` instead. If you persisted in `onTurnComplete`, do that work inside `onAction`. For state-only actions, just remove the skip-the-model workaround. ```ts theme={"theme":"css-variables"} // before onAction: async ({ action }) => { if (action.type === "regenerate") { runState.skipModelCall = false; chat.history.slice(0, -1); } }, run: async ({ messages, signal }) => { if (runState.skipModelCall) return; return streamText({ model, messages, abortSignal: signal }); }, // after onAction: async ({ action, messages, signal }) => { if (action.type === "regenerate") { chat.history.slice(0, -1); return streamText({ model, messages, abortSignal: signal }); } }, run: async ({ messages, signal }) => streamText({ model, messages, abortSignal: signal }), ``` Actions arriving when no `onAction` handler is configured now `console.warn` once and are ignored — previously they silently fell through to `run()` with an empty wire payload. ## Fix: duplicate turn after `chat.agent` idle-suspends Every message sent to a `chat.agent` after the run idle-suspended produced two turns on the agent side instead of one — same user message, two LLM calls. Internal session-stream reconnect logic was racing the waitpoint and feeding the just-consumed message back into the next turn's input buffer. No public API change. ## `chat.headStart` — fast first-turn for chat.agent A new opt-in flow that cuts first-turn TTFC roughly in half by running step 1's LLM call in your warm process while the chat.agent run boots in parallel. On the LLM's `tool-calls` boundary, ownership of the durable stream hands over to the agent for tool execution and step 2+. Pure-text first turns finish on the customer side with no LLM call from the trigger run at all. Measured on `claude-sonnet-4-6` (same model both sides): TTFT 2801ms → 1218ms (−57%), total turn 4180ms → 2345ms (−44%). With Head Start, first-text time is essentially the LLM TTFB floor. ### Setup ```ts app/api/chat/route.ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/chat-server"; import { streamText } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; import { headStartTools } from "@/lib/chat-tools/schemas"; export const POST = chat.headStart({ agentId: "my-chat", run: async ({ chat: helper }) => streamText({ ...helper.toStreamTextOptions({ tools: headStartTools }), model: anthropic("claude-sonnet-4-6"), system: "You are a helpful assistant.", }), }); ``` ```tsx components/chat.tsx theme={"theme":"css-variables"} const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, taskId, clientData }) => startChatSession({ chatId, taskId, clientData }), headStart: "/api/chat", }); ``` ### Bundle isolation Tool schemas (`description` + `inputSchema`) live in their own module that imports only `ai` and `zod`. The agent task imports those schemas and adds heavy `execute` fns. The route handler imports schemas only — keeping the warm-process bundle light is what makes the win possible. Runtime "strip executes" helpers don't solve this — bundlers resolve imports at build time. See [Fast starts → Head Start setup](/docs/ai-chat/fast-starts#setup) for the full split. ### Compared to Preload Preload eagerly triggers the run on page load (good when you're confident the user *will* send a message — trades idle compute for fast TTFC). Head Start gates the run on a real first message — no idle compute, customer's process runs step 1 directly. Pick one per chat. ### Works on every runtime `chat.headStart` returns a standard Web Fetch handler — `(req: Request) => Promise` — so it slots into Next.js App Router, Hono, SvelteKit, Remix / React Router v7, TanStack Start, Astro, Nitro/Nuxt, Elysia, Cloudflare Workers, Bun, Deno, and any other runtime that speaks Web Fetch. Verified runtimes: Node 18+, Bun, Deno, Workers, Vercel (Node and Edge), Netlify (Functions and Edge). For Node-only frameworks (Express, Fastify, Koa, raw `node:http`), the SDK ships `chat.toNodeListener(handler)` — converts any Web Fetch handler into a Node `(req, res)` listener with proper streaming, header translation, and client-disconnect propagation. ```ts theme={"theme":"css-variables"} import express from "express"; import { chat } from "@trigger.dev/sdk/chat-server"; const handler = chat.headStart({ agentId: "my-chat", run: ... }); const app = express(); app.post("/api/chat", chat.toNodeListener(handler)); ``` ## Docs * New [Head Start guide](/docs/ai-chat/fast-starts#head-start) — bundle isolation, schema/execute split, route handler setup, transport option, lifecycle, limitations. * [Reference](/docs/ai-chat/reference#triggerchattransport-options) — `headStart` transport option. ## Resilient SSE reconnection The chat transport now retries indefinitely on network drops with bounded exponential backoff (100ms initial, 5s cap, 50% jitter) instead of giving up after 5 attempts. Reconnects are immediate on `online`, on tab refocus after a long background, and on Safari bfcache restore (`pageshow` with `event.persisted`). A 60s stall detector catches silent-dead-socket cases on mobile where the OS killed the TCP socket without the reader noticing. A 30s per-attempt fetch timeout prevents stuck connections from blocking the retry loop. Resume continues to use `Last-Event-ID`, so no chunks are lost when the connection comes back. No public API change — these are defaults on `TriggerChatTransport`. Customers who built `hasActiveStream` / `isStreaming` flag tracking on their side can drop it: the transport handles the silent-but-stale case internally now. `SSEStreamSubscription` (used by `TriggerChatTransport` and `AgentChat`) gained `retryNow()` and `forceReconnect()` for callers writing custom transports, plus options to tune `maxRetries` / `retryDelayMs` / `maxRetryDelayMs` / `retryJitter` / `fetchTimeoutMs` / `stallTimeoutMs` / `nonRetryableStatuses`. `404` and `410` short-circuit retry by default (stream gone / session closed). ## `chat.agent` now runs on Sessions Every chat is backed by a durable Session row that outlives any single run. `externalId` = your chat ID, `type` = `"chat.agent"`. Under the hood: * Output chunks stream on `session.out` (was a run-scoped `streams.writer("chat")`). * Client messages and stops land on `session.in` as a [`ChatInputChunk`](/docs/ai-chat/reference#chatinputchunk) tagged union (was two run-scoped `streams.input` definitions). * Wire endpoints moved from `/realtime/v1/streams/{runId}/...` to `/realtime/v1/sessions/{sessionId}/...`. See the rewritten [Client Protocol](/docs/ai-chat/client-protocol). Public surface (`chat.agent()`, `TriggerChatTransport`, `AgentChat`, `chat.stream` / `chat.messages` / `chat.stopSignal`) is unchanged — existing apps keep working. What's new is: * **Cross-run resume is free.** A chat you were in yesterday resumes against the same `sessionId` today, even if the original run long since exited. No more lost conversations when a run idle-times-out. * **Inbox views via `sessions.list({type: "chat.agent"})`.** Enumerate every chat in your environment, filter by tag or status. * **`TriggerChatTaskResult.sessionId`** + **`ChatTaskRunPayload.sessionId`** — you can reach into the raw session via `sessions.open(payload.sessionId)` for advanced cases (writing from a sub-agent, custom transport). * **Dashboard Agent tab** resolves via `sessionId` and stays in sync with the live stream across runs. The full wire-level protocol (session create, channel routes, JWT scopes) is documented in [Client Protocol](/docs/ai-chat/client-protocol). ## `X-Session-Settled` — fast reconnect on idle chats When a client reconnects to `session.out` and the tail record is a `trigger:turn-complete` marker (agent finished a turn, idle-waiting or exited), the server sets `X-Session-Settled: true` and uses `wait=0` on the underlying S2 read. The SSE drains any remaining records then closes in \~1s instead of long-polling for 60s. Practical impact: `TriggerChatTransport.reconnectToStream` no longer needs a client-side `isStreaming` flag. You can drop the field from your persisted `ChatSession` state entirely — the server decides. Existing callers that still persist `isStreaming` are unaffected; `reconnectToStream` keeps the fast-path short-circuit when it's `false`. ## Migration See the [Sessions Upgrade Guide](/docs/ai-chat/upgrade-guide) for the full step-by-step — auth callback split, persisted `ChatSession` shape, server-side helpers (`chat.createStartSessionAction`, `chat.createAccessToken` for renewal), and the `clientData` validation pivot. ## Docs * Rewritten [Client Protocol](/docs/ai-chat/client-protocol) — full wire format for the new `/realtime/v1/sessions/{sessionId}/...` endpoints, JWT scopes, S2 direct-write credentials, and `Last-Event-ID` resume. * [Database persistence pattern](/docs/ai-chat/patterns/database-persistence) — new `chatId`-keyed `ChatSession` shape (no more `runId`) and a warning on the `onTurnComplete` race that requires a single atomic write of `messages` + `lastEventId`. * [Reference](/docs/ai-chat/reference) — added `chat.createStartSessionAction`, `chat.createAccessToken`, `ChatInputChunk`, `TriggerChatTaskResult.sessionId`, `ChatTaskRunPayload.sessionId`. The old run-scoped stream-ID constants are gone. * Refreshed [Backend](/docs/ai-chat/backend), [Frontend](/docs/ai-chat/frontend), [Server Chat](/docs/ai-chat/server-chat), [Quick start](/docs/ai-chat/quick-start), [Overview](/docs/ai-chat/overview), [Types](/docs/ai-chat/types), [Error handling](/docs/ai-chat/error-handling), and [Testing](/docs/ai-chat/testing) for the session-based wiring. ## Agent Skills Ship reusable capabilities as folders — a `SKILL.md` plus optional scripts, references, and assets. The agent sees short descriptions in its system prompt, loads full instructions on demand via `loadSkill`, and invokes bundled scripts via `bash` — no manual wiring. `skills.define({ id, path })` registers the skill; the CLI bundles the folder into the deploy image. `chat.skills.set([...])` activates skills for the run; `chat.toStreamTextOptions()` auto-injects the preamble and tools. See the new [Agent Skills guide](/docs/ai-chat/patterns/skills). ## `chat.endRun()` — exit on your own terms New imperative API to exit the loop after the current turn completes, without the upgrade-required signal that `chat.requestUpgrade()` sends. Use for one-shot agents, budget-exhausted exits, or goal-reached completions. ```ts theme={"theme":"css-variables"} chat.agent({ id: "one-shot", run: async ({ messages, signal }) => { chat.endRun(); return streamText({ model: openai("gpt-4o"), messages, abortSignal: signal }); }, }); ``` The current turn streams normally, `onBeforeTurnComplete` / `onTurnComplete` fire, the turn-complete chunk is written, and the run exits instead of suspending. Callable from `run()`, `chat.defer()`, `onBeforeTurnComplete`, or `onTurnComplete`. See [Ending a run on your terms](/docs/ai-chat/backend#ending-a-run-on-your-terms). ## `finishReason` on turn-complete events `TurnCompleteEvent` and `BeforeTurnCompleteEvent` now include the AI SDK's `finishReason` (`"stop" | "tool-calls" | "length" | "content-filter" | "error" | "other"`). Clean signal for distinguishing a normal turn end from one paused on a pending tool call (HITL flows like `ask_user`): ```ts theme={"theme":"css-variables"} onTurnComplete: async ({ finishReason, responseMessage }) => { if (finishReason === "tool-calls") { // Paused — assistant message has a pending tool call waiting for user input await persistCheckpoint(responseMessage); } else { await persistCompleted(responseMessage); } }; ``` Undefined for manual `chat.pipe()` flows or aborted streams. See the new [Human-in-the-loop pattern](/docs/ai-chat/patterns/human-in-the-loop). ## User-initiated compaction pattern The [Compaction guide](/docs/ai-chat/compaction) now covers how to wire a "Summarize conversation" button or `/compact` slash command via `actionSchema` + `onAction`. The agent summarizes on demand, rewrites history with `chat.history.set()`, and short-circuits the LLM call for action turns. Needed a small type fix for this: `ChatTaskPayload.trigger` now correctly includes `"action"`, so `run()` handlers can short-circuit with `if (trigger === "action") return` when an action doesn't need a response. ## Human-in-the-loop pattern page New [Human-in-the-loop](/docs/ai-chat/patterns/human-in-the-loop) page walks through `ask_user`-style mid-turn user input end-to-end: defining a no-execute tool, rendering pending tool calls on the frontend with `addToolOutput` + `sendAutomaticallyWhen`, detecting paused turns via `finishReason`, and two persistence strategies (overwrite vs. checkpoint nodes). ## Offline test harness for `chat.agent` `@trigger.dev/sdk/ai/test` now ships `mockChatAgent`, a harness that drives a `chat.agent` definition through real turns without network or task runtime. Send messages, actions, and stop signals; inspect emitted chunks; assert on hook order. ```ts theme={"theme":"css-variables"} import { mockChatAgent } from "@trigger.dev/sdk/ai/test"; import { MockLanguageModelV3 } from "ai/test"; import { myAgent } from "./my-agent"; const harness = mockChatAgent(myAgent, { chatId: "test-1", clientData: { model: new MockLanguageModelV3({ /* ... */ }), }, }); const turn = await harness.sendMessage({ id: "u1", role: "user", parts: [{ type: "text", text: "hi" }], }); expect(turn.chunks).toContainEqual(expect.objectContaining({ type: "text-delta", delta: "hello" })); await harness.close(); ``` ### Dependency injection via locals `setupLocals` pre-seeds `locals` before `run()` starts — the pattern for injecting database clients, service stubs, and other server-side dependencies that shouldn't leak through untrusted `clientData`: ```ts theme={"theme":"css-variables"} import { dbKey } from "./db"; const harness = mockChatAgent(agent, { chatId: "test-1", setupLocals: ({ set }) => { set(dbKey, testDb); }, }); ``` Hooks then read the seeded value with `locals.get(dbKey)`. Falls through to the production client in real runs. See [Testing](/docs/ai-chat/testing). ## `runInMockTaskContext` — lower-level test harness `@trigger.dev/core/v3/test` now exports `runInMockTaskContext` for unit-testing any task code offline (not just chat agents). Installs in-memory managers for `locals`, `lifecycleHooks`, `runtime`, `inputStreams`, and `realtimeStreams`, plus a mock `TaskContext`. Drivers let you push data into input streams and inspect chunks written to output streams. ## Multi-tab coordination Prevent duplicate messages when the same chat is open in multiple browser tabs. Enable with `multiTab: true` on the transport. ```tsx theme={"theme":"css-variables"} const transport = useTriggerChatTransport({ task: "my-chat", multiTab: true, accessToken }); const { messages, setMessages } = useChat({ id: chatId, transport }); const { isReadOnly } = useMultiTabChat(transport, chatId, messages, setMessages); ``` Only one tab can send at a time. Other tabs enter read-only mode with real-time message updates via `BroadcastChannel`. When the active tab's turn completes, any tab can send next. Crashed tabs are detected via heartbeat timeout (10s). See [Multi-tab coordination](/docs/ai-chat/frontend#multi-tab-coordination) and [`useMultiTabChat`](/docs/ai-chat/reference#usemultitabchat). ## Error stack truncation Large error stacks no longer OOM the worker process. Stacks are capped at 50 frames (top 5 + bottom 45), individual lines at 1024 chars, messages at 1000 chars. Applied in `parseError`, `sanitizeError`, and OTel span recording. ## Fix: `resume: true` hangs on completed turns When refreshing a page after a turn completed, `useChat` with `resume: true` would hang indefinitely — `reconnectToStream` opened an SSE connection that never received data. Added `isStreaming` to session state. The transport sets it to `true` when streaming starts and `false` on `trigger:turn-complete`. `reconnectToStream` returns `null` immediately when `isStreaming` is false, so `resume: initialMessages.length > 0` is now safe to pass unconditionally. The flag flows through `onSessionChange` and is restored from `sessions` — no extra persistence code needed. ## `hydrateMessages` — backend-controlled message history Load message history from your database on every turn instead of trusting the frontend accumulator. The hook replaces the built-in linear accumulation entirely — the backend is the source of truth. ```ts theme={"theme":"css-variables"} chat.agent({ id: "my-chat", hydrateMessages: async ({ chatId, trigger, incomingMessages }) => { const stored = await db.getMessages(chatId); if (trigger === "submit-message" && incomingMessages.length > 0) { stored.push(incomingMessages[incomingMessages.length - 1]!); await db.persistMessages(chatId, stored); } return stored; }, }); ``` Tool approval updates are auto-merged after hydration — no extra handling needed. See [hydrateMessages](/docs/ai-chat/lifecycle-hooks#hydratemessages). ## `chat.history` — imperative message mutations Modify the accumulated message history from any hook or `run()`: ```ts theme={"theme":"css-variables"} chat.history.rollbackTo(messageId); // Undo — keep up to this message chat.history.remove(messageId); // Remove one message chat.history.replace(id, newMsg); // Edit a message chat.history.slice(0, -2); // Remove last 2 messages chat.history.all(); // Read current state ``` See [chat.history](/docs/ai-chat/backend#chat-history). ## Custom actions — `actionSchema` + `onAction` Send typed actions (undo, rollback, edit) from the frontend via `transport.sendAction()`. Actions wake the agent, fire `onAction`, then trigger a normal `run()` turn. ```ts theme={"theme":"css-variables"} chat.agent({ id: "my-chat", actionSchema: z.discriminatedUnion("type", [ z.object({ type: z.literal("undo") }), z.object({ type: z.literal("rollback"), targetMessageId: z.string() }), ]), onAction: async ({ action }) => { if (action.type === "undo") chat.history.slice(0, -2); if (action.type === "rollback") chat.history.rollbackTo(action.targetMessageId); }, }); ``` Frontend: `transport.sendAction(chatId, { type: "undo" })` Server: `agentChat.sendAction({ type: "undo" })` See [Actions](/docs/ai-chat/actions) and [Sending actions](/docs/ai-chat/frontend#sending-actions). ## `chat.response` — persistent data parts Added `chat.response.write()` for writing data parts that both stream to the frontend AND persist in `onTurnComplete`'s `responseMessage` and `uiMessages`. ```ts theme={"theme":"css-variables"} // Persists to responseMessage.parts — available in onTurnComplete chat.response.write({ type: "data-handover", data: { context: summary } }); // Transient — streams to frontend only, not in responseMessage writer.write({ type: "data-progress", data: { percent: 50 }, transient: true }); ``` Non-transient `data-*` chunks written via lifecycle hook `writer.write()` now automatically persist to the response message, matching the AI SDK's default semantics. Add `transient: true` for ephemeral chunks (progress indicators, status updates). See [Custom data parts](/docs/ai-chat/backend#custom-data-parts). ## Tool approvals Added support for AI SDK tool approvals (`needsApproval: true`). When the model calls a tool that needs approval, the turn completes and the frontend shows approve/deny buttons. After approval, the updated assistant message is sent back and matched by ID in the accumulator. ```ts theme={"theme":"css-variables"} const sendEmail = tool({ description: "Send an email. Requires human approval.", inputSchema: z.object({ to: z.string(), subject: z.string(), body: z.string() }), needsApproval: true, execute: async ({ to, subject, body }) => { /* ... */ }, }); ``` Frontend setup requires `sendAutomaticallyWhen` and `addToolApprovalResponse` from `useChat`. See [Tool approvals](/docs/ai-chat/frontend#tool-approvals). ## `transport.stopGeneration(chatId)` Added `stopGeneration` method to `TriggerChatTransport` for reliable stop after page refresh / stream reconnect. Works regardless of whether the AI SDK passes `abortSignal` through `reconnectToStream`. ```tsx theme={"theme":"css-variables"} const stop = useCallback(() => { transport.stopGeneration(chatId); aiStop(); // also update useChat state }, [transport, chatId, aiStop]); ``` See [Stop generation](/docs/ai-chat/frontend#stop-generation). ## `generateMessageId` support `generateMessageId` can now be passed via `uiMessageStreamOptions` to control response message ID generation (e.g. UUID-v7). The backend automatically passes `originalMessages` to `toUIMessageStream` so message IDs are consistent between frontend and backend. ## Bug fixes * **`onTurnComplete` not called**: Fixed `turnCompleteResult?.lastEventId` TypeError that silently skipped `onTurnComplete` when `writeTurnCompleteChunk` returned undefined in dev. * **Stop during streaming**: Added 2s timeout on `onFinishPromise` so `onBeforeTurnComplete` and `onTurnComplete` fire even when the AI SDK's `onFinish` doesn't fire after abort. * **`toStreamTextOptions` without `chat.prompt.set()`**: `prepareStep` injection (compaction, steering, background context) now works even when the user passes `system` directly to `streamText` instead of using `chat.prompt.set()`. * **Background queue vs tool approvals**: Background context injection is now skipped when the last accumulated message is a `tool` message, preventing it from breaking `streamText`'s `collectToolApprovals`. # chat.local Source: https://trigger.dev/docs/ai-chat/chat-local Typed, run-scoped data accessible from hooks, run(), tools, and subtasks. Survives across turns, auto-cleared between runs, auto-hydrated into subtasks. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. Use `chat.local` to create typed, run-scoped data that persists across turns and is accessible from anywhere — the run function, tools, nested helpers. Each run gets its own isolated copy, and locals are automatically cleared between runs. Lifecycle hooks and **`run`** also receive **`ctx`** ([`TaskRunContext`](/docs/ai-chat/reference#task-context-ctx)) — the same object as on a standard `task()` — for tags, metadata, and cleanup that needs the full run record. When a subtask is invoked via `ai.toolExecute()` (or the deprecated `ai.tool()`), initialized locals are automatically serialized into the subtask's metadata and hydrated on first access — no extra code needed. Subtask changes to hydrated locals are local to the subtask and don't propagate back to the parent. ## Declaring and initializing Declare locals at module level with a unique `id`, then initialize them inside a lifecycle hook where you have context (chatId, clientData, etc.): ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { streamText, tool, stepCountIs } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; import { z } from "zod"; import { db } from "@/lib/db"; // Declare at module level — each local needs a unique id const userContext = chat.local<{ userId: string; name: string; plan: "free" | "pro"; messageCount: number; }>({ id: "userContext" }); export const myChat = chat.agent({ id: "my-chat", clientDataSchema: z.object({ userId: z.string() }), onBoot: async ({ clientData }) => { // Initialize with real data from your database const user = await db.user.findUnique({ where: { id: clientData.userId }, }); userContext.init({ userId: clientData.userId, name: user.name, plan: user.plan, messageCount: user.messageCount, }); }, run: async ({ messages, signal }) => { userContext.messageCount++; return streamText({ model: anthropic("claude-sonnet-4-5"), system: `Helping ${userContext.name} (${userContext.plan} plan).`, messages, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` Initialize `chat.local` in [`onBoot`](/docs/ai-chat/lifecycle-hooks#onboot), not `onChatStart`. `onBoot` fires on every fresh worker — including continuation runs (post-cancel, crash, `endRun`, `requestUpgrade`, OOM retry) — whereas `onChatStart` only fires on the chat's very first message. Initializing in `onChatStart` means `run()` will crash on continuation runs with `chat.local can only be modified after initialization`. ## Accessing from tools Locals are accessible from anywhere during task execution — including AI SDK tools: ```ts theme={"theme":"css-variables"} const userContext = chat.local<{ plan: "free" | "pro" }>({ id: "userContext" }); const premiumTool = tool({ description: "Access premium features", inputSchema: z.object({ feature: z.string() }), execute: async ({ feature }) => { if (userContext.plan !== "pro") { return { error: "This feature requires a Pro plan." }; } // ... premium logic }, }); ``` ## Accessing from subtasks When you use `ai.toolExecute()` inside AI SDK `tool()` to expose a subtask, chat locals are automatically available read-only: ```ts theme={"theme":"css-variables"} import { chat, ai } from "@trigger.dev/sdk/ai"; import { schemaTask } from "@trigger.dev/sdk"; import { streamText, tool } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; import { z } from "zod"; const userContext = chat.local<{ name: string; plan: "free" | "pro" }>({ id: "userContext" }); export const analyzeDataTask = schemaTask({ id: "analyze-data", schema: z.object({ query: z.string() }), run: async ({ query }) => { // userContext.name just works — auto-hydrated from parent metadata console.log(`Analyzing for ${userContext.name}`); // Changes here are local to this subtask and don't propagate back }, }); const analyzeData = tool({ description: analyzeDataTask.description ?? "", inputSchema: analyzeDataTask.schema!, execute: ai.toolExecute(analyzeDataTask), }); export const myChat = chat.agent({ id: "my-chat", onBoot: async ({ clientData }) => { userContext.init({ name: "Alice", plan: "pro" }); }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, tools: { analyzeData }, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` Values must be JSON-serializable for subtask access. Non-serializable values (functions, class instances, etc.) will be lost during transfer. ## Dirty tracking and persistence The `hasChanged()` method returns `true` if any property was set since the last check, then resets the flag. Use it in lifecycle hooks to only persist when data actually changed: ```ts theme={"theme":"css-variables"} onTurnComplete: async ({ chatId }) => { if (userContext.hasChanged()) { await db.user.update({ where: { id: userContext.get().userId }, data: { messageCount: userContext.messageCount, }, }); } }, ``` ## API | Method | Description | | ----------------------- | --------------------------------------------------------------- | | `chat.local({ id })` | Create a typed local with a unique id (declare at module level) | | `local.init(value)` | Initialize with a value (call in hooks or `run`) | | `local.hasChanged()` | Returns `true` if modified since last check, resets flag | | `local.get()` | Returns a plain object copy (for serialization) | | `local.property` | Direct property access (read/write via Proxy) | Locals use shallow proxying. Nested object mutations like `local.prefs.theme = "dark"` won't trigger the dirty flag. Instead, replace the whole property: `local.prefs = { ...local.prefs, theme: "dark" }`. ## See also * [Lifecycle hooks](/docs/ai-chat/lifecycle-hooks) — `onBoot` is the canonical init site for `chat.local`. * [Database persistence pattern](/docs/ai-chat/patterns/database-persistence) — full per-hook breakdown using `chat.local` alongside DB rows. * [Code execution sandbox pattern](/docs/ai-chat/patterns/code-sandbox) — example of using `chat.local` to hold a sandbox handle across turns. * [Database connections](/docs/database-connections) — why the database client and its connection pool belong at module scope, not in `chat.local`. # Client Protocol Source: https://trigger.dev/docs/ai-chat/client-protocol The wire protocol for building custom chat transports — how clients communicate with chat agents over Sessions and SSE. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. This page documents the protocol that chat clients use to communicate with `chat.agent()` tasks. Use this if you're building a custom transport (e.g., for a Slack bot, CLI tool, or native app) instead of using the built-in `TriggerChatTransport` or `AgentChat`. Most users don't need this. Use [`TriggerChatTransport`](/docs/ai-chat/frontend) for browser apps or [`AgentChat`](/docs/ai-chat/server-chat) for server-side code. This page is for building your own from scratch. ## Overview `chat.agent` is built on a durable Session row — the unit of state that owns the chat's runs across their full lifecycle. A conversation is one session; a session can host many runs over its lifetime. The protocol has three parts: 1. **Create the session** — idempotent on your chat ID. Creates the row **and** triggers the first run in one call. Returns the `publicAccessToken` you'll use for everything else. 2. **Subscribe to `.out`** — receive `UIMessageChunk` events via SSE. 3. **Append to `.in`** — send subsequent user messages, stops, or actions. ```mermaid theme={"theme":"css-variables"} sequenceDiagram participant Client participant API as Trigger.dev API participant Agent as Chat Agent Run Client->>API: POST /api/v1/sessions { type: "chat.agent", externalId, taskIdentifier, triggerConfig.basePayload } API-->>Client: { id: sessionId, runId, publicAccessToken, ... } Client->>API: GET /realtime/v1/sessions/{sessionId}/out (SSE subscribe) Agent-->>Client: UIMessageChunk stream... Agent-->>Client: turn-complete control record Client->>API: POST /realtime/v1/sessions/{sessionId}/in/append { kind: "message", payload: { message, ... } } Agent-->>Client: UIMessageChunk stream... Agent-->>Client: turn-complete control record ``` **Stream lifetime.** `session.out` is bounded. After each turn-complete control record, the agent appends an S2 `trim` command record back to the previous turn-complete's seq\_num — the stream stays roughly one turn long forever at steady state. Full conversation history lives in a durable S3 snapshot, not on the stream. The transport's `lastEventId` bookmark plus S2's eventually-consistent trim window (10-60s) keeps single-turn-boundary resume working; multi-turn-away resume falls back to the snapshot. See [Resuming a stream](#resuming-a-stream) and [How history is rebuilt](#how-history-is-rebuilt). **Session create triggers a run.** Unlike `POST /api/v1/tasks/{taskId}/trigger`, `POST /api/v1/sessions` is the **only** entry point for chat-agent runs. The session row is task-bound and the first run is triggered atomically as part of the create call. Don't call `/tasks/{taskId}/trigger` directly for `chat.agent` tasks — the resulting run won't be bound to a session and `.in`/`.out` won't reach it. **One message per record.** Each `.in/append` carries at most one new `UIMessage` — the new user turn or a tool-approval response. The agent rebuilds prior history at run boot from a durable object-store snapshot plus a replay of the `session.out` tail; clients never ship full conversation history on the wire. See [How history is rebuilt](#how-history-is-rebuilt). ## End-to-end curl recipe A single-shell walk-through of the whole protocol — copy, fill in `BASE_URL` / `SECRET_KEY` / `TASK_ID`, and run. Drives a two-turn conversation (`pong` → `echo`) using only `curl` and `jq`. ```bash theme={"theme":"css-variables"} BASE_URL="https://api.trigger.dev" # or your local webapp SECRET_KEY="tr_dev_..." # secret API key for the env TASK_ID="ai-chat" # your chat.agent task id CHAT_ID=$(uuidgen | tr '[:upper:]' '[:lower:]') # 1. Create session + trigger first run with the user's first message. RESP=$(curl -sS -X POST "$BASE_URL/api/v1/sessions" \ -H "Authorization: Bearer $SECRET_KEY" \ -H "Content-Type: application/json" \ -d @- < ```bash trigger: "preload" — warm the agent, no message yet theme={"theme":"css-variables"} POST /api/v1/sessions Authorization: Bearer Content-Type: application/json { "type": "chat.agent", "externalId": "conversation-123", "taskIdentifier": "ai-chat", "triggerConfig": { "basePayload": { "chatId": "conversation-123", "trigger": "preload", "metadata": { "userId": "user-456" } } }, "tags": ["chat:conversation-123"] } ``` ```bash trigger: "submit-message" — process first user message immediately theme={"theme":"css-variables"} POST /api/v1/sessions Authorization: Bearer Content-Type: application/json { "type": "chat.agent", "externalId": "conversation-123", "taskIdentifier": "ai-chat", "triggerConfig": { "basePayload": { "chatId": "conversation-123", "trigger": "submit-message", "message": { "id": "msg-1", "role": "user", "parts": [{ "type": "text", "text": "Hello!" }] }, "metadata": { "userId": "user-456" } } }, "tags": ["chat:conversation-123"] } ``` Pick `"preload"` when the UI has rendered but the user hasn't typed (warms the agent so the first response is fast); pick `"submit-message"` when you already have the first message and want it processed in the same call. ### Required fields | Field | Type | Description | | --------------------------- | -------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `type` | `string` | Discriminator. Use `"chat.agent"`. | | `taskIdentifier` | `string` | The `id` you passed to `chat.agent({ id: ... })` — e.g. `"ai-chat"`. | | `triggerConfig.basePayload` | `object` | The wire payload sent to the **first run** created by this call. Same shape as [`ChatTaskWirePayload`](#chattaskwirepayload) in Step 3. Durable fields (`chatId`, `metadata`, `idleTimeoutInSeconds`, `sessionId`) flow through to continuation runs too; first-turn-only fields (`message`, `trigger`) are stripped on continuations — those are session-create concerns and don't replay. See [What goes in `basePayload`](#what-goes-in-basepayload) below. | ### Optional fields | Field | Type | Description | | ------------------------------------ | ------------------- | ----------------------------------------------------------------------------------------------------------------------- | | `externalId` | `string` | Your stable chat ID. Strongly recommended — without it, repeat calls create new sessions. Cannot start with `session_`. | | `tags` | `string[]` | Up to 10 dashboard tags. | | `metadata` | `object` | Arbitrary JSON metadata stored on the session row (separate from `basePayload.metadata`, which goes to the agent). | | `expiresAt` | `string` (ISO date) | Retention cap. | | `triggerConfig.machine` | `string` | Machine preset (`micro`, `small-1x`, …) for every run. | | `triggerConfig.queue` | `string` | Queue name. | | `triggerConfig.tags` | `string[]` | Tags applied to every run (in addition to session-level `tags`). | | `triggerConfig.maxAttempts` | `number` | Per-run retry cap (1–10). | | `triggerConfig.maxDuration` | `number` | Per-run wall-clock cap, seconds. | | `triggerConfig.lockToVersion` | `string` | Pin every run to a specific worker version. | | `triggerConfig.region` | `string` | Region preference. | | `triggerConfig.idleTimeoutInSeconds` | `number` | Surfaced to the agent through the wire payload (1–3600). | ### What goes in `basePayload` `basePayload` is the [`ChatTaskWirePayload`](#chattaskwirepayload) sent to the agent at run boot — the same shape used for every subsequent `.in/append` (Step 3). Two fields you must always include: * `chatId` — should equal your `externalId`. The agent uses this as its conversation identity (e.g. as a DB key in `hydrateMessages`); the `externalId` is what the URL routes resolve. Setting them to the same value is the standard pattern and the only way the built-in clients work. * `trigger` — see the two examples above. `"preload"` and `"submit-message"` are the only valid choices for the first run; the others (`"regenerate-message"`, `"action"`, `"close"`, `"handover-prepare"`) are for subsequent `.in/append` calls. The agent's typed `clientData` (declared via `chat.withClientData({ schema: ... })`) is read from `basePayload.metadata`. If your agent declares `clientData: { userId: string }`, then `metadata.userId` is required on every run — including the first one in `basePayload`. ### Response ```http theme={"theme":"css-variables"} HTTP/1.1 201 Created content-type: application/json; charset=utf-8 x-trigger-jwt: eyJhbGciOi... x-trigger-jwt-claims: {"sub":"...","scopes":["read:runs:run_abc123","write:inputStreams:run_abc123"]} { "id": "session_cm4z2plfh000abcd1efgh", "externalId": "conversation-123", "type": "chat.agent", "taskIdentifier": "ai-chat", "triggerConfig": { "basePayload": { /* echoed back */ } }, "currentRunId": "run_abc123", "tags": ["chat:conversation-123"], "metadata": null, "closedAt": null, "closedReason": null, "expiresAt": null, "createdAt": "2026-04-24T09:00:00.000Z", "updatedAt": "2026-04-24T09:00:00.000Z", "runId": "run_abc123", "publicAccessToken": "eyJhbGciOi...", "isCached": false } ``` | Field | Description | | ------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `id` | The `session_*` friendly ID. Stable for the life of the conversation. | | `runId` / `currentRunId` | Friendly ID of the first run. Identical on a fresh create; will diverge over the conversation (see [Continuations](#continuations)). | | `publicAccessToken` | Session-scoped JWT carrying `read:sessions:{externalId}` + `write:sessions:{externalId}`. **This is the token you use for every subsequent `.in`/`.out` call.** Persist it. Lifetime is 60 minutes — see [Refreshing the token](#refreshing-the-token). | | `isCached` | `true` if the session existed already (idempotent re-create). HTTP status is 200 in that case, 201 on a fresh create. | **Use `publicAccessToken` from the body, not the `x-trigger-jwt` response header.** The header is included by the underlying run-trigger machinery and carries **run-scoped** scopes (`read:runs:{runId}` + `write:inputStreams:{runId}`) — it cannot subscribe to `.out` or append to `.in`. The body's `publicAccessToken` is the only token with the correct session-level scopes. ### Idempotency Re-calling `POST /api/v1/sessions` with the same `(taskIdentifier, externalId)` pair is **idempotent for the lifetime of the session**: * If the session is still alive: returns the existing row with `isCached: true`, `runId` unchanged, and a **fresh** 60-minute `publicAccessToken`. No duplicate run is triggered. (Idle/exited runs are different — see [Continuations](#continuations).) * If the session has been closed (`POST /api/v1/sessions/{id}/close`): returns **HTTP 409**. Closed is one-way; reuse a different `externalId` to start a new conversation. * Any tags / metadata / expiresAt / triggerConfig fields you send on the cached path are written through to the row, so you can update e.g. `triggerConfig.basePayload.metadata` mid-conversation. The new fields apply to **future** runs (continuations); the currently-live run keeps its original config. **A cached re-POST does not deliver a new `basePayload.message`.** `basePayload` is run-trigger config, not a message channel — the existing run keeps streaming and your message is silently dropped. To send a follow-up message, use `POST /realtime/v1/sessions/{sessionId}/in/append` (Step 3). ### Refreshing the token The `publicAccessToken` returned by `POST /api/v1/sessions` is valid for 60 minutes. Two ways to keep going past that: 1. **Take refreshed tokens from the stream.** Most `turn-complete` control records on `.out` carry a `public-access-token` header with a refreshed JWT (see [`turn-complete` control record](#turn-complete-control-record)). The header is optional and may be absent on some turns (for example an errored turn), so replace your stored token whenever the header is present rather than expecting it every turn. For active conversations it rolls on its own. 2. **Re-call `POST /api/v1/sessions`.** Idempotent, returns `isCached: true` and a brand-new 60-minute token. Use this if a chat goes idle long enough that the SSE stream has closed and you need to resume. The built-in SDK clients (`TriggerChatTransport` from `@trigger.dev/sdk`, `AgentChat` from `@trigger.dev/sdk/chat`) call this endpoint and persist the refreshed `publicAccessToken` automatically, refreshing on every `turn-complete` control record. ## Step 2: Subscribe to `.out` Subscribe to the agent's response via SSE on the session's `.out` channel: ``` GET /realtime/v1/sessions/{sessionId}/out Authorization: Bearer Accept: text/event-stream ``` `Accept: text/event-stream` is required — without it the request is rejected as a non-SSE caller. The URL accepts either form for `{sessionId}`: the friendly `session_*` ID, or your `externalId` (the chat ID you created the session with). The `publicAccessToken` from session-create authorizes both forms. Pick whichever your client already has on hand. A session's `.out` stays the same across runs, so the client doesn't need to re-subscribe when a new run starts on the same chat. `seq_num` is **monotonically increasing across the entire session**, not just within one run — turn 1 might emit seq 0–9, turn 2 picks up at seq 10+, a continuation run on the same session continues numbering from there. This is why a single `Last-Event-ID` cursor is sufficient to resume across turns and across runs. ### Stream timeout The SSE long-polls until either a record arrives or the timeout expires. The default is **60 seconds**; cap it explicitly via the `Timeout-Seconds` request header (1–600): ``` GET /realtime/v1/sessions/{sessionId}/out Authorization: Bearer Accept: text/event-stream Timeout-Seconds: 30 ``` If nothing arrives by the deadline, the server sends `data: [DONE]` and closes. Reconnect with `Last-Event-ID` to continue (see [Resuming a stream](#resuming-a-stream)). ### Stream format (S2) The output stream uses [S2](https://s2.dev) under the hood and follows the standard SSE wire format ([WHATWG spec](https://html.spec.whatwg.org/multipage/server-sent-events.html#parsing-an-event-stream)). Three event types arrive on the wire: | Event | Meaning | | ------------------------------------ | ------------------------------------------------------------------------- | | `batch` | One or more records. The records you actually care about. | | `ping` | Keepalive (\~every 5s on idle). Body is `{"timestamp": }`. Ignore it. | | *(no `event:`, just `data: [DONE]`)* | Stream is closing — server sends this once before EOF. | A `batch` event in raw SSE format looks like this — note the `data` is a single line of JSON, no embedded newlines (per the SSE spec): ``` id: 0,1,106 event: batch data: {"records":[{"seq_num":0,"timestamp":1712150400000,"body":"{\"data\":{\"type\":\"text-delta\",\"id\":\"msg_1\",\"delta\":\"pong\"},\"id\":\"abc\"}"}],"tail":{"seq_num":10,"timestamp":1712150400500}} ``` The `id:` line on the wire is a comma-separated triple internal to S2 (`startSeq,endSeq,byteOffset`) — **don't try to parse it**. Use `record.seq_num` from inside the `data` body instead (see [Resuming a stream](#resuming-a-stream)). Decoded `data` payload: ```json theme={"theme":"css-variables"} { "records": [ { "seq_num": 0, "timestamp": 1712150400000, "body": "{\"data\":{\"type\":\"text-delta\",\"id\":\"msg_1\",\"delta\":\"pong\"},\"id\":\"abc\"}" } ], "tail": { "seq_num": 10, "timestamp": 1712150400500 } } ``` | Field | Description | | --------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `records[]` | One or more records delivered in this batch, in arrival order. | | `records[].seq_num` | Monotonic per-record cursor. Use the **last** one you successfully processed as your `Last-Event-ID` on resume. | | `records[].timestamp` | Unix ms when the record was written to S2. | | `records[].body` | For data records: a JSON-encoded **string** wrapping `{ data: UIMessageChunk, id: string }`. For control records: an empty string (semantics live in `headers`). For S2 command records: opaque bytes. See [Records on session.out](#records-on-session-out). | | `records[].headers` | Optional `[name, value]` pairs. Empty for data records; a `trigger-control` entry for control records; a single empty-name `["", ""]` entry for S2 command records. | | `tail.seq_num` | Latest known tail of the S2 stream — useful for detecting how far behind the live edge you are. Skip if you don't need it. | | `tail.timestamp` | Timestamp of `tail.seq_num`. | ### Records on `session.out` Three kinds of records can arrive on the wire. They all share the `batch` envelope above; you tell them apart by `headers`. | Kind | `headers[0][0]` | `headers` carries | `body` | | -------------------------- | ------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------- | | **Data record** | *empty array or non-empty name* | (currently none from the agent) | JSON envelope `{"data": UIMessageChunk, "id": }` | | **Trigger control record** | `"trigger-control"` | `["trigger-control", ]` plus subtype-specific siblings (e.g. `["public-access-token", ]` and `["session-in-event-id", ]` on `turn-complete`) | empty string | | **S2 command record** | `""` (empty name) | `["", ""]` (currently `"trim"`) | opaque bytes — S2-interpreted | **Uniform filter rule for custom readers:** ```ts theme={"theme":"css-variables"} // Always advance the resume cursor — even for records you skip. lastEventId = String(record.seq_num); // S2 command record: bump cursor, don't dispatch. if (record.headers?.[0]?.[0] === "") continue; // Trigger control record: route by `trigger-control` value, don't // dispatch as a UIMessageChunk. const controlValue = record.headers?.find(([name]) => name === "trigger-control")?.[1]; if (controlValue === "turn-complete") { const token = record.headers.find(([name]) => name === "public-access-token")?.[1]; // ...fire your turn-complete handler with the optional refreshed token... continue; } if (controlValue === "upgrade-required") { // ...your upgrade flow, if any. The server has already swapped the run // by the time this arrives — subsequent chunks are from the new run... continue; } // Otherwise: data record. Parse the body, dispatch the UIMessageChunk. const { data: chunk } = JSON.parse(record.body); ``` Built-in SDK transports (`TriggerChatTransport`, `AgentChat`) handle all of this for you — control records surface via `onTurnComplete({ chatId, lastEventId, publicAccessToken })` and the upgrade flow. Custom transports need the routing above. **Prior wire shape.** Earlier SDK versions emitted `trigger:turn-complete` and `trigger:upgrade-required` as `UIMessageChunk`-shaped data records with `chunk.type === "trigger:turn-complete"`. Current versions use the header-form control records described above. Built-in SDK transports handle the new shape transparently; custom transports filtering on `chunk.type` need to switch to the `trigger-control` header check. ### Built-in parser (recommended for SDK users) If you're working in TypeScript and depending on `@trigger.dev/core/v3` is acceptable, use `SSEStreamSubscription` — it handles batch decoding, deduplication, command-record filtering, and `Last-Event-ID` tracking for you: ```ts theme={"theme":"css-variables"} import { SSEStreamSubscription, controlSubtype } from "@trigger.dev/core/v3"; const subscription = new SSEStreamSubscription( `${baseUrl}/realtime/v1/sessions/${sessionId}/out`, { headers: { Authorization: `Bearer ${publicAccessToken}` }, timeoutInSeconds: 120, lastEventId, } ); const stream = await subscription.subscribe(); const reader = stream.getReader(); while (true) { const { done, value } = await reader.read(); if (done) break; // value is { id, chunk, timestamp, headers }. S2 command records are // filtered out of this stream entirely (cursor still advances). Trigger // control records pass through with `chunk === undefined` and a // `trigger-control` header. const control = controlSubtype(value.headers); if (control === "turn-complete") break; if (control === "upgrade-required") continue; const chunk = value.chunk as { type?: string; delta?: string } | undefined; if (chunk?.type === "text-delta") process.stdout.write(chunk.delta ?? ""); } ``` ### Self-contained parser (for custom transports) If you're building a transport in another language or don't want the dependency, here's a complete reader. It handles the SSE framing, the comma-separated `id:` line, batch unwrapping, the inner `body` string, and `ping` / `[DONE]` events: ```ts theme={"theme":"css-variables"} async function* readSessionOut( url: string, publicAccessToken: string, opts: { lastEventId?: string; timeoutSeconds?: number } = {} ) { const headers: Record = { Authorization: `Bearer ${publicAccessToken}`, Accept: "text/event-stream", }; if (opts.lastEventId) headers["Last-Event-ID"] = opts.lastEventId; if (opts.timeoutSeconds) headers["Timeout-Seconds"] = String(opts.timeoutSeconds); const res = await fetch(url, { headers }); if (!res.ok || !res.body) throw new Error(`SSE failed: ${res.status}`); const decoder = new TextDecoder(); const reader = res.body.getReader(); let buf = ""; while (true) { const { done, value } = await reader.read(); if (done) return; buf += decoder.decode(value, { stream: true }); // SSE events are separated by blank lines (CRLF or LF). const events = buf.split(/\r?\n\r?\n/); buf = events.pop() ?? ""; // last chunk is incomplete for (const raw of events) { let eventType = "message"; // SSE default const dataLines: string[] = []; for (const line of raw.split(/\r?\n/)) { if (line.startsWith("event:")) eventType = line.slice(6).trim(); else if (line.startsWith("data:")) dataLines.push(line.slice(5).trimStart()); // We deliberately ignore `id:` — use record.seq_num for resume cursors. } const data = dataLines.join("\n"); if (!data) continue; if (eventType === "ping") continue; if (data === "[DONE]") return; if (eventType === "batch") { const batch = JSON.parse(data) as { records: Array<{ seq_num: number; timestamp: number; body: string; headers?: Array<[string, string]>; }>; }; for (const record of batch.records) { const firstHeaderName = record.headers?.[0]?.[0]; // S2 command record (trim/fence) — bump cursor, skip dispatch. if (firstHeaderName === "") { yield { seqNum: record.seq_num, timestamp: record.timestamp, kind: "command" }; continue; } // Trigger control record (turn-complete, upgrade-required) — // semantics live in headers, body is empty. Route by header. const controlValue = record.headers?.find(([n]) => n === "trigger-control")?.[1]; if (controlValue) { const token = record.headers?.find(([n]) => n === "public-access-token")?.[1]; yield { seqNum: record.seq_num, timestamp: record.timestamp, kind: "control", subtype: controlValue, publicAccessToken: token, }; continue; } // Data record — UIMessageChunk wrapped in `{ data, id }`. const inner = JSON.parse(record.body) as { data: unknown; id: string }; yield { seqNum: record.seq_num, // use this for Last-Event-ID on resume timestamp: record.timestamp, kind: "data", chunk: inner.data, // the actual UIMessageChunk }; } } } } } ``` Driving it: ```ts theme={"theme":"css-variables"} let lastSeq: string | undefined; for await (const ev of readSessionOut(sseUrl, publicAccessToken)) { lastSeq = String(ev.seqNum); // always advance the cursor if (ev.kind === "command") continue; // S2 trim/fence — skip if (ev.kind === "control") { if (ev.subtype === "turn-complete") break; // turn done if (ev.subtype === "upgrade-required") continue; // run swap handled server-side continue; } // ev.kind === "data" — the UIMessageChunk const chunk = ev.chunk as { type: string; delta?: string }; if (chunk.type === "text-delta") process.stdout.write(chunk.delta ?? ""); } // On reconnect, pass `lastEventId: lastSeq` to resume from the next record. ``` ### Chunk types Data records on the stream carry a `UIMessageChunk` from the [AI SDK](https://ai-sdk.dev/docs/ai-sdk-ui/ui-message-stream). Two Trigger.dev-specific control events ride alongside as **header-form control records** (see [Records on session.out](#records-on-session-out)). Within a single assistant turn the AI SDK chunk types you'll typically see, in order: | Chunk type | Shape | Notes | | ---------------------------------------------------------------- | ------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `start` | `{ type: "start", messageId: string }` | First chunk of a new assistant message. **Persist `messageId`** — you'll need it to send tool-approval responses (see [Tool approval responses](#tool-approval-responses)). | | `start-step` | `{ type: "start-step" }` | New `prepareStep` boundary. | | `text-start` / `text-delta` / `text-end` | `{ type: ..., id: string, delta?: string }` | Streaming text. Concatenate `delta`s for the visible reply. | | `tool-input-start` / `tool-input-delta` / `tool-input-available` | tool-call argument streaming | The tool the model is calling. | | `tool-output-available` | tool result | After the agent runs the tool. | | `data-*` | `{ type: "data-", data: ... }` | Custom data parts written by the agent's hooks. | | `finish-step` / `finish` | end markers for the assistant message | Followed by the `turn-complete` control record. | Refer to the AI SDK docs linked above for the full union — only the two control records below are Trigger.dev-specific. ### `turn-complete` control record Signals that the agent's turn is finished — stop reading and wait for user input. ``` headers: ["trigger-control", "turn-complete"] ["public-access-token", "eyJ..."] // optional, refreshed JWT ["session-in-event-id", "42"] // optional, agent-internal resume cursor body: "" ``` | Header | Description | | --------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `trigger-control: turn-complete` | Always present on this record. | | `public-access-token: ` (optional) | A refreshed JWT with the same session + run scopes. If present, replace your stored token. | | `session-in-event-id: ` (optional) | Internal cursor used by the agent to resume `.in` across worker boots without replaying already-processed user messages. Custom transports should ignore this header — it carries no client-side meaning. | When you receive this record: 1. Update `publicAccessToken` if one is included on the headers. 2. Close the stream reader (unless you want to keep it open across turns — see [Resuming a stream](#resuming-a-stream)). 3. Wait for the next user message before sending on `.in`. ### `upgrade-required` control record Signals that the agent cannot handle this message on its current version and a new run has been started. Emitted when the agent calls [`chat.requestUpgrade()`](/docs/ai-chat/patterns/version-upgrades). ``` headers: ["trigger-control", "upgrade-required"] body: "" ``` The server has already swapped the run on the same session by the time this record is delivered. Subsequent records on the same SSE subscription come from the new run. When you receive this record: 1. Treat it as informational — no client action required. The same SSE keeps streaming the new run's chunks on the same session. 2. Optionally surface a "switched to vN.N+1" indicator in your UI. The built-in clients handle this transparently. ### Resuming a stream If the SSE connection drops, reconnect with the `Last-Event-ID` header set to the **last `record.seq_num` you successfully processed** (decoded from the batch body — not the SSE `id:` line, which is a comma-list internal to S2): ``` GET /realtime/v1/sessions/{sessionId}/out Authorization: Bearer Accept: text/event-stream Last-Event-ID: 42 ``` The server resumes streaming from `seq_num = 43` onward. `Last-Event-ID` is a single non-negative integer; passing the SSE `id:` line value verbatim (e.g. `0,1,106`) silently falls back to "start from the beginning." `SSEStreamSubscription` tracks this automatically via its `lastEventId` option. **What "resumable" means.** `session.out` is trimmed back to the previous `turn-complete` control record after each turn finishes. In practice: * **Resume across a single turn boundary always works** — your bookmark is the last turn's `turn-complete` record, which is still on the stream. * **The S2 trim is eventually consistent** (10-60s typical), so close-then-reload-quickly cases reliably still see records that are about to be trimmed. * **Resume across multiple turns of inactivity** may find your bookmark trimmed. The S2 read silently clamps forward to the first surviving record; the cleanest recovery is to fetch the latest snapshot and treat the SSE as fresh from there (or rehydrate via your own DB if you use `hydrateMessages`). See [How history is rebuilt](#how-history-is-rebuilt). ### `X-Peek-Settled` / `X-Session-Settled` — opt-in fast close on idle reconnects On **reconnect-on-reload** paths (resuming a chat where nothing may be streaming), send `X-Peek-Settled: 1` as a request header when opening the SSE. When present, the server peeks the tail of `.out` and walks past any trailing S2 trim command record to find the most recent data/control record underneath. If that record is a `turn-complete` control record (agent finished a turn and is idle-waiting or exited), the SSE: * Uses `wait=0` internally — drains any residual records and closes in \~1s instead of long-polling for 60s. * Sets the `X-Session-Settled: true` response header so the client can tell the close is terminal rather than a mid-stream drop. **Do not send `X-Peek-Settled` on the active-send response-stream path.** The peek would race the newly-triggered turn's first chunk — if the agent hasn't written the new turn's first record yet, the peek sees the prior turn's `turn-complete` and closes the SSE before the response lands on S2. The built-in `TriggerChatTransport.reconnectToStream` sets the header; `sendMessages → subscribeToStream` does not. ```ts theme={"theme":"css-variables"} // Reconnect path (page reload) const response = await fetch(sseUrl, { headers: { Authorization: `Bearer ${publicAccessToken}`, "X-Peek-Settled": "1", "Last-Event-ID": lastEventId, }, }); const settled = response.headers.get("X-Session-Settled") === "true"; // ...subscribe as normal; if settled and nothing arrives, you're done. // Active send path — no X-Peek-Settled, keep long-poll semantics const liveResponse = await fetch(sseUrl, { headers: { Authorization: `Bearer ${publicAccessToken}`, "Last-Event-ID": lastEventId, }, }); ``` ## Step 3: Send messages, stops, and actions All client-to-agent signals are appended to the session's `.in` channel: ``` POST /realtime/v1/sessions/{sessionId}/in/append Authorization: Bearer Content-Type: application/json ``` `{sessionId}` accepts the same friendly-or-external forms as `.out`. The `publicAccessToken` from session-create authorizes both. The body is a JSON-serialized [`ChatInputChunk`](#chatinputchunk) — a tagged union covering messages, stops, and actions. Send them as raw JSON strings (not wrapped in a `data` field). On success the response is `200 OK` with body `{ "ok": true }`; on failure it's `4xx`/`5xx` with `{ "ok": false, "error": "" }`. Common failures: | Status | When | | ------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `401` | Missing or invalid `Authorization` header. | | `403` | Token doesn't carry `write:sessions:{externalId}`. | | `409` | The session is closed — `{ "ok": false, "error": "Cannot append to a closed session" }`. | | `413` | Body exceeds 1 MiB **or** the wrapped record would exceed S2's \~1 MiB per-record metered ceiling. A normal `kind: "message"` payload is a few KB; if you hit this you're shipping more than one message per record or pushing a single tool output that's itself oversized. Carries CORS headers so browser fetches can read the status. | | `500` | Transient backend failure on the durable stream. Safe to retry — appends are idempotent on `(externalId, X-Part-Id)` if you set the optional `X-Part-Id` request header (the built-in clients set it from a UUID). | **Schema validation of `metadata` happens inside the agent, not at this endpoint.** A `kind: "message"` with bad or missing metadata returns `200 OK` here, but the agent rejects the turn at run time. From the wire the failure looks like a `turn-complete` control record with no preceding `text-delta` — i.e. an empty assistant response. **How to detect from the client:** treat "received `turn-complete` after sending a `submit-message` with no `text-delta`/`tool-input-*` chunks in between" as a schema-validation suspect, and surface a sensible error to your user. **How to confirm from the dashboard / Trigger MCP:** the run trace includes a `chat turn N [ERROR]` span followed by `waiting for next message (after error)`; the `[ERROR]` span carries the validation error message in its events. Use `mcp__trigger__get_run_details` (or open the run in the dashboard) on the run ID surfaced in the `runId` field of session-create. ### `ChatInputChunk` ```ts theme={"theme":"css-variables"} type ChatInputChunk = | { kind: "message"; payload: ChatTaskWirePayload } | { kind: "stop"; message?: string }; ``` The discriminator `kind` drives the agent's dispatch — `"message"` goes to the turn loop, `"stop"` fires the abort controller. ### `ChatTaskWirePayload` ```ts theme={"theme":"css-variables"} type ChatTaskWirePayload = { /** * The new message for this turn — at most ONE per record. * - "submit-message": the new user message, OR a tool-approval-responded * assistant message (with `state: "approval-responded"` tool parts). * - "regenerate-message": omitted (the server trims its own tail). * - "preload" / "close" / "action": omitted. * - "handover-prepare": omitted (use `headStartMessages` instead — see below). */ message?: TMessage; /** * Escape hatch for chat.headStart. Ships full UIMessage history on the * very first turn — before any snapshot exists. Used ONLY by * trigger: "handover-prepare" against the customer's own HTTP route * handler. The server ignores this field on any other trigger. */ headStartMessages?: TMessage[]; chatId: string; trigger: | "submit-message" | "regenerate-message" | "preload" | "close" | "action" | "handover-prepare"; messageId?: string; /** * Wire envelope for the agent's typed `clientData` (declared via * `chat.withClientData({ schema })`). Whatever you put here is parsed * against that schema at the agent boundary. If the agent declares * `clientData: { userId: string }`, then `metadata.userId` is required. */ metadata?: TMetadata; action?: unknown; /** * Informational — the server sets this automatically on continuation * runs (when the prior run is dead). Clients don't need to send it. * Read by the agent's boot gate to skip `onChatStart` and trigger * snapshot read + replay. */ continuation?: boolean; /** * Informational — paired with `continuation: true`, set by the server * from the prior run's friendly ID. Surfaced to the agent in * `ctx.previousRunId`. Clients don't need to send it. */ previousRunId?: string; idleTimeoutInSeconds?: number; sessionId?: string; }; ``` **`metadata` is the wire envelope for `clientData`.** The agent's `clientData` (typed via `chat.withClientData({ schema })`) is read from this field at run boot. If the agent declares e.g. `{ userId: string, model?: string }`, then every `kind: "message"` payload — and the `triggerConfig.basePayload` you sent at session create — must carry a matching `metadata.userId`. The agent rejects messages whose metadata fails schema validation. ### Sending a message ``` POST /realtime/v1/sessions/{sessionId}/in/append Authorization: Bearer Content-Type: application/json { "kind": "message", "payload": { "message": { "id": "msg-2", "role": "user", "parts": [{ "type": "text", "text": "Tell me more" }] }, "chatId": "conversation-123", "trigger": "submit-message", "metadata": { "userId": "user-456" } } } ``` After sending, subscribe to `.out` (if you closed the stream after the previous turn's `turn-complete`) to receive the response. Send only the **new** user message — never the full history. The agent rebuilds prior history from a durable S3 snapshot plus a `session.out` replay at run boot. See [How history is rebuilt](#how-history-is-rebuilt). ### Sending a stop ```json theme={"theme":"css-variables"} { "kind": "stop" } ``` Interrupts the agent's current turn. `streamText` aborts, the agent emits a `turn-complete` control record, and the run returns to idle. An optional `message` field surfaces in the agent's stop handler: ```json theme={"theme":"css-variables"} { "kind": "stop", "message": "user cancelled" } ``` ### Sending an action Custom actions (undo, rollback, edit) ride on the same `.in` channel using `kind: "message"` with `trigger: "action"` in the payload. Omit `message` — actions don't carry a UIMessage: ```json theme={"theme":"css-variables"} { "kind": "message", "payload": { "chatId": "conversation-123", "trigger": "action", "action": { "type": "undo" }, "metadata": { "userId": "user-456" } } } ``` Actions wake the agent from suspension (same as messages) and fire the `onAction` hook — they are not turns, so `run()` and turn lifecycle hooks do not fire. If `onAction` returns a `StreamTextResult`, the response is auto-piped to the frontend (but still no `run()` or `onTurnComplete`). The `action` payload is validated against the agent's `actionSchema`. If the agent didn't register an `actionSchema` (or your `action` payload doesn't match it), validation fails the same way `metadata` does — `.in/append` returns `200 OK`, but the run trace shows `chat turn N [ERROR]` and the wire emits a `turn-complete` control record with no other chunks. See [Actions](/docs/ai-chat/actions) for the agent-side schema setup. ### Regenerating the last response To regenerate the assistant's last response, send `trigger: "regenerate-message"` with no `message`: ```json theme={"theme":"css-variables"} { "kind": "message", "payload": { "chatId": "conversation-123", "trigger": "regenerate-message", "metadata": { "userId": "user-456" } } } ``` The agent trims trailing assistant messages from its accumulator and re-streams from the prior user turn. The frontend's `useChat()` already removed the trailing assistant locally — the wire signal tells the agent to do the same. ### Tool approval responses When a tool requires approval (`needsApproval: true`), the agent streams the tool call with an `approval-requested` state and completes the turn. After the user approves or denies, send the **updated assistant message** back as a `kind: "message"` chunk — singular, not the full chain. The minimum shape the agent reads is just the resolved tool parts: ```json theme={"theme":"css-variables"} { "kind": "message", "payload": { "message": { "id": "asst-msg-1", "role": "assistant", "parts": [ { "type": "tool-sendEmail", "toolCallId": "call-1", "state": "approval-responded", "approval": { "id": "approval-1", "approved": true } } ] }, "chatId": "conversation-123", "trigger": "submit-message", "metadata": { "userId": "user-456" } } } ``` The agent matches the incoming message by `id` against the rebuilt accumulator (or hydrated chain) and **overlays the tool-state advance** onto the matching entry — `state` plus `output` / `errorText` / `approval`, depending on the new state. Hydrated `input`, text, reasoning, and provider metadata stay put. This is what makes the slim shape above sufficient: the agent rebuilds everything else from the snapshot or from your `hydrateMessages` hook. The same shape applies to HITL `addToolOutput` answers — substitute `state: "output-available"` and `output: ` for the approval pair above. Single-tool HITL `addToolOutput` continuation payloads are typically \~1 KiB on the wire. The built-in transports (`TriggerChatTransport`, `AgentChat`) ship the slim shape by default on `submit-message` continuations. Custom transports can ship a fuller `UIMessage` — the agent still only reads the resolved tool-part fields — but the slim shape is the most efficient and avoids brushing the per-record cap on reasoning-heavy turns. The message `id` must match the one the agent assigned during streaming. `TriggerChatTransport` keeps IDs in sync automatically. Custom transports should use the `messageId` from the stream's `start` chunk. ## How history is rebuilt The agent rebuilds the full conversation accumulator on every fresh run boot. There are two reconstruction paths, and the agent picks based on what hooks the customer registered: ### Path A — `hydrateMessages` registered If the agent declares a [`hydrateMessages`](/docs/ai-chat/lifecycle-hooks#hydratemessages) hook, the runtime trusts the customer to be the source of truth for history. Snapshot read and replay are **skipped entirely** at boot. The hook fires per turn — `incomingMessages` is 0-or-1-length consistently (since each record carries at most one new message) — and returns the canonical chain from the customer's database. ### Path B — Snapshot + replay (default) When `hydrateMessages` is not registered, the runtime reconstructs history from durable infrastructure on every run boot: The runtime fetches a per-session JSON snapshot from object storage (S3 or compatible). The snapshot stores `{ messages, lastOutEventId, lastOutTimestamp, savedAt }` — what was true at the moment the previous turn finished. A 404 (no snapshot yet) is fine — treated as empty. The runtime subscribes to `session.out` with `wait=0` starting from the snapshot's `lastOutEventId` (or seq 0 if there is no snapshot). Any chunks since that cursor are fed through the AI SDK's `processUIMessageStream` reducer to materialize fresh `UIMessage[]`. This catches turns whose snapshot write didn't make it before a crash. Snapshot messages and replayed messages are merged by `id`. On collision, replay wins — `session.out` is the freshest representation of any assistant message. Partial trailing assistant work from a crashed turn is cleaned up via `cleanupAbortedParts`. When `onTurnComplete` fires, the runtime serializes the accumulator and writes it back to object storage. The write is **awaited** — the run may suspend immediately after, and fire-and-forget would lose the snapshot. Object-store configuration is the same as the rest of Trigger.dev — set `OBJECT_STORE_*` env vars. With no object store configured and no `hydrateMessages` hook, conversations don't survive run boundaries; the runtime logs a warning at registration time. For a deeper walkthrough of the snapshot model, including OOM-retry interaction and crash semantics, see [Persistence and replay](/docs/ai-chat/patterns/persistence-and-replay). ## Head-start protocol caveat The [`chat.headStart`](/docs/ai-chat/fast-starts#head-start) flow runs the first turn's LLM call inside the customer's own HTTP route handler, then hands the durable stream off to the agent for tool execution and step 2+. On that first-ever turn no snapshot exists yet — the agent boots empty. To bridge that gap, the head-start route handler ships **full UIMessage history** through the dedicated `headStartMessages` field with `trigger: "handover-prepare"`. This is the **only** path where a wire-shipped UIMessage\[] still seeds the agent's accumulator: ```json theme={"theme":"css-variables"} { "kind": "message", "payload": { "headStartMessages": [ { "id": "u1", "role": "user", "parts": [/* ... */] }, { "id": "a1", "role": "assistant", "parts": [/* ... */] } ], "chatId": "conversation-123", "trigger": "handover-prepare", "metadata": { "userId": "user-456" } } } ``` Two reasons this exception is safe: 1. **The route handler runs against the customer's own HTTP endpoint**, not `/realtime/v1/sessions/{id}/in/append`. The per-record cap on the realtime route doesn't apply. 2. **`headStartMessages` is only honored on `trigger: "handover-prepare"`**. The runtime ignores the field on every other trigger — the one-message-per-record rule still holds for normal turns. After turn 1 completes, the snapshot is written and turn 2+ run as a normal single-message-per-record chat. ## Pending and steering messages You can send messages while the agent is still streaming a response. These are **pending messages** — the agent receives them mid-turn and can inject them between tool-call steps. The wire format is identical to a normal `kind: "message"` send — same `.in` channel, single `message` field. The difference is timing. What happens depends on the agent's `pendingMessages` configuration: * **With `pendingMessages.shouldInject`**: the message is injected into the model's context at the next `prepareStep` boundary. The agent sees it and can adjust its behavior mid-response. * **Without `pendingMessages` config**: the message queues for the next turn. See [Pending Messages](/docs/ai-chat/pending-messages) for how to configure the agent side. Unlike a normal `sendMessage`, pending messages should **not** cancel the active stream subscription. Keep reading — the agent incorporates the message into the same turn or queues it for the next one. ## Continuations A run can end for several reasons: idle timeout, max turns reached, `chat.requestUpgrade()`, crash, or cancellation. When this happens, the session row stays alive — only the run is gone. The next message you append to `.in` automatically triggers a fresh run on the same session. **Clients send the wire shape exactly as a normal `submit-message`** — the server detects the absent run and handles the continuation itself: ```json theme={"theme":"css-variables"} { "kind": "message", "payload": { "message": { "id": "u-42", "role": "user", "parts": [{ "type": "text", "text": "Where were we?" }] }, "chatId": "conversation-123", "trigger": "submit-message", "metadata": { "userId": "user-456" } } } ``` POST to the same `/realtime/v1/sessions/{sessionId}/in/append` URL with the same `publicAccessToken` you've been using — both stay valid across runs. The server detects the absent run, triggers a new one on the session's `triggerConfig`, and the agent boots, reads the snapshot from the prior run's last turn, replays any tail, and continues. Only `runId` changes — the new run's id is encoded in the next refreshed `publicAccessToken`'s `read:runs:{runId}` scope. **You don't need to track `runId` or set `continuation: true` / `previousRunId` yourself.** The server detects continuation when the prior run is in a terminal state and sets those fields on the new run's boot payload automatically. The `continuation` and `previousRunId` fields on `ChatTaskWirePayload` are informational — used internally by the agent's boot path, never required from the client. **`onChatStart` does NOT fire on continuation runs.** The hook is once-per-chat — it fires only on the chat's very first user message. Customers who want per-turn setup that also runs on continuation turns should use `onTurnStart` instead. This is how [version upgrades](/docs/ai-chat/patterns/version-upgrades) work transparently — the agent calls `chat.requestUpgrade()`, the run exits, and the client's next message triggers a continuation on the new version. Same session, new run, same snapshot. ## Closing the conversation When the user is done with the conversation, close the session: ```bash theme={"theme":"css-variables"} POST /api/v1/sessions/{sessionId}/close Authorization: Bearer Content-Type: application/json { "reason": "user-ended" } ``` The body is optional — `{}` (or no body at all) closes the session with no reason set. If provided, `reason` is a free-form string up to 256 characters used for dashboard / audit display. Closing is **idempotent**: re-calling on an already-closed session returns the existing row without clobbering the original `closedAt` / `closedReason`. A long-running chat that's just between turns is a **live** session, not a closed one — don't close it prematurely. Once closed, the session cannot be reopened; reuse a different `externalId` if the user wants to start fresh. ## Session state A client needs to track per-conversation: | Field | Description | | ------------------- | ----------------------------------------------------------------------------------------------------------------------------------- | | `sessionId` | Durable session ID (`session_*`). Stable for the life of the conversation. | | `chatId` | Your stable conversation ID (passed as `externalId` on create). | | `runId` | Current run ID. Changes when a run ends and a continuation starts. Only needed if you want to display it. | | `publicAccessToken` | JWT for session access. Stable across runs; refreshed via the `public-access-token` header on every `turn-complete` control record. | | `lastEventId` | Last `record.seq_num` received on `.out`. Use to resume mid-stream. | `sessionId`, `chatId`, and `publicAccessToken` are durable. `runId` is live-run state that refreshes on each new run. On reload, you only need `sessionId` + `publicAccessToken` + `lastEventId` to resume — `runId` is a hint that can be `null` when no run is active. ## Authentication | Operation | Auth | | -------------------------------------------------- | ------------------------------------------------------------------------------------------------------- | | Create session (`POST /api/v1/sessions`) | Secret API key, or JWT with `write:sessions` super-scope plus a matching `tasks:{taskIdentifier}` scope | | Close session (`POST /api/v1/sessions/{id}/close`) | Secret API key, or JWT with `admin:sessions:{id}` / `admin:sessions` super-scope | | `.in` append | The session's `publicAccessToken` (carries `write:sessions:{id}`) | | `.out` subscribe | The session's `publicAccessToken` (carries `read:sessions:{id}`) | The `publicAccessToken` returned in the body of `POST /api/v1/sessions` carries both `read:sessions:{externalId}` and `write:sessions:{externalId}` and is **the only token you need** for every `.in`/`.out` operation thereafter. A token minted on the externalId form authorizes both the externalId and the friendlyId URL forms on every read and write route, so use whichever URL form your client already has on hand. **Don't use the `x-trigger-jwt` header from `POST /api/v1/tasks/{taskId}/trigger`.** That header carries `read:runs:{runId}` + `write:inputStreams:{runId}` — run-scoped scopes, not session-scoped. It cannot subscribe to `.out` or append to `.in`. Always use the `publicAccessToken` from the session-create response body. ## FAQ Yes. `.in` records are processed in arrival order — the agent's stop handler aborts the in-flight `streamText`, emits a `turn-complete` control record, and reads the next record. You don't have to wait for `turn-complete` on the wire before posting the next `.in/append`. In practice you usually do anyway, because your UI is gated on the stream coming back to ready. Any opaque ASCII string up to \~64 characters. The built-in clients generate a high-entropy id per logical send (a UUID in the browser, a `nanoid` server-side) and reuse it across auth retries of that send. The server uses it as a per-record idempotency key — re-POSTing the same body with the same `X-Part-Id` produces a single S2 record. If you don't send the header, the server generates one for you and idempotency is per-request only. The `.in/append` route returns standard rate-limit response headers (`x-ratelimit-limit`, `x-ratelimit-remaining`, `x-ratelimit-reset` — Unix ms epoch when the bucket refills). On `429`, back off until `x-ratelimit-reset` and retry with the same `X-Part-Id` to remain idempotent. Default per-environment limits are generous (millions of requests/window); you'll typically only hit this with runaway client loops. You don't need to. There's no `trigger:run-ended` chunk. The protocol is designed so the client doesn't track run lifecycle: * A `turn-complete` control record means **the turn finished**, not that the run is gone. The run may still be alive, idle-waiting for the next `.in` record, or it may have suspended / exited shortly after. * When you POST the next message to `.in/append`, the server figures out whether the existing run can pick it up or whether to spawn a continuation. Either way you get streamed responses on the same `.out` URL. If you genuinely need the live `runId` (for displaying the dashboard link, say), read it from the latest `turn-complete` control record's refreshed `public-access-token` header — the JWT's `read:runs:{runId}` scope encodes it. Or call `GET /api/v1/sessions/{sessionId}` (omitted from this page; see the Sessions API reference) to read `currentRunId`. No. `seq_num` is monotonic across the entire session — turn 1 might emit seq 0–9, turn 2 picks up at seq 10+, and a continuation run on the same session continues numbering from where the prior run left off. A single `Last-Event-ID` cursor is sufficient to resume across turns and runs. The HTTP body is capped at 1 MiB as a DoS guard. The actual ceiling is at the storage layer: each `.in/append` becomes a single S2 record, metered as `8 + body_bytes_after_JSON_wrap`, capped at 1 MiB. So the practical limit on the raw HTTP body sits around \~1023 KiB for content with low JSON-escape overhead (ASCII, base64) and \~512 KiB for content that escapes heavily (all quotes / backslashes). A typical `kind: "message"` is a few KiB. If you're brushing the cap you're either shipping a single tool output that's itself oversized — see [Large payloads](/docs/ai-chat/patterns/large-payloads) — or you're shipping more than one message per record, which the protocol forbids. The 413 response carries CORS headers so browser fetches can read the status. The headStart path (`trigger: "handover-prepare"`) sends through the customer's own HTTP route handler, not `.in/append`, so the cap doesn't apply there. ## See also * [`TriggerChatTransport`](/docs/ai-chat/frontend) — Built-in browser transport (implements this protocol) * [`AgentChat`](/docs/ai-chat/server-chat) — Built-in server-side client * [Persistence and replay](/docs/ai-chat/patterns/persistence-and-replay) — How the snapshot + replay model works end-to-end * [Lifecycle hooks](/docs/ai-chat/lifecycle-hooks) — What the agent does on each event * [Version upgrades](/docs/ai-chat/patterns/version-upgrades) — How `chat.requestUpgrade()` uses continuations # Compaction Source: https://trigger.dev/docs/ai-chat/compaction Automatic context compaction to keep long conversations within token limits. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. ## Overview Long conversations accumulate tokens across turns. Eventually the context window fills up, causing errors or degraded responses. Compaction solves this by automatically summarizing the conversation when token usage exceeds a threshold, then using that summary as the context for future turns. The `compaction` option on `chat.agent()` handles this in both paths: * **Between tool-call steps** (inner loop) — via the AI SDK's `prepareStep`, compaction runs between tool calls within a single turn * **Between turns** (outer loop) — for single-step responses with no tool calls, where `prepareStep` never fires ## Basic usage Provide `shouldCompact` to decide when to compact and `summarize` to generate the summary: ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { streamText, generateText, stepCountIs } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; export const myChat = chat.agent({ id: "my-chat", compaction: { shouldCompact: ({ totalTokens }) => (totalTokens ?? 0) > 80_000, summarize: async ({ messages }) => { const result = await generateText({ model: anthropic("claude-haiku-4-5"), messages: [...messages, { role: "user", content: "Summarize this conversation concisely." }], }); return result.text; }, }, run: async ({ messages, signal }) => { return streamText({ ...chat.toStreamTextOptions({ registry }), messages, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` The `prepareStep` for inner-loop compaction is automatically injected when you spread `chat.toStreamTextOptions()` into your `streamText` call. If you provide your own `prepareStep` after the spread, it overrides the auto-injected one. ## How it works After each turn completes: 1. `shouldCompact` is called with the current token usage 2. If it returns `true`, `summarize` generates a summary from the model messages 3. The **model messages** (sent to the LLM) are replaced with the summary 4. The **UI messages** (persisted and displayed) are preserved by default 5. The `onCompacted` hook fires if configured On the next turn, the LLM receives the compact summary instead of the full history — dramatically reducing token usage while preserving context. ## Customizing what gets persisted By default, compaction only affects model messages — UI messages stay intact so users see the full conversation after a page refresh. You can customize this with `compactUIMessages`: ### Summary + recent messages Replace older messages with a summary but keep the last few exchanges visible: ```ts theme={"theme":"css-variables"} import { generateId } from "ai"; export const myChat = chat.agent({ id: "my-chat", compaction: { shouldCompact: ({ totalTokens }) => (totalTokens ?? 0) > 80_000, summarize: async ({ messages }) => { return generateText({ model: anthropic("claude-haiku-4-5"), messages: [...messages, { role: "user", content: "Summarize." }], }).then((r) => r.text); }, compactUIMessages: ({ uiMessages, summary }) => [ { id: generateId(), role: "assistant", parts: [{ type: "text", text: `[Conversation summary]\n\n${summary}` }], }, ...uiMessages.slice(-4), // Keep the last 4 messages ], }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` ### Flatten to summary only Replace all messages with just the summary (like the LLM sees): ```ts theme={"theme":"css-variables"} compactUIMessages: ({ summary }) => [ { id: generateId(), role: "assistant", parts: [{ type: "text", text: `[Conversation summary]\n\n${summary}` }], }, ], ``` ## Customizing model messages By default, model messages are replaced with a single summary message. Use `compactModelMessages` to customize what the LLM sees after compaction: ### Summary + recent context Keep the last few model messages so the LLM has recent detail alongside the summary: ```ts theme={"theme":"css-variables"} compactModelMessages: ({ modelMessages, summary }) => [ { role: "user", content: summary }, ...modelMessages.slice(-2), // Keep last exchange for detail ], ``` ### Keep tool results Preserve tool-call results so the LLM remembers what tools returned: ```ts theme={"theme":"css-variables"} compactModelMessages: ({ modelMessages, summary }) => [ { role: "user", content: summary }, ...modelMessages.filter((m) => m.role === "tool"), ], ``` ## shouldCompact event The `shouldCompact` callback receives context about the current state: | Field | Type | Description | | -------------- | --------------------- | ---------------------------------------------- | | `messages` | `ModelMessage[]` | Current model messages | | `totalTokens` | `number \| undefined` | Total tokens from the triggering step/turn | | `inputTokens` | `number \| undefined` | Input tokens | | `outputTokens` | `number \| undefined` | Output tokens | | `usage` | `LanguageModelUsage` | Full usage object | | `totalUsage` | `LanguageModelUsage` | Cumulative usage across all turns | | `chatId` | `string` | Chat session ID | | `turn` | `number` | Current turn (0-indexed) | | `clientData` | `unknown` | Custom data from the frontend | | `source` | `"inner" \| "outer"` | Whether this is between steps or between turns | | `steps` | `CompactionStep[]` | Steps array (inner loop only) | | `stepNumber` | `number` | Step index (inner loop only) | ## summarize event The `summarize` callback receives similar context: | Field | Type | Description | | ------------ | -------------------- | ----------------------------------- | | `messages` | `ModelMessage[]` | Messages to summarize | | `usage` | `LanguageModelUsage` | Usage from the triggering step/turn | | `totalUsage` | `LanguageModelUsage` | Cumulative usage | | `chatId` | `string` | Chat session ID | | `turn` | `number` | Current turn | | `clientData` | `unknown` | Custom data from the frontend | | `source` | `"inner" \| "outer"` | Where compaction is running | | `stepNumber` | `number` | Step index (inner loop only) | ## onCompacted hook Track compaction events for logging, billing, or analytics: ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", compaction: { ... }, onCompacted: async ({ summary, totalTokens, messageCount, chatId, turn }) => { logger.info("Compacted", { chatId, turn, totalTokens, messageCount }); await db.compactionLog.create({ data: { chatId, summary, totalTokens, messageCount }, }); }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` ## User-initiated compaction Sometimes you want the user to decide when to compact — a "Summarize conversation" button, a `/compact` slash command, or a settings toggle. Wire this up with [actions](/docs/ai-chat/actions): the frontend sends a typed action, `onAction` runs the summary, and `chat.history.set()` replaces the conversation. ### Backend Define a `compact` action that reuses your existing `summarize` function: ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { streamText, generateText, generateId, convertToModelMessages } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; import { z } from "zod"; // Reusable summarize fn — also used by the automatic compaction config. async function summarize(messages: ModelMessage[]) { const result = await generateText({ model: anthropic("claude-haiku-4-5"), messages: [...messages, { role: "user", content: "Summarize this conversation concisely." }], }); return result.text; } export const myChat = chat.agent({ id: "my-chat", // Automatic compaction still runs on threshold. compaction: { shouldCompact: ({ totalTokens }) => (totalTokens ?? 0) > 80_000, summarize: async ({ messages }) => summarize(messages), }, // User-initiated: the frontend sends { type: "compact" }. actionSchema: z.discriminatedUnion("type", [ z.object({ type: z.literal("compact") }), ]), onAction: async ({ action, uiMessages }) => { if (action.type !== "compact") return; const summary = await summarize(convertToModelMessages(uiMessages)); // Replace the full history with a single summary message. chat.history.set([ { id: generateId(), role: "assistant", parts: [{ type: "text", text: `[Conversation summary]\n\n${summary}` }], }, ]); }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` Actions fire `onAction` only (plus `hydrateMessages` if set) — `run()` and `onTurnComplete` do not fire for actions. Persist the compacted state directly inside `onAction` after the `chat.history.set` call. See [Actions](/docs/ai-chat/actions) for the full lifecycle. ### Frontend Call `transport.sendAction()` from a button or slash command: ```tsx theme={"theme":"css-variables"} import { useTriggerChatTransport } from "@trigger.dev/sdk/chat/react"; import { useChat } from "@ai-sdk/react"; function ChatView({ chatId }: { chatId: string }) { const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), }); const { messages } = useChat({ id: chatId, transport }); return ( <> {messages.map(/* ... */)} ); } ``` The call returns as soon as the backend accepts the action. Because `onTurnComplete` replaces the `uiMessages` with the summary, `useChat` receives the new state via the normal turn-complete flow — the UI updates automatically. ### Indicating compaction in the UI For "Compacting..." feedback while the summary generates, append a transient data part from `onAction` via `chat.stream.append()`: ```ts theme={"theme":"css-variables"} onAction: async ({ action, uiMessages }) => { if (action.type !== "compact") return; chat.stream.append({ type: "data-compaction", data: { status: "compacting" } }); const summary = await summarize(convertToModelMessages(uiMessages)); chat.stream.append({ type: "data-compaction", data: { status: "complete" } }); chat.history.set([ /* ... */ ]); }, ``` See [Raw streaming with `chat.stream`](/docs/ai-chat/backend#raw-streaming-with-chat-stream) for the full API. ## Using with chat.createSession() Pass the same `compaction` config to `chat.createSession()`. The session handles outer-loop compaction automatically inside `turn.complete()`: ```ts theme={"theme":"css-variables"} const session = chat.createSession(payload, { signal, idleTimeoutInSeconds: 60, timeout: "1h", compaction: { shouldCompact: ({ totalTokens }) => (totalTokens ?? 0) > 80_000, summarize: async ({ messages }) => generateText({ model: anthropic("claude-haiku-4-5"), messages }).then((r) => r.text), compactUIMessages: ({ uiMessages, summary }) => [ { id: generateId(), role: "assistant", parts: [{ type: "text", text: `[Summary]\n\n${summary}` }] }, ...uiMessages.slice(-4), ], }, }); for await (const turn of session) { const result = streamText({ model: anthropic("claude-sonnet-4-5"), messages: turn.messages, abortSignal: turn.signal, stopWhen: stepCountIs(15), }); await turn.complete(result); // Outer-loop compaction runs automatically after complete() await db.chat.update({ where: { id: turn.chatId }, data: { messages: turn.uiMessages }, }); } ``` ## Using with raw tasks (MessageAccumulator) Pass `compaction` to the `MessageAccumulator` constructor. Use `prepareStep()` for inner-loop compaction and `compactIfNeeded()` for the outer loop: ```ts theme={"theme":"css-variables"} const conversation = new chat.MessageAccumulator({ compaction: { shouldCompact: ({ totalTokens }) => (totalTokens ?? 0) > 80_000, summarize: async ({ messages }) => generateText({ model: anthropic("claude-haiku-4-5"), messages }).then((r) => r.text), compactUIMessages: ({ summary }) => [ { id: generateId(), role: "assistant", parts: [{ type: "text", text: `[Summary]\n\n${summary}` }] }, ], }, }); for (let turn = 0; turn < 100; turn++) { const messages = await conversation.addIncoming(payload.messages, payload.trigger, turn); const result = streamText({ model: anthropic("claude-sonnet-4-5"), messages, prepareStep: conversation.prepareStep(), // Inner-loop compaction stopWhen: stepCountIs(15), }); const response = await chat.pipeAndCapture(result); if (response) await conversation.addResponse(response); // Outer-loop compaction const usage = await result.totalUsage; await conversation.compactIfNeeded(usage, { chatId: payload.chatId, turn }); await db.chat.update({ data: { messages: conversation.uiMessages } }); await chat.writeTurnComplete(); } ``` ## Fully manual compaction For maximum control, use `chat.compact()` directly inside a custom `prepareStep`: ```ts theme={"theme":"css-variables"} prepareStep: async ({ messages: stepMessages, steps }) => { const result = await chat.compact(stepMessages, steps, { threshold: 80_000, summarize: async (msgs) => generateText({ model: anthropic("claude-haiku-4-5"), messages: msgs }).then((r) => r.text), }); return result.type === "skipped" ? undefined : result; }, ``` Or use the `chat.compactionStep()` factory: ```ts theme={"theme":"css-variables"} prepareStep: chat.compactionStep({ threshold: 80_000, summarize: async (msgs) => generateText({ model: anthropic("claude-haiku-4-5"), messages: msgs }).then((r) => r.text), }), ``` The fully manual APIs only handle inner-loop compaction (between tool-call steps). For outer-loop coverage, use the `compaction` option on `chat.agent()`, `chat.createSession()`, or `MessageAccumulator`. # Custom agents Source: https://trigger.dev/docs/ai-chat/custom-agents Build chat agents without chat.agent()'s managed lifecycle: register with chat.customAgent(), then drive turns with the createSession iterator or a hand-rolled loop. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. **A custom agent is a task you register with `chat.customAgent()` and drive yourself — either with the managed turn iterator from `chat.createSession()`, or with a fully hand-rolled loop over the raw chat primitives.** You give up `chat.agent()`'s lifecycle hooks and automatic continuation recovery; you gain inline control over every turn, and (at the lowest level) full control over the stream conversion. See the [comparison table](/docs/ai-chat/backend) before dropping down. The frontend is unchanged either way: all levels speak the same wire protocol, so [`useTriggerChatTransport`](/docs/ai-chat/frontend) points at a custom agent exactly like a `chat.agent()`. ## chat.customAgent() `chat.customAgent()` is a thin wrapper around `task()` that does two things: it registers the task as an agent (so it appears in the agent dashboard, the playground, and the MCP server's `list_agents`), and it binds the run to its backing [Session](/docs/ai-chat/sessions) so the `chat.*` primitives resolve to the right `.in`/`.out` channels. There is no managed lifecycle — no turn loop, no hooks, no preload handling. A plain `task()` works with the same primitives but stays invisible to the agent surfaces, so prefer `customAgent` unless you specifically don't want the task listed as an agent. Inside the wrapper, pick one of two loop styles: * **[Managed loop](#managed-loop-chatcreatesession)** — `chat.createSession()` yields turns; the SDK handles stop signals, accumulation, idle suspend/resume, and turn-complete signaling. You write the turn body. * **[Hand-rolled loop](#hand-rolled-loop-with-primitives)** — you write the loop itself with `chat.messages`, `MessageAccumulator`, `pipeAndCapture`, and `writeTurnComplete`. The right choice when you need complete control over `.toUIMessageStream()` (e.g. `onFinish`, `originalMessages`) beyond what `chat.setUIMessageStreamOptions()` provides, or you're implementing a custom protocol. ## Managed loop: chat.createSession() `chat.createSession()` gives you an async iterator of `ChatTurn` objects. Each turn arrives with the accumulated history, a combined stop+cancel signal, and helpers to finish the turn: ```ts trigger/my-chat.ts theme={"theme":"css-variables"} import { chat, type ChatTaskWirePayload } from "@trigger.dev/sdk/ai"; import { streamText, stepCountIs } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; export const myChat = chat.customAgent({ id: "my-chat", run: async (payload: ChatTaskWirePayload, { signal }) => { // One-time initialization — plain code, no hooks. Upsert, not create: // continuation runs boot with the row already in place. const clientData = payload.metadata as { userId: string }; await db.chat.upsert({ where: { id: payload.chatId }, create: { id: payload.chatId, userId: clientData.userId }, update: {}, }); const session = chat.createSession(payload, { signal, idleTimeoutInSeconds: 60, timeout: "1h", }); for await (const turn of session) { // Persist the incoming user message BEFORE streaming — this is your // onTurnStart equivalent. Without it, a page reload mid-stream // restores the assistant text (replayed from the session) but loses // the user message that prompted it. await db.chat.update({ where: { id: turn.chatId }, data: { messages: turn.uiMessages }, }); const result = streamText({ model: anthropic("claude-sonnet-4-5"), messages: turn.messages, abortSignal: turn.signal, stopWhen: stepCountIs(15), }); // Pipe, capture, accumulate, and signal turn-complete — all in one call await turn.complete(result); // Persist the full exchange after the turn — your onTurnComplete equivalent await db.chat.update({ where: { id: turn.chatId }, data: { messages: turn.uiMessages }, }); } }, }); ``` If you pass `compaction` or `pendingMessages` to `chat.createSession()`, you must also pass `prepareStep: turn.prepareStep()` to `streamText` (or spread `chat.toStreamTextOptions()`, which wires it automatically). Without it, both features silently no-op. ### ChatSessionOptions | Option | Type | Default | Description | | ---------------------- | ---------------------------- | ----------- | ------------------------------------------------------------------------------------------------ | | `signal` | `AbortSignal` | required | Run-level cancel signal (from task context) | | `idleTimeoutInSeconds` | `number` | `30` | Seconds to stay idle between turns before suspending | | `timeout` | `string` | `"1h"` | Duration string for suspend timeout | | `maxTurns` | `number` | `100` | Max turns before ending | | `compaction` | `ChatAgentCompactionOptions` | `undefined` | Automatic context [compaction](/docs/ai-chat/compaction) — same options as on `chat.agent()` | | `pendingMessages` | `PendingMessagesOptions` | `undefined` | Mid-execution [message injection](/docs/ai-chat/pending-messages) — same options as on `chat.agent()` | Between turns the run idles on `waitWithIdleTimeout`: after `idleTimeoutInSeconds` with no message it suspends (compute is freed), and the next message restores it on the same run — the same warm/suspended pipeline `chat.agent()` uses. ### ChatTurn Each turn yielded by the iterator provides: | Field | Type | Description | | ------------------- | --------------------------------- | -------------------------------------------------------- | | `number` | `number` | Turn number (0-indexed) | | `chatId` | `string` | Chat session ID | | `trigger` | `string` | What triggered this turn | | `clientData` | `unknown` | Client data from the transport | | `messages` | `ModelMessage[]` | Full accumulated model messages — pass to `streamText` | | `uiMessages` | `UIMessage[]` | Full accumulated UI messages — use for persistence | | `signal` | `AbortSignal` | Combined stop+cancel signal (fresh each turn) | | `stopped` | `boolean` | Whether the user stopped generation this turn | | `continuation` | `boolean` | Whether this is a continuation run | | `previousTurnUsage` | `LanguageModelUsage \| undefined` | Token usage from the previous turn (undefined on turn 0) | | `totalUsage` | `LanguageModelUsage` | Cumulative token usage across all completed turns | | Method | Description | | ------------------------------ | --------------------------------------------------------------------------------------------------------------------------- | | `turn.complete(source)` | Pipe stream, capture response, accumulate, and signal turn-complete | | `turn.done()` | Signal turn-complete only (when you have piped manually) | | `turn.addResponse(response)` | Add a response to the accumulator manually | | `turn.setMessages(uiMessages)` | Replace the accumulated messages — continuation seeding and on-demand compaction | | `turn.prepareStep()` | `prepareStep` callback wiring compaction + injection — pass to `streamText` when not spreading `chat.toStreamTextOptions()` | ### Continuation runs and history seeding `chat.agent()` rebuilds conversation history automatically when a chat continues on a fresh run (after a cancel, crash, version upgrade, or TTL expiry) — via its snapshot/replay boot or your `hydrateMessages` hook. Custom agents do none of that: a continuation run starts with an **empty accumulator**, and history restoration is your job. With `createSession`, check `turn.continuation` on the first turn and seed from your store with `turn.setMessages()`: ```ts theme={"theme":"css-variables"} for await (const turn of session) { if (turn.continuation && turn.number === 0) { const row = await db.chat.findUnique({ where: { id: turn.chatId } }); const stored = (row?.messages ?? []) as UIMessage[]; if (stored.length > 0) { // Keep any incoming message that isn't already persisted const incoming = turn.uiMessages.filter((m) => !stored.some((s) => s.id === m.id)); await turn.setMessages([...stored, ...incoming]); } } // ... streamText + turn.complete as usual } ``` Without this, a resumed chat silently loses its history: the model sees only the message that triggered the continuation. In a hand-rolled loop, seed by passing the stored history into the turn-0 `addIncoming` call — shown in the example below. ### turn.complete() vs manual control `turn.complete(result)` is the one-call path — it handles piping, capturing the response, accumulating messages, cleaning up aborted parts on a stop, and writing the turn-complete chunk. For more control, you can do each step manually: ```ts theme={"theme":"css-variables"} for await (const turn of session) { const result = streamText({ model: anthropic("claude-sonnet-4-5"), messages: turn.messages, abortSignal: turn.signal, stopWhen: stepCountIs(15), }); // Manual: pipe and capture separately const response = await chat.pipeAndCapture(result, { signal: turn.signal }); if (response) { // Custom processing before accumulating await turn.addResponse(response); } // Custom persistence, analytics, etc. await db.chat.update({ ... }); // Must call done() when not using complete() await turn.done(); } ``` ## Hand-rolled loop with primitives For full control, skip `createSession` and compose the primitives directly: | Primitive | Description | | ------------------------------- | ------------------------------------------------------------------------------------------- | | `chat.messages` | Input stream for incoming messages — use `.waitWithIdleTimeout()` to wait for the next turn | | `chat.createStopSignal()` | Create a managed stop signal wired to the stop input stream | | `chat.pipeAndCapture(result)` | Pipe a `StreamTextResult` to the chat stream and capture the response | | `chat.writeTurnComplete()` | Signal the frontend that the current turn is complete | | `chat.MessageAccumulator` | Accumulates conversation messages across turns | | `chat.pipe(stream)` | Pipe a stream to the frontend (no response capture) | | `chat.cleanupAbortedParts(msg)` | Clean up incomplete parts from a stopped response | A complete loop: ```ts trigger/my-chat-raw.ts theme={"theme":"css-variables"} import { chat, type ChatTaskWirePayload } from "@trigger.dev/sdk/ai"; import { streamText, stepCountIs } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; export const myChat = chat.customAgent({ id: "my-chat-raw", run: async (payload: ChatTaskWirePayload, { signal: runSignal }) => { let currentPayload = payload; // Handle preload — wait for the first real message if (currentPayload.trigger === "preload") { const result = await chat.messages.waitWithIdleTimeout({ idleTimeoutInSeconds: 60, timeout: "1h", spanName: "waiting for first message", }); if (!result.ok) return; currentPayload = result.output; } const stop = chat.createStopSignal(); const conversation = new chat.MessageAccumulator(); // Continuation runs (cancel, crash, upgrade) start with an empty // accumulator — fetch stored history so turn 0 can seed it. let continuationSeed: UIMessage[] = []; if (currentPayload.continuation) { const row = await db.chat.findUnique({ where: { id: currentPayload.chatId } }); continuationSeed = (row?.messages ?? []) as UIMessage[]; } for (let turn = 0; turn < 100; turn++) { stop.reset(); // The wire payload carries at most one new message per turn. Turn 0 // REPLACES the accumulator, so seed stored history through // addIncoming together with the incoming message — a setMessages // call before the loop would be wiped here. const incoming = currentPayload.message ? [currentPayload.message] : []; const turnInput = turn === 0 && continuationSeed.length > 0 ? [...continuationSeed.filter((s) => !incoming.some((m) => m.id === s.id)), ...incoming] : incoming; const messages = await conversation.addIncoming(turnInput, currentPayload.trigger, turn); // Persist the incoming user message before streaming so a // mid-stream reload doesn't lose it. await db.chat.update({ where: { id: currentPayload.chatId }, data: { messages: conversation.uiMessages }, }); const combinedSignal = AbortSignal.any([runSignal, stop.signal]); const result = streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: combinedSignal, stopWhen: stepCountIs(15), }); let response; try { response = await chat.pipeAndCapture(result, { signal: combinedSignal }); } catch (error) { if (error instanceof Error && error.name === "AbortError") { if (runSignal.aborted) break; // Stop — fall through to accumulate partial } else { throw error; } } if (response) { const cleaned = stop.signal.aborted && !runSignal.aborted ? chat.cleanupAbortedParts(response) : response; await conversation.addResponse(cleaned); } if (runSignal.aborted) break; // Persist, analytics, etc. await db.chat.update({ where: { id: currentPayload.chatId }, data: { messages: conversation.uiMessages }, }); await chat.writeTurnComplete(); // Wait for the next message const next = await chat.messages.waitWithIdleTimeout({ idleTimeoutInSeconds: 60, timeout: "1h", spanName: "waiting for next message", }); if (!next.ok) break; currentPayload = next.output; } stop.cleanup(); }, }); ``` ### MessageAccumulator `addIncoming(messages, trigger, turn)` has two modes: * **Turn 0 or `trigger === "regenerate-message"`: replaces** the accumulator with exactly what you pass. This is why continuation seeding goes through `addIncoming` (above), and why a regenerate needs you to slice your own history — the wire omits the message on regenerate, so pass the stored history minus the last assistant message. * **Every other turn: appends** what you pass (the wire carries at most the one new user message). ```ts theme={"theme":"css-variables"} const conversation = new chat.MessageAccumulator(); // Returns full accumulated ModelMessage[] for streamText const messages = await conversation.addIncoming( payload.message ? [payload.message] : [], payload.trigger, turn ); // After piping, add the response const response = await chat.pipeAndCapture(result); if (response) await conversation.addResponse(response); // Access accumulated messages for persistence conversation.uiMessages; // UIMessage[] conversation.modelMessages; // ModelMessage[] ``` The constructor also accepts `compaction` and `pendingMessages` options (same shapes as on `chat.agent()`); pass `prepareStep: conversation.prepareStep()` to `streamText` to activate them. See [pending messages](/docs/ai-chat/pending-messages#backend-messageaccumulator-raw-task) for the manual steering wiring. ### Hand-rolled loop checklist Things the managed levels do for you that a raw loop has to get right: * **Don't bare-await `result.totalUsage`.** On a stop-abort the AI SDK's `totalUsage` promise never settles, which wedges the loop forever. Race it with a timeout: ```ts theme={"theme":"css-variables"} const turnUsage = await Promise.race([ result.totalUsage, new Promise((resolve) => setTimeout(() => resolve(undefined), 2000)), ]); ``` * **Persist the user message before streaming** (shown in the example above). The session replay restores the assistant's streamed text after a page reload, but nothing restores a user message you haven't written down. * **Seed history on continuation runs through the turn-0 `addIncoming`** (shown above). `payload.continuation` is `true` when this run picked up an existing chat; the accumulator starts empty — and because turn 0 replaces the accumulator, a `setMessages` call before the loop gets wiped. * **Clean up aborted parts on a stop** with `chat.cleanupAbortedParts()` before accumulating, or the partial response carries half-open tool calls into the next turn's prompt. * **Read `payload.message` (singular).** The wire payload carries at most one new message per turn; there is no `messages` array on the payload. ## Next steps The three abstraction levels compared, and everything chat.agent() adds on top. The durable stream pair every agent — managed or custom — is built on. Automatic context compression — works with createSession and MessageAccumulator. The wire format your loop is speaking, chunk by chunk. # Error handling Source: https://trigger.dev/docs/ai-chat/error-handling How errors flow through chat.agent — stream errors, hook errors, run failures — and how to recover. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. `chat.agent` errors fall into four layers, each with different recovery semantics. The default behavior is **conversation-preserving**: a thrown error in a hook or `run()` does not kill the chat. The current turn ends with an error chunk, and the agent waits for the user's next message. ## Error layers at a glance | Layer | Source | Default behavior | Recovery | | --------------- | ------------------------------------------------------------------ | --------------------------------------------------------------------- | ----------------------------------------------------- | | **Stream** | `streamText` errors mid-response (rate limits, model API failures) | `onError` callback converts to error chunk | Sanitize message via `uiMessageStreamOptions.onError` | | **Hook / turn** | Throws in `onValidateMessages`, `onTurnStart`, `run`, etc. | Error chunk + turn-complete written to stream; conversation continues | Catch in your hook, or rely on default | | **Run** | Unhandled exception escapes the run | Run fails. No retry by default. Standard task `onFailure` fires. | `onFailure` task hook | | **Frontend** | Stream delivers `{ type: "error", errorText }` | `useChat` exposes via `error` field and `onError` callback | Show toast, retry button, etc. | ## Stream errors mid-turn When the model API errors mid-response (rate limits, network failures, malformed output), the AI SDK's `streamText` calls the `onError` callback. Use `uiMessageStreamOptions.onError` to convert the error to a user-friendly string. The string is sent to the frontend as an error chunk. ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; export const myChat = chat.agent({ id: "my-chat", uiMessageStreamOptions: { onError: (error) => { console.error("Stream error:", error); if (error instanceof Error && error.message.includes("rate limit")) { return "Rate limited. Please wait a moment and try again."; } if (error instanceof Error && error.message.includes("context_length")) { return "This conversation is too long. Please start a new chat."; } return "Something went wrong while generating a response. Please try again."; }, }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` Returning a string from `onError` is what gets shown to the user. Do not return raw error messages — they may leak internal details (API keys, stack traces, etc.). The frontend receives this as an error chunk that `useChat` exposes via its `error` field: ```tsx theme={"theme":"css-variables"} const { messages, error } = useChat({ transport }); {error &&
{error.message}
} ``` ## Hook and turn errors If any lifecycle hook (`onValidateMessages`, `onChatStart`, `onTurnStart`, `hydrateMessages`, `onAction`, `prepareMessages`, `onBeforeTurnComplete`, `onTurnComplete`) or `run()` throws an unhandled exception, the turn loop catches it: 1. Writes `{ type: "error", errorText: error.message }` to the stream 2. Writes a turn-complete chunk to close the turn 3. Waits for the next user message The conversation stays alive. The user can send another message and continue. ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", onTurnStart: async ({ chatId, uiMessages }) => { // If this throws, the turn ends with an error chunk // and the agent waits for the next message await db.chat.update({ where: { id: chatId }, data: { messages: uiMessages } }); }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` ### Catching errors in your own hooks For granular control, wrap your hook code in try/catch and decide what to do. Common patterns: ```ts theme={"theme":"css-variables"} onValidateMessages: async ({ messages }) => { try { return await validateUIMessages({ messages, tools: chatTools }); } catch (err) { // Log to your error tracking service Sentry.captureException(err); // Throw a user-facing error message — this becomes the error chunk throw new Error("Your message contains invalid data and could not be sent."); } }, ``` The `Error.message` you throw is sent verbatim to the frontend as the error chunk's `errorText`. Use messages safe for end users. ### Catching errors inside `run()` `run()` is your code — wrap it in try/catch for full control. This is the right place to save partial state to your DB before the error chunk goes out: ```ts theme={"theme":"css-variables"} run: async ({ messages, chatId, signal }) => { try { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); } catch (err) { // Save the failed turn for debugging / undo await db.failedTurn.create({ data: { chatId, error: err instanceof Error ? err.message : String(err), messages, }, }); throw err; // Re-throw to trigger the error chunk } }, ``` ## Saving error state to your DB To persist errors for debugging or undo, use `onTurnComplete` (which fires even after errors) or the standard task `onComplete` hook. ### Using `onTurnComplete` `onTurnComplete` fires after every turn — successful **or** errored. On an errored turn `responseMessage` is undefined or partial and `error` carries the thrown value (with `finishReason` set to `"error"`). Use this to mark the turn as failed: ```ts theme={"theme":"css-variables"} onTurnComplete: async ({ chatId, uiMessages, responseMessage, stopped, error }) => { // Persist the messages regardless of error state await db.chat.update({ where: { id: chatId }, data: { messages: uiMessages, // `error` is set when the turn threw lastTurnStatus: error ? "errored" : stopped ? "stopped" : "ok", }, }); }, ``` ### Using the standard `onFailure` task hook For run-level failures (the entire run dies), use the standard task `onFailure` hook. This fires when the run terminates with an unhandled exception: ```ts theme={"theme":"css-variables"} chat.agent({ id: "my-chat", onFailure: async ({ error, ctx }) => { // Log run-level failure to your monitoring service await monitoring.recordRunFailure({ runId: ctx.run.id, chatId: ctx.run.tags.find(t => t.startsWith("chat:"))?.slice(5), error: error.message, }); }, run: async ({ messages, signal }) => { return streamText({ ... }); }, }); ``` `chat.agent` uses `retry: { maxAttempts: 1 }` internally, so the run never retries on failure. To add run-level retries, wrap the agent in a parent task or implement your own retry logic in the frontend (re-send the message). ## Recovery patterns ### Pattern 1: Undo to last successful response A common pattern is to let the user "undo" the failed turn and try again. Combine `chat.history.rollbackTo` with a custom action: ```ts theme={"theme":"css-variables"} chat.agent({ id: "my-chat", actionSchema: z.discriminatedUnion("type", [ z.object({ type: z.literal("undo") }), ]), onAction: async ({ action, uiMessages }) => { if (action.type === "undo") { // Find the last user message and roll back to it const lastUserIdx = [...uiMessages].reverse().findIndex(m => m.role === "user"); if (lastUserIdx !== -1) { const targetIdx = uiMessages.length - 1 - lastUserIdx - 1; const target = uiMessages[targetIdx]; if (target) chat.history.rollbackTo(target.id); } } }, run: async ({ messages, signal }) => { return streamText({ ... }); }, }); ``` On the frontend, show an "Undo" button when an error occurs: ```tsx theme={"theme":"css-variables"} {error && ( )} ``` ### Pattern 2: Retry the last message For transient errors (network blips, rate limits), the simplest recovery is to re-send the last user message. The AI SDK's `useChat` provides `regenerate()`: ```tsx theme={"theme":"css-variables"} const { messages, error, regenerate } = useChat({ transport }); {error && ( )} ``` `regenerate()` removes the last assistant response and re-sends. Combined with `onValidateMessages` or `hydrateMessages`, you can reload the canonical state from your DB before retrying. ### Pattern 3: Save partial responses When a stream errors mid-response, the `responseMessage` in `onBeforeTurnComplete` and `onTurnComplete` contains the partial output. Save it as a "draft" so the user can see what was generated before the error: ```ts theme={"theme":"css-variables"} onBeforeTurnComplete: async ({ chatId, responseMessage, stopped }) => { if (responseMessage && responseMessage.parts.length > 0) { // Save partial response — user can manually accept or discard await db.partialResponse.create({ data: { chatId, message: responseMessage, reason: stopped ? "stopped" : "errored", }, }); } }, ``` ### Pattern 4: Fall back to a different model If the primary model errors, try a fallback model in the same turn: ```ts theme={"theme":"css-variables"} run: async ({ messages, signal }) => { try { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal, stopWhen: stepCountIs(15), }); } catch (err) { console.warn("Primary model failed, falling back:", err); return streamText({ model: anthropic("claude-sonnet-4-6"), messages, abortSignal: signal, stopWhen: stepCountIs(15), }); } }, ``` This only catches errors thrown synchronously by `streamText` setup. Errors that happen mid-stream go through `uiMessageStreamOptions.onError`, not your try/catch. ## What gets written to the stream on error When an error occurs at any layer, the frontend's `UIMessageChunk` stream surfaces an error chunk: ```json theme={"theme":"css-variables"} { "type": "error", "errorText": "Rate limited. Please wait a moment and try again." } ``` A `turn-complete` control record follows on `session.out` (header-form, not a data chunk — see [`turn-complete` control record](/docs/ai-chat/client-protocol#turn-complete-control-record) for the wire format) to mark the turn as done. The AI SDK's `useChat` processes this and: 1. Sets `useChat`'s `error` field to an `Error` with `message = errorText` 2. Calls the user's `onError` callback (if set) 3. Marks the turn as complete (`status` returns to `"ready"`) ```tsx theme={"theme":"css-variables"} const { messages, error, status } = useChat({ transport, onError: (err) => { toast.error(err.message); }, }); ``` ## Frontend error handling ### Showing the error to the user ```tsx theme={"theme":"css-variables"} function Chat() { const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), }); const { messages, error, sendMessage } = useChat({ transport }); return (
{messages.map(m => /* ... */)} {error && (

{error.message}

)}
{ e.preventDefault(); sendMessage(/* ... */); }}> {/* ... */}
); } ``` ### Distinguishing error types The `errorText` is just a string, so distinguish error types via prefixes or codes: ```ts theme={"theme":"css-variables"} // Backend uiMessageStreamOptions: { onError: (error) => { if (error.message.includes("rate limit")) return "RATE_LIMIT: Please wait and try again."; if (error.message.includes("context_length")) return "CONTEXT_TOO_LONG: Start a new chat."; return "UNKNOWN: Something went wrong."; }, }, ``` ```tsx theme={"theme":"css-variables"} // Frontend {error?.message.startsWith("RATE_LIMIT") && } {error?.message.startsWith("CONTEXT_TOO_LONG") && } ``` For richer error structures, use [`chat.response.write()`](/docs/ai-chat/backend#custom-data-parts) with a custom `data-error` part type. This lets you ship structured error metadata (codes, retry hints, etc.) instead of stringly-typed messages. ### Errors from `accessToken` / `startSession` If your `accessToken` or `startSession` callback throws (auth failure, DB write failure, network error), the rejection surfaces through `useChat`'s `error` state — same as a stream error. The transport doesn't retry the callback automatically; the customer is responsible for handling it. ```tsx theme={"theme":"css-variables"} const transport = useTriggerChatTransport({ task: "my-chat", accessToken: async ({ chatId }) => { try { return await mintChatAccessToken(chatId); } catch (err) { // Customer's server action failed (e.g. user lost auth). // Re-throw to surface as a useChat error, or return a sentinel // your UI can detect and prompt re-auth. throw new Error(`AUTH_REFRESH: ${err.message}`); } }, startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), }); ``` `startSession` failures most commonly mean the customer's authorization layer rejected the request (no plan, quota exceeded, user not allowed to chat with this agent). The customer's server should produce a meaningful error message; the transport propagates it verbatim to `useChat`'s `error` state. ## Run-level retries `chat.agent` uses `retry: { maxAttempts: 1 }` — the run **never retries** on unhandled failure. This is intentional: each turn is conversation-preserving, so a true run failure is severe and shouldn't silently retry (which could send duplicate API calls or mutate state twice). To add retry-like behavior: * **Per-turn retries**: handle inside `run()` with try/catch and a fallback model * **Per-message retries**: re-send from the frontend (call `sendMessage` or `regenerate` again) * **Whole-run retries**: wrap `chat.agent` with a parent task that has `retry` configured, and call the agent's task internally ## Best practices 1. **Always set `uiMessageStreamOptions.onError`** to sanitize stream errors before they reach the user. 2. **Persist messages in `onTurnStart`** so a mid-stream failure still leaves the user's message visible. 3. **Use `onTurnComplete` to mark turn status** in your DB (`ok` / `errored` / `stopped`). 4. **Don't throw raw errors with internal details** in hooks — catch, log, then throw a sanitized user-facing message. 5. **Provide an undo or retry affordance** in the UI when errors occur. 6. **Use `onFailure` for run-level monitoring** (Sentry, monitoring dashboards). 7. **For known transient errors (rate limits, network)**, consider a fallback model inside `run()` instead of failing the turn. ## `ChatChunkTooLargeError` A specific run-failing error worth flagging on its own. Anything written through the chat output is one record on the underlying realtime stream, capped at \~1 MiB per record. A single chunk over the cap throws `ChatChunkTooLargeError` (named export from `@trigger.dev/sdk`). The most common trigger is a tool whose result object is large enough to overflow as one `tool-output-available` chunk. The error carries `chunkType`, `chunkSize`, and `maxSize`. Catch with the `isChatChunkTooLargeError` guard and route oversized values out-of-band. See [Large payloads in chat.agent](/docs/ai-chat/patterns/large-payloads) for the ID-reference pattern that works around the cap, plus guidance on transient data parts and out-of-band logging. ## See also * [`uiMessageStreamOptions.onError`](/docs/ai-chat/backend#error-handling-with-onerror) — stream error handler details * [Custom actions](/docs/ai-chat/actions) — implement undo/retry actions * [`chat.history`](/docs/ai-chat/backend#chat-history) — rollback to a previous message * [Large payloads](/docs/ai-chat/patterns/large-payloads) — handling the \~1 MiB per-chunk cap * [Database persistence](/docs/ai-chat/patterns/database-persistence) — saving conversation state * [Standard task hooks](/docs/tasks/overview) — `onFailure`, `onComplete`, `onWait`, etc. # Fast starts Source: https://trigger.dev/docs/ai-chat/fast-starts Two ways to cut first-turn TTFC: Preload eagerly triggers the run before the first message; Head Start runs step 1 in your warm server while the agent boots in parallel. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. The first turn of a brand-new conversation pays for the chat.agent run's cold start: dequeue, process boot, `onPreload` / `onChatStart` hooks, and only then the LLM call. Two features address this from different angles. ## Picking an approach | | [Preload](#preload) | [Head Start](#head-start) | | ---------------------------------- | -------------------------------------------------- | ----------------------------------------------------------------------------- | | **What it does** | Eagerly triggers the run before the first message | Runs step 1's LLM call in your warm process while the agent boots in parallel | | **First-turn TTFC win** | Hides agent boot if the user *does* send a message | \~50% reduction (LLM TTFB floor); boot fully overlaps with TTFB | | **When to fire** | Page load / input focus — your call | First message arrival — automatic | | **Cost when user never sends** | Idle compute until the preload window times out | Zero (no run was triggered) | | **Requires a warm server process** | No — works for browser-only surfaces | Yes — your route handler runs step 1 | | **Requires LLM keys client-side?** | No | No — keys stay in your warm server | | **Bundle constraints** | None | Route handler must import schema-only tools (no heavy executes) | **Pick one, not both.** Running both for the same chat is wasted work — Head Start gates on a real first message, so adding Preload on top eats the idle-compute cost Head Start was avoiding. **Use Preload** when the chat surface is browser-only, when you don't have a warm Node/Bun/Edge process serving the page, or when you can confidently predict the user *will* send a message (the run never goes idle). **Use Head Start** when the chat lives behind a warm server (Next.js App Router, Hono, SvelteKit, Workers, etc.) and you want first-turn TTFC down at the LLM TTFB floor without any speculative run. *** ## Preload Preload eagerly triggers a run for a chat before the first message is sent. Initialization (DB setup, context loading) happens while the user is still typing, reducing first-response latency. ### Frontend Call `transport.preload(chatId)` to start a run early: ```tsx theme={"theme":"css-variables"} import { useEffect } from "react"; import { useTriggerChatTransport } from "@trigger.dev/sdk/chat/react"; import { useChat } from "@ai-sdk/react"; export function Chat({ chatId }) { const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), clientData: { userId: currentUser.id }, }); // Preload on mount: run starts before the user types anything. // Trigger config (idleTimeoutInSeconds, machine, tags) lives in the // server action that wraps `chat.createStartSessionAction`. useEffect(() => { transport.preload(chatId); }, [chatId]); const { messages, sendMessage } = useChat({ id: chatId, transport }); // ... } ``` Preload is a no-op if a session already exists for this chatId. Your `accessToken` callback receives `{ chatId }` and is invoked the same way on preload as on any other refresh — no special branching by purpose. See [TriggerChatTransport options](/docs/ai-chat/reference#triggerchattransport-options). ### Backend The `onPreload` hook fires immediately. The run then waits for the first message. When the user sends a message, `onChatStart` fires with `preloaded: true` so you can skip work that already ran: ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", onPreload: async ({ chatId, clientData }) => { // Eagerly initialize: runs before the first message userContext.init(await loadUser(clientData.userId)); await db.chat.create({ data: { id: chatId } }); }, onChatStart: async ({ preloaded }) => { if (preloaded) return; // Already initialized in onPreload // ... fallback initialization for non-preloaded runs }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` With `chat.createSession()` or raw tasks, check `payload.trigger === "preload"` and wait for the first message: ```ts theme={"theme":"css-variables"} if (payload.trigger === "preload") { // Initialize early... const result = await chat.messages.waitWithIdleTimeout({ idleTimeoutInSeconds: 60, timeout: "1h", }); if (!result.ok) return; currentPayload = result.output; } ``` *** ## Head Start Head Start runs step 1's LLM call in your warm server process while the chat.agent run boots in parallel. The user sees one continuous turn: text first from your server, then a clean handover to the agent for tool execution and any further steps. `chat.headStart` returns a standard [Web Fetch API](https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API) handler — `(req: Request) => Promise` — so it slots into any runtime that speaks Web Fetch. **Verified runtimes:** Node 18+, Bun, Deno, Cloudflare Workers, Vercel (Node and Edge), Netlify (Functions and Edge). The handler uses only `fetch` and Web `ReadableStream` / `TransformStream` (no `node:*` imports), and the S2 streaming dependency picks the right transport for each runtime automatically (HTTP/2 on Node/Deno, HTTP/1.1 on Bun/Workers/browsers). **Compatible frameworks (native Web Fetch):** Next.js App Router, Hono, SvelteKit, Remix, React Router v7, TanStack Start, Astro, Nitro/Nuxt, Elysia. Mount the handler directly. **Node-only frameworks (Express, Fastify, Koa):** the handler still works, but the framework gives you a Node `IncomingMessage` instead of a Web `Request`. Use a small adapter — examples in [Mounting in your framework](#mounting-in-your-framework) below. When the first turn is pure text (no tool calls), the agent run boots and exits without ever calling an LLM. You only pay for what the conversation actually needed. ### Measured TTFC 3 runs each, prompt `"say hi in five words"`, same model both sides (Anthropic Claude Sonnet 4): | | Without Head Start | With Head Start | Δ | | ------------ | ------------------ | --------------- | -------- | | TTFT (avg) | 2801 ms | **1218 ms** | **−57%** | | TTFT (range) | 2351–3101 ms | 1201–1252 ms | | | Total turn | 4180 ms | 2345 ms | −44% | With Head Start, time-to-first-text is essentially the LLM TTFB floor (50ms spread). Without it, agent boot + hooks stack before the LLM call, adding 750ms of variance. ### How it works ```mermaid theme={"theme":"css-variables"} sequenceDiagram autonumber participant B as Browser participant H as Route handler
(your warm server) participant T as chat.agent run
(Trigger.dev) B->>H: POST first message
(headStart URL) par Step 1 + agent boot in parallel H->>H: streamText step 1
(your model, schema-only tools) H-->>B: SSE: step 1 chunks and H->>T: createSession + trigger run T->>T: boot → wait on session.in end alt finishReason: tool-calls H->>T: handover signal
(partial assistant message) T->>T: execute tools, run step 2 LLM T-->>H: chunks via session.out H-->>B: SSE: step 2 chunks T-->>H: trigger:turn-complete else finishReason: stop (pure text) H->>T: handover-skip signal T->>T: exit (no LLM call) end H-->>B: SSE close Note over B,T: Subsequent turns bypass the handler:
browser writes directly to session.in ``` The transport sees `headStart: "/api/chat"` is set and there's no session yet for this chat. It POSTs the wire payload (messages, chatId, metadata) to your route handler. A single `apiClient.createSession` round-trip both creates the chat session and triggers an agent run with `trigger: "handover-prepare"`. The agent run boots into a wait state on `session.in`. `streamText` runs in your warm process with `stopWhen: stepCountIs(1)`. The output is streamed to the browser as SSE while the agent run boots in parallel. Boot time (\~488ms) overlaps with LLM TTFB (\~389ms), fully hidden. On step 1's `tool-calls` finish, your handler signals the agent and the SDK splices the agent's step-2+ stream into the same SSE response. On pure-text finish, your handler signals `handover-skip` and the agent run exits clean — no LLM call from the trigger side. After turn 1, the transport hydrates the session PAT from response headers and writes turn 2 onward directly to `session.in`. Same direct-trigger path as a regular `chat.agent` setup. ### Setup **Bundle isolation is the load-bearing constraint.** Head Start only saves time because your route-handler bundle stays lightweight. Anything you import in that handler — and anything those modules import transitively — lands in the bundle. If your tool catalog with heavy `execute` fns (E2B, Puppeteer, native bindings, the trigger SDK runtime, Turndown, image processing, `node:child_process`) ends up in the bundle, you've put cold-start back into a different process. This is an **import-chain** problem, not a runtime one. A "we'll strip the executes at runtime" helper would not fix it — bundlers resolve imports at build time. The only correct shape is to keep schemas in their own module that imports `ai` and `zod` only. Schemas in one module (light deps), executes in another (heavy deps). The agent task pulls in both; the route handler pulls in schemas only. ```ts lib/chat-tools/schemas.ts theme={"theme":"css-variables"} // ⚠️ This file MUST NOT import anything heavier than `ai` and `zod`. // Any import here lands in the route-handler bundle. import { tool } from "ai"; import { z } from "zod"; export const fetchPage = tool({ description: "Fetch a URL and return text", inputSchema: z.object({ url: z.string().url() }), // No execute — agent task adds it elsewhere. }); export const headStartTools = { fetchPage }; ``` ```ts trigger/chat-tools.ts theme={"theme":"css-variables"} // Heavy deps live here. Only the trigger task imports this module. import { tool } from "ai"; import TurndownService from "turndown"; import { fetchPage as fetchPageSchema } from "@/lib/chat-tools/schemas"; const turndown = new TurndownService(); export const fetchPage = tool({ ...fetchPageSchema, execute: async ({ url }) => { const res = await fetch(url); return { body: turndown.turndown(await res.text()) }; }, }); export const chatTools = { fetchPage }; ``` The agent uses the full tool set — these are the executes that run when step 2+ needs them. ```ts trigger/chat.ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { streamText, stepCountIs } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; import { chatTools } from "./chat-tools"; export const myChat = chat.agent({ id: "my-chat", run: async ({ messages, signal }) => streamText({ ...chat.toStreamTextOptions({ tools: chatTools }), model: anthropic("claude-sonnet-4-6"), messages, stopWhen: stepCountIs(10), abortSignal: signal, }), }); ``` Call `chat.headStart({ agentId, run })`. It returns a standard Web Fetch handler: `(req: Request) => Promise`. Inside the `run` callback you call `streamText` yourself and spread `chat.toStreamTextOptions({ tools })` to inherit the SDK-owned wiring (messages, schema-only tools, `stopWhen: stepCountIs(1)`, abort signal). Add your own `model` and `system` on top. ```ts lib/chat-handler.ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/chat-server"; import { streamText } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; import { headStartTools } from "@/lib/chat-tools/schemas"; export const chatHandler = chat.headStart({ agentId: "my-chat", run: async ({ chat: helper }) => streamText({ ...helper.toStreamTextOptions({ tools: headStartTools }), model: anthropic("claude-sonnet-4-6"), system: "You are a helpful assistant.", stopWhen: stepCountIs(15), }), }); ``` Use the **same model** on both sides (route handler and `chat.agent`) to avoid a tone or style shift between step 1 and step 2+. Your LLM provider keys stay server-side in your warm process — Trigger.dev never holds them in this design. Mount the handler in whatever framework you use — see [Mounting in your framework](#mounting-in-your-framework) below. Add `headStart: "/api/chat"` to `useTriggerChatTransport`. Subsequent turns bypass this URL automatically — `accessToken` and (optionally) `startSession` still run for the direct-trigger path on turn 2 onward. ```tsx components/chat.tsx theme={"theme":"css-variables"} const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), headStart: "/api/chat", }); ``` ### Mounting in your framework `chat.headStart` returns a Web Fetch handler — `(req: Request) => Promise`. Frameworks that natively pass Web `Request` objects mount it as-is. Node-only frameworks (Express, Fastify, Koa) need a small adapter. #### Web Fetch frameworks (recommended) ```ts Next.js (App Router) theme={"theme":"css-variables"} // app/api/chat/route.ts import { chatHandler } from "@/lib/chat-handler"; export const POST = chatHandler; // Default function timeout on Vercel is 10s. Bump if your turns // run long (multi-step tool use, slow models): // export const maxDuration = 60; ``` ```ts Hono theme={"theme":"css-variables"} // src/index.ts import { Hono } from "hono"; import { chatHandler } from "./chat-handler"; const app = new Hono(); app.post("/api/chat", (c) => chatHandler(c.req.raw)); export default app; ``` ```ts SvelteKit theme={"theme":"css-variables"} // src/routes/api/chat/+server.ts import type { RequestHandler } from "./$types"; import { chatHandler } from "$lib/chat-handler"; export const POST: RequestHandler = ({ request }) => chatHandler(request); ``` ```ts Remix / React Router v7 theme={"theme":"css-variables"} // app/routes/api.chat.ts import type { ActionFunctionArgs } from "@remix-run/node"; import { chatHandler } from "~/lib/chat-handler"; export async function action({ request }: ActionFunctionArgs) { return chatHandler(request); } ``` ```ts TanStack Start theme={"theme":"css-variables"} // app/routes/api/chat.ts import { createAPIFileRoute } from "@tanstack/start/api"; import { chatHandler } from "~/lib/chat-handler"; export const Route = createAPIFileRoute("/api/chat")({ POST: ({ request }) => chatHandler(request), }); ``` ```ts Astro theme={"theme":"css-variables"} // src/pages/api/chat.ts import type { APIRoute } from "astro"; import { chatHandler } from "../../lib/chat-handler"; export const POST: APIRoute = ({ request }) => chatHandler(request); ``` ```ts Nitro / Nuxt theme={"theme":"css-variables"} // server/api/chat.post.ts import { chatHandler } from "~/lib/chat-handler"; export default defineEventHandler((event) => chatHandler(toWebRequest(event))); ``` ```ts Elysia theme={"theme":"css-variables"} // src/index.ts import { Elysia } from "elysia"; import { chatHandler } from "./chat-handler"; new Elysia() .post("/api/chat", ({ request }) => chatHandler(request)) .listen(3000); ``` #### Edge / standalone runtimes ```ts Cloudflare Workers theme={"theme":"css-variables"} // src/index.ts import { chatHandler } from "./chat-handler"; export default { async fetch(req: Request): Promise { const url = new URL(req.url); if (req.method === "POST" && url.pathname === "/api/chat") { return chatHandler(req); } return new Response("Not found", { status: 404 }); }, }; ``` ```ts Bun (native server) theme={"theme":"css-variables"} // server.ts import { chatHandler } from "./chat-handler"; Bun.serve({ port: 3000, async fetch(req) { const url = new URL(req.url); if (req.method === "POST" && url.pathname === "/api/chat") { return chatHandler(req); } return new Response("Not found", { status: 404 }); }, }); ``` ```ts Deno (Deno.serve) theme={"theme":"css-variables"} // server.ts import { chatHandler } from "./chat-handler.ts"; Deno.serve({ port: 3000 }, async (req) => { const url = new URL(req.url); if (req.method === "POST" && url.pathname === "/api/chat") { return chatHandler(req); } return new Response("Not found", { status: 404 }); }); ``` #### Node-only frameworks Express, Fastify, and Koa pass Node `IncomingMessage` / `ServerResponse` objects rather than Web `Request` / `Response`. The SDK ships `chat.toNodeListener` that wraps any Web Fetch handler as a Node `(req, res)` listener — body bytes are read upfront, headers translated, the response body streamed chunk-by-chunk, and client disconnect is propagated to the handler via `AbortSignal`. ```ts Express theme={"theme":"css-variables"} import express from "express"; import { chat } from "@trigger.dev/sdk/chat-server"; import { chatHandler } from "./chat-handler"; const app = express(); app.post("/api/chat", chat.toNodeListener(chatHandler)); app.listen(3000); ``` ```ts Fastify theme={"theme":"css-variables"} import Fastify from "fastify"; import { chat } from "@trigger.dev/sdk/chat-server"; import { chatHandler } from "./chat-handler"; const fastify = Fastify(); const listener = chat.toNodeListener(chatHandler); fastify.post("/api/chat", (req, reply) => { // Hand the raw Node request/response to the adapter and tell // Fastify we'll handle the response ourselves (no auto-reply). reply.hijack(); return listener(req.raw, reply.raw); }); fastify.listen({ port: 3000 }); ``` ```ts Koa theme={"theme":"css-variables"} import Koa from "koa"; import Router from "@koa/router"; import { chat } from "@trigger.dev/sdk/chat-server"; import { chatHandler } from "./chat-handler"; const app = new Koa(); const router = new Router(); const listener = chat.toNodeListener(chatHandler); router.post("/api/chat", async (ctx) => { ctx.respond = false; // Tell Koa not to send the response itself. await listener(ctx.req, ctx.res); }); app.use(router.routes()).listen(3000); ``` ```ts Raw node:http theme={"theme":"css-variables"} import http from "node:http"; import { chat } from "@trigger.dev/sdk/chat-server"; import { chatHandler } from "./chat-handler"; const listener = chat.toNodeListener(chatHandler); http .createServer((req, res) => { if (req.method === "POST" && req.url === "/api/chat") { return listener(req, res); } res.statusCode = 404; res.end(); }) .listen(3000); ``` Don't run `express.json()` (or any body-parsing middleware) before the head-start route — it consumes the request body before `chat.toNodeListener` can read the raw bytes. Either skip the parser for this route, or scope it to other routes. #### Streaming response timeouts The handler keeps the SSE response open until the agent run signals turn-complete (or skip, on a pure-text turn). Make sure your framework / serverless function timeout accommodates that: * **Pure-text first turns**: \~LLM TTFB (1–3 s typically). * **Tool-calling first turns**: LLM step 1 + agent boot + tool execution + step 2 LLM call. Usually 5–15 s; longer for multi-step tool use. * **Vercel**: default function timeout is 10 s on Hobby, 60 s on Pro. Set `export const maxDuration = N;` on the route segment. * **Cloudflare Workers**: default 30 s CPU time (paid plans up to 5 min). Streaming wall time is generally not the bottleneck. * **AWS Lambda behind API Gateway**: 29 s API Gateway hard limit; Lambda Function URL allows up to 15 min. ### What gets routed where | | First turn (handover) | Subsequent turns | | --------------------------------------- | ----------------------------------------------------------------- | ---------------------------- | | Browser sends message via | POST to `headStart` URL | Direct write to `session.in` | | Step 1 LLM call runs in | Your warm process | Trigger.dev agent run | | Tool execution runs in | Trigger.dev agent run | Trigger.dev agent run | | Step 2+ LLM call runs in | Trigger.dev agent run | Trigger.dev agent run | | `onChatStart` / `onTurnStart` fire | After handover signal arrives | Normally | | `hydrateMessages` fires (if registered) | After handover, with the first-turn history as `incomingMessages` | Normally | | `onTurnComplete` fires | After turn finishes (handover) or skipped (handover-skip) | Normally | ### Persistence and the handover contract A head-start turn persists exactly like a normal turn — the handover machinery is invisible to your hooks. The guarantees: * **One stable assistant `messageId` across the whole turn.** The route handler generates the id, the handover signal carries it to the agent, and the agent's step 2+ stream reuses it — so the browser merges step 1 and step 2+ into a single assistant message, and you can merge-by-id when persisting. * **`onTurnComplete` is the canonical persistence point**, same as any turn. It carries the full assistant message under that one id: step-1 text, reasoning, and tool calls plus step-2+ tool results and text. The [database persistence](/docs/ai-chat/patterns/database-persistence) patterns apply unchanged. * **Reasoning parts survive the handover.** When step 1 runs on an extended-thinking model, the reasoning streamed by your route handler lands in the durable session history (and `onTurnComplete`) under the same `messageId`, with provider metadata intact — Anthropic thinking signatures survive a replay back to the model. Step-2 reasoning appends to the same message rather than replacing it. #### With `hydrateMessages` Head Start composes with [`hydrateMessages`](/docs/ai-chat/lifecycle-hooks#hydratemessages). On the first turn, the hook receives the route handler's first-turn history as `incomingMessages` — the canonical upsert-and-return pattern persists the user message exactly as it would on a direct-trigger turn. The runtime splices the warm handler's partial assistant onto your hydrated chain after the hook returns, deduplicated by the assistant `messageId`, so your hook never needs to include the in-flight partial. **Hydrate hooks must upsert their conversation row, not update it.** Head-start turns skip preload entirely, so row-creating hooks (`onPreload`, or an `onChatStart` create) have not run when `hydrateMessages` first fires. A bare `UPDATE` against a missing row throws and errors the turn. Your hydrate hook shapes **model context**, not the transcript — dropping reasoning-only entries or unresolved tool rows from the returned chain is fine and does not affect what `onTurnComplete` persists or what the UI renders. ### The `chat.headStart` API ```ts theme={"theme":"css-variables"} chat.headStart({ agentId: string, // The chat.agent({ id }) you're handing off to run: (args: HeadStartRunArgs) => Promise>, idleTimeoutInSeconds?: number, // How long the agent waits for the handover signal. Default: 60 }): (req: Request) => Promise ``` The `run` callback receives: * `messages: UIMessage[]` — user messages parsed from the request body. * `signal: AbortSignal` — fires when the request closes or the SDK times out the handover. * `chat: HeadStartChatHelper` — exposes `chat.toStreamTextOptions({ tools })` and a `chat.session` escape hatch for power users. `chat.toStreamTextOptions({ tools })` returns options to spread into `streamText`. The SDK owns these keys — overriding them will break the protocol: | Key | What the SDK sets | Why | | ------------- | ------------------------------------ | ------------------------------------ | | `messages` | `convertToModelMessages(uiMessages)` | First-turn user history | | `tools` | What you pass | Schema-only tools for step 1 | | `stopWhen` | `stepCountIs(1)` | Step 1 only — agent picks up step 2+ | | `abortSignal` | Combined request + idle timeout | Safe cleanup on disconnect | You bring `model`, `system`, `providerOptions`, `prepareStep`, anything else `streamText` accepts. #### The transport option ```ts theme={"theme":"css-variables"} useTriggerChatTransport({ // ... task, accessToken, startSession, ... headStart?: string, // URL of your chat.headStart route handler }); ``` Optional. When set, the FIRST message of a brand-new chat (no existing session state) routes through this URL. Subsequent turns bypass it and use the direct-trigger path. This is **not** a stock `useChat` `endpoint` — it's not the canonical request URL for every turn, just the first-turn shortcut. ### Limitations * **First turn only.** Step 2+ and turn 2+ run on the trigger side. There's no per-turn "head start every turn" mode — the win comes from amortizing agent boot across the LLM call once. * **Single step on the warm-server side.** The handler runs `stopWhen: stepCountIs(1)`. Multi-step handover (handler does step 1 + step 2 + ...) is out of scope. * **Your server needs an LLM provider key.** The first-turn LLM call runs in your warm process, so that environment needs whatever keys the model requires. The agent's executes still run on the Trigger.dev side with whatever environment variables they need there. * **Browser-only chat surfaces don't apply.** Without a warm server process, there's nowhere to run step 1 ahead of the agent run. Use [Preload](#preload) or eat the cold-start tax. * **Streaming-capable runtime required.** Your framework / runtime has to support streaming HTTP responses (Web Fetch `Response` body or equivalent). Most modern hosts do — Next.js, Hono, SvelteKit, Workers, Bun, Deno, Vercel, etc. Some legacy platforms that buffer full responses won't deliver chunks until the turn is over, which negates the TTFC benefit (correctness still holds). * **Non-`useChat` chat surfaces** (Slack bots, Discord bots, custom protocols) don't fit the `chat.headStart` shape — the API expects the AI SDK transport's wire payload on input. For those, trigger the chat.agent directly from your bot handler. ## Reference * [`chat.headStart` factory and types](/docs/ai-chat/reference) — full signatures for `HeadStartRunArgs`, `HeadStartChatHelper`, `HeadStartSession`, `HeadStartHandlerOptions`. * [`headStart` transport option](/docs/ai-chat/reference#triggerchattransport-options) — alongside `accessToken`, `startSession`, etc. * [`onPreload` hook](/docs/ai-chat/lifecycle-hooks#onpreload) — the backend hook that fires when a run is preloaded. # Frontend Source: https://trigger.dev/docs/ai-chat/frontend Transport setup, session management, client data, and frontend patterns for AI Chat. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. ## How the transport works Vanilla `useChat` expects an `api` URL — it POSTs the conversation to your own Next.js route handler, which terminates the stream. `useTriggerChatTransport` replaces that round-trip: instead of an `api` URL, you pass a custom [`ChatTransport`](https://ai-sdk.dev/docs/ai-sdk-ui/transport) that talks directly to the Trigger.dev cloud (or your self-hosted webapp) on behalf of `useChat`. There's no API route to maintain. The browser uses a short-lived session-scoped PAT (minted by your `accessToken` server action) to: * **Create the session** via your `startSession` action on the first message (or `transport.preload(chatId)`). * **Append the new user message** to the session's durable `.in` stream. * **Subscribe to the `.out` SSE stream** for the agent's response chunks (text, tool calls, reasoning, custom `data-*` parts). The transport handles the auth refresh, reconnect, `Last-Event-ID` resume, and stop-signal plumbing transparently. `useChat` sees the result as `UIMessageChunk`s and renders them unchanged. ## Transport setup Use the `useTriggerChatTransport` hook from `@trigger.dev/sdk/chat/react` to create a memoized transport instance, then pass it to `useChat`: ```tsx theme={"theme":"css-variables"} import { useTriggerChatTransport } from "@trigger.dev/sdk/chat/react"; import { useChat } from "@ai-sdk/react"; import type { myChat } from "@/trigger/chat"; import { mintChatAccessToken, startChatSession } from "@/app/actions"; export function Chat() { const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), }); const { messages, sendMessage, stop, status } = useChat({ transport }); // ... render UI } ``` The transport is created once on first render and reused across re-renders. Pass a type parameter for compile-time validation of the task ID. The two callbacks have distinct responsibilities: * **`accessToken`** is a *pure* PAT mint — the transport invokes it on a 401/403 to refresh the session-scoped token. Customer wraps `auth.createPublicToken({ scopes: { read: { sessions: chatId }, write: { sessions: chatId } } })`, which resolves to a `Promise` (the JWT). Return that string from your `accessToken` callback. * **`startSession`** wraps `chat.createStartSessionAction(taskId)` and is called when the transport needs to *create* the session (`transport.preload(chatId)`, or lazily on the first `sendMessage` for a chatId without a cached PAT). The customer's server controls authorization here, alongside any DB writes paired with session creation. See [Quick start](/docs/ai-chat/quick-start) for the matching server actions. The hook keeps `onSessionChange` and `clientData` up to date via internal refs, so you don't need to memoize callbacks or worry about stale closures when those options change between renders. ## Typed messages (`chat.withUIMessage`) If your chat agent is defined with [`chat.withUIMessage()`](/docs/ai-chat/types) (custom `data-*` parts, typed tools, etc.), pass the same message type through `useChat` so `messages` and `message.parts` are narrowed on the client: ```tsx theme={"theme":"css-variables"} import { useChat } from "@ai-sdk/react"; import { useTriggerChatTransport, type InferChatUIMessage } from "@trigger.dev/sdk/chat/react"; import type { myChat } from "./myChat"; type Msg = InferChatUIMessage; const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), }); const { messages } = useChat({ transport }); ``` See the [Types](/docs/ai-chat/types) guide for defining `YourUIMessage`, default stream options, and backend examples. ### Calling a fetch endpoint instead of a server action If you want to mint tokens via a REST endpoint instead of a Next.js server action, the same callbacks accept any async function. Import `AccessTokenParams` and `StartSessionParams` from `@trigger.dev/sdk/chat` to type your fetch handler. ```ts theme={"theme":"css-variables"} import type { AccessTokenParams, StartSessionParams } from "@trigger.dev/sdk/chat"; const transport = useTriggerChatTransport({ task: "my-chat", accessToken: async ({ chatId }: AccessTokenParams) => { const res = await fetch(`/api/chat/${chatId}/access-token`, { method: "POST" }); return res.text(); }, startSession: async ({ chatId, taskId, clientData }: StartSessionParams) => { const res = await fetch(`/api/chat/${chatId}/start`, { method: "POST", body: JSON.stringify({ taskId, clientData }), }); return res.json(); // { publicAccessToken: string } }, }); ``` The fetch handlers on the server side wrap the same SDK helpers as the server-action variant: `auth.createPublicToken({ scopes: { read: { sessions: chatId }, write: { sessions: chatId } } })` for refresh and `chat.createStartSessionAction(taskId)` for create. ## Session management Every chat is backed by a durable Session — the row that owns the chat's runs, persists across run lifecycles, and orchestrates handoffs. The transport manages the session for you; what you persist on your side is a small piece of state per chat that lets a fresh tab resume without a round-trip to create a new session. ### What the transport persists per chat | Field | Type | Notes | | ------------------- | ---------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `publicAccessToken` | `string` | Session-scoped JWT (`read:sessions:{chatId} + write:sessions:{chatId}`). Refreshed automatically on 401/403 via `accessToken`. | | `lastEventId` | `string \| undefined` | Last SSE event received on `.out`. **Valid for the lifetime of the Session** — keep it across `endRun` / `requestUpgrade` / continuation-run boundaries; only clear when the Session itself closes. The cursor lets the next subscription open past the prior turn's stale `turn-complete` record. | | `isStreaming` | `boolean \| undefined` | **Optional.** The transport sets it internally, but you don't have to persist it — the server decides "nothing is streaming" via the session's [`X-Session-Settled`](/docs/ai-chat/client-protocol#x-session-settled-fast-close-on-idle-reconnects) signal on reconnect. If you do persist it, the transport keeps the fast-path short-circuit. If you drop it, reconnects open the SSE and close fast on settled sessions. | ### Session cleanup (frontend) Since session creation and updates are handled server-side, the frontend only needs to handle session deletion when a run ends: ```tsx theme={"theme":"css-variables"} const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), sessions: loadedSessions, // Restored from DB on page load onSessionChange: (chatId, session) => { if (!session) { deleteSession(chatId); // Server action — run ended } }, }); ``` ### Restoring on page load On page load, fetch both the messages and the session state from your database, then pass them to `useChat` and the transport. Pass `resume: true` to `useChat` when there's an existing conversation — this tells the AI SDK to reconnect to the stream via the transport. Because the underlying Session row outlives individual runs, a chat you were in yesterday resumes against the same chat — even if the original run has long since exited. The transport hydrates from the persisted state and uses `lastEventId` to resubscribe; if the client tries to send a new message and no run is alive, the server triggers a fresh continuation run on the same session before the message is appended. ```tsx app/chat/[chatId]/ChatPage.tsx theme={"theme":"css-variables"} "use client"; import { useEffect, useState } from "react"; import { useTriggerChatTransport } from "@trigger.dev/sdk/chat/react"; import { useChat } from "@ai-sdk/react"; import { mintChatAccessToken, startChatSession, getChatMessages, getSession, deleteSession, } from "@/app/actions"; // Rendered from `app/chat/[chatId]/page.tsx`, which awaits `params` // and forwards `chatId` into this client component: // // export default async function Page({ params }: { params: Promise<{ chatId: string }> }) { // const { chatId } = await params; // return ; // } export default function ChatPage({ chatId }: { chatId: string }) { const [initialMessages, setInitialMessages] = useState([]); const [initialSession, setInitialSession] = useState(undefined); const [loaded, setLoaded] = useState(false); useEffect(() => { async function load() { const [messages, session] = await Promise.all([getChatMessages(chatId), getSession(chatId)]); setInitialMessages(messages); setInitialSession(session ? { [chatId]: session } : undefined); setLoaded(true); } load(); }, [chatId]); if (!loaded) return null; return ( ); } function ChatClient({ chatId, initialMessages, initialSessions }) { const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), sessions: initialSessions, onSessionChange: (id, session) => { if (!session) deleteSession(id); }, }); const { messages, sendMessage, stop, status } = useChat({ id: chatId, messages: initialMessages, transport, resume: initialMessages.length > 0, // Resume if there's an existing conversation }); // ... render UI } ``` `resume: true` causes `useChat` to call `reconnectToStream` on the transport when the component mounts. The transport uses the session's `lastEventId` to skip past already-seen stream events, so the frontend only receives new data. Only enable `resume` when there are existing messages — for brand new chats, there's nothing to reconnect to. After resuming, `useChat`'s built-in `stop()` won't send the stop signal to the backend because the AI SDK doesn't pass its abort signal through `reconnectToStream`. Use `transport.stopGeneration(chatId)` for reliable stop behavior after resume — see [Stop generation](#stop-generation) for the recommended pattern. In React strict mode (enabled by default in Next.js dev), you may see a `TypeError: Cannot read properties of undefined (reading 'state')` in the console when using `resume`. This is a [known bug in the AI SDK](https://github.com/vercel/ai/issues/8477) caused by React strict mode double-firing the resume effect. The error is caught internally and **does not affect functionality** — streaming and message display work correctly. It only appears in development and will not occur in production builds. ### Network resilience You don't need to handle network drops, mobile background-kills, or Safari bfcache restores. The transport retries indefinitely with bounded backoff, reconnects on `online` / tab refocus / `pageshow` with `event.persisted`, and uses `Last-Event-ID` to resume without dropping chunks. See the [changelog entry](/docs/ai-chat/changelog) for the gory details. ## Client data and metadata ### Transport-level client data Set default client data on the transport that's included in every request. When the task uses `clientDataSchema`, this is type-checked to match: ```ts theme={"theme":"css-variables"} const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), clientData: { userId: currentUser.id }, }); ``` The transport threads `clientData` through three places automatically: into `startSession`'s `params.clientData` for the first run's `payload.metadata`, into per-turn `metadata` on every `.in/append` chunk, and live-updates if the option value changes between renders (so React-driven values like the current user work without reconstructing the transport). ### Per-message metadata Pass metadata with individual messages via `sendMessage`. Per-message values are merged with transport-level client data (per-message wins on conflicts): ```ts theme={"theme":"css-variables"} sendMessage({ text: "Hello" }, { metadata: { model: "gpt-4o", priority: "high" } }); ``` ### Typed client data with clientDataSchema Instead of manually parsing `clientData` with Zod in every hook, pass a `clientDataSchema` to `chat.agent`. The schema validates the data once per turn, and `clientData` is typed in all hooks and `run`: ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { streamText, stepCountIs } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; import { z } from "zod"; export const myChat = chat.agent({ id: "my-chat", clientDataSchema: z.object({ model: z.string().optional(), userId: z.string(), }), onChatStart: async ({ chatId, clientData }) => { // clientData is typed as { model?: string; userId: string } await db.chat.create({ data: { id: chatId, userId: clientData.userId }, }); }, run: async ({ messages, clientData, signal }) => { // Same typed clientData — no manual parsing needed return streamText({ model: openai(clientData?.model ?? "gpt-4o"), messages, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` The schema also types the `clientData` option on the frontend transport: ```ts theme={"theme":"css-variables"} // TypeScript enforces that clientData matches the schema const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), clientData: { userId: currentUser.id }, }); ``` Supports Zod, ArkType, Valibot, and other schema libraries supported by the SDK. ## Stop generation Use `transport.stopGeneration(chatId)` to stop the current generation. This sends a stop signal to the running task via input streams, aborting the current `streamText` call while keeping the run alive for the next message. `stopGeneration` works in all scenarios — including after a page refresh when the stream was reconnected via `resume`. Call it alongside `useChat`'s `stop()` to also update the frontend state: ```tsx theme={"theme":"css-variables"} const { messages, sendMessage, stop: aiStop, status } = useChat({ transport }); // Wrap both calls in a single stop handler const stop = useCallback(() => { transport.stopGeneration(chatId); aiStop(); }, [transport, chatId, aiStop]); { status === "streaming" && ( ); } ``` `transport.stopGeneration(chatId)` handles the backend stop signal and closes the SSE connection, while `aiStop()` (from `useChat`) updates the frontend status to `"ready"` and fires the `onFinish` callback. A [PR to the AI SDK](https://github.com/vercel/ai/pull/14350) has been submitted to pass `abortSignal` through `reconnectToStream`, which would make `useChat`'s built-in `stop()` work after resume without needing `stopGeneration`. Until that lands, use the pattern above for reliable stop behavior after page refresh. See [Stop generation](/docs/ai-chat/backend#stop-generation) in the backend docs for how to handle stop signals in your task. ## Tool approvals The AI SDK supports tools that require human approval before execution. To use this with `chat.agent`, define a tool with `needsApproval: true` on the backend, then handle the approval UI and configure `sendAutomaticallyWhen` on the frontend. ### Backend: define an approval-required tool ```ts theme={"theme":"css-variables"} import { tool } from "ai"; import { z } from "zod"; const sendEmail = tool({ description: "Send an email. Requires human approval before sending.", inputSchema: z.object({ to: z.string(), subject: z.string(), body: z.string(), }), needsApproval: true, execute: async ({ to, subject, body }) => { await emailService.send({ to, subject, body }); return { sent: true, to, subject }; }, }); ``` Pass the tool to `streamText` in your `run` function as usual. When the model calls the tool, `chat.agent` streams a `tool-approval-request` chunk. The turn completes and the run waits for the next message. ### Frontend: approval UI Import `lastAssistantMessageIsCompleteWithApprovalResponses` from the AI SDK and pass it to `sendAutomaticallyWhen`. This tells `useChat` to automatically re-send messages once all approvals have been responded to. Destructure `addToolApprovalResponse` from `useChat` and wire it to your approval buttons: ```tsx theme={"theme":"css-variables"} import { useChat } from "@ai-sdk/react"; import { lastAssistantMessageIsCompleteWithApprovalResponses } from "ai"; function Chat({ chatId, transport }) { const { messages, sendMessage, addToolApprovalResponse, status } = useChat({ id: chatId, transport, sendAutomaticallyWhen: lastAssistantMessageIsCompleteWithApprovalResponses, }); const handleApprove = (approvalId: string) => { addToolApprovalResponse({ id: approvalId, approved: true }); }; const handleDeny = (approvalId: string) => { addToolApprovalResponse({ id: approvalId, approved: false, reason: "User denied" }); }; return (
{messages.map((msg) => msg.parts.map((part, i) => { if (part.state === "approval-requested") { return (

Tool "{part.type}" wants to run with input:

{JSON.stringify(part.input, null, 2)}
); } // ... render other parts }) )}
); } ``` ### How it works 1. Model calls a tool with `needsApproval: true` — the turn completes with the tool in `approval-requested` state 2. Frontend shows Approve/Deny buttons 3. User clicks Approve — `addToolApprovalResponse` updates the tool part to `approval-responded` 4. `sendAutomaticallyWhen` returns `true` — `useChat` re-sends the updated assistant message 5. The transport sends the message via input streams — the backend matches it by ID and replaces the existing assistant message in the accumulator 6. `streamText` sees the approved tool, executes it, and streams the result Message IDs are kept in sync between frontend and backend automatically. The backend always includes a `generateMessageId` function when streaming responses, ensuring the `start` chunk carries a `messageId` that the frontend uses. This makes the ID-based matching reliable for tool approval updates. ## Sending actions Send custom actions (undo, rollback, edit) to the agent via `transport.sendAction()`. Actions wake the agent and fire only `hydrateMessages` (if configured) and `onAction` — they're not turns, so `onTurnStart` / `prepareMessages` / `onBeforeTurnComplete` / `onTurnComplete` and `run()` do not fire. For optimistic UI, mirror the action's effect on the `useChat` state via `setMessages` while the request is in flight: ```tsx theme={"theme":"css-variables"} function ChatControls({ chatId }: { chatId: string }) { const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), }); const { setMessages } = useChat({ transport }); return (
); } ``` The action payload is validated against the agent's `actionSchema` on the backend — invalid actions are rejected. See [Actions](/docs/ai-chat/actions) for the backend setup. `sendAction` returns a `ReadableStream`. For side-effect-only actions (where `onAction` returns `void`), the stream completes immediately with `trigger:turn-complete`. For actions where `onAction` returns a `StreamTextResult`, the stream carries the assistant chunks the same way `sendMessages` does — `useChat` consumes them automatically. For server-to-server usage, `AgentChat` has the same method: ```ts theme={"theme":"css-variables"} const stream = await agentChat.sendAction({ type: "undo" }); for await (const chunk of stream) { if (chunk.type === "text-delta") process.stdout.write(chunk.delta); } ``` ## Multi-tab coordination When the same chat is open in multiple browser tabs, `multiTab: true` prevents duplicate messages and syncs conversation state across tabs. Only one tab can send at a time. Other tabs enter read-only mode with real-time message updates. ```tsx theme={"theme":"css-variables"} import { useTriggerChatTransport } from "@trigger.dev/sdk/chat/react"; import { useMultiTabChat } from "@trigger.dev/sdk/chat/react"; import { useChat } from "@ai-sdk/react"; function Chat({ chatId }: { chatId: string }) { const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), multiTab: true, }); const { messages, setMessages, sendMessage } = useChat({ id: chatId, transport, }); const { isReadOnly } = useMultiTabChat(transport, chatId, messages, setMessages); return (
{isReadOnly && (
This chat is active in another tab. Messages are read-only.
)} {/* message list */}
); } ``` ### How it works 1. When a tab sends a message, the transport "claims" the chatId via `BroadcastChannel` 2. Other tabs detect the claim and enter read-only mode (`isReadOnly: true`) 3. The active tab broadcasts its messages so read-only tabs see updates in real-time 4. When the turn completes, the claim is released. Any tab can send next. 5. Heartbeats detect crashed tabs (10s timeout clears stale claims) ### What `useMultiTabChat` does * Returns `{ isReadOnly }` for disabling the input UI * Broadcasts `messages` from the active tab to other tabs * Calls `setMessages` on read-only tabs when messages arrive from the active tab * Tracks read-only state via the transport's `BroadcastChannel` coordinator Multi-tab coordination is same-browser only (`BroadcastChannel` is a browser API). It gracefully degrades to a no-op in Node.js, SSR, or browsers without `BroadcastChannel` support. Cross-device coordination requires server-side involvement. ## Self-hosting If you're self-hosting Trigger.dev, pass the `baseURL` option: ```ts theme={"theme":"css-variables"} const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), baseURL: "https://your-trigger-instance.com", }); ``` `baseURL` also accepts a function so you can route per endpoint — useful when fronting `.in/append` with an edge proxy (e.g. to inject server-trusted signal into the wire) while keeping `.out` SSE direct: ```ts theme={"theme":"css-variables"} baseURL: ({ endpoint }) => endpoint === "out" ? "https://api.trigger.dev" : "https://chat-proxy.example.com", ``` For per-request control beyond URL routing (header injection, custom retries, tracing), pass a `fetch` override. See [Trusted edge signals](/docs/ai-chat/patterns/trusted-edge-signals) for a full proxy walkthrough. # How it works Source: https://trigger.dev/docs/ai-chat/how-it-works End-to-end mechanics of a chat.agent turn: the two durable channels per session, the long-lived task that reads and writes them, and how a chat survives refreshes, deploys, and idle gaps. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. This page explains how `chat.agent` is put together, what each piece does on a single turn, and how a chat survives across turns. It is not an API tour — for that, see [Backend](/docs/ai-chat/backend), [Frontend](/docs/ai-chat/frontend), and the [Reference](/docs/ai-chat/reference). For the byte-level wire format, see [Client Protocol](/docs/ai-chat/client-protocol). **What you don't have to think about**: SSE reconnects, WebSocket backpressure, container cold starts, whether a worker is currently running, or how to re-deliver chunks the client missed during a reload. The platform handles those. **What you do have to think about**: idempotency in your `run()` function, and how much state you keep in memory between turns versus persist in your own database. ## The primary noun: a chat session is a pair of streams and a task A **chat session** is the unit chat.agent owns. It is three things bound together: * An **inbox** channel called `.in` — every user message lands here as a record. * An **outbox** channel called `.out` — every assistant chunk leaves through here. * A long-lived **agent task** that reads from `.in` and writes to `.out`. Both channels are S2 ([s2.dev](https://s2.dev)) durable append-only streams, keyed by the session. Think of them as a pair of per-session topics on a tiny Kafka: records have monotonically increasing sequence numbers, readers resume from a cursor, writers append to the tail. We chose S2 because reads are resumable from an offset — so a browser reload can replay the response stream without re-running the LLM, and a crashed run can rejoin mid-conversation by reading from where it left off. A chat ID identifies the session for the lifetime of the conversation. The same session can be served by **many runs**: one run handles a turn (or several), goes idle, eventually exits, and the next user message triggers a fresh continuation run on the same session. Sessions are the durable identity; runs are the ephemeral compute. ## The lifecycle states A run moves through a small state machine over its lifetime. Each state is named below, with the trigger that moves it to the next. ### Cold start There is no run yet for this session. The frontend's first `sendMessage` posts to the session's `.in` channel; the server sees no live `currentRunId` and triggers a fresh `chat.agent` run with `continuation: false`. Moves to **Streaming** as soon as the task wakes and begins consuming `.in`. ### Streaming The agent task is running. It reads the new message off `.in`, fires `onTurnStart`, runs your `run()` function, and pipes `streamText()` chunks onto `.out`. The browser is SSE-subscribed to `.out` and renders chunks as they land. When `streamText()` ends, the task writes a `trigger:turn-complete` control record (an S2 record with an empty body and a special header) and immediately trims `.out` back to the *previous* turn's completion marker — keeping the outbox bounded to roughly one turn of chunks at steady state. Moves to **Idle** after `onTurnComplete` runs and the post-turn snapshot is written. ### Idle (awaiting next message) The turn is over. The task is alive but not doing work — it is parked in a waitpoint on `.in`, waiting for the next user message. If one arrives, it goes back to **Streaming** for the next turn. If `idleTimeoutInSeconds` (30 seconds by default) passes with no new message, it moves to **Suspended**. ### Suspended The task fires `onChatSuspend`, then the engine **checkpoints** the run's whole process state and frees the compute. The session is still live (the row exists, the `.out` stream is still readable, the chat ID still works), but no machine is dedicated to it. This is the same Checkpoint-Resume System that powers every Trigger.dev task — covered in detail at [How it works → Checkpoint-Resume](/docs/how-it-works#the-checkpoint-resume-system). Moves to **Resuming** when the next message lands in `.in`. ### Resuming The engine restores the suspended run from its checkpoint. The same JS process picks up exactly where it parked — `chat.local` values, the accumulator, in-flight promises, in-memory caches all preserved as they were. `onChatResume` fires immediately after the restore, then the task transitions to **Streaming**. No boot work, no snapshot read, no SDK reinitialization. This is the cheap path. ### Continuation (after exit) If the run has fully exited (because it hit `maxTurns`, the customer called `chat.endRun()` or `chat.requestUpgrade()`, or it was cancelled or crashed), the next user message can't resume it — there is nothing to resume. Instead, the server triggers a brand-new run with `continuation: true`. The new run does a cold boot, reads the prior conversation's S3 snapshot, replays any `.out` chunks after the snapshot cursor, AND replays any `.in` records past the last `turn-complete` cursor (the user messages a dead run never acknowledged). If the predecessor died mid-stream and left a partial assistant response in `.out`, the smart default splices `[firstInFlightUser, partialAssistant]` onto the chain so any follow-up has full context — see [Recovery boot](/docs/ai-chat/patterns/recovery-boot). The new run then enters **Streaming** with `turn === 0` of the new run but `messageCount > 0`. ### Closed `POST /api/v1/sessions/:id/close` flips `closedAt` on the session row. Future appends are rejected. Reads still work for transcript viewing. The session is terminal. ## One turn, end to end Here is a typical cold turn — user opens the page, types "What's the weather?", reads the response — traced through every component. The Vercel AI SDK's `useChat` hook serializes the user's message into the slim wire format: `{ chatId, trigger: "submit-message", message, metadata }`. Only the new message goes on the wire, not the full history. The transport calls `POST /realtime/v1/sessions/:chatId/in/append`, authenticated with the session's public access token. The body is one S2 record. The append route resolves the session, then calls `ensureRunForSession()`. The session's `currentRunId` is null (cold start), so it triggers a new `chat.agent` run on the project's dev/prod environment and atomically claims the slot via an optimistic version counter. The route writes the message to `s2://sessions/:chatId/in` as a single record. S2 assigns a sequence number. Any waitpoints registered on this channel fire, which would wake an existing run — but there is no run waiting yet, so this is a no-op for now. In parallel with the send, the transport opens `GET /realtime/v1/sessions/:chatId/out` (server-sent events). It passes its `lastEventId` if it has one cached; on a brand-new chat it does not. Any chunks the agent writes from now on will be delivered to this stream. The newly-triggered run starts. `onBoot` fires once per worker process. Because this is a fresh chat, no snapshot is read. The agent reads the pending record off `.in` via a waitpoint. `onChatStart` fires (once per chat lifetime). `onTurnStart` fires (every turn). Your code calls `streamText({ model, messages })`. Each `UIMessageChunk` it produces is appended to `s2://sessions/:chatId/out` as a record. The browser sees them arrive on the SSE stream and the AI SDK renders them. When `streamText()` finishes, the agent writes a record with header `trigger:turn-complete` and an empty body. The browser transport sees this header and closes the per-turn readable stream. Immediately after writing the new turn-complete marker, the agent issues an S2 trim command targeting the *previous* turn-complete's sequence number. This bounds the stream's storage to roughly one turn of chunks plus the latest control record. `onTurnComplete` runs (your hook for persistence). Then the agent writes `ChatSnapshotV1` — `{ version: 1, messages, lastOutEventId, lastOutTimestamp }` — to S3 at `sessions/:chatId/snapshot.json`. This write is awaited, not fire-and-forget, so the next run is guaranteed to find it. The agent re-enters the waitpoint on `.in`. After `idleTimeoutInSeconds` of nothing arriving, `onChatSuspend` fires and the engine snapshots the run. Compute is freed. ## Three layers of persistence chat.agent survives idle gaps, deploys, refreshes, and crashes because three separate persistence mechanisms work at three different layers of the stack. They're orthogonal — each protects against a different failure mode, and conflating them is a common source of bugs. ### Layer 1: the engine checkpoint (compute) When a run enters the Suspended state, the engine **checkpoints** the running process — its memory, CPU registers, and open file descriptors — and frees the compute. Today this is done via [CRIU](https://criu.org/) (Checkpoint/Restore in Userspace), the same mechanism that powers every Trigger.dev task's suspend/resume. On the new microVM compute runtime (currently in [private beta](/docs/compute-private-beta)), it becomes a full Firecracker VM snapshot: every byte of memory plus filesystem state plus every kernel object inside the VM. When the next message arrives, the engine **restores** the checkpoint. The same JS process picks up at the exact instruction it parked on. From your code's perspective, the line right after the `messagesInput.wait()` waitpoint just continues executing. Anything in process memory survives: `chat.local`, the message accumulator, in-flight Promises, in-memory caches, open DB connections. The runId is unchanged. This is what lets you write `run()` as a single long-lived function with stateful closures, even though the underlying compute actually goes through checkpoint/restore cycles between turns. `onChatSuspend` fires immediately before the checkpoint; `onChatResume` fires immediately after the restore. ### Layer 2: the chat snapshot (S3) After every turn the agent writes a `ChatSnapshotV1` blob to S3 — full accumulated `UIMessage[]` plus the current `lastOutEventId` cursor. This is chat-specific and lives one layer above the engine. It has nothing to do with CRIU or Firecracker. The chat snapshot bridges run *boundaries*. If a run exits cleanly — because it hit `maxTurns`, called `chat.endRun()` or `chat.requestUpgrade()`, was cancelled, crashed, or got bumped to a new version after a deploy — the engine checkpoint is gone with it. When the next user message arrives, the server triggers a fresh run with `continuation: true`. That new run reads the S3 snapshot, replays any post-snapshot chunks from `.out`, merges by message ID, and starts its first turn with the full conversation history already in memory. The chat snapshot carries only message history — not process memory. `chat.local`, in-memory caches, open connections all need to be reinitialized on a continuation. This is why `onBoot` (every fresh worker) is the right place to initialize `chat.local`, not `onChatStart` (only the very first turn of the chat). See [Persistence and replay](/docs/ai-chat/patterns/persistence-and-replay) for the full snapshot model. If your task registers a `hydrateMessages` hook, the chat snapshot is skipped entirely — your hook is the single source of truth for history. ### Layer 3: the `lastEventId` cursor (browser) The transport stores `lastEventId` — the S2 sequence number of the most recent chunk it processed — in its session state. On page reload, it reopens the SSE stream with `Last-Event-ID: ` as a header. S2 resumes from that cursor; chunks the browser already saw are not redelivered. If the agent was mid-turn when the browser reloaded, the rest of the turn streams in. If the turn had already completed, the stream closes immediately via an `X-Session-Settled` header so the client doesn't long-poll for nothing. Unlike the other two layers, this one is client-side. The server doesn't even need to know the browser refreshed — the agent run keeps running (or stays suspended) regardless. ### Which layer covers which failure mode | What happened | Recovery layer | Same run? | In-memory state preserved? | | ----------------------------------------------------------- | ---------------------------------------- | --------------------------- | -------------------------- | | Idle gap mid-conversation (suspend → resume) | Engine checkpoint | Yes | Yes | | Run exited cleanly (`endRun`, `requestUpgrade`, `maxTurns`) | Chat snapshot | No (fresh continuation run) | No | | Run crashed mid-turn (OOM, exception) | Chat snapshot + `.out` tail replay | (retried as a new attempt) | No | | Browser tab reloaded mid-stream | `lastEventId` cursor on `.out` | (run unaffected) | (n/a) | | Deploy rolled out a new version mid-chat | Chat snapshot, via `requestUpgrade` flow | No | No | No single layer covers every case. The engine checkpoint alone can't survive a run exit (there's nothing to restore). The chat snapshot alone can't survive a tab refresh mid-turn (chunks already streamed would be lost). The `lastEventId` cursor alone can't bridge run boundaries (the new run wouldn't know the history). Together they cover every realistic failure. ## Warm vs cold: same chat, three different timings Take the same conversation — "What's the weather?" then "What about tomorrow?" — and look at how each second turn lands. **Warm second turn (within a few seconds).** The first turn finished, the agent is parked on the `.in` waitpoint, status is **Idle**. The new message hits `/append`, the waitpoint fires, the agent wakes inside the same run with all memory intact, runs `onTurnStart` for turn 2, streams the response. No checkpoint involved — the process never went to sleep. Latency to first chunk: dominated by the LLM, not the platform. **Resumed second turn (a few minutes later).** The first turn finished and the agent suspended — the engine checkpoint is stored, compute is freed. The new message hits `/append`. The engine restores the checkpoint, fires `onChatResume`, and the task picks up exactly where it parked — all in-memory state preserved (`chat.local`, the accumulator, the lot). Latency to first chunk: the engine's restore overhead, then the LLM. **Continuation second turn (an hour later, or after a deploy).** The first turn finished and the run eventually exited. The new message hits `/append`, the server triggers a fresh run with `continuation: true`. The new run boots cold, `onBoot` fires, the agent reads the S3 chat snapshot, replays the `.out` tail, then enters the turn loop with the full conversation already accumulated. The previous run's in-memory state is gone — anything in `chat.local` has to be re-initialized in `onBoot`. Latency to first chunk: cold start plus snapshot read, then the LLM. All three look identical to the browser. Only the agent task knows which path it took, via `payload.continuation` and `ctx.attempt.number`. ## Lifecycle hooks: where you plug in | Hook | When it fires | Typical use | | -------------------------------- | ------------------------------------------------------------------------------- | ----------------------------------------------- | | `onBoot` | Once per worker process, before any chat work | Initialize `chat.local` resources | | `onPreload` | Once per chat lifetime, if the chat was preloaded before the first message | Warm caches, fetch the user's profile | | `onChatStart` | Once per chat lifetime, on the first turn of a fresh chat (not on continuation) | First-message persistence, system-prompt setup | | `onValidateMessages` | Every turn, before merging the incoming message | Reject or transform user input | | `hydrateMessages` | Every turn, instead of snapshot+replay | Use your DB as the source of truth | | `onTurnStart` | Every turn, before `run()` | Compact history, persist the user message | | `onBeforeTurnComplete` | Every turn, after streaming, before the turn-complete record | Emit a final custom chunk | | `onTurnComplete` | Every turn, after the turn-complete record is written | Persist the assistant message and `lastEventId` | | `onChatSuspend` / `onChatResume` | At the idle → suspend / suspend → wake transitions | Release/reacquire expensive resources | See [Lifecycle hooks](/docs/ai-chat/lifecycle-hooks) for the full signatures and firing order. ## When chat.agent is the right primitive **Good fit**: * Multi-turn conversational agents where the user is expected to come back later. * Long-running agent loops with tool calls, where a single turn can take a minute or more. * Cases where you want page reloads to resume the in-flight response without re-running the model. * Cases where you can't predict idle gaps — humans go to lunch. **Not a good fit**: * Single-shot completions where you don't need durability or resume. Call your model directly. * Workflows where you control both ends and want a custom protocol. Use a [raw `task()` with chat primitives](/docs/ai-chat/custom-agents) directly without the `chat.agent` wrapper. * High-fanout broadcasting (one source, many subscribers). Use Trigger.dev realtime streams against a regular task instead. ## Putting it together ```mermaid theme={"theme":"css-variables"} sequenceDiagram participant Browser participant API as Trigger.dev API participant S2_in as S2 .in participant S2_out as S2 .out participant Agent as chat.agent task participant S3 as S3 snapshot Note over Agent: Cold start Browser->>API: POST /sessions/:id/in/append API->>S2_in: append(message) API->>Agent: trigger run (continuation: false) Browser->>API: GET /sessions/:id/out (SSE) API->>S2_out: read stream Agent->>S2_in: read message (waitpoint) Agent->>S2_out: append chunk(s) S2_out-->>Browser: SSE chunks Agent->>S2_out: append turn-complete (control) Agent->>S2_out: trim < previous turn-complete Agent->>S3: write snapshot Note over Agent: Idle on waitpoint Note over Agent: ...time passes... Note over Agent: Suspended Browser->>API: POST /sessions/:id/in/append API->>S2_in: append(message) API->>Agent: restore from suspend Agent->>S2_in: read message Agent->>S2_out: append chunk(s) S2_out-->>Browser: SSE chunks Agent->>S2_out: append turn-complete Agent->>S3: write snapshot Note over Agent: Idle again ``` ## Where to go next * [Quick start](/docs/ai-chat/quick-start) — get a chat running in a few minutes. * [Backend](/docs/ai-chat/backend) — the `chat.agent()` API in detail. * [Lifecycle hooks](/docs/ai-chat/lifecycle-hooks) — every hook, what fires when. * [Persistence and replay](/docs/ai-chat/patterns/persistence-and-replay) — deeper on the snapshot model. * [Client protocol](/docs/ai-chat/client-protocol) — wire format if you're writing a custom transport. # Lifecycle hooks Source: https://trigger.dev/docs/ai-chat/lifecycle-hooks Hook into every stage of a chat agent's run: preload, turn start, turn complete, suspend, resume, and more. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. `chat.agent({ ... })` accepts a set of lifecycle hooks for persisting state, validating input, transforming messages, and reacting to suspension and resumption. They fire at well-defined points in the chat agent's lifetime. **Once per worker process (every fresh run boot):** `onBoot` → `onPreload` (preloaded runs only). **Once per chat (first message of the chat's lifetime):** `onChatStart`. **Per-turn order:** `onValidateMessages` → `hydrateMessages` → `onChatStart` (chat's first message only) → `onTurnStart` → `run()` → `onBeforeTurnComplete` → `onTurnComplete`. **Suspend / resume:** `onChatSuspend` fires when the run transitions from idle to suspended (waiting on the next message); `onChatResume` fires on wake. **Four scopes to keep straight:** | Scope | Fires when | Use for | | ----------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------- | | **Process** ([`onBoot`](#onboot)) | Every fresh worker boots — initial, preloaded, and reactive continuation (post-cancel/crash/`endRun`/upgrade). | Initialize `chat.local`, open per-process resources, re-hydrate state from your DB on continuation. | | **Recovery** ([`onRecoveryBoot`](#onrecoveryboot)) | Continuation boot where the dead run was mid-stream — a partial assistant survives on `session.out`. | Override the smart default — drop the partial, synthesize tool results, emit a recovery banner. | | **Chat** ([`onChatStart`](#onchatstart)) | First message of a chat's lifetime. Does NOT fire on continuation runs or OOM retries. | One-time DB rows for the chat, resources tied to the chat's lifetime. | | **Turn** ([`onTurnStart`](#onturnstart), [`onTurnComplete`](#onturncomplete), etc.) | Every turn. | Persist messages, post-process responses. | ## Task context (`ctx`) Every chat lifecycle callback and the `run` payload include `ctx`: the same run context object as `task({ run: (payload, { ctx }) => ... })`. Import the type with `import type { TaskRunContext } from "@trigger.dev/sdk"` (the `Context` export is the same type). Use `ctx` for tags, metadata, or any API that needs the full run record. The string `runId` on chat events is always `ctx.run.id` (both are provided for convenience). See [Task context (`ctx`)](/docs/ai-chat/reference#task-context-ctx) in the API reference. Standard [task lifecycle hooks](/docs/tasks/overview) such as `onWait`, `onResume`, `onComplete`, and `onFailure` are also available on `chat.agent()` with the same shapes as on a normal `task()` — but prefer the chat-specific [`onChatSuspend` / `onChatResume`](#onchatsuspend--onchatresume) for any chat-related work. The generic hooks fire on every wait/resume (including ones the runtime uses internally for non-chat reasons); the chat-specific ones fire only at the idle-to-suspended transition you actually care about and carry full chat context. ## onBoot Fires **once per worker process picking up the chat** — for the initial run, for preloaded runs, AND for reactive continuation runs (post-cancel, crash, `endRun`, `requestUpgrade`, OOM retry). Does NOT fire when the same run resumes from snapshot via the idle-window suspend/resume path — use [`onChatResume`](#onchatsuspend--onchatresume) for that. This is the right place to initialize anything that lives in the JS process for the lifetime of the run: [`chat.local`](/docs/ai-chat/chat-local) state, [DB connections](/docs/database-connections), sandboxes, in-memory caches. It runs before `onPreload`, `onChatStart`, the continuation-wait branch, and any turn — so anything you set up here is available everywhere downstream. If you initialize `chat.local` only in `onChatStart`, your `run()` will crash on continuation runs with `chat.local can only be modified after initialization`. `onChatStart` is once-per-chat by contract; `chat.local` is per-process and needs `onBoot`. Branch on `continuation` to decide whether to load existing state from your DB or start fresh: ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", clientDataSchema: z.object({ userId: z.string() }), onBoot: async ({ chatId, clientData, continuation, previousRunId }) => { const user = await db.user.findUnique({ where: { id: clientData.userId } }); userContext.init({ name: user.name, plan: user.plan }); if (continuation) { // Re-hydrate per-chat in-memory state from your DB. // `previousRunId` is the public id of the prior run (use it for // logging or to look up persisted state keyed on run id). const saved = await db.chatState.findUnique({ where: { chatId } }); if (saved) { // Re-apply your saved per-chat state into wherever your // run() reads it from (a chat.local slot, an in-memory map, etc.). userContext.applySaved(saved); } } }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` | Field | Type | Description | | ----------------- | --------------------------- | ---------------------------------------------------------------------------- | | `ctx` | `TaskRunContext` | Full task run context. See [reference](/docs/ai-chat/reference#task-context-ctx). | | `chatId` | `string` | Chat session ID | | `runId` | `string` | The Trigger.dev run ID for this run boot | | `chatAccessToken` | `string` | Scoped access token for this run | | `clientData` | Typed by `clientDataSchema` | Custom data from the frontend | | `continuation` | `boolean` | `true` when this run is taking over from a prior dead run | | `previousRunId` | `string \| undefined` | Public id of the prior run when `continuation` is true | | `preloaded` | `boolean` | Whether this run was triggered as a preload | `onBoot` and `onChatStart` are complementary — keep DB-row creation in `onChatStart` (it only needs to happen once per chat) and put process-level setup (`chat.local`, connections, caches) in `onBoot` (it needs to happen on every fresh worker). ## onRecoveryBoot Fires once on a continuation boot when the dead predecessor was mid-stream — a partial assistant survives on `session.out`. The runtime reconstructs context automatically via a smart default; this hook is the override path for policies that need something different. The hook does NOT fire when there's no partial — clean continuations after `chat.endRun()` or `chat.requestUpgrade()`, fresh chats, OOM retries on top of a complete snapshot. Those paths dispatch any in-flight user message as a normal turn on the new run without involving the hook. It also does NOT fire when [`hydrateMessages`](#hydratemessages) is registered (the customer owns persistence). ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", onRecoveryBoot: async ({ partialAssistant, inFlightUsers, writer, cause, previousRunId }) => { writer.write({ type: "data-chat-recovery", data: { cause, previousRunId, partialPresent: partialAssistant !== undefined }, transient: true, }); // Return nothing → fall through to the smart default // (splice partial + first user into chain, dispatch the rest). }, run: async ({ messages, signal }) => streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }), }); ``` | Field | Type | Description | | ------------------ | --------------------------------------------------- | ------------------------------------------------------------------------------------------------- | | `ctx` | `TaskRunContext` | Full task run context | | `chatId` | `string` | Chat session ID | | `runId` | `string` | The Trigger.dev run ID for this run boot | | `previousRunId` | `string` | Public id of the prior run that died | | `cause` | `"cancelled" \| "crashed" \| "unknown"` | Best-effort cause. Currently always `"unknown"` — don't branch on it | | `settledMessages` | `TUIMessage[]` | The chain persisted by the predecessor's last `onTurnComplete` | | `inFlightUsers` | `TUIMessage[]` | User messages on `session.in` past the cursor — the message(s) the predecessor never acknowledged | | `partialAssistant` | `TUIMessage \| undefined` | The trailing assistant message whose stream never received `finish` | | `pendingToolCalls` | `Array<{ toolCallId, toolName, input, partIndex }>` | Tool calls in `input-available` state extracted from `partialAssistant` | | `writer` | `ChatWriter` | Lazy session.out writer — write a recovery banner / signal here | Returns `{ chain?, recoveredTurns?, beforeBoot? }` — every field optional. Omitted fields fall through to the smart default. See [Recovery boot](/docs/ai-chat/patterns/recovery-boot) for the full guide, examples (drop partial, synthesize tool results, persist before boot), and interaction notes. Don't put `chat.local` initialization in `onRecoveryBoot` — use [`onBoot`](#onboot). `onRecoveryBoot` is for recovery decisions, not per-process setup. `onBoot` fires first. ## onPreload Fires when a **preloaded run** starts, before any messages arrive. Use it to eagerly create chat-scoped DB rows (the Chat row, the ChatSession row) while the user is still typing — so the very first message lands fast. Preloaded runs are triggered by calling `transport.preload(chatId)` on the frontend. See [Preload](/docs/ai-chat/fast-starts#preload) for details. Per-process state (anything in [`chat.local`](/docs/ai-chat/chat-local), DB connections, etc.) belongs in [`onBoot`](#onboot) — `onBoot` fires before `onPreload` on every fresh worker, including on continuation runs where `onPreload` never fires. ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", clientDataSchema: z.object({ userId: z.string() }), onBoot: async ({ clientData }) => { // Per-process state — runs on every fresh worker (initial, // preloaded, continuation). See onBoot above. const user = await db.user.findUnique({ where: { id: clientData.userId } }); userContext.init({ name: user.name, plan: user.plan }); }, onPreload: async ({ chatId, clientData, runId, chatAccessToken }) => { // Chat-scoped DB rows — only matters on preload (and onChatStart as // a fallback when not preloaded). await db.chat.create({ data: { id: chatId, userId: clientData.userId } }); await db.chatSession.upsert({ where: { id: chatId }, create: { id: chatId, runId, publicAccessToken: chatAccessToken }, update: { runId, publicAccessToken: chatAccessToken }, }); }, onChatStart: async ({ preloaded }) => { if (preloaded) return; // Already initialized in onPreload // ... non-preloaded chat-row initialization }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` | Field | Type | Description | | ----------------- | --------------------------------------------- | ---------------------------------------------------------------------------- | | `ctx` | `TaskRunContext` | Full task run context. See [reference](/docs/ai-chat/reference#task-context-ctx). | | `chatId` | `string` | Chat session ID | | `runId` | `string` | The Trigger.dev run ID | | `chatAccessToken` | `string` | Scoped access token for this run | | `clientData` | Typed by `clientDataSchema` | Custom data from the frontend | | `writer` | [`ChatWriter`](/docs/ai-chat/reference#chatwriter) | Stream writer for custom chunks | Every lifecycle callback receives a `writer`, a lazy stream writer that lets you send custom `UIMessageChunk` parts (like `data-*` parts) to the frontend. Non-transient `data-*` chunks written via the `writer` are automatically added to the response message and available in `onTurnComplete`. Add `transient: true` for ephemeral chunks (progress indicators, etc.) that should not persist. See [Custom data parts](/docs/ai-chat/backend#custom-data-parts). ## onChatStart Fires **exactly once per chat**, on the very first user message of the chat's lifetime, before `run()` executes. Use it for one-time chat-scoped setup — create the Chat DB row, mint resources tied to the chat's lifetime. `onChatStart` does **not** fire on: * **Continuation runs** — a new run picking up an existing session after the prior run ended (`chat.endRun`, waitpoint timeout, `chat.requestUpgrade`, cancel, crash). The chat already started. * **OOM-retry attempts** — same chat, same conversation, just on a larger machine. For per-process state that has to be initialized on every fresh worker (including continuation runs), use [`onBoot`](#onboot). For per-turn setup, use [`onTurnStart`](#onturnstart). Do not initialize [`chat.local`](/docs/ai-chat/chat-local) here. `chat.local` is per-process state that must survive continuation runs, but `onChatStart` only fires on the chat's very first message. Use [`onBoot`](#onboot) instead. The `preloaded` field tells you whether [`onPreload`](#onpreload) already ran for this chat — useful for skipping setup work that's already done. Because `onChatStart` fires only on the chat's first ever message, `messages` is either empty (when no message exists yet — e.g. a preloaded run that hasn't received its first turn) or contains just the first user message. There's no prior history to load here. ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", onChatStart: async ({ chatId, clientData, preloaded }) => { if (preloaded) return; // Already set up in onPreload const { userId } = clientData as { userId: string }; await db.chat.create({ data: { id: chatId, userId, title: "New chat" }, }); }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` `clientData` contains custom data from the frontend: either the `clientData` option on the transport constructor (sent with every message) or the `metadata` option on `sendMessage()` (per-message). See [Client data and metadata](/docs/ai-chat/frontend#client-data-and-metadata). ## onValidateMessages Validate or transform incoming `UIMessage[]` before they are converted to model messages. Fires on turns that carry incoming messages, with the raw messages from the wire payload (after cleanup of aborted tool parts), **before** accumulation and `toModelMessages()`. Turns with no incoming messages — preload, close, and regenerate with nothing re-sent — skip it. Return the validated messages array. Throw to abort the turn with an error. This is the right place to call the AI SDK's [`validateUIMessages`](https://ai-sdk.dev/docs/ai-sdk-ui/chatbot-message-persistence#validating-messages-on-the-server) to catch malformed messages from storage or untrusted input before they reach the model, especially useful when persisting conversations to a database where tool schemas may drift between deploys. | Field | Type | Description | | ---------- | ------------------------------------------------------------------ | ---------------------------------- | | `messages` | `UIMessage[]` | Incoming UI messages for this turn | | `chatId` | `string` | Chat session ID | | `turn` | `number` | Turn number (0-indexed) | | `trigger` | `"submit-message" \| "regenerate-message" \| "preload" \| "close"` | The trigger type for this turn | ```ts theme={"theme":"css-variables"} import { validateUIMessages } from "ai"; export const myChat = chat.agent({ id: "my-chat", onValidateMessages: async ({ messages }) => { const userMessages = messages.filter((m) => m.role === "user"); if (userMessages.length > 0) { await validateUIMessages({ messages: userMessages, tools: chatTools }); } return messages; }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, tools: chatTools, abortSignal: signal }); }, }); ``` On HITL continuations (`addToolOutput` / `addToolApproveResponse`) the assistant entry in `messages` is **slim** — `state` + `output` / `errorText` / `approval` only, no `input` or other parts. `validateUIMessages` against the AI SDK schema rejects that shape (the schema requires `input` on resolved tool parts), so filter to user messages first (or skip validation entirely on those turns). The example above does the filter. `onValidateMessages` fires **before** `onTurnStart` and message accumulation. If you need to validate messages loaded from a database, do the loading in `onChatStart` or `onPreload` and let `onValidateMessages` validate the full incoming set each turn. ## hydrateMessages Load the full message history from your backend on every turn, replacing the built-in linear accumulator. When set, the hook's return value becomes the accumulated state; the normal accumulation logic (append for submit, replace for regenerate) is skipped entirely. Use this when the backend should be the source of truth for message history: abuse prevention, branching conversations (DAGs), or rollback/undo support. | Field | Type | Description | | ------------------ | ------------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `chatId` | `string` | Chat session ID | | `turn` | `number` | Turn number (0-indexed) | | `trigger` | `"submit-message" \| "regenerate-message" \| "action"` | The trigger type for this turn | | `incomingMessages` | `UIMessage[]` | Validated incoming messages for this turn. Usually 0-or-1 (empty for actions, regenerates, and continuations; one element for normal `submit-message` and tool-approval responses). On a [Head Start](/docs/ai-chat/fast-starts#with-hydratemessages) first turn, this can contain the route handler's first-turn history. | | `previousMessages` | `UIMessage[]` | Accumulated UI messages before this turn (`[]` on turn 0) | | `clientData` | Typed by `clientDataSchema` | Custom data from the frontend | | `continuation` | `boolean` | Whether this run is continuing an existing chat | | `previousRunId` | `string \| undefined` | The previous run ID (if continuation) | ```ts theme={"theme":"css-variables"} import { chat, upsertIncomingMessage } from "@trigger.dev/sdk/ai"; export const myChat = chat.agent({ id: "my-chat", hydrateMessages: async ({ chatId, trigger, incomingMessages }) => { const record = await db.chat.findUnique({ where: { id: chatId } }); const stored = record?.messages ?? []; if (upsertIncomingMessage(stored, { trigger, incomingMessages })) { // Upsert, not update: on a head-start first turn no preload ran, // so the row may not exist yet when this hook fires. await db.chat.upsert({ where: { id: chatId }, create: { id: chatId, messages: stored }, update: { messages: stored }, }); } return stored; }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` `upsertIncomingMessage` (exported from `@trigger.dev/sdk/ai`) handles the three cases that matter — fresh user messages get pushed, HITL continuations (`addToolOutput` / `addToolApproveResponse`) no-op because the incoming wire shares the existing assistant's id and the runtime overlays the new tool-state advance onto that entry, and non-`submit-message` triggers (`regenerate-message` / `action`) skip persistence. It returns `true` when it mutated `stored`, so the caller knows whether to persist. If you need branching, rollback, or other custom hydrate logic, you can still write the upsert by hand — `upsertIncomingMessage` is a convenience for the common case, not the only supported shape. **Lifecycle position:** `onValidateMessages` → **`hydrateMessages`** → `onChatStart` (chat's first message only) → `onTurnStart` → `run()` After the hook returns, the runtime overlays the wire's tool-state advances (`output-available` / `output-error` / `approval-responded` / `output-denied`) onto matching hydrated entries by id. Everything else on the hydrated entry — text, reasoning, tool `input`, providerMetadata — stays put. This makes [tool approvals](/docs/ai-chat/frontend#tool-approvals) and HITL `addToolOutput` continuations work transparently: ship a slim resolution on the wire, the agent merges the new state onto your DB-backed copy. `hydrateMessages` also fires for [action](/docs/ai-chat/actions) turns (`trigger: "action"`) with empty `incomingMessages`. This lets the action handler work with the latest DB state. Registering `hydrateMessages` short-circuits the runtime's [snapshot + replay](/docs/ai-chat/patterns/persistence-and-replay) reconstruction at run boot — your hook is the single source of truth for history, so the runtime skips reading or writing the snapshot entirely. No object storage traffic, no replay cost. The trade-off is that you own persistence end-to-end. `incomingMessages` is **usually 0-or-1-length**. `submit-message` and tool-approval responses ship a single message; `regenerate-message`, continuations, and actions ship none. The exception is a [Head Start](/docs/ai-chat/fast-starts#with-hydratemessages) first turn, where it carries the route handler's first-turn history. Patterns like [tool-result auditing](/docs/ai-chat/patterns/tool-result-auditing) work the same regardless — iterate the array rather than assuming a single element. ## onTurnStart Fires at the start of **every turn** — including the first turn of a continuation run, where `onChatStart` doesn't fire. Runs after message accumulation and (when applicable) `onChatStart`, but **before** `run()` executes. Use it to persist messages before streaming begins so a mid-stream page refresh still shows the user's message. | Field | Type | Description | | ----------------- | --------------------------------------------- | ---------------------------------------------------------------------------- | | `ctx` | `TaskRunContext` | Full task run context. See [reference](/docs/ai-chat/reference#task-context-ctx). | | `chatId` | `string` | Chat session ID | | `messages` | `ModelMessage[]` | Full accumulated conversation (model format) | | `uiMessages` | `UIMessage[]` | Full accumulated conversation (UI format) | | `turn` | `number` | Turn number (0-indexed) | | `runId` | `string` | The Trigger.dev run ID | | `chatAccessToken` | `string` | Scoped access token for this run | | `continuation` | `boolean` | Whether this run is continuing an existing chat | | `preloaded` | `boolean` | Whether this run was preloaded | | `clientData` | Typed by `clientDataSchema` | Custom data from the frontend | | `writer` | [`ChatWriter`](/docs/ai-chat/reference#chatwriter) | Stream writer for custom chunks | ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", onTurnStart: async ({ chatId, uiMessages, runId, chatAccessToken }) => { await db.chat.update({ where: { id: chatId }, data: { messages: uiMessages }, }); await db.chatSession.upsert({ where: { id: chatId }, create: { id: chatId, runId, publicAccessToken: chatAccessToken }, update: { runId, publicAccessToken: chatAccessToken }, }); }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` By persisting in `onTurnStart`, the user's message is saved to your database before the AI starts streaming. If the user refreshes mid-stream, the message is already there. ## onBeforeTurnComplete Fires after the response is captured but **before** the stream closes. The `writer` can send custom chunks that appear in the current turn. Use this for post-processing indicators, compaction progress, or any data the user should see before the turn ends. ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", onBeforeTurnComplete: async ({ writer, usage, uiMessages }) => { // Write a custom data part while the stream is still open writer.write({ type: "data-usage-summary", data: { tokens: usage?.totalTokens, messageCount: uiMessages.length, }, }); // You can also compact messages here and write progress if (usage?.totalTokens && usage.totalTokens > 50_000) { writer.write({ type: "data-compaction", data: { status: "compacting" } }); chat.setMessages(compactedMessages); writer.write({ type: "data-compaction", data: { status: "complete" } }); } }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` Receives the same fields as [`TurnCompleteEvent`](/docs/ai-chat/reference#turncompleteevent), plus a [`writer`](/docs/ai-chat/reference#chatwriter). ## onTurnComplete Fires after each turn completes, after the response is captured and the stream is closed. This is the primary hook for persisting the assistant's response. Does not include a `writer` since the stream is already closed. | Field | Type | Description | | -------------------- | ------------------------ | -------------------------------------------------------------------------------------------- | | `ctx` | `TaskRunContext` | Full task run context. See [reference](/docs/ai-chat/reference#task-context-ctx). | | `chatId` | `string` | Chat session ID | | `messages` | `ModelMessage[]` | Full accumulated conversation (model format) | | `uiMessages` | `UIMessage[]` | Full accumulated conversation (UI format) | | `newMessages` | `ModelMessage[]` | Only this turn's messages (model format) | | `newUIMessages` | `UIMessage[]` | Only this turn's messages (UI format) | | `responseMessage` | `UIMessage \| undefined` | The assistant's response for this turn | | `turn` | `number` | Turn number (0-indexed) | | `runId` | `string` | The Trigger.dev run ID | | `chatAccessToken` | `string` | Scoped access token for this run | | `lastEventId` | `string \| undefined` | Stream position for resumption. Persist this with the session. | | `stopped` | `boolean` | Whether the user stopped generation during this turn | | `continuation` | `boolean` | Whether this run is continuing an existing chat | | `rawResponseMessage` | `UIMessage \| undefined` | The raw assistant response before abort cleanup (same as `responseMessage` when not stopped) | ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", onTurnComplete: async ({ chatId, uiMessages, runId, chatAccessToken, lastEventId }) => { // Atomic write — see Database persistence for the race-condition rationale await db.$transaction([ db.chat.update({ where: { id: chatId }, data: { messages: uiMessages }, }), db.chatSession.upsert({ where: { id: chatId }, create: { id: chatId, runId, publicAccessToken: chatAccessToken, lastEventId }, update: { runId, publicAccessToken: chatAccessToken, lastEventId }, }), ]); }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` Use `uiMessages` to overwrite the full conversation each turn (simplest). Use `newUIMessages` if you prefer to store messages individually, e.g. one database row per message. Persist `lastEventId` alongside the session. When the transport reconnects after a page refresh, it uses this to skip past already-seen events, preventing duplicate messages. For a full **conversation + session** persistence pattern (including preload, continuation, and token renewal), see [Database persistence](/docs/ai-chat/patterns/database-persistence). ## onChatSuspend / onChatResume Chat-specific hooks that fire at the **idle-to-suspended** transition: the moment the run stops using compute and waits for the next message. These replace the need for the generic `onWait` / `onResume` task hooks for chat-specific work. The `phase` discriminator tells you **when** the suspend/resume happened: * `"preload"`: after `onPreload`, waiting for the first message * `"turn"`: after `onTurnComplete`, waiting for the next message ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", onChatSuspend: async (event) => { // Tear down expensive resources before suspending await disposeCodeSandbox(event.ctx.run.id); if (event.phase === "turn") { logger.info("Suspending after turn", { turn: event.turn }); } }, onChatResume: async (event) => { // Re-initialize after waking up logger.info("Resumed", { phase: event.phase }); }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` | Field | Type | Description | | ------------ | --------------------------- | ---------------------------------------------------- | | `phase` | `"preload" \| "turn"` | Whether this is a preload or post-turn suspension | | `ctx` | `TaskRunContext` | Full task run context | | `chatId` | `string` | Chat session ID | | `runId` | `string` | The Trigger.dev run ID | | `clientData` | Typed by `clientDataSchema` | Custom data from the frontend | | `turn` | `number` | Turn number (**`"turn"` phase only**) | | `messages` | `ModelMessage[]` | Accumulated model messages (**`"turn"` phase only**) | | `uiMessages` | `UIMessage[]` | Accumulated UI messages (**`"turn"` phase only**) | Unlike `onWait` (which fires for all wait types: duration, task, batch, token), `onChatSuspend` fires only at chat suspension points with full chat context. No need to filter on `wait.type`. ## exitAfterPreloadIdle When set to `true`, a preloaded run completes successfully after the idle timeout elapses instead of suspending. Use this for "fire and forget" preloads. If the user doesn't send a message during the idle window, the run ends cleanly. ```ts theme={"theme":"css-variables"} export const myChat = chat.agent({ id: "my-chat", preloadIdleTimeoutInSeconds: 10, exitAfterPreloadIdle: true, onPreload: async ({ chatId, clientData }) => { // Eagerly set up state. If no message comes, the run just ends. await initializeChat(chatId, clientData); }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` ## See also * [Reference](/docs/ai-chat/reference) for full event-type definitions * [Database persistence](/docs/ai-chat/patterns/database-persistence) for the canonical persistence pattern * [Code execution sandbox](/docs/ai-chat/patterns/code-sandbox) for an `onChatSuspend` use case * [Backend](/docs/ai-chat/backend) for `chat.agent({ ... })` itself, prompts, stop signals, persistence overview, and runtime configuration # MCP Server Source: https://trigger.dev/docs/ai-chat/mcp Chat with your agents from any AI coding tool using the Trigger.dev MCP server. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. The Trigger.dev MCP server includes tools for having conversations with your chat agents directly from AI coding tools like Claude Code, Cursor, Windsurf, and others. This lets your AI assistant interact with your agents without writing any code. ## Available tools | Tool | Description | | -------------------- | ------------------------------------- | | `list_agents` | List all agents in the current worker | | `start_agent_chat` | Start a conversation with an agent | | `send_agent_message` | Send a message and get the response | | `close_agent_chat` | Close a conversation | See the [MCP Tools Reference](/docs/mcp-tools#agent-chat-tools) for full details on each tool. ## Typical workflow Ask your AI assistant to list agents in your project. This calls `list_agents` which returns all tasks created with [`chat.agent()`](/docs/ai-chat/backend#chat-agent) or [`chat.customAgent()`](/docs/ai-chat/custom-agents). Start a conversation with an agent using `start_agent_chat`. This triggers a run and optionally preloads the agent so it's ready to respond immediately. If the agent uses `withClientData`, pass the required fields via `clientData`. Send messages with `send_agent_message`. The agent processes each message and returns the full response text, including which tools were used. The agent retains full context across messages in the same chat. When done, close the conversation with `close_agent_chat`. This signals the agent to exit gracefully. If you skip this step, the agent will close on its own when its idle timeout expires. ## Example conversation Here's what a typical MCP conversation with an agent looks like: ``` You: List my agents AI: Found 3 agents in worker 20250402.12 (dev): - support-agent (src/trigger/support.ts) - pr-review (src/trigger/pr-review.ts) - research (src/trigger/research.ts) You: Start a chat with the support agent AI: Agent chat started and preloaded. - Chat ID: abc-123 - Agent: support-agent - Run ID: run_xyz You: Ask it to look up order #4821 AI: I found order #4821. It was placed on March 15 and shipped via FedEx (tracking: 7891234). Current status: In Transit, estimated delivery March 19. Tools used: lookupOrder, getShippingStatus You: Close the chat AI: Chat abc-123 closed. ``` ## Setup If you haven't set up the MCP server yet, see the [MCP Server introduction](/docs/mcp-introduction) for installation and client configuration. Agent chat tools require: * A running dev server (`trigger dev`) or a deployed worker * At least one agent defined with [`chat.agent()`](/docs/ai-chat/backend#chat-agent) or [`chat.customAgent()`](/docs/ai-chat/custom-agents) ## How it works Under the hood, the MCP tools use the same protocol as the [frontend transport](/docs/ai-chat/frontend) and [AgentChat SDK](/docs/ai-chat/server-chat): 1. **`start_agent_chat`** triggers a task run with the `preload` trigger and stores the session (run ID, chat ID) in memory. 2. **`send_agent_message`** sends the message via the run's input stream and subscribes to the output SSE stream to collect the agent's full response. 3. **`close_agent_chat`** sends a close signal via the input stream and removes the session. Sessions are held in-memory within the MCP server process. If the MCP server restarts, active sessions are lost — but the underlying agent runs continue until their idle timeout. The `get_current_worker` tool also labels agents with `[agent]` in its output, making it easy to identify which tasks are agents even when listing all tasks. ## See also * [AgentChat SDK](/docs/ai-chat/server-chat) — programmatic server-side access to agents * [Sub-Agents](/docs/ai-chat/patterns/sub-agents) — agents calling other agents * [MCP Tools Reference](/docs/mcp-tools#agent-chat-tools) — full tool parameter reference # AI Agents Source: https://trigger.dev/docs/ai-chat/overview Durable multi-turn AI chats — one Trigger.dev task per conversation, surviving refreshes, deploys, and crashes. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. An AI chat isn't a request — it's a session. `chat.agent` runs every conversation as a single long-lived Trigger.dev task: you write the loop, it wakes up when a message arrives, freezes when none do, and the same in-memory state and on-disk workspace survive across page refreshes, deploys, idle gaps, and crashes. The substrate handles the parts most teams stitch together by hand — turn lifecycle, mid-stream resume, recovery from cancel/crash/OOM, HITL approvals, deploy upgrades — so your code is the loop you'd write anyway: messages in, `streamText` out. ## A minimal example A `chat.agent` task takes `messages`, calls `streamText`, and returns the result. The frontend wires the [Vercel AI SDK's `useChat`](https://ai-sdk.dev/docs/reference/ai-sdk-ui/use-chat) to a `TriggerChatTransport`. No API routes. ```ts trigger/chat.ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { streamText, stepCountIs } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; export const myChat = chat.agent({ id: "my-chat", run: async ({ messages, signal }) => streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal, stopWhen: stepCountIs(15), }), }); ``` ```tsx app/components/Chat.tsx theme={"theme":"css-variables"} import { useChat } from "@ai-sdk/react"; import { useTriggerChatTransport } from "@trigger.dev/sdk/chat/react"; export function Chat() { const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), }); const { messages, sendMessage } = useChat({ transport }); // ... render UI } ``` See [Quick Start](/docs/ai-chat/quick-start) for the matching server actions and a runnable project. ## Why use AI Agents on Trigger.dev * **Resume across refreshes, deploys, and crashes.** A chat in progress when you redeploy keeps streaming on the new version. Mid-stream refreshes pick up where they left off. * **Native AI SDK support.** Text, tool calls, reasoning, and custom `data-*` parts all flow through `useChat` over a custom `ChatTransport`. No custom protocol to maintain. * **Multi-turn for free.** Each turn is a step inside the same durable task; conversation history accumulates server-side, so clients only ship the new message. * **Fast cold starts.** Opt-in [Head Start](/docs/ai-chat/fast-starts#head-start) runs the first `streamText` step in your warm Next.js / Hono / SvelteKit server while the agent boots in parallel — cuts time-to-first-chunk roughly in half. * **Production primitives ship in the box.** Stop generation, steering, edits, branching, sub-agents, HITL tool approvals, version upgrades, recovery from cancel/crash/OOM — all first-class. * **Observable.** Every turn is a span in the Trigger.dev dashboard. Sessions are queryable via `sessions.list` for inbox-style UIs. ## How it fits together Three primitives, related but distinct: * **Chat agents** — the SDK surface you define with [`chat.agent()`](/docs/ai-chat/backend#chat-agent). Owns the turn loop, lifecycle hooks, and the response stream. * **Sessions** — the durable, bi-directional channel keyed on `chatId` that holds the conversation across run boundaries. A chat agent runs *on top of* a [Session](/docs/ai-chat/sessions). * **Sub-agents** — Delegate work from one agent to another via [`AgentChat`](/docs/ai-chat/patterns/sub-agents). The sub-agent runs as its own durable agent on its own session; its response streams back through the parent as preliminary tool results, so the frontend sees the sub-agent working inside the parent's tool card. ## Next steps Get a working chat in three steps — agent, token, frontend. Sessions, the turn loop, durable streams, and what survives a refresh. `chat.agent` options, lifecycle hooks, and the raw-task primitives. Declare tools so `toModelOutput` survives across turns, typed in `run()`. HITL approvals, branching, sub-agents, OOM/crash recovery. Size and release connection pools so agents don't exhaust your database. # Branching conversations Source: https://trigger.dev/docs/ai-chat/patterns/branching-conversations Build ChatGPT-style conversation trees with edit, regenerate, undo, and branch switching using hydrateMessages, chat.history, and actions. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. Most chat UIs treat conversations as linear sequences. But real conversations branch — users edit previous messages, regenerate responses, undo exchanges, and explore alternative paths. This pattern shows how to build a branching conversation system using `hydrateMessages`, `chat.history`, and custom actions. ## Data model The standard approach (used by ChatGPT, Open WebUI, LibreChat, and others) stores messages as a tree with parent pointers: ```ts theme={"theme":"css-variables"} // Each message is a node in the tree type ChatNode = { id: string; chatId: string; parentId: string | null; // null for root role: "user" | "assistant"; message: UIMessage; // the full AI SDK message createdAt: Date; }; ``` A conversation is a tree of nodes. The **active branch** is resolved by walking from a leaf node up through `parentId` pointers to the root, then reversing: ``` root ├── user: "Hello" │ └── assistant: "Hi there!" │ ├── user: "What's the weather?" ← branch A │ │ └── assistant: "It's sunny!" │ └── user: "Tell me a joke" ← branch B (active) │ └── assistant: "Why did the..." ``` Switching branches means changing which leaf is "active" — the same tree, different path. ## Backend setup ### Store: tree operations Define helpers that read and write the node tree. Adapt to your database: ```ts theme={"theme":"css-variables"} // Resolve the active path: walk from leaf to root, reverse async function getActiveBranch(chatId: string): Promise { const nodes = await db.chatNode.findMany({ where: { chatId } }); const byId = new Map(nodes.map((n) => [n.id, n])); // Find active leaf (most recently created leaf node) const childIds = new Set(nodes.map((n) => n.parentId).filter(Boolean)); const leaves = nodes.filter((n) => !childIds.has(n.id)); const activeLeaf = leaves.sort((a, b) => b.createdAt - a.createdAt)[0]; if (!activeLeaf) return []; // Walk to root const path: UIMessage[] = []; let current: ChatNode | undefined = activeLeaf; while (current) { path.unshift(current.message); current = current.parentId ? byId.get(current.parentId) : undefined; } return path; } // Append a message as a child of the current leaf async function appendMessage(chatId: string, message: UIMessage): Promise { const branch = await getActiveBranch(chatId); const parentId = branch.length > 0 ? branch[branch.length - 1]!.id : null; await db.chatNode.create({ data: { id: message.id, chatId, parentId, role: message.role, message, createdAt: new Date() }, }); } ``` ### Agent: hydration + actions ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { streamText, stepCountIs } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; import { z } from "zod"; export const myChat = chat.agent({ id: "branching-chat", // Load the active branch from the DB on every turn. // The frontend's message array is ignored — the tree is the source of truth. hydrateMessages: async ({ chatId, trigger, incomingMessages }) => { if (trigger === "submit-message" && incomingMessages.length > 0) { await appendMessage(chatId, incomingMessages[incomingMessages.length - 1]!); } return getActiveBranch(chatId); }, actionSchema: z.discriminatedUnion("type", [ // Edit a previous user message — creates a sibling node in the tree z.object({ type: z.literal("edit"), messageId: z.string(), text: z.string() }), // Switch to a different branch by selecting a leaf node z.object({ type: z.literal("switch-branch"), leafId: z.string() }), // Undo the last user + assistant exchange z.object({ type: z.literal("undo") }), ]), onAction: async ({ action, chatId }) => { switch (action.type) { case "edit": { // Find the original message's parent, create a sibling with new content const original = await db.chatNode.findUnique({ where: { id: action.messageId } }); if (!original) break; const newId = generateId(); await db.chatNode.create({ data: { id: newId, chatId, parentId: original.parentId, // same parent = sibling role: "user", message: { id: newId, role: "user", parts: [{ type: "text", text: action.text }] }, createdAt: new Date(), }, }); // Active branch now resolves through the new sibling (most recent leaf) break; } case "switch-branch": { // Mark this leaf as the most recently accessed so getActiveBranch picks it await db.chatNode.update({ where: { id: action.leafId }, data: { createdAt: new Date() }, }); break; } case "undo": { // Remove the last two nodes (user + assistant) from the active branch const branch = await getActiveBranch(chatId); if (branch.length >= 2) { const lastTwo = branch.slice(-2); await db.chatNode.deleteMany({ where: { id: { in: lastTwo.map((m) => m.id) } }, }); } break; } } // Reload the (now modified) active branch into the accumulator const updated = await getActiveBranch(chatId); chat.history.set(updated); }, onTurnComplete: async ({ chatId, responseMessage }) => { // Persist the assistant's response as a new node if (responseMessage) { await appendMessage(chatId, responseMessage); } }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` ## Frontend ### Sending actions Wire up edit, undo, and branch switching to the transport: ```tsx theme={"theme":"css-variables"} function MessageActions({ message, chatId }: { message: UIMessage; chatId: string }) { const transport = useTransport(); const [editing, setEditing] = useState(false); const [editText, setEditText] = useState(""); if (message.role !== "user") return null; return (
{editing ? (
{ transport.sendAction(chatId, { type: "edit", messageId: message.id, text: editText }); setEditing(false); }}> setEditText(e.target.value)} />
) : ( )}
); } ``` ### Branch navigation To show the `< 2/3 >` sibling switcher, query the tree for siblings at each fork point. This is a frontend concern — the backend exposes the data, the UI navigates it. ```tsx theme={"theme":"css-variables"} function BranchSwitcher({ message, chatId, siblings }: { message: UIMessage; chatId: string; siblings: { id: string; createdAt: string }[]; }) { const transport = useTransport(); if (siblings.length <= 1) return null; const currentIndex = siblings.findIndex((s) => s.id === message.id); return (
{currentIndex + 1}/{siblings.length}
); } ``` The sibling data (which messages share the same parent) needs to come from your database — query it when loading the chat or include it as client data. The agent only returns the active branch via `hydrateMessages`. ## How it works | Operation | What happens | | ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | **Send message** | `hydrateMessages` appends the new message as a child of the current leaf, returns the active path | | **Edit message** | `onAction` creates a sibling node with the same parent. The new node becomes the latest leaf, so `hydrateMessages` resolves through it. LLM responds to the edited history | | **Regenerate** | Same as edit — create a new assistant sibling. The AI SDK's `regenerate()` handles this via `trigger: "regenerate-message"` | | **Undo** | `onAction` removes the last two nodes. `chat.history.set()` updates the accumulator. LLM responds to the earlier state | | **Switch branch** | `onAction` updates which leaf is "active". `hydrateMessages` loads the new path. LLM responds to the switched context | ## Design notes * **Messages are immutable** — edits create siblings, not mutations. This preserves full history for analytics and auditing. * **The tree lives in your database** — the agent loads a linear path from it via `hydrateMessages`. The agent itself doesn't know about the tree structure. * **`hydrateMessages` + `onAction` + `chat.history`** are the three primitives. Hydration loads the active path, actions modify the tree, and `chat.history.set()` syncs the accumulator after tree modifications. * **Frontend owns navigation** — the `< 2/3 >` UI, sibling queries, and branch switching triggers are client-side concerns. The backend just processes actions and returns responses. ## See also * [`hydrateMessages`](/docs/ai-chat/lifecycle-hooks#hydratemessages) — backend-controlled message history * [Actions](/docs/ai-chat/actions) — custom actions with `actionSchema` and `onAction` * [`chat.history`](/docs/ai-chat/backend#chat-history) — imperative history mutations * [Database persistence](/docs/ai-chat/patterns/database-persistence) — basic persistence pattern (linear) # Code execution sandbox Source: https://trigger.dev/docs/ai-chat/patterns/code-sandbox Warm an isolated sandbox on each chat turn, run an AI SDK executeCode tool, and tear down right before the run suspends — using chat.agent hooks and chat.local. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. Use a **hosted code sandbox** (for example [E2B](https://e2b.dev)) when the model should run short scripts to analyze tool output (PostHog queries, CSV-like data, math) without executing arbitrary code on the Trigger worker host. This page describes a **durable chat** pattern that fits `chat.agent()`: * **Warm** the sandbox at the start of each turn (**non-blocking**). * **Reuse** it for every `executeCode` tool call during that turn (and across turns in the same run if you keep the handle). * **Dispose** it **right before the run suspends** waiting for the next user message — using the **`onChatSuspend`** hook, not `onTurnComplete`. ## Why not tear down in `onTurnComplete`? After a turn finishes, the chat runtime still goes through an **idle** window and only then suspends. During that window the run is still executing — useful for `chat.defer()` work — and the run hasn't suspended yet. The boundary you want for “turn done, about to sleep” is **`onChatSuspend`**, which fires right before the run transitions from idle to suspended. It provides the `phase` (`”preload”` or `”turn”`) and full chat context. See [onChatSuspend / onChatResume](/docs/ai-chat/lifecycle-hooks#onchatsuspend--onchatresume). ```mermaid theme={"theme":"css-variables"} sequenceDiagram participant TurnStart as onTurnStart participant Run as run / streamText participant TurnDone as onTurnComplete participant Idle as Idle window participant Suspend as onChatSuspend participant Sleep as suspended TurnStart->>Run: warm sandbox (async) Run->>TurnDone: persist / inject / etc. TurnDone->>Idle: still running Idle->>Suspend: dispose sandbox Suspend->>Sleep: waiting for next message ``` ## Recommended provider: E2B * **API key** auth — works from any Trigger.dev worker; no Vercel-only OIDC. * **Code Interpreter** SDK (`@e2b/code-interpreter`): long-lived sandbox, `runCode()`, `kill()`. Alternatives (Modal, Daytona, raw Docker) are fine but more DIY. Vercel’s sandbox + AI SDK helpers are a better fit when execution stays **on Vercel**, not on the Trigger worker. ## Implementation sketch ### 1. Run-scoped sandbox map Keep a `Map>` (or similar) in a **task-only module** so your Next.js app never imports it. ### 2. `onTurnStart` — warm without blocking ```ts theme={"theme":"css-variables"} onTurnStart: async ({ runId, ctx, ...rest }) => { warmCodeSandbox(runId); // fire-and-forget Sandbox.create() // ...persist messages, writer, etc. }, ``` ### 3. `chat.local` — run id for tools Tool `execute` functions do not receive hook payloads. Use [`chat.local()`](/docs/ai-chat/chat-local) to store the current run id for the sandbox key, **initialized from `onTurnStart`** (same `runId` as the map): ```ts theme={"theme":"css-variables"} // In the same task module as your tools import { chat } from "@trigger.dev/sdk/ai"; export const codeSandboxRun = chat.local<{ runId: string }>({ id: "codeSandboxRun" }); export function warmCodeSandbox(runId: string) { codeSandboxRun.init({ runId }); // ...start Sandbox.create(), store promise in Map by runId } ``` The **`executeCode`** tool reads `codeSandboxRun.runId` and awaits the sandbox promise before `runCode`. ### 4. `onChatSuspend` / `onComplete` — teardown Use **`onChatSuspend`** to dispose the sandbox right before the run suspends, and **`onComplete`** as a safety net when the run ends entirely. ```ts theme={"theme":"css-variables"} export const aiChat = chat.agent({ id: "ai-chat", // ... onChatSuspend: async ({ phase, ctx }) => { await disposeCodeSandboxForRun(ctx.run.id); }, onComplete: async ({ ctx }) => { await disposeCodeSandboxForRun(ctx.run.id); }, }); ``` Unlike `onWait` (which fires for all wait types), `onChatSuspend` only fires at chat suspension points — no need to filter on `wait.type`. The `phase` discriminator tells you if this is a preload or post-turn suspension. Optional **`onChatResume`**: log or reset flags; a fresh sandbox can be warmed again on the next **`onTurnStart`**. ### 5. AI SDK tool Wrap the provider in a normal AI SDK `tool({ inputSchema, execute })` (same pattern as `webFetch`). Keep tool definitions in **task code**, not in the Next.js server bundle. ### 6. Environment Set **`E2B_API_KEY`** (or your provider’s secret) on the **Trigger environment** for the worker — not in public client env. ## Typing `ctx` Every `chat.agent` lifecycle event and the `run` payload include **`ctx`**: the same **[`TaskRunContext`](/docs/ai-chat/reference#task-context-ctx)** shape as `task({ run: (payload, { ctx }) => ... })`. ```ts theme={"theme":"css-variables"} import type { TaskRunContext } from "@trigger.dev/sdk"; ``` The alias **`Context`** is also exported from `@trigger.dev/sdk` and is the same type. ## See also * [Database persistence for chat](/docs/ai-chat/patterns/database-persistence) — conversation + session rows, hooks, token renewal * [Lifecycle hooks](/docs/ai-chat/lifecycle-hooks) * [API Reference — `ctx` on events](/docs/ai-chat/reference#task-context-ctx) * [Per-run data with `chat.local`](/docs/ai-chat/chat-local) # Database persistence for chat Source: https://trigger.dev/docs/ai-chat/patterns/database-persistence Split conversation state and live session metadata across hooks — preload, turn start, turn complete — without tying the pattern to a specific ORM or schema. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. Durable chat runs can span **hours** and **many turns**. You usually want: 1. **Conversation state** — full **`UIMessage[]`** (or equivalent) keyed by **`chatId`**, so reloads and history views work. 2. **Live session state** — a **scoped access token** for the session and optionally **`lastEventId`** for stream resume. This page describes a **hook mapping** that works with any database. Adapt table and column names to your stack. ## Conceptual data model You can use one table or two; the important split is **semantic**: | Concept | Purpose | Typical fields | | ------------------ | ------------------------------------- | ------------------------------------------------------------------------------------------------------------- | | **Conversation** | Durable transcript + display metadata | Stable id (same as **`chatId`**), serialized **`uiMessages`**, title, model choice, owner/user id, timestamps | | **Active session** | Hydrate the transport on page reload | Same **`chatId`** as key (or FK), **`publicAccessToken`**, optional **`lastEventId`** | The **conversation** row is what your UI lists as "chats." The **session** row is what the **transport** needs after a refresh: a session-scoped PAT (so the transport doesn't have to re-mint on first paint) and the SSE resume cursor. Storing the current **`runId`** is optional — useful for telemetry / dashboard linking ("View this run") but not required for resume. The Session row owns its current run server-side; the transport reads from `session.out` keyed on `chatId`, so a run swap (continuation, upgrade) is invisible to your DB schema. Store **`UIMessage[]`** in a JSON-compatible column, or normalize to a messages table — the pattern is *when* you read/write, not *how* you encode rows. ## Where each hook writes This pattern covers **durable DB rows** (the conversation and the active session). Per-process in-memory state ([`chat.local`](/docs/ai-chat/chat-local), [DB connection pools](/docs/database-connections), sandboxes, etc.) belongs in [`onBoot`](/docs/ai-chat/lifecycle-hooks#onboot) — it fires on every fresh worker including continuation runs, where `onPreload` and `onChatStart` do not. ### `onPreload` (optional) When the user triggers [preload](/docs/ai-chat/fast-starts#preload), the run starts **before** the first user message. * Ensure the **conversation** row exists (create or no-op). * **Upsert session**: **`chatAccessToken`** from the event (a session-scoped PAT covering both `read:sessions:{chatId}` and `write:sessions:{chatId}`). * Load any **user / tenant context** you need for prompts (`clientData`). If you skip preload, do the equivalent in **`onChatStart`** when **`preloaded`** is false. ### `onChatStart` (chat's first message, non-preloaded path) * Fires **once per chat**, on the very first user message. Does NOT fire on continuation runs (post-`endRun`, post-waitpoint-timeout, post-`chat.requestUpgrade`) or on OOM-retry attempts. * If **`preloaded`** is true, return early — **`onPreload`** already ran. * Otherwise mirror preload: user/context, conversation create, session upsert. * No need to gate the conversation create on `continuation` — it's always a brand-new chat at this point. * For continuation runs that need to refresh per-run state (new PAT, new `lastEventId`), do it in **`onTurnStart`** / **`onTurnComplete`** — both fire on every turn including the first turn of a continuation run. ### `onTurnStart` * **`await`** persist **`uiMessages`** (full accumulated history including the new user turn) **before** the hook returns — `chat.agent` does not begin streaming until `onTurnStart` resolves, so this is what bounds "user message is durable before the stream". **Don't use [`chat.defer()`](/docs/ai-chat/background-injection#chat-defer-standalone) for the message write here.** `chat.defer` is fire-and-forget — the hook resolves before the write lands and the stream starts immediately. If the user refreshes mid-stream, the next page load reads `[]` from your DB, the resumed SSE stream pushes the assistant into an empty array, and the user's message disappears from the rendered conversation forever. ```ts theme={"theme":"css-variables"} // ❌ Bad — non-blocking write, mid-stream refresh drops the user message. onTurnStart: async ({ chatId, uiMessages }) => { chat.defer(db.chat.update({ where: { id: chatId }, data: { messages: uiMessages } })); }, // ✅ Good — awaited, durable before the model starts. onTurnStart: async ({ chatId, uiMessages }) => { await db.chat.update({ where: { id: chatId }, data: { messages: uiMessages } }); }, ``` `chat.defer` is for writes whose timing doesn't matter for resume — analytics, audit logs, search-index updates, etc. Anything the next page load reads needs to land before the stream begins. ### `onTurnComplete` * Persist **`uiMessages`** again with the **assistant** reply finalized. * **Upsert session** with the fresh **`chatAccessToken`** and **`lastEventId`** from the event. **`lastEventId`** lets the frontend [resume](/docs/ai-chat/frontend) without replaying SSE events it already applied. Treat it as part of session state, not optional polish, if you care about duplicate chunks after refresh. **Write the messages and `lastEventId` in a single transaction.** Both values are read in parallel on the next page load (one fetches the conversation, the other fetches the session). If a refresh races between the two writes, the page can see the assistant message persisted (full history) but a stale `lastEventId` from the previous turn. The transport then resumes from that stale cursor and replays this turn's chunks on top of the already-persisted assistant message, producing a duplicated render. ```ts theme={"theme":"css-variables"} // ✅ Atomic — refresh on the next page load reads both writes consistently. await db.$transaction([ db.chat.update({ where: { id: chatId }, data: { messages: uiMessages } }), db.chatSession.upsert({ where: { id: chatId }, create: { id: chatId, publicAccessToken: chatAccessToken, lastEventId }, update: { publicAccessToken: chatAccessToken, lastEventId }, }), ]); // ❌ Two awaits — narrow race window where messages are post-write but // lastEventId is still pre-write. A page refresh that lands here will // duplicate the assistant message on resume. await db.chat.update({ where: { id: chatId }, data: { messages: uiMessages } }); await db.chatSession.upsert({ /* ... */ }); ``` ## Token renewal (app server) The persisted PAT has a TTL (see **`chatAccessTokenTTL`** on **`chat.agent`**, default 1h). When the transport gets a **401** on a session-PAT-authed request, it calls your **`accessToken`** callback to mint a fresh PAT — no DB lookup required, since the session is keyed on `chatId` (which the transport already has). Your `accessToken` callback typically just wraps `auth.createPublicToken`: ```ts theme={"theme":"css-variables"} "use server"; import { auth } from "@trigger.dev/sdk"; export async function mintChatAccessToken(chatId: string) { return auth.createPublicToken({ scopes: { read: { sessions: chatId }, write: { sessions: chatId } }, expirationTime: "1h", }); } ``` If you want to keep your DB session row in sync, the transport's **`onSessionChange`** callback fires every time the cached PAT changes — persist the new value there. No Trigger task code needs to run for renewal. ## Minimal pseudocode ```typescript theme={"theme":"css-variables"} // Pseudocode — replace saveConversation / saveSession with your DB layer. chat.agent({ id: "my-chat", clientDataSchema: z.object({ userId: z.string() }), onPreload: async ({ chatId, chatAccessToken, clientData }) => { if (!clientData) return; await ensureUser(clientData.userId); await upsertConversation({ id: chatId, userId: clientData.userId /* ... */ }); await upsertSession({ chatId, publicAccessToken: chatAccessToken }); }, onChatStart: async ({ chatId, chatAccessToken, clientData, preloaded }) => { if (preloaded) return; // Fires once per chat — no continuation gate needed. await ensureUser(clientData.userId); await upsertConversation({ id: chatId, userId: clientData.userId /* ... */ }); await upsertSession({ chatId, publicAccessToken: chatAccessToken }); }, onTurnStart: async ({ chatId, uiMessages }) => { // Awaited, not chat.defer — see the warning in `onTurnStart` above. await saveConversationMessages(chatId, uiMessages); }, onTurnComplete: async ({ chatId, uiMessages, chatAccessToken, lastEventId }) => { // Atomic: messages + lastEventId must be readable consistently on resume. // See the warning above for why a non-atomic write causes duplicate renders. await db.$transaction([ saveConversationMessagesQuery(chatId, uiMessages), upsertSessionQuery({ chatId, publicAccessToken: chatAccessToken, lastEventId }), ]); }, run: async ({ messages, signal }) => { /* streamText, etc. */ }, }); ``` ## Alternative: `hydrateMessages` For apps that need the backend to be the single source of truth for message history — abuse prevention, branching conversations, or rollback support — use [`hydrateMessages`](/docs/ai-chat/lifecycle-hooks#hydratemessages) instead of relying on the frontend's accumulated state. With hydration, the hook loads messages from your database on every turn. The frontend's messages are ignored (except for the new user message, which arrives in `incomingMessages`): ```ts theme={"theme":"css-variables"} import { chat, upsertIncomingMessage } from "@trigger.dev/sdk/ai"; export const myChat = chat.agent({ id: "my-chat", hydrateMessages: async ({ chatId, trigger, incomingMessages }) => { const record = await db.chat.findUnique({ where: { id: chatId } }); const stored = record?.messages ?? []; // `upsertIncomingMessage` pushes a fresh user message and no-ops // on HITL continuations (the runtime overlays the new tool-state // advance onto the existing entry). See lifecycle hooks for the // full pattern: /ai-chat/lifecycle-hooks#hydratemessages if (upsertIncomingMessage(stored, { trigger, incomingMessages })) { // Upsert, not update: on a head-start first turn no preload ran, // so the row may not exist yet when this hook fires. await db.chat.upsert({ where: { id: chatId }, create: { id: chatId, messages: stored }, update: { messages: stored }, }); } return stored; }, onTurnComplete: async ({ chatId, uiMessages, chatAccessToken, lastEventId }) => { // Persist the response and refresh session state atomically — see the // warning in the previous section for why these two writes have to be // in the same transaction. await db.$transaction([ db.chat.update({ where: { id: chatId }, data: { messages: uiMessages } }), db.chatSession.upsert({ where: { id: chatId }, create: { id: chatId, publicAccessToken: chatAccessToken, lastEventId }, update: { publicAccessToken: chatAccessToken, lastEventId }, }), ]); }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` This replaces the `onTurnStart` persistence pattern — the hook handles both loading and persisting the new message in one place. Hydration composes with [Head Start](/docs/ai-chat/fast-starts#with-hydratemessages): on a head-start first turn the route handler's history arrives as `incomingMessages`, and the write path must be an upsert because no preload ran to create the row. ## Design notes * **`chatId`** is stable for the life of a thread and is the only identifier the transport persists. Runs come and go (idle continuation, upgrade, cancel/restart) but the chat keeps its identity. * **`continuation: true`** means "same logical chat, new run" — refresh the persisted PAT, don't assume an empty conversation. * The current `runId` is available on every hook event for telemetry / dashboard linking ("View this run"), but you don't need to persist it for resume to work — the transport addresses by `chatId`. * Keep **task modules** that perform writes **out of** browser bundles; the pattern assumes persistence runs **in the worker** (or your BFF that the task calls). ## Complete example End-to-end implementation across the three files involved: agent task, server actions, and React component. The example below trusts raw `chatId` and returns rows without filtering by user. In a real multi-user app, **scope every query by the authenticated user** — read the user from your auth/session in each server action and add `where: { userId }` to all `db.chat.*` and `db.chatSession.*` queries. Without that, one client could read or delete another user's chat state, and `getAllSessions()` would leak other users' `publicAccessToken`s. The snippet keeps auth out of the way to focus on the persistence shape. ```ts trigger/chat.ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { streamText, stepCountIs } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; import { z } from "zod"; import { db } from "@/lib/db"; export const myChat = chat.agent({ id: "my-chat", clientDataSchema: z.object({ userId: z.string(), }), onChatStart: async ({ chatId, clientData }) => { await db.chat.create({ data: { id: chatId, userId: clientData.userId, title: "New chat", messages: [] }, }); }, onTurnStart: async ({ chatId, uiMessages, runId, chatAccessToken }) => { // Persist messages + session before streaming await db.chat.update({ where: { id: chatId }, data: { messages: uiMessages }, }); await db.chatSession.upsert({ where: { id: chatId }, create: { id: chatId, runId, publicAccessToken: chatAccessToken }, update: { runId, publicAccessToken: chatAccessToken }, }); }, onTurnComplete: async ({ chatId, uiMessages, runId, chatAccessToken, lastEventId }) => { // Persist assistant response + stream position atomically — see the // race-condition warning earlier on this page. await db.$transaction([ db.chat.update({ where: { id: chatId }, data: { messages: uiMessages }, }), db.chatSession.upsert({ where: { id: chatId }, create: { id: chatId, runId, publicAccessToken: chatAccessToken, lastEventId }, update: { runId, publicAccessToken: chatAccessToken, lastEventId }, }), ]); }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` ```ts app/actions.ts theme={"theme":"css-variables"} "use server"; import { auth } from "@trigger.dev/sdk"; import { chat } from "@trigger.dev/sdk/ai"; import { db } from "@/lib/db"; export const startChatSession = chat.createStartSessionAction("my-chat"); export async function mintChatAccessToken(chatId: string) { return auth.createPublicToken({ scopes: { read: { sessions: chatId }, write: { sessions: chatId } }, expirationTime: "1h", }); } export async function getChatMessages(chatId: string) { const found = await db.chat.findUnique({ where: { id: chatId } }); return found?.messages ?? []; } export async function getAllSessions() { const sessions = await db.chatSession.findMany(); const result: Record< string, { publicAccessToken: string; lastEventId?: string; } > = {}; for (const s of sessions) { result[s.id] = { publicAccessToken: s.publicAccessToken, lastEventId: s.lastEventId ?? undefined, }; } return result; } export async function deleteSession(chatId: string) { await db.chatSession.delete({ where: { id: chatId } }).catch(() => {}); } ``` ```tsx app/components/chat.tsx theme={"theme":"css-variables"} "use client"; import { useChat } from "@ai-sdk/react"; import { useTriggerChatTransport } from "@trigger.dev/sdk/chat/react"; import type { myChat } from "@/trigger/chat"; import { mintChatAccessToken, startChatSession, deleteSession } from "@/app/actions"; export function Chat({ chatId, initialMessages, initialSessions }) { const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), clientData: { userId: currentUser.id }, // Type-checked against clientDataSchema sessions: initialSessions, onSessionChange: (id, session) => { if (!session) deleteSession(id); }, }); const { messages, sendMessage, stop, status } = useChat({ id: chatId, messages: initialMessages, transport, resume: initialMessages.length > 0, }); return (
{messages.map((m) => (
{m.role}: {m.parts.map((part, i) => part.type === "text" ? {part.text} : null )}
))}
{ e.preventDefault(); const input = e.currentTarget.querySelector("input"); if (input?.value) { sendMessage({ text: input.value }); input.value = ""; } }} > {status === "streaming" && ( )}
); } ```
## See also * [Lifecycle hooks](/docs/ai-chat/lifecycle-hooks) * [Session management](/docs/ai-chat/frontend#session-management) — `resume`, `lastEventId`, transport * [`chat.defer()`](/docs/ai-chat/background-injection#chat-defer-standalone) — non-blocking writes during a turn * [Code execution sandbox](/docs/ai-chat/patterns/code-sandbox) — combines **`onWait`** / **`onComplete`** with this persistence model # Human-in-the-loop Source: https://trigger.dev/docs/ai-chat/patterns/human-in-the-loop Pause the agent mid-response to ask the user a clarifying question, then resume with their answer. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. Some turns need to stop and ask the user something before they can finish — picking between options, confirming a destructive action, or clarifying an ambiguous request. The AI SDK calls this **human-in-the-loop** (HITL), and the building block is a tool with no `execute` function. When the LLM calls a tool that has no `execute`, `streamText` ends with the tool call still pending. The turn completes cleanly, the frontend renders UI to collect the answer, and when the user responds, a new turn resumes with the answer merged into the same assistant message. ## How it works ``` Turn N: User message → run() LLM streams text → calls askUser tool (no execute) streamText ends with tool-call in `input-available` state onTurnComplete fires (finishReason = "tool-calls") Agent idle Frontend: Renders question + option buttons from tool input User clicks → addToolOutput({ tool, toolCallId, output }) sendAutomaticallyWhen: lastAssistantMessageIsCompleteWithToolCalls → sendMessage() fires next turn Turn N+1: hydrateMessages / accumulator sees the updated assistant message run() is called, LLM continues from the tool result onTurnComplete fires (finishReason = "stop", responseMessage is the FULL merged message) ``` The AI SDK's `toUIMessageStream` automatically reuses the assistant message ID across the pause (we pass `originalMessages` internally), so `responseMessage` in the post-resume `onTurnComplete` is the **full merged message** — the original text, the completed tool call, and any follow-up content — not just the new parts. ## Backend: define the tool A HITL tool has an `inputSchema` describing what the model can ask, but **no `execute` function**. When the LLM calls it, `streamText` returns control to your agent. ```ts trigger/my-chat.ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { streamText, tool, stepCountIs } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; import { z } from "zod"; const askUser = tool({ description: "Ask the user a clarifying question when you need their input. " + "Present 2-4 options for them to pick from.", inputSchema: z.object({ question: z.string(), options: z .array( z.object({ id: z.string(), label: z.string(), description: z.string().optional(), }) ) .min(2) .max(4), }), // No execute function — streamText ends, the frontend supplies the output // via addToolOutput, and the next turn continues from the result. }); export const myChat = chat.agent({ id: "my-chat", tools: { askUser }, run: async ({ messages, tools, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, tools, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` Declaring `tools` on the config (and reading them back from the payload) is the recommended shape for any agent with tools. See [Tools](/docs/ai-chat/tools). ## Frontend: render the question and collect the answer Two pieces on the client: 1. **UI for the pending tool call** — render when the tool part is in `input-available` state, i.e. the LLM has called the tool but there's no output yet. 2. **Auto-send on resolution** — use `sendAutomaticallyWhen: lastAssistantMessageIsCompleteWithToolCalls` so answering kicks off the next turn without the user having to hit "send." ```tsx theme={"theme":"css-variables"} import { useChat, lastAssistantMessageIsCompleteWithToolCalls } from "@ai-sdk/react"; import { useTriggerChatTransport } from "@trigger.dev/sdk/chat/react"; function ChatView({ chatId }: { chatId: string }) { const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), }); const { messages, sendMessage, addToolOutput } = useChat({ id: chatId, transport, sendAutomaticallyWhen: lastAssistantMessageIsCompleteWithToolCalls, }); return ( <> {messages.map((m) => m.parts.map((part, i) => { if (part.type === "tool-askUser" && part.state === "input-available") { return ( addToolOutput({ tool: "askUser", toolCallId: part.toolCallId, output: { optionId: opt.id, label: opt.label }, }) } /> ); } if (part.type === "text") return {part.text}; return null; }) )} ); } ``` `addToolOutput` patches the assistant message locally with `state: "output-available"` and fills in `output`. `lastAssistantMessageIsCompleteWithToolCalls` detects that every pending tool call now has a result, and `useChat` fires a new `sendMessage` — the backend picks it up as the next turn. ## Detecting a paused turn in `onTurnComplete` Two ways to detect "this turn paused for user input" vs "this turn finished normally": ### Via `finishReason` (recommended) The AI SDK's finish reason is surfaced on every `onTurnComplete` event. If the model stopped on tool calls, it's `"tool-calls"`: ```ts theme={"theme":"css-variables"} onTurnComplete: async ({ finishReason, responseMessage }) => { if (finishReason === "tool-calls") { // Turn paused — assistant message has pending tool call(s) const pending = responseMessage?.parts.filter( (p) => p.type.startsWith("tool-") && p.state === "input-available" ); // Persist as a checkpoint / partial turn } else { // finishReason === "stop" — normal completion // Persist as a completed turn } }; ``` `finishReason` is only undefined for manual `chat.pipe()` flows or aborted streams. For the common `run() → return streamText(...)` pattern it's always populated. ### Via response parts If you need more nuance (e.g. which specific tool is pending), use `chat.history.getPendingToolCalls()`: ```ts theme={"theme":"css-variables"} const pending = chat.history.getPendingToolCalls(); // [{ toolCallId, toolName, messageId }] ``` The result reflects the most recent assistant message: the one waiting on `addToolOutput`. Use it from `onAction` to gate fresh user turns ("can't send a new message while a HITL is open"), or from `onTurnComplete` to decide what to persist. Both `finishReason === "tool-calls"` and `chat.history.getPendingToolCalls().length > 0` are equivalent in practice. Use `finishReason` for dispatch, the helper for detail. ### Acting once per net-new tool result When the user's `addToolOutput` round-trips a tool answer back to the agent, the wire message carries the resolved tool part. If you want to fire side-effects (audit log, billing, notifications) exactly once per resolved tool call, do it in `hydrateMessages` before the runtime merges. `chat.history.extractNewToolResults(message)` returns only the parts whose `toolCallId` isn't already resolved on the chain: ```ts theme={"theme":"css-variables"} hydrateMessages: async ({ incomingMessages }) => { for (const msg of incomingMessages) { if (msg.role !== "assistant") continue; for (const r of chat.history.extractNewToolResults(msg)) { await auditLog.record({ toolCallId: r.toolCallId, toolName: r.toolName, output: r.output, errorText: r.errorText, // set only for output-error parts }); } } return incomingMessages; }, ``` `extractNewToolResults` compares against the current `chat.history`. By the time `onTurnComplete` fires, the chain already contains `responseMessage`, so the helper returns `[]` there. Use it where the message is from outside the accumulator: `hydrateMessages`, `onAction` if the action carries a message, or any custom pre-merge code path. ## Persistence: one message vs one record per pause Because the AI SDK reuses the assistant message ID across the pause, the "same turn" from the user's perspective maps to **two `onTurnComplete` firings** on the server — but both receive a `responseMessage` with the **same `id`**, and the second firing's `responseMessage` contains the fully merged content. Two common persistence patterns: ### Overwrite on every turn (simplest) Just store the latest `uiMessages` array on every `onTurnComplete`. The paused-turn write is overwritten by the resume-turn write; the final DB state has the full merged message. ```ts theme={"theme":"css-variables"} onTurnComplete: async ({ chatId, uiMessages }) => { await db.chat.update({ where: { id: chatId }, data: { messages: uiMessages }, }); }, ``` Use this unless you specifically need an audit trail. ### Checkpoint nodes (immutable history) For apps that want every pause point recorded as its own immutable snapshot (branching, replay, diff review), save a checkpoint when paused and a sibling when complete: ```ts theme={"theme":"css-variables"} onTurnComplete: async ({ chatId, responseMessage, finishReason, uiMessages }) => { if (!responseMessage) return; if (finishReason === "tool-calls") { // Paused — save a checkpoint await db.turnCheckpoint.create({ data: { chatId, messageId: responseMessage.id, parts: responseMessage.parts, kind: "partial", }, }); } else { // Completed — save a sibling with the merged full message await db.turnCheckpoint.create({ data: { chatId, messageId: responseMessage.id, parts: responseMessage.parts, kind: "final", }, }); } // Always update the canonical chat record for `hydrateMessages` to load await db.chat.update({ where: { id: chatId }, data: { messages: uiMessages }, }); }; ``` Both writes see `responseMessage.id` as the same value — they're checkpoints of the same logical message. Grouping by `messageId` + ordering by `createdAt` gives you the progression. ## Multi-pause turns A single logical turn can pause more than once — the LLM asks question A, gets the answer, thinks, then asks question B before finishing. Each pause fires its own `onTurnComplete` with `finishReason === "tool-calls"`; only the last firing has `finishReason === "stop"`. The checkpoint pattern above handles this naturally — each pause adds a new checkpoint sharing the same `responseMessage.id`. ## Gotchas * **Don't set an `execute` function on the HITL tool.** If it has one, `streamText` will call it immediately instead of handing control back. * **The frontend must use `sendAutomaticallyWhen`.** Without it, the user has to press Enter after answering — `addToolOutput` updates local state but doesn't fire a new turn by itself. * **Don't mutate `responseMessage` in `onTurnComplete`.** It's the captured snapshot. To add custom parts, use `chat.response.append()` in `onBeforeTurnComplete` (while the stream is open). * **Stop handling.** If the user stops the run while a pause is active (`chat.stop()` on the transport), `onTurnComplete` fires with `stopped: true` and `finishReason` reflecting the last successful step. Treat stopped paused turns the same as stopped normal turns. # Large payloads in chat.agent Source: https://trigger.dev/docs/ai-chat/patterns/large-payloads Why a single chunk on the chat stream is capped at ~1 MiB, what error you'll see, and how to work around it with ID references. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. The realtime stream that backs `chat.agent` enforces a **per-record cap of \~1 MiB** (`1048576` bytes minus a small envelope reserve). Anything written through the chat output — auto-piped LLM chunks, `chat.response.write`, custom `writer.write` parts — counts as one record per chunk and is rejected if it crosses the cap. This is a platform-level limit and cannot be raised per project or per stream. ## What you'll see When a chunk crosses the cap, the run fails with a typed [`ChatChunkTooLargeError`](/docs/ai-chat/error-handling): ``` ChatChunkTooLargeError: chat.agent chunk of type "tool-output-available" is 2000126 bytes, over the realtime stream's per-record cap of 1047552 bytes. For oversized payloads (e.g. large tool outputs), write the value to your own store and emit only an id/url through the chat stream — see https://trigger.dev/docs/ai-chat/patterns/large-payloads. ``` The error includes: * `chunkType` — discriminant on the chunk that failed (e.g. `tool-output-available`, `data-handover`, `text-delta`). * `chunkSize` — UTF-8 byte count of the JSON-serialized record. * `maxSize` — the effective cap. You can catch and re-throw / log it explicitly: ```ts theme={"theme":"css-variables"} import { ChatChunkTooLargeError, isChatChunkTooLargeError } from "@trigger.dev/sdk"; try { await someWrite(); } catch (err) { if (isChatChunkTooLargeError(err)) { logger.error("Oversized chunk", { type: err.chunkType, size: err.chunkSize }); } throw err; } ``` ## Most common cause: large tool outputs If you return a `streamText` result from `run()`, the AI SDK auto-pipes its `UIMessageStream` into the chat output. A tool whose result object is large (a fetched HTML body, a CSV blob, an image as base64, a deep DB row dump) gets emitted as one `tool-output-available` chunk — and that's the chunk that overruns. **Diagnose first**: log tool sizes during development. ```ts theme={"theme":"css-variables"} const fetchPage = tool({ inputSchema: z.object({ url: z.string().url() }), execute: async ({ url }) => { const html = await (await fetch(url)).text(); if (html.length > 500_000) { logger.warn("Large tool output", { tool: "fetchPage", bytes: html.length }); } return { html }; }, }); ``` If the size is unbounded by input, fix the tool — not the stream. ## ID-reference pattern Store the large value in your own database (or object store) and emit only an identifier through the chat stream. The frontend fetches the full payload separately on demand. This keeps the chat stream small, predictable, and resumable, and lets you reuse the value across turns or sessions without re-streaming it. ```ts task.ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { tool } from "ai"; import { z } from "zod"; const fetchPage = tool({ description: "Fetch a URL and store the HTML for later inspection.", inputSchema: z.object({ url: z.string().url() }), execute: async ({ url }) => { const html = await (await fetch(url)).text(); const docId = await db.documents.create({ data: { url, html, byteSize: html.length }, }); // Tool result is small — just an id and metadata. // The model and the UI both work with this lightweight handle. return { docId, url, byteSize: html.length, preview: html.slice(0, 500), }; }, }); ``` ```ts api/document/[id]/route.ts theme={"theme":"css-variables"} // Frontend fetches the full document on demand. import { auth, currentUser } from "@/lib/auth"; export async function GET(_req: Request, { params }: { params: { id: string } }) { const user = await currentUser(); const doc = await db.documents.findUniqueOrThrow({ where: { id: params.id, userId: user.id }, }); return new Response(doc.html, { headers: { "content-type": "text/html" } }); } ``` ```tsx component.tsx theme={"theme":"css-variables"} function ToolResultCard({ part }: { part: ToolUIPart<"fetchPage"> }) { const { docId, url, byteSize, preview } = part.output; return (

{url} — {(byteSize / 1024).toFixed(0)} KB

{preview}…
Open full HTML
); } ``` The same pattern works for `chat.response.write` — push the heavy value to your DB, then emit a small data part with the id: ```ts theme={"theme":"css-variables"} const id = await db.attachments.create({ data: { content: hugeReport } }); chat.response.write({ type: "data-report", data: { id, summary: shortSummary } }); ``` Persist the large value **before** you emit the id chunk. If the chunk reaches the UI before the row is written, the frontend gets a 404 on the follow-up fetch. ## Transient UI parts For progress indicators or status data that should stream to the UI but not persist into the response message, use `chat.response.write` with `transient: true`. The chunk still travels on the chat stream (so the 1 MiB per-record cap still applies), but it never lands in `responseMessage` or `uiMessages`: ```ts theme={"theme":"css-variables"} chat.response.write({ type: "data-progress", data: { percent: 50 }, transient: true, }); ``` For genuinely high-volume diagnostic data (per-token traces, large debug dumps), don't try to ship it through the realtime stream at all. Log to your own store (DB, object storage, OTel logger) and surface it through a separate UI route that isn't tied to the chat session. ## What does **not** trigger the cap These calls don't go through the realtime stream and have no per-record cap: * [`chat.history.set` / `slice` / `replace` / `remove`](/docs/ai-chat/backend#chat-history) — locals-only mutations on the in-memory message list. * [`chat.inject`](/docs/ai-chat/background-injection#chat-inject) — appends to the run's pending message queue, not the stream. * [`chat.defer`](/docs/ai-chat/background-injection#chat-defer-standalone) — promise registry; awaited at turn boundaries, never serialized to the stream. The control markers `chat.agent` emits internally (`trigger:turn-complete`, `trigger:upgrade-required`) are tiny by construction. ## See also * [Error handling](/docs/ai-chat/error-handling) — how `ChatChunkTooLargeError` flows through the layers. * [Database persistence](/docs/ai-chat/patterns/database-persistence) — your own store as the durable backing for ID references. * [Client protocol](/docs/ai-chat/client-protocol) — chunk shapes that travel on the chat stream. # OOM resilience Source: https://trigger.dev/docs/ai-chat/patterns/oom-resilience Recover from out-of-memory errors mid-turn by automatically retrying the failed turn on a larger machine — without losing the in-flight user message or re-processing completed turns. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. When a `chat.agent` turn runs out of memory, the worker process dies and everything in it is gone: the in-flight LLM call, the accumulator, any tool execution mid-flight. By default, Trigger.dev surfaces the OOM as a run failure. Setting `oomMachine` opts the agent into automatic recovery: the failed turn re-runs on a larger machine, picks up the user message that triggered the OOM (without re-processing earlier completed turns), and produces a normal response. ## Setup ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; export const myChat = chat.agent({ id: "my-chat", machine: "small-1x", // default machine oomMachine: "medium-2x", // fallback on OOM run: async ({ messages, signal }) => streamText({ model, messages, abortSignal: signal }), }); ``` That's the entire opt-in. With `oomMachine` set, the agent gets: * **`retry.maxAttempts: 2`** internally — one retry for OOM only; non-OOM errors don't retry. * **`retry.outOfMemory.machine: oomMachine`** — the fresh attempt boots on the larger machine. * **`session.in` cursor recovery** — the new attempt skips records belonging to turns that already completed on the prior attempt and only re-runs the OOM'd turn. `chat.agent` does not expose generic `retry` options. OOM recovery is the only retry path because retrying an LLM-driven loop on non-OOM errors tends to be expensive and side-effecting. Drop down to a [raw `task()` with chat primitives](/docs/ai-chat/custom-agents) if you need richer retry semantics. ## How recovery works The recovery doesn't need any customer-side persistence to avoid duplicate processing. It uses two pieces of durable state Trigger already maintains for every chat: * **`session.out`** — the durable response stream. Every successful turn writes a `trigger:turn-complete` chunk here. * **`session.in`** — the durable input stream. Every user message after the first turn lands here as a record with a server-assigned timestamp. On retry boot, the SDK: 1. Scans `session.out` for the latest `trigger:turn-complete` chunk and reads its timestamp. Call this `T_last_complete`. 2. Sets a per-stream filter on `session.in` so any record with `timestamp <= T_last_complete` is dropped before it reaches the turn loop. 3. Begins normal processing. The first record that passes the filter is the message that triggered the OOM (or any newer message that arrived during the retry window). Result: turns 1..N-1 are not re-processed, turn N runs on the larger machine, and the conversation continues. ```mermaid theme={"theme":"css-variables"} sequenceDiagram participant User participant Run as chat.agent run participant SessionIn as session.in participant SessionOut as session.out User->>SessionIn: u2 (turn 2) Run->>SessionIn: read u2 Run->>SessionOut: turn-complete (T1) User->>SessionIn: u3 (turn 3) Run->>SessionIn: read u3 Run->>SessionOut: turn-complete (T2) User->>SessionIn: u4 (turn 4) Run->>SessionIn: read u4 Note over Run: OOM mid-turn Run->>Run: ⚠️ killed Note over Run: Attempt 2 boots on oomMachine Run->>SessionOut: scan → T_last_complete = T2 Run->>SessionIn: read with filter (ts > T2) SessionIn-->>Run: u2 (filtered, ts < T2) SessionIn-->>Run: u3 (filtered, ts < T2) SessionIn-->>Run: u4 (passes — the OOM'd turn) Run->>SessionOut: turn 4 complete ``` The scan on `session.out` is streaming and bounded in memory: each chunk is inspected and discarded one at a time, so a long-running chat doesn't bloat the retry-boot worker. Bandwidth scales linearly with `session.out` size, but only on the OOM-retry path — a rare event. ## With `hydrateMessages` If your agent uses [`hydrateMessages`](/docs/ai-chat/lifecycle-hooks#hydratemessages) to load the durable conversation history per turn, the OOM'd turn re-runs against the full prior accumulator: the model sees `[u1, a1, u2, a2, ..., u_N]` and responds in context. This is the recommended pattern for production chats. ## Without `hydrateMessages` Recovery boot reconstructs context automatically. The boot reads both the durable `session.out` snapshot (settled turns) and the `session.out` tail past the snapshot cursor (the partial assistant chunks the OOM'd turn streamed before dying). When the new attempt processes the OOM'd user message, the model sees the full prior conversation **plus** the partial assistant that was cut off — so a "keep going" follow-up continues naturally, and any other follow-up has the same context the original turn had. `hydrateMessages` is still the right choice if you want a single source of truth in your own database (branching conversations, message-level access control, etc.). It's no longer required for OOM continuity. For full control over recovery — drop the partial, synthesize tool results for an interrupted tool call, emit a recovery banner to the UI — register [`onRecoveryBoot`](/docs/ai-chat/patterns/recovery-boot). ## Tool execute idempotency If an OOM hits mid-tool-execution, the new attempt re-runs the entire turn — including the tool call. Make tool `execute` functions idempotent or checkpoint their progress externally. Trigger doesn't roll back side effects automatically. ```ts theme={"theme":"css-variables"} import { tool } from "ai"; export const sendEmail = tool({ description: "Send an email", inputSchema: z.object({ to: z.string(), idempotencyKey: z.string() }), execute: async ({ to, idempotencyKey }) => { // Stripe-style: dedupe at the side-effect layer with a customer-supplied key. return await mailer.send({ to, idempotencyKey }); }, }); ``` ## Limitations * **One OOM retry per run.** `chat.agent` sets `maxAttempts: 2`. If attempt 2 also OOMs, the run fails. Use a sufficiently large `oomMachine` to avoid this. * **Single fallback tier.** Only one `oomMachine`. There's no "tiered retry" (small → medium → large). If you need that, drop down to a [raw `task()` with chat primitives](/docs/ai-chat/custom-agents) and configure `retry` directly. * **Non-OOM errors don't retry.** Schema errors, model-call rejections, tool throws, etc. fail the run as before. Out-of-memory is the only retry trigger. * **Tools mid-execution are not checkpointed.** A partially-run tool re-runs from scratch on the new attempt. Make them idempotent. ## See also * [Recovery boot](/docs/ai-chat/patterns/recovery-boot) — the underlying hook + smart default that gives OOM recovery its full-context behavior * [Lifecycle hooks](/docs/ai-chat/lifecycle-hooks) — `onChatResume` fires on every retry attempt with `phase: "preload"` or `"turn"` * [Database persistence](/docs/ai-chat/patterns/database-persistence) — the `hydrateMessages` pattern for branching, ACL, and DB-as-source-of-truth scenarios # Persistence and replay Source: https://trigger.dev/docs/ai-chat/patterns/persistence-and-replay How chat.agent rebuilds conversation history at run boot — durable JSON snapshot in object storage plus session.out replay, with a hydrateMessages short-circuit for backend-owned history. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. `chat.agent` runs are processes — they boot, stream a turn, and either suspend (waiting for the next message) or exit. When the next message arrives at a session whose previous run already exited, a **fresh** run boots with no in-memory state. Something has to rebuild the conversation history before that turn can produce a coherent response. This page walks through the **snapshot + replay** model the runtime uses by default, and the [`hydrateMessages`](/docs/ai-chat/lifecycle-hooks#hydratemessages) short-circuit that turns the whole thing off when the customer owns history. ## Why a snapshot at all The wire is delta-only: each `.in/append` carries at most one new `UIMessage` (see [Client Protocol](/docs/ai-chat/client-protocol#chattaskwirepayload)). A long conversation might be 50 turns deep with megabytes of tool results — the wire never carries that. So when run #2 boots to handle turn 51, the wire alone tells it almost nothing about turns 1–50. Two existing pieces of durable state already capture everything that happened: * **`session.in`** — every user message and tool-approval response ever sent. * **`session.out`** — every assistant token, tool call, and tool result the agent emitted, ordered. Replaying `session.out` from the beginning is correct but expensive — bandwidth scales with chat length, and parsing N megabytes of streamed chunks at every boot adds latency. So the runtime writes a **snapshot** after every turn and reads it on the next boot. Replay only covers the gap between the snapshot's cursor and now. ## The model end-to-end ```mermaid theme={"theme":"css-variables"} sequenceDiagram participant User participant Run1 as Run 1 (turn 1) participant Snapshot as Object storage participant SessionOut as session.out participant Run2 as Run 2 (turn 2+) User->>Run1: u1 Run1->>SessionOut: assistant chunks for a1 Run1->>Run1: onTurnComplete Run1->>Snapshot: write { messages: [u1, a1], lastOutEventId, lastOutTimestamp } Note over Run1: idle suspend (or exit) User->>Run2: u2 (delta only) Run2->>Snapshot: read snapshot Run2->>SessionOut: subscribe(lastEventId, wait=0) SessionOut-->>Run2: (empty — nothing since snapshot) Note over Run2: accumulator = [u1, a1] Run2->>Run2: append u2 from wire Run2->>SessionOut: assistant chunks for a2 Run2->>Run2: onTurnComplete Run2->>Snapshot: write { messages: [u1, a1, u2, a2], ... } ``` ### Run 1 — first turn The accumulator starts empty. The wire delivers `u1`. After the model finishes, `onTurnComplete` fires, then the runtime serializes the full accumulator and writes: ```json theme={"theme":"css-variables"} { "version": 1, "savedAt": 1715180400000, "messages": [u1, a1], "lastOutEventId": "42", "lastOutTimestamp": 1715180399000 } ``` The key is `packets/{projectRef}/{envSlug}/sessions/{sessionId}/snapshot.json` — overwritten every turn, never appended. The write is **awaited**, not fire-and-forget — if the run idle-suspends immediately after, in-flight promises don't reliably complete and the snapshot would be lost. ### Run 2 — boot A new run boots when the user sends `u2`. Run 1 has long since exited. Run 2 has no in-memory state. The boot sequence: GET the JSON blob. On 404 (no snapshot yet — first-ever turn) or read error or version mismatch, treat as empty and continue. Snapshot misses are non-fatal — replay alone may still be sufficient. Subscribe to `session.out` with `wait=0` starting from `snapshot.lastOutEventId`. Drain whatever's there and close. Returns: * **Settled messages** — closed assistant turns past the snapshot cursor (the chunks of a turn that completed after the snapshot was written but before the run exited cleanly). * **A partial assistant** — the trailing message if its stream never received a `finish` chunk. The dead run was mid-response when it died. `cleanupAbortedParts` has already stripped streaming-in-progress fragments. In the steady state this returns empty. In recovery, it returns whatever the dead run was in the middle of. GET `session.in` records past the last `turn-complete`'s `session-in-event-id` cursor. Returns the user messages the dead run hadn't acknowledged — typically the message that triggered the cancelled / crashed turn, plus anything the customer typed after. Snapshot messages merge with the settled replay (replay wins on `id` collision). Then: * If there's a partial assistant **and** at least one in-flight user message, splice `[firstInFlightUser, partialAssistant]` onto the end of the chain. The model sees the prior turn's incomplete attempt and can continue, abandon, or pivot based on the next user message. * Remaining in-flight users dispatch as fresh turns after the recovered first one. * If there's no partial OR no in-flight users, the chain is just the settled chain and any in-flight users dispatch normally. Customers can override this entirely via [`onRecoveryBoot`](/docs/ai-chat/patterns/recovery-boot). Append `u2` from the wire payload, exactly as on turn 1. The model now sees `[u1, a1, u2]` and produces `a2`. After `onTurnComplete`, the runtime overwrites the snapshot with `[u1, a1, u2, a2]` and the cycle repeats. ### Crash mid-turn — replay carries the load Suppose Run 1's turn 1 streams partial assistant chunks to `session.out` and then crashes (OOM, exception, server-side cancel) before `onTurnComplete` fires. No snapshot was written. The next run boots and: 1. Snapshot read returns 404 → empty. 2. `session.out` tail replay picks up the partial assistant chunks emitted before the crash. `cleanupAbortedParts` strips streaming-in-progress fragments but keeps the cleaned trailing message as the `partialAssistant`. 3. `session.in` tail replay finds the user message the dead run was answering (no `turn-complete` was written, so the cursor never advanced past it). 4. Smart default splices `[firstInFlightUser, partialAssistant]` onto the chain. Any later user messages (including the customer's follow-up) dispatch as fresh turns. 5. The model sees full prior context and responds in kind — continuing a cut-off essay on "keep going", answering a fresh question on "actually, what's 7+8?", abandoning the prior work on "scrap that, do X instead". Replay carries the conversation across the crash boundary with zero customer code. For policies different from "preserve context" — drop the partial entirely, synthesize tool results for an interrupted tool call, write a recovery banner to the UI — register [`onRecoveryBoot`](/docs/ai-chat/patterns/recovery-boot). ## OOM-retry interaction The runtime already had an OOM-retry path that scans `session.out` for the latest `trigger:turn-complete` timestamp to use as a cutoff for `session.in` (so the retry doesn't re-process completed turns — see [OOM resilience](/docs/ai-chat/patterns/oom-resilience)). The snapshot includes a `lastOutTimestamp` field that is exactly that high-water mark. When a snapshot exists, the OOM-retry path reads `lastOutTimestamp` directly instead of scanning `session.out`. One fewer stream subscription per retry. Free win. If no snapshot exists (first turn, or `hydrateMessages` registered), the path falls back to the scan. ## Action turns — no snapshot write [Action turns](/docs/ai-chat/actions) (`trigger: "action"`) don't fire `onTurnComplete` — they fire `onAction` only. The snapshot write site is gated on `onTurnComplete`, so action turns don't snapshot. If `onAction` mutates `chat.history.*` and then the run crashes before the next regular turn, the mutation is lost. The user re-fires the action. This matches `chat.history` semantics in general — mutations are persisted at turn boundaries, not action boundaries. ## The `hydrateMessages` short-circuit When the customer registers a [`hydrateMessages`](/docs/ai-chat/lifecycle-hooks#hydratemessages) hook, the runtime trusts the hook to be the source of truth for history. Snapshot read and replay are **skipped entirely** at boot. The hook fires per turn, returns the canonical chain from the customer's database, and the accumulator is set to whatever the hook returned. ```ts theme={"theme":"css-variables"} import { chat, upsertIncomingMessage } from "@trigger.dev/sdk/ai"; import { db } from "@/lib/db"; export const myChat = chat.agent({ id: "my-chat", hydrateMessages: async ({ chatId, trigger, incomingMessages }) => { const stored = (await db.chat.findUnique({ where: { id: chatId } }))?.messages ?? []; // See lifecycle-hooks for the full upsert pattern + rationale: // /ai-chat/lifecycle-hooks#hydratemessages if (upsertIncomingMessage(stored, { trigger, incomingMessages })) { // Upsert, not update: head-start first turns run without a preload // to create the row. await db.chat.upsert({ where: { id: chatId }, create: { id: chatId, messages: stored }, update: { messages: stored }, }); } return stored; }, onTurnComplete: async ({ chatId, uiMessages }) => { await db.chat.update({ where: { id: chatId }, data: { messages: uiMessages } }); }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` What you gain: * **Zero object-store traffic per turn.** No snapshot read, no snapshot write, no replay subscription. `OBJECT_STORE_*` env vars don't have to be set. * **Branching, undo, edit, abuse prevention** — patterns that need a backend-side single source of truth work naturally because the customer mediates every read. What you give up: * **You own persistence end-to-end.** A bug in `hydrateMessages` that returns the wrong chain corrupts the conversation visible to the model. * **OOM-retry needs a `session.out` scan again** because there's no snapshot to short-circuit it. (Same as the pre-snapshot baseline — not a regression, just a missed optimization.) The runtime's snapshot+replay is the safer default. `hydrateMessages` is the right choice when you already have authoritative storage for messages and want one consistent persistence path. ## When neither is configured If `hydrateMessages` is not registered **and** no object store is configured, conversations don't survive run boundaries. A continuation boots empty. The runtime logs a warning at agent registration time so you see this at deploy time, not at user-traffic time. For local development this is sometimes fine — you're not testing continuations. For production it isn't. Configure one of: * **Object store** (`OBJECT_STORE_*` env vars on your webapp) — easiest, default behavior. * **`hydrateMessages` + your own database** — stronger control, suits multi-tenant apps with audit needs. ## Snapshot key & lifecycle | Field | Value | | ---------- | ------------------------------------------------------------------- | | Bucket | Whatever `OBJECT_STORE_BASE_URL` points to | | Key prefix | `packets/{projectRef}/{envSlug}/` (server-prefixed) | | Key suffix | `sessions/{sessionId}/snapshot.json` | | Final key | `packets/{projectRef}/{envSlug}/sessions/{sessionId}/snapshot.json` | | Size | Tens of KB typical, capped only by object-store limits | | Cadence | Overwritten after every successful `onTurnComplete` | Snapshots accumulate per-session forever unless you set a lifecycle policy on the bucket. A 90-day expiry on `packets/*/sessions/*/snapshot.json` is a reasonable default if your chats don't typically resume after that window. Closed sessions are not auto-cleaned today. ### MinIO and S3-compatible stores Snapshot read/write reuses the same object-store layer as Trigger.dev's existing large-payload routes. Anything that already works for large payloads — AWS S3, MinIO (self-host or local development), Cloudflare R2, Tigris, Backblaze B2 — works for snapshots too. `OBJECT_STORE_DEFAULT_PROTOCOL` controls the routing (`s3`, `minio`, etc.) and the SDK picks the right driver automatically. No snapshot-specific config. For local development against `pnpm run docker`, the bundled MinIO container is enough — set `OBJECT_STORE_DEFAULT_PROTOCOL=minio` and the standard MinIO env vars on the webapp, and continuations work end-to-end against a local stack. ## See also * [Client Protocol](/docs/ai-chat/client-protocol#how-history-is-rebuilt) — the wire-level view of the same model * [`hydrateMessages`](/docs/ai-chat/lifecycle-hooks#hydratemessages) — the short-circuit hook * [OOM resilience](/docs/ai-chat/patterns/oom-resilience) — how `session.in` cutoffs interact with snapshots * [Database persistence](/docs/ai-chat/patterns/database-persistence) — the canonical persistence pattern using `onTurnComplete` * [v4.5 upgrade guide](/docs/ai-chat/upgrade-guide#v45-wire-format-change) — when this model landed and what changed # Recovery boot Source: https://trigger.dev/docs/ai-chat/patterns/recovery-boot Recover from cancel-mid-stream, crashes, and OOM kills with full conversational context. The smart default Just Works; the onRecoveryBoot hook is the override path for advanced policies. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. When a `chat.agent` run dies in the middle of streaming a response — the user cancels, the worker OOMs, or an unhandled exception kills the process — the durable streams hold what was in flight. The next run boots as a continuation, reads both stream tails, and reconstructs a chain that preserves the partial response so any follow-up (`keep going`, `actually do X instead`, a new question) has full context. The behavior is automatic. The `onRecoveryBoot` hook is opt-in for policies that need something different. ## The scenario ```ts theme={"theme":"css-variables"} // Turn 1 is mid-essay when the user clicks Cancel. window.__chat.send("Write me a long essay about espresso"); // ... assistant has written 3000 characters ... window.__chat.stop(); // OR: server-side cancel_run // User decides what they want next. window.__chat.send("keep going"); // OR: "what's 7+8?", or anything ``` The cancelled run never wrote `onTurnComplete`. The snapshot is stale or absent. `session.out` has a half-written assistant message. `session.in` has the original user message (the run consumed it but never marked the turn complete) plus the new follow-up. A naive continuation would either re-run the cancelled essay (the user already chose to stop) or drop everything (no context for the follow-up). Recovery boot handles this without either failure mode. ## The smart default On a continuation boot, the runtime reads: * **Snapshot** — settled turns persisted by the last successful `onTurnComplete`. * **`session.out` tail past the snapshot cursor** — closed assistant turns plus, optionally, a `partialAssistant` (the trailing message whose stream never received a `finish` chunk). `cleanupAbortedParts` has already stripped streaming-in-progress fragments. * **`session.in` tail past the last `turn-complete` cursor** — user messages the dead run hadn't acknowledged. If both `partialAssistant` and `inFlightUsers` are non-empty, the runtime splices `[firstInFlightUser, partialAssistant]` onto the chain. The remaining in-flight users dispatch as fresh turns. The model sees: ``` [ ...settledMessages, // chain through the last completed turn firstInFlightUser, // the question the dead run was answering partialAssistant, // the dead run's incomplete response followUpUser ] // the new turn the customer just sent ``` Modern instruction-following models prioritize the latest user message. The follow-up determines the response: | Follow-up | Model behavior | | ---------------------------------- | ---------------------------------------------------------- | | "keep going" / "continue" / "more" | Continues the partial essay from where it stopped. | | "actually, what's 7+8?" | Answers the new question. Prior context doesn't derail it. | | "scrap that, do something else" | Abandons the partial work and follows the new direction. | No customer code needed for any of these. ## When to register `onRecoveryBoot` The hook fires when recovery state is non-empty (either `partialAssistant` is defined or there's at least one in-flight user). Register it when you need a policy different from "preserve context": * **Drop the partial entirely.** Your UX means "cancel discards the work — start fresh from the follow-up." * **Synthesize tool results.** The partial has tool calls in `input-available` state (HITL was mid-call when the run died). Return a chain that has fabricated `output-available` results so the model can continue. * **Emit a recovery banner.** Write a `data-chat-recovery` UIMessage chunk via `ctx.writer` so the frontend can render "Recovering interrupted response..." before the model speaks. * **Persist recovered state.** Use `beforeBoot` to flush the partial to your own database before the next turn starts. ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; export const myChat = chat.agent({ id: "my-chat", onRecoveryBoot: async ({ partialAssistant, inFlightUsers, writer, cause, previousRunId }) => { writer.write({ type: "data-chat-recovery", data: { cause, previousRunId, partialPresent: partialAssistant !== undefined }, transient: true, }); // Return nothing → fall through to smart default. }, run: async ({ messages, signal }) => streamText({ model, messages, abortSignal: signal }), }); ``` ## Hook reference ### Fires when The hook fires once on a continuation boot, AFTER both stream tails have been read, AND only when there's a partial assistant — the mid-stream-died signal: ```ts theme={"theme":"css-variables"} const shouldFire = partialAssistant !== undefined; ``` In-flight users alone don't fire the hook. Graceful exits like `chat.requestUpgrade()` and `chat.endRun()` may leave an unacknowledged user on `session.in` (the message that triggered the upgrade, the next message after endRun), but no partial — that's a normal continuation, not recovery. The next message just dispatches as turn 1 on the new run via the normal session.in pump. Skipped scenarios (where the hook does NOT fire): * A clean continuation after `chat.endRun()` with no buffered follow-up. * A fresh chat (no continuation, attempt 1). * An OOM retry that booted onto a complete snapshot (no partial on the tail). * `chat.requestUpgrade()` graceful exit — predecessor ended cleanly before processing, no partial. * An agent with [`hydrateMessages`](/docs/ai-chat/lifecycle-hooks#hydratemessages) registered. Customers using `hydrateMessages` own persistence — recovery decisions live in their own DB query. ### Event shape ```ts theme={"theme":"css-variables"} type RecoveryBootEvent = { ctx: TaskRunContext; chatId: string; runId: string; previousRunId: string; cause: "cancelled" | "crashed" | "unknown"; settledMessages: TUIM[]; inFlightUsers: TUIM[]; partialAssistant: TUIM | undefined; pendingToolCalls: Array<{ toolCallId: string; toolName: string; input: unknown; partIndex: number; }>; writer: ChatWriter; }; ``` `cause` is currently always `"unknown"` — the run engine doesn't yet plumb the real reason into the continuation payload. The enum is forward-looking; don't branch behavior on it for now. ### Return shape Every field is optional. Returning `undefined` (or nothing) accepts the smart default for every field. ```ts theme={"theme":"css-variables"} type RecoveryBootResult = { chain?: TUIM[]; recoveredTurns?: TUIM[]; beforeBoot?: () => Promise; }; ``` * **`chain`** — replaces the seed chain. Defaults to `[...settledMessages, firstInFlightUser, partialAssistant]` when both partial and in-flight users exist, otherwise `settledMessages` alone. * **`recoveredTurns`** — user messages to dispatch as fresh turns after the chain is restored. Defaults to `inFlightUsers.slice(1)` when the smart default consumed the first user, otherwise `inFlightUsers`. * **`beforeBoot`** — runs after the writer flushes and before the first recovered turn fires. Use for blocking persistence (write the partial to your DB so a later turn can reference it). Errors bubble — wrap your own try/catch if you want to soft-fail. ## Examples ### Drop the partial — strict "cancel means discard" The customer's UX treats cancel as "throw the work away": ```ts theme={"theme":"css-variables"} onRecoveryBoot: async ({ inFlightUsers, partialAssistant }) => { if (!partialAssistant) return; // No partial → nothing to drop return { chain: undefined, // Use settledMessages, don't splice partial recoveredTurns: inFlightUsers.slice(1) // Still skip the first user (the dead run was answering it) }; } ``` ### Synthesize tool results for a mid-call interruption The dead run was processing a tool call when it died. The partial has tool parts in `input-available` state with no `output-available`. Synthesize a result so the model can keep going: ```ts theme={"theme":"css-variables"} onRecoveryBoot: async ({ partialAssistant, pendingToolCalls, settledMessages, inFlightUsers }) => { if (pendingToolCalls.length === 0) return; // Rebuild the partial with synthetic outputs for any input-available tool call. const repaired = { ...partialAssistant!, parts: partialAssistant!.parts!.map((part, i) => { const pending = pendingToolCalls.find(p => p.partIndex === i); if (!pending) return part; return { ...part, state: "output-available" as const, output: { interrupted: true, reason: "previous run was cancelled" }, }; }), }; return { chain: [...settledMessages, inFlightUsers[0]!, repaired], recoveredTurns: inFlightUsers.slice(1), }; } ``` ### Persist the partial before the next turn fires ```ts theme={"theme":"css-variables"} onRecoveryBoot: async ({ chatId, partialAssistant }) => { return { beforeBoot: async () => { if (partialAssistant) { await db.partial.create({ data: { chatId, partialJson: JSON.stringify(partialAssistant) }, }); } }, }; } ``` ## Interaction with other features ### `hydrateMessages` If your agent registers [`hydrateMessages`](/docs/ai-chat/lifecycle-hooks#hydratemessages), the runtime skips snapshot read, `session.out` replay, `session.in` replay, AND `onRecoveryBoot`. Your DB is the source of truth — recovery decisions live in your own query. To detect a cancel-recovery scenario yourself, persist a `runState: "in-progress"` flag in `onTurnStart` and check for it in `hydrateMessages`. ### `chat.requestUpgrade()` [`chat.requestUpgrade()`](/docs/ai-chat/patterns/version-upgrades) is a graceful exit — the old run doesn't crash, it returns cleanly. The new continuation run boots with a clean `session.out` tail (`partialAssistant` is undefined) and the upgrade-trigger message on `session.in` (one in-flight user). The smart default doesn't splice (it requires both partial AND in-flight users), so the chain is just `settledMessages` and the in-flight user dispatches as a fresh turn. `onRecoveryBoot` still fires (there's an in-flight user) — use it to emit an "upgraded" signal to the UI if you want. ### Hooks throwing If the body of `onRecoveryBoot` throws (or rejects), the runtime logs a warning and falls back to the smart default — the run does not fail. Wrap your own try/catch if you want stricter handling. `beforeBoot` is the exception: it's the contract you opted into for blocking persistence, so errors thrown there **bubble** and fail the run rather than dispatch recovered turns against half-persisted state. Wrap it yourself if you want to soft-fail. ## See also * [OOM resilience](/docs/ai-chat/patterns/oom-resilience) — `oomMachine` opt-in for automatic memory-driven recovery; uses the same recovery boot path. * [Persistence and replay](/docs/ai-chat/patterns/persistence-and-replay) — the snapshot + dual-tail replay model that recovery boot sits on top of. * [Lifecycle hooks](/docs/ai-chat/lifecycle-hooks) — where `onRecoveryBoot` sits in the broader hook taxonomy. # Agent Skills Source: https://trigger.dev/docs/ai-chat/patterns/skills Ship reusable capabilities (folders with SKILL.md + scripts) that a chat agent discovers and invokes on demand. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. Agent skills are reusable capabilities you ship as folders — a `SKILL.md` describing when and how to use them, plus optional scripts, references, and assets. The chat agent sees a short description of each skill in its system prompt, loads the full instructions on demand via a `loadSkill` tool, and invokes the bundled scripts via `bash` — all without you wiring anything up manually. Built on the [AI SDK cookbook pattern](https://ai-sdk.dev/cookbook/guides/agent-skills). Works with any provider (OpenAI, Anthropic, Gemini, etc.) — not tied to Anthropic's server-side skills. ## Why skills? Compared to regular AI SDK tools: * **Tools** are typed functions you pre-declare. Great when you know up-front exactly what capability the agent needs. * **Skills** are folders the model discovers and reads on demand. Great when the capability is a bundle of instructions + helper scripts that would be awkward to encode as a single tool. PDFs are the canonical example: you don't want to ask the LLM to parse PDF bytes inline. You want it to `bash scripts/extract.py report.pdf` using a bundled `pdfplumber` wrapper. A skill ships the script, the instructions, and any reference notes together. Dashboard-editable `SKILL.md` is on the roadmap so a platform team can tighten a skill's description or "when to use" text without a redeploy. Today, skills are SDK-only — defined in your task code and shipped with each deploy. ## Trust model Skills are **developer-authored code**, not end-user-supplied. The same developer who writes the `chat.agent()` writes the skill bundle. The trust boundary is identical to any `tool.execute` handler the developer writes — scripts run directly in the Trigger.dev worker container, no sandboxing required. This makes skills different from the Claude Code / end-user model where arbitrary user-provided skills need isolation. Don't accept skill paths from untrusted input. ## Skill folder layout A skill is a directory under your project (conventionally `trigger/skills/{id}/`): ``` trigger/skills/time-utils/ ├── SKILL.md # Required — frontmatter + instructions ├── scripts/ │ ├── now.sh │ └── add.sh ├── references/ │ └── timezones.txt └── assets/ # Optional — templates, data files, etc. ``` ### SKILL.md Frontmatter is YAML-subset — only `name` and `description` are required: ```md theme={"theme":"css-variables"} --- name: time-utils description: Compute and format dates/times in arbitrary timezones. Use when the user asks "what time is it", timezone conversions, or date math. --- # Time utilities ## When to use - The user asks for the current time in a timezone - The user wants date math ("3 days from now") ## Scripts ### `scripts/now.sh [TZ]` Prints the current time in the given IANA timezone (default `UTC`). ### `scripts/add.sh DAYS [TZ]` Prints a date `DAYS` days from now. ## Tips - IANA timezone names only (`America/New_York`, not `EST`). - See `references/timezones.txt` for a cheat-sheet. ``` The **description** is what the model sees in its system prompt — write it like you're explaining to the agent when to reach for the skill. The **body** is loaded on demand via the `loadSkill` tool when the agent decides to use the skill. Write it like documentation for the agent. ## Defining and using a skill ```ts trigger/chat.ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { skills } from "@trigger.dev/sdk"; import { streamText, stepCountIs } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; const timeUtilsSkill = skills.define({ id: "time-utils", path: "./skills/time-utils", }); export const agent = chat.agent({ id: "docs-chat", onChatStart: async () => { chat.skills.set([await timeUtilsSkill.local()]); }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal, ...chat.toStreamTextOptions(), stopWhen: stepCountIs(15), }); }, }); ``` `skills.define({ id, path })` does two things: 1. Registers the skill with the Trigger.dev build system so the CLI **automatically bundles the folder** into your deploy image at `/app/.trigger/skills/{id}/`. No `trigger.config.ts` changes, no build extension — it just works. 2. Returns a `SkillHandle` you use at runtime. `skill.local()` reads the bundled `SKILL.md` from disk and returns a `ResolvedSkill` with the parsed frontmatter + body + on-disk path. `chat.skills.set([...])` stores the resolved skills for the current run. `chat.toStreamTextOptions()` spreads them into `streamText` automatically: * The frontmatter `description` lands in the system prompt under "Available skills:". * Three tools are added: `loadSkill`, `readFile`, `bash` — scoped per skill. ## What gets auto-injected When you spread `chat.toStreamTextOptions()` with skills set, the AI SDK call receives three tools: ### `loadSkill({ name })` Returns the full `SKILL.md` body for the named skill. The model calls this first when it decides a skill is relevant, to load the full instructions. ### `readFile({ skill, path })` Reads a file inside the skill's bundled folder. Paths are relative to the skill's root and are rejected if they attempt to escape via `..` or absolute paths. Output is capped at 1 MB per call. Use for reference files and templates that the model should read literally: ``` readFile({ skill: "time-utils", path: "references/timezones.txt" }) ``` ### `bash({ skill, command })` Runs a bash command with `cwd` set to the skill's root. Stdout and stderr are captured and returned (each capped at 64 KB per call, with tail truncation). The turn's abort signal propagates — cancelling the run kills the child process. Use to invoke the skill's bundled scripts: ``` bash({ skill: "time-utils", command: "bash scripts/now.sh America/Los_Angeles" }) ``` Script runtime expectations are yours to manage. If your skill uses `extract.py`, your deploy image needs Python — add it via your build config the same way you would for any other task dependency. ## How discovery works in the model The model sees a short preamble appended to your system prompt: ``` Available skills (call `loadSkill` to read the full instructions before using one): - time-utils: Compute and format dates/times in arbitrary timezones... - pdf-processing: Extract text from PDFs, fill forms... ``` When the user asks something that matches a description, the model calls `loadSkill({ name: "time-utils" })` to load the body, then follows the body's instructions — typically by calling `bash` or `readFile` on the bundled scripts. This is **progressive disclosure**: each skill costs \~100 tokens up front (its one-line description), and only the ones the model actually uses pay the full context cost. ## Mixing skills with custom tools If you also define your own AI SDK tools, pass them through `chat.toStreamTextOptions()` so the merge is explicit: ```ts theme={"theme":"css-variables"} return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal, ...chat.toStreamTextOptions({ tools: { webFetch, // your tool deepResearch, // your tool }, }), stopWhen: stepCountIs(15), }); ``` Your tools win on name conflicts. (Pick names that don't collide with `loadSkill` / `readFile` / `bash` to keep things predictable.) Also declare those same tools on the agent's [`tools`](/docs/ai-chat/tools) config. `toStreamTextOptions` merges them with the skill tools for the model call, while the config option threads them into history re-conversion so any `toModelOutput` survives across turns. The auto-injected skill tools (`loadSkill` / `readFile` / `bash`) don't define `toModelOutput`, so they don't need to be on the config. ## Bundling Bundling is **built-in to the CLI** — there's no extension to import. When you run `trigger deploy` or `trigger dev`: 1. esbuild bundles your task code as usual. 2. The CLI forks the indexer locally against the bundled output, collects every `skills.define({ path })` registration. 3. Each skill's folder is copied to `{outputPath}/.trigger/skills/{id}/` via a recursive copy. 4. The existing Dockerfile `COPY` picks up `.trigger/skills/` along with the rest of the bundle — no Dockerfile changes. If you're running `trigger dev`, the same layout appears in the local dev output directory, so `skill.local()` works the same way. ## Path scoping rules * `skill.path` always resolves to `${process.cwd()}/.trigger/skills/{id}/` at runtime. Don't hardcode paths elsewhere. * `readFile` rejects `..` segments and absolute paths — the tool only exposes files inside the skill's own directory. * `bash` runs with `cwd` set to the skill's root. Inside the script, relative paths resolve against the skill directory. * Cross-skill access isn't provided — each skill is isolated by design. If two skills need to share data, either duplicate the shared file or consolidate the skills. ## Current limitations * `skill.resolve()` (backend-managed overrides) is not available yet — use `.local()` for now. Dashboard-editable `SKILL.md` is on the roadmap. * No per-skill metrics in the dashboard yet. * No Anthropic `/v1/skills` integration — use the portable path today; we're tracking the Anthropic optimization separately. ## Full example See [`projects/ai-chat/src/trigger/skills/time-utils/`](https://github.com/triggerdotdev/references/tree/main/projects/ai-chat/src/trigger/skills/time-utils) in the [references repo](https://github.com/triggerdotdev/references) for a working skill that bundles two bash scripts and a reference cheat-sheet, wired into a `chat.agent` that answers timezone questions. ## Related * [AI SDK cookbook — Agent Skills](https://ai-sdk.dev/cookbook/guides/agent-skills) — the userland pattern we build on * [Anthropic Agent Skills](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/overview) — Anthropic's codified version (server-side, optional future integration) # Sub-Agents Source: https://trigger.dev/docs/ai-chat/patterns/sub-agents Delegate work to durable sub-agents from within a parent agent's tool calls, with streaming preliminary results. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. Sub-agents let a parent agent delegate work to other agents running as durable Trigger.dev tasks. The sub-agent's response streams back through the parent as preliminary tool results, so the frontend sees the sub-agent working inside the parent's tool call card. This builds on the AI SDK's [async generator tool pattern](https://ai-sdk.dev/docs/agents/subagents) and Trigger.dev's [AgentChat](/docs/ai-chat/server-chat) for server-side agent interaction. ## How it works 1. The parent LLM calls a tool (e.g., `researchAgent`) 2. The tool's `execute` is an `async function*` (async generator) 3. Inside, it creates an `AgentChat` and sends a message to the sub-agent 4. `yield* stream.messages()` streams each accumulated `UIMessage` snapshot as a preliminary tool result 5. The frontend renders the sub-agent's response building up inside the parent's tool card 6. `toModelOutput` compresses the full output into a summary for the parent LLM ``` Parent LLM │ ├─ calls researchAgent tool │ │ │ ├─ AgentChat triggers sub-agent run │ ├─ sub-agent streams response (text, tool calls, etc.) │ ├─ yield* sends UIMessage snapshots as preliminary results │ └─ toModelOutput compresses for parent LLM │ └─ parent LLM reads compressed summary, continues reasoning ``` ## Single-turn sub-agent The simplest pattern: one tool call, one sub-agent turn, conversation closes. ```ts theme={"theme":"css-variables"} import { tool, stepCountIs } from "ai"; import { AgentChat } from "@trigger.dev/sdk/chat"; import { z } from "zod"; import type { prReviewAgent } from "./trigger/pr-review"; const prReviewTool = tool({ description: "Delegate a PR review to the PR review agent.", inputSchema: z.object({ prNumber: z.number().describe("The PR number to review"), repo: z.string().describe("The GitHub repo URL"), }), execute: async function* ({ prNumber, repo }, { abortSignal }) { const chat = new AgentChat({ agent: "pr-review", id: `review-${prNumber}`, clientData: { userId: "parent-agent", githubUrl: repo }, }); const stream = await chat.sendMessage(`Review PR #${prNumber}`, { abortSignal }); // Each yield sends a UIMessage snapshot to the frontend yield* stream.messages(); await chat.close(); }, // The parent LLM only sees this compressed summary toModelOutput: ({ output: message }) => { const lastText = message?.parts?.findLast( (p: { type: string }) => p.type === "text" ) as { text?: string } | undefined; return { type: "text", value: lastText?.text ?? "Review complete." }; }, }); ``` Use this tool in a parent agent's `streamText` call: ```ts theme={"theme":"css-variables"} import { streamText } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; const result = streamText({ model: anthropic("claude-sonnet-4-6"), tools: { prReview: prReviewTool }, prompt: "Review PR #42 on triggerdotdev/trigger.dev", stopWhen: stepCountIs(15), }); ``` ## Multi-turn sub-agent (LLM-driven) The parent LLM drives a persistent conversation with a sub-agent across multiple tool calls. Each call with the same `conversationId` hits the same durable agent run. ```ts theme={"theme":"css-variables"} import { tool } from "ai"; import { AgentChat } from "@trigger.dev/sdk/chat"; import { z } from "zod"; // Track active sub-agent conversations const subAgents = new Map(); const researchTool = tool({ description: "Talk to a research agent. Use the same conversationId to continue " + "an existing conversation — the agent remembers full context.", inputSchema: z.object({ conversationId: z .string() .describe("Unique ID for this research thread. Reuse to continue."), message: z.string().describe("Your message to the research agent"), }), execute: async function* ({ conversationId, message }, { abortSignal }) { let agent = subAgents.get(conversationId); if (!agent) { agent = new AgentChat({ agent: "research-agent", id: conversationId, }); subAgents.set(conversationId, agent); } const stream = await agent.sendMessage(message, { abortSignal }); yield* stream.messages(); }, toModelOutput: ({ output: message }) => { const lastText = message?.parts?.findLast( (p: { type: string }) => p.type === "text" ) as { text?: string } | undefined; return { type: "text", value: lastText?.text ?? "Done." }; }, }); ``` The parent LLM naturally calls this tool multiple times: 1. `researchAgent({ conversationId: "competitors", message: "Research competitors in AI agents" })` — first call triggers a new sub-agent run 2. `researchAgent({ conversationId: "competitors", message: "Go deeper on pricing" })` — same run, sub-agent has full context 3. `researchAgent({ conversationId: "new-topic", message: "..." })` — different ID = different sub-agent ### Cross-turn persistence Sub-agent conversations persist across **parent turns** because the `Map` lives in the parent's process heap. When the parent suspends and restores via snapshot, the heap is preserved — the Map still has the conversations, the sessions still have the run IDs. ```ts theme={"theme":"css-variables"} export const orchestrator = chat .withClientData({ schema: z.object({ userId: z.string() }) }) .customAgent({ id: "orchestrator", run: async (payload, { signal: runSignal }) => { // These survive across parent turns via snapshot/restore const subAgents = new Map(); const researchTool = tool({ // ... closes over subAgents Map }); // Turn loop — subAgents persist across all turns for (let turn = 0; turn < 50; turn++) { // ... streamText with researchTool } // Cleanup when parent exits await Promise.all( Array.from(subAgents.values()).map((a) => a.close().catch(() => {})) ); }, }); ``` ## How sub-agents clean up Sub-agents clean up through three mechanisms: 1. **Explicit close**: Call `chat.close()` or `agent.close()` when done 2. **Idle timeout**: The sub-agent's idle timeout expires, it suspends 3. **Suspend timeout**: The sub-agent's suspend timeout expires, the run ends For the multi-turn pattern, the parent should clean up sub-agents when it exits (in `onComplete` for managed agents, or at the end of the loop for custom agents). Without explicit cleanup, sub-agents close on their own via timeouts — no leaked resources or cost while suspended. ## What the frontend sees Each `yield` from `stream.messages()` sends a complete `UIMessage` containing all the sub-agent's parts accumulated so far. The AI SDK delivers these as `tool-output-available` chunks with `preliminary: true`. The frontend renders the tool part with: * `state: "output-available"` and `preliminary: true` while streaming * `state: "output-available"` and `preliminary: false` (or absent) when done The tool output contains the full `UIMessage` with nested parts — text, the sub-agent's own tool calls and results, reasoning, etc. ### Controlling what the parent LLM sees `toModelOutput` transforms the tool's output before it enters the parent LLM's context. The full UIMessage streams to the frontend, but the model only sees the compressed version: ```ts theme={"theme":"css-variables"} toModelOutput: ({ output: message }) => { // Extract just the final text — the model doesn't need // to see all the sub-agent's tool calls and intermediate work const lastText = message?.parts?.findLast( (p: { type: string }) => p.type === "text" ) as { text?: string } | undefined; return { type: "text", value: lastText?.text ?? "Done." }; }, ``` This is important for token efficiency: the sub-agent might use 100K tokens exploring and reasoning, but the parent LLM only consumes the summary. `toModelOutput` only runs when the SDK has your tools at conversion time. On a multi-turn parent, the SDK re-converts the persisted history at the start of each turn, so you must declare the sub-agent tool on the agent config (`chat.agent({ tools })`) for the compression to survive. Without it, the summary holds on turn 1 but turn 2 onward re-ingests the full sub-agent output. In a `chat.customAgent` loop you own the conversion, so pass the tools to `convertToModelMessages(uiMessages, { tools })` yourself. See [Tools: toModelOutput across turns](/docs/ai-chat/tools#tomodeloutput-across-turns). ## ChatStream.messages() The `messages()` method on `ChatStream` wraps the AI SDK's `readUIMessageStream`. It reads the raw `UIMessageChunk` stream and yields complete `UIMessage` snapshots — each containing all parts received so far. ```ts theme={"theme":"css-variables"} const stream = await chat.sendMessage("Research this topic"); // Each yield is a complete UIMessage with all accumulated parts for await (const message of stream.messages()) { console.log(message.parts.length, "parts so far"); } ``` For the sub-agent pattern, use `yield*` to delegate all yields to the parent tool's generator: ```ts theme={"theme":"css-variables"} execute: async function* ({ topic }, { abortSignal }) { const stream = await chat.sendMessage(topic, { abortSignal }); yield* stream.messages(); }, ``` `stream.messages()` consumes the stream. You can't also call `stream.text()` or iterate over chunks on the same stream. Pick one consumption mode. ## Combining with chat.agent() Sub-agent tools work inside both `chat.agent()` (managed) and `chat.customAgent()` (manual lifecycle): ```ts theme={"theme":"css-variables"} // Managed agent with sub-agent tool const tools = { research: researchTool }; export const myAgent = chat.agent({ id: "orchestrator", tools, // declare here so toModelOutput survives across turns run: async ({ messages, tools, stopSignal }) => { return streamText({ model: anthropic("claude-sonnet-4-6"), messages, tools, abortSignal: stopSignal, stopWhen: stepCountIs(15), }); }, }); ``` For `chat.customAgent()`, define the tool and sub-agent Map inside the `run` closure so they survive across turns. Since you own the turn loop there, convert history with your tools in scope so `toModelOutput` is re-applied each turn: `convertToModelMessages(uiMessages, { tools })`. See [Tools: manual turn loops](/docs/ai-chat/tools#manual-turn-loops-chatcustomagent). ## Streaming progress from a subtask to the parent chat When a tool invokes a subtask via `triggerAndWait`, the subtask can stream custom data parts directly to the parent chat using `chat.stream.writer({ target: "root" })`. The frontend receives these as `DataUIPart` objects in `message.parts` on the **parent's** message stream: ```ts theme={"theme":"css-variables"} import { chat, ai } from "@trigger.dev/sdk/ai"; import { schemaTask } from "@trigger.dev/sdk"; import { streamText, tool, generateId } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; import { z } from "zod"; export const researchTask = schemaTask({ id: "research", schema: z.object({ query: z.string() }), run: async ({ query }) => { const partId = generateId(); // Stream a data-* chunk to the root run's chat stream. const { waitUntilComplete } = chat.stream.writer({ target: "root", execute: ({ write }) => { write({ type: "data-research-status", id: partId, data: { query, status: "in-progress" }, }); }, }); await waitUntilComplete(); const result = await doResearch(query); // Update the same part with the final status — same type + id replaces it. const { waitUntilComplete: waitDone } = chat.stream.writer({ target: "root", execute: ({ write }) => { write({ type: "data-research-status", id: partId, data: { query, status: "done", resultCount: result.length }, }); }, }); await waitDone(); return result; }, }); const research = tool({ description: researchTask.description ?? "", inputSchema: researchTask.schema!, execute: ai.toolExecute(researchTask), }); ``` On the frontend, render the custom data part: ```tsx theme={"theme":"css-variables"} {message.parts.map((part, i) => { if (part.type === "data-research-status") { const { query, status, resultCount } = part.data; return (
{status === "done" ? `Found ${resultCount} results` : `Researching "${query}"...`}
); } // ...other part types })} ``` The `target` option accepts: * `"self"` — current run (default) * `"parent"` — parent task's run * `"root"` — root task's run (the chat agent) * A specific run ID string ## Inside `ai.toolExecute`: accessing tool + chat context When a subtask runs via `execute: ai.toolExecute(task)`, it can read the parent's tool call ID and chat context from inside the subtask body: ```ts theme={"theme":"css-variables"} import { ai, chat } from "@trigger.dev/sdk/ai"; import type { myChat } from "./chat"; export const mySubtask = schemaTask({ id: "my-subtask", schema: z.object({ query: z.string() }), run: async ({ query }) => { // The AI SDK tool call ID — useful as a stable `data-*` chunk id const toolCallId = ai.toolCallId(); // Typed chat context — `clientData` is typed off your chat's `clientDataSchema` const { chatId, clientData } = ai.chatContextOrThrow(); const { waitUntilComplete } = chat.stream.writer({ target: "root", execute: ({ write }) => { write({ type: "data-progress", id: toolCallId, data: { status: "working", query, userId: clientData?.userId }, }); }, }); await waitUntilComplete(); return { result: "done" }; }, }); ``` | Helper | Returns | Description | | ---------------------------------------- | --------------------------------------------------------- | ----------------------------------------------------------------------------------- | | `ai.toolCallId()` | `string \| undefined` | The AI SDK tool call ID | | `ai.chatContext()` | `{ chatId, turn, continuation, clientData } \| undefined` | Chat context with typed `clientData`. Returns `undefined` if not in a chat context. | | `ai.chatContextOrThrow()` | `{ chatId, turn, continuation, clientData }` | Same as above but throws if not in a chat context | | `ai.currentToolOptions()` | `ToolCallExecutionOptions \| undefined` | Full tool execution options | The subtask body also has read-only access to any [`chat.local`](/docs/ai-chat/chat-local) values initialized in the parent — auto-hydrated from the parent's metadata on first access. # Tool result auditing Source: https://trigger.dev/docs/ai-chat/patterns/tool-result-auditing Fire side effects exactly once per resolved tool call — audit logs, billing, notifications — using extractNewToolResults inside hydrateMessages or onTurnComplete. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. When a chat agent uses [tools](/docs/ai-chat/tools) (especially [human-in-the-loop](/docs/ai-chat/patterns/human-in-the-loop) tools that wait on `addToolOutput` from the frontend), you often need to fire side effects exactly once per resolved tool call: * **Audit logs** — record every tool result for compliance. * **Billing** — charge per tool invocation. * **Notifications** — alert downstream systems when a specific tool resolves. * **Search-index updates** — reflect tool outputs into a derived store. The naive approach — "log every tool part you see" — over-counts. The same assistant message gets re-shown across re-renders, replays, and retries. You want a function of the form **"is this tool result one I haven't already logged?"** That's exactly what [`chat.history.extractNewToolResults`](/docs/ai-chat/backend#chat-history) returns. ## The pattern ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { auditLog } from "@/lib/audit"; export const myChat = chat.agent({ id: "my-chat", hydrateMessages: async ({ chatId, incomingMessages }) => { for (const msg of incomingMessages) { for (const r of chat.history.extractNewToolResults(msg)) { await auditLog.record({ chatId, toolCallId: r.toolCallId, toolName: r.toolName, output: r.output, errorText: r.errorText, }); } } return await db.getMessages(chatId); }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` The hook fires per turn. `incomingMessages` is the new wire message (0-or-1-length, see [v4.5 wire format change](/docs/ai-chat/upgrade-guide#v45-wire-format-change)). For each new tool result on that message, write one audit row. Then return the canonical chain from your DB. `extractNewToolResults` compares the message against the current `chat.history` chain and returns only tool parts whose `toolCallId` is **not** already resolved. That's what makes the call exactly-once: * A re-emitted message (same id, same toolCallId) returns `[]` — no duplicate log. * A genuinely new tool result on a known assistant message returns just the new ones. * A first-time tool result returns the full set. ## Why `hydrateMessages` is the right hook The pattern works in any pre-merge callback, but `hydrateMessages` is the canonical spot for two reasons: 1. **It fires before the runtime merges** the incoming message into the accumulator. Once merged, the tool results are already on the chain, and `extractNewToolResults` returns `[]` for them. 2. **It always fires per turn** — including HITL turns where the user resolved a tool with `addToolOutput`, which is the highest-volume audit event in most apps. By the time `onTurnComplete` fires, the chain already contains `responseMessage`, so calling `extractNewToolResults(responseMessage)` there returns `[]`. Don't put audit logging there for the resolution path. ## Without `hydrateMessages` — `onTurnComplete` for self-emitted tool calls If you don't use `hydrateMessages`, the runtime's snapshot+replay path handles persistence. You can still audit the agent's **own** tool executions in `onTurnComplete` — but compare against the prior message rather than the just-emitted one: ```ts theme={"theme":"css-variables"} onTurnComplete: async ({ chatId, newUIMessages }) => { // The assistant message from this turn is in newUIMessages. for (const msg of newUIMessages) { if (msg.role !== "assistant") continue; for (const part of msg.parts) { if ( typeof part.type === "string" && part.type.startsWith("tool-") && ((part as any).state === "output-available" || (part as any).state === "output-error") ) { await auditLog.record({ chatId, toolCallId: (part as any).toolCallId, toolName: (part as any).type.slice("tool-".length), output: (part as any).output, errorText: (part as any).errorText, }); } } } }, ``` `newUIMessages` is just the messages this turn produced — no prior-chain noise. Each tool part shows up exactly once. This works for tools the agent itself calls (no HITL pause). For HITL flows where the user resolves a tool with `addToolOutput`, the resolution arrives on the **next** turn's wire message, not in `newUIMessages` of the resolving turn — use `hydrateMessages` for those. ## Idempotency at the storage layer Even with `extractNewToolResults`, transient failures (e.g. an audit-log POST that times out and is retried) can produce duplicates. Make the audit-log writer idempotent on `toolCallId`: ```ts theme={"theme":"css-variables"} await auditLog.upsert({ where: { toolCallId: r.toolCallId }, create: { /* ... */ }, update: { /* timestamp, retry count, etc. */ }, }); ``` `toolCallId` is unique per tool invocation (assigned by the AI SDK when the model emits the tool call) and stable across retries — perfect for an idempotency key. ## What `extractNewToolResults` returns ```ts theme={"theme":"css-variables"} type ExtractedToolResult = { toolCallId: string; toolName: string; input: unknown; // The arguments the model passed when calling the tool output?: unknown; // The tool's return value (output-available state) errorText?: string; // Error message (output-error state) }; ``` Tool parts in `input-available` state (the model called the tool but it hasn't resolved yet) are not returned — only **resolved** results count. ## Combining with HITL [Human-in-the-loop](/docs/ai-chat/patterns/human-in-the-loop) tools pause the turn waiting for `addToolOutput` from the frontend. When the user submits, the wire message carries an updated assistant message with the tool now in `output-available` state. `extractNewToolResults` against that message returns the just-resolved tool — exactly one audit row per user resolution: ```ts theme={"theme":"css-variables"} hydrateMessages: async ({ chatId, incomingMessages }) => { for (const msg of incomingMessages) { for (const r of chat.history.extractNewToolResults(msg)) { // Fires once per ask_user / approval / similar resolution await auditLog.record({ chatId, /* ... */ }); } } return await db.getMessages(chatId); } ``` This is the original motivator for the helper — see the [HITL pattern's net-new-tool-result section](/docs/ai-chat/patterns/human-in-the-loop#acting-once-per-net-new-tool-result). ## See also * [`chat.history`](/docs/ai-chat/backend#chat-history) — full reference for `extractNewToolResults`, `getPendingToolCalls`, `getResolvedToolCalls` * [Human-in-the-loop](/docs/ai-chat/patterns/human-in-the-loop) — the pattern this auditing hook complements * [`hydrateMessages`](/docs/ai-chat/lifecycle-hooks#hydratemessages) — where pre-merge auditing lives * [Persistence and replay](/docs/ai-chat/patterns/persistence-and-replay) — how the runtime rebuilds chains, and why `extractNewToolResults` works against them # Trusted edge signals Source: https://trigger.dev/docs/ai-chat/patterns/trusted-edge-signals How to safely deliver server-trusted signals (bot scores, JA4, ASN, ReCAPTCHA verdicts) to a chat.agent run via an edge proxy. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. A common need for chat-style endpoints is to drive agent behavior from **server-trusted signals** that the browser cannot be allowed to declare itself — bot management scores, JA4 fingerprints, ASN, ReCAPTCHA verdicts, or any other anti-abuse data only the edge can see. The agent's [`clientData`](/docs/ai-chat/reference#withclientdata) channel is the right delivery mechanism, but `clientData` set in the browser is by definition spoofable. The fix is to move the value population out of the browser and into a trusted edge proxy. This page documents the pattern using Cloudflare Workers as the proxy. The same shape applies to any edge layer (custom reverse proxy, Vercel Edge Middleware, AWS Lambda\@Edge) — the trust comes from the deployment topology, not from Trigger.dev validating the source. ## Why headers don't work It's tempting to ask whether `POST /realtime/v1/sessions/{id}/in/append` could carry the signal as an HTTP header. It cannot. The realtime route reads only `Authorization` and `X-Part-Id`; the remaining headers are dropped at the route boundary and the body is persisted to the durable stream as opaque bytes. There is no `headers → run payload` channel. The trigger.dev wire payload, on the other hand, has a typed per-turn metadata channel ([`ChatTaskWirePayload.metadata`](/docs/ai-chat/client-protocol#chattaskwirepayload)). It already flows from the wire into [`clientData`](/docs/ai-chat/reference#withclientdata) on every hook (`onBoot`, `onChatStart`, `onTurnStart`, `run`, `onTurnComplete`). That field is where signals must land. ## The trust boundary The pattern has one architectural requirement and one wire-shape convention. **Topology**: the browser must not be able to reach `trigger.dev` directly. All four chat-related requests (`POST /api/v1/sessions`, `GET /realtime/v1/sessions/{id}/out`, `POST /realtime/v1/sessions/{id}/in/append`, `POST /api/v1/auth/jwt/claims`) flow through your edge proxy. The proxy holds the trust; trigger.dev simply persists whatever the proxy writes. **Namespace**: pick a key your edge proxy owns exclusively — e.g. `__cf`, `__edge`, `__trust`. The proxy **strips** anything in that key on the way in and **injects** its own value on every request. Nothing else in your system should write that key. This is the convention that converts deployment topology into a guarantee the agent can rely on. ```mermaid theme={"theme":"css-variables"} sequenceDiagram participant Browser participant Edge as Edge Proxy (CF Worker) participant Trigger as trigger.dev API participant Agent as chat.agent run Browser->>Edge: POST /api/v1/sessions { triggerConfig.basePayload.metadata: {...} } Edge->>Edge: strip body.triggerConfig.basePayload.metadata.__cf
inject body.triggerConfig.basePayload.metadata.__cf = { botScore, ja4, asn } Edge->>Trigger: POST /api/v1/sessions (rewritten body) Trigger-->>Agent: run boots with payload.metadata.__cf Browser->>Edge: POST /realtime/v1/sessions/{id}/in/append { kind: "message", payload: {...} } Edge->>Edge: strip payload.metadata.__cf
inject payload.metadata.__cf Edge->>Trigger: POST /in/append (rewritten body) Trigger-->>Agent: chat.messages.wait() resolves with payload.metadata.__cf ``` ## Wire payload — the two endpoints to rewrite The signal needs to land in **two** places. Both bodies are JSON; the edge proxy parses, mutates the namespaced key, and re-serializes. ### `POST /api/v1/sessions` — session create The browser's session-create call carries the first-turn metadata under `triggerConfig.basePayload.metadata`. The proxy mutates that: ```ts theme={"theme":"css-variables"} // Before { "type": "chat.agent", "externalId": "conv-123", "taskIdentifier": "my-agent", "triggerConfig": { "basePayload": { "chatId": "conv-123", "trigger": "preload", "metadata": { "userId": "user-456" } } } } // After { "type": "chat.agent", "externalId": "conv-123", "taskIdentifier": "my-agent", "triggerConfig": { "basePayload": { "chatId": "conv-123", "trigger": "preload", "metadata": { "userId": "user-456", "__cf": { "botScore": 95, "ja4": "...", "asn": 13335, "country": "US" } } } } } ``` ### `POST /realtime/v1/sessions/{id}/in/append` — every follow-up turn The body is a JSON-serialized `ChatInputChunk`. The proxy parses it, checks `kind === "message"`, and mutates `payload.metadata`: ```ts theme={"theme":"css-variables"} // Before { "kind": "message", "payload": { "message": { "id": "u-2", "role": "user", "parts": [{ "type": "text", "text": "..." }] }, "chatId": "conv-123", "trigger": "submit-message", "metadata": { "userId": "user-456" } } } // After { "kind": "message", "payload": { "message": { ... }, "chatId": "conv-123", "trigger": "submit-message", "metadata": { "userId": "user-456", "__cf": { "botScore": 95, "ja4": "...", "asn": 13335, "country": "US" } } } } ``` Both bodies stay well under the [per-record cap on `/in/append`](/docs/ai-chat/client-protocol#step-3-send-messages-stops-and-actions) — a typical trust object is \~200 bytes. Other paths — `.out` SSE, `/api/v1/auth/jwt/claims`, anything else — pass through the proxy untouched. The SSE stream in particular must not be buffered; preserve the response body as-is. ## Cloudflare Worker reference implementation A complete worker that proxies all paths to `TRIGGER_API_UPSTREAM` and injects `__cf` on the two body-write endpoints: ```ts theme={"theme":"css-variables"} export interface Env { TRIGGER_API_UPSTREAM: string; // e.g. "https://api.trigger.dev" } type CfTrustData = { botScore: number; ja4: string; asn: number; country: string; }; function readCfTrustData(request: Request): CfTrustData { const cf = (request as Request & { cf?: Record }).cf; const bm = cf?.botManagement as Record | undefined; return { botScore: (bm?.score as number) ?? 0, ja4: (bm?.ja4 as string) ?? "", asn: (cf?.asn as number) ?? 0, country: (cf?.country as string) ?? "", }; } function injectCf(metadata: Record | undefined, cf: CfTrustData) { // Strip anything the client tried to send under our namespace, // then inject the edge-trusted value. Topology + convention = // trust. const stripped = { ...(metadata ?? {}) }; delete stripped.__cf; return { ...stripped, __cf: cf }; } function rewriteSessionsCreate(body: string, cf: CfTrustData) { const parsed = JSON.parse(body) as Record; const tc = (parsed.triggerConfig as Record) ?? {}; const bp = (tc.basePayload as Record) ?? {}; parsed.triggerConfig = { ...tc, basePayload: { ...bp, metadata: injectCf(bp.metadata as Record, cf) }, }; return JSON.stringify(parsed); } function rewriteAppend(body: string, cf: CfTrustData) { let parsed: Record; try { parsed = JSON.parse(body); } catch { return body; } if (parsed.kind !== "message") return body; const payload = (parsed.payload as Record) ?? {}; parsed.payload = { ...payload, metadata: injectCf(payload.metadata as Record, cf) }; return JSON.stringify(parsed); } export default { async fetch(request: Request, env: Env): Promise { const incoming = new URL(request.url); const target = new URL(incoming.pathname + incoming.search, env.TRIGGER_API_UPSTREAM); const cf = readCfTrustData(request); const isSessionsCreate = request.method === "POST" && incoming.pathname === "/api/v1/sessions"; const isAppend = request.method === "POST" && /^\/realtime\/v1\/sessions\/[^/]+\/in\/append$/.test(incoming.pathname); let body: BodyInit | null = null; if (request.method !== "GET" && request.method !== "HEAD") { const raw = await request.text(); if (isSessionsCreate && raw) body = rewriteSessionsCreate(raw, cf); else if (isAppend && raw) body = rewriteAppend(raw, cf); else body = raw; } const headers = new Headers(request.headers); headers.delete("host"); headers.delete("content-length"); return fetch(target.toString(), { method: request.method, headers, body, redirect: "manual", }); }, }; ``` Browser-only deployments also need CORS on the worker — echo `Access-Control-Request-Headers` on preflight and set `Access-Control-Allow-Origin` to your frontend origin. The trigger.dev route itself allows all origins, but the worker becomes the visible cross-origin endpoint to the browser. ### Streaming and latency The SDK's `baseURL` accepts a function (see [Browser transport configuration](#browser-transport-configuration)), so the recommended setup routes `.in/append` and session-create through the worker but lets `.out` SSE go direct to `api.trigger.dev`. Body-mutation only happens on the POST paths; the SSE stream is read-only, doesn't need rewriting, and routing it direct saves an edge hop on every reconnect. If you do route `.out` through the proxy (e.g. you want a single origin in front of `api.trigger.dev` and don't care about the extra hop), the template above handles it correctly because the worker returns `response.body` as a `ReadableStream`. **Do not replace that with `await response.text()`** anywhere in your fork; doing so converts the streaming SSE response into a buffered read and breaks per-chunk delivery. [Cloudflare Workers HTTP requests](https://developers.cloudflare.com/workers/platform/limits/) have no wall-clock duration limit while the client stays connected — the 60-second long-poll runs to completion on every plan, including Free. CPU-time limits (10 ms on Free, 30 s default on Paid) only apply to active computation; relaying bytes through `fetch` doesn't burn CPU. The two body-rewrite paths use sub-millisecond CPU for typical message sizes, well under either ceiling. Network-wise the proxy adds one edge hop: roughly 10–50 ms per request round trip versus talking to `api.trigger.dev` directly. Routing SSE direct via the function-form `baseURL` eliminates that hop on the long-lived path. ## Agent side — declare the namespace in `clientDataSchema` Mirror the namespace in the agent so every turn lands typed: ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { z } from "zod"; export const myAgent = chat .withClientData({ schema: z.object({ userId: z.string(), __cf: z.object({ botScore: z.number(), ja4: z.string(), asn: z.number(), country: z.string(), }), }), }) .agent({ id: "my-agent", run: async ({ messages, clientData, signal }) => { // Score-based routing. The values arrive from the edge proxy. if (clientData.__cf.botScore < 30) { return streamText({ model: anthropic("claude-haiku-4-5"), messages: [{ role: "system", content: "Reject politely; do not engage." }], abortSignal: signal, stopWhen: stepCountIs(15), }); } return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal, // ... stopWhen: stepCountIs(15), }); }, }); ``` Because the schema requires `__cf` on every turn, any request that *doesn't* go through the proxy fails at the agent boundary — the turn produces a `[ERROR]` span on the trace and an empty `turn-complete` on the wire (see [the client protocol error-detection note](/docs/ai-chat/client-protocol#step-3-send-messages-stops-and-actions)). That gives you a server-side enforcement check for "did this request actually come through the trusted path?" ## Browser transport configuration Point the `TriggerChatTransport` at the worker, not at `api.trigger.dev`: `baseURL` accepts a function so you can route `.in/append` through the worker while keeping `.out` SSE direct to `api.trigger.dev`. The append path is where the body-mutation matters; the SSE stream is a read-only one-way channel that doesn't need to be proxied. Routing it direct saves an edge hop on every long-poll. ```tsx theme={"theme":"css-variables"} import { useTriggerChatTransport } from "@trigger.dev/sdk/chat/react"; const WORKER = "https://worker.your-domain.com"; const DIRECT = "https://api.trigger.dev"; const transport = useTriggerChatTransport({ task: "my-agent", baseURL: ({ endpoint }) => (endpoint === "out" ? DIRECT : WORKER), // ... accessToken, startSession, etc. // NOTE: do not set __cf in clientData here. The browser cannot be // trusted to populate it — the worker is the source of truth. clientData: { userId: currentUserId }, }); ``` If you'd rather route everything through the worker, pass a single string: ```tsx theme={"theme":"css-variables"} baseURL: "https://worker.your-domain.com", ``` `baseURL` accepts the same string-or-function shape on `chat.createStartSessionAction`, so the Next.js server action that creates the session also flows through the worker — that's how the very first run's `basePayload.metadata.__cf` gets injected before reaching `api.trigger.dev`: ```ts theme={"theme":"css-variables"} // actions.ts — server-only import { chat } from "@trigger.dev/sdk/ai"; export const startSession = chat.createStartSessionAction("my-agent", { tokenTTL: "1h", baseURL: ({ endpoint }) => endpoint === "sessions" ? WORKER : DIRECT, }); ``` The session-create endpoint discriminator is `"sessions"` (POST `/api/v1/sessions`) or `"auth"` (POST `/api/v1/auth/jwt/claims`) — distinct from the chat transport's `"in"` / `"out"`. If you want everything proxied, pass a string. ## Threat model Two important invariants follow from this design: 1. **Direct browser-to-trigger.dev requests cannot succeed**. As long as your agent's `clientDataSchema` requires the namespaced field, any request that doesn't go through the proxy fails schema validation and produces an empty turn. This is your gate. 2. **Anything inside the namespaced key is trusted only as far as the proxy is the sole writer**. If a client could obtain the public access token and bypass the proxy, they could send arbitrary values under `__cf`. The schema would still validate (it only checks shape, not provenance). The mitigation is operational: the public access token must only be served to clients that reach trigger.dev through the proxy. In practice this means your Next.js server actions and your browser are both behind the same edge layer, and the worker is the only fetch destination for `trigger.dev` baked into either of them. You can harden further with a shared-secret header the worker injects (e.g. `X-Edge-Signature`) and an agent-side check, but in most CDN deployments the deployment topology is already sufficient. ## Recipe summary 1. Pick a namespaced key the edge proxy owns (`__cf`, `__edge`, `__trust`). 2. Deploy a proxy in front of `trigger.dev` that rewrites POST `/api/v1/sessions` and POST `/realtime/v1/sessions/{id}/in/append` to inject your trusted values under that key. 3. Declare the namespace in the agent's `clientDataSchema` so missing or malformed signals fail at the agent boundary. 4. Point your transport's `baseURL` at the proxy. Never expose `api.trigger.dev` directly to the browser. ## See also * [Client Protocol](/docs/ai-chat/client-protocol) — the full wire shape the proxy is rewriting. * [`withClientData`](/docs/ai-chat/reference#withclientdata) — agent-side typed metadata channel. * [Large payloads](/docs/ai-chat/patterns/large-payloads) — for when injected signals or hooks need to ship more than the 1 MiB stream cap allows. # Version upgrades Source: https://trigger.dev/docs/ai-chat/patterns/version-upgrades Gracefully migrate suspended chat agents to a new deployment using chat.requestUpgrade() and the continuation mechanism. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. Chat agent runs are pinned to the worker version they started on. When you deploy a new version, suspended runs resume on the **old** code. If your deploy includes breaking changes (new tools, changed schemas, updated API contracts), this can cause issues. `chat.requestUpgrade()` lets the agent opt out of the current run so the transport triggers a new one on the latest version. ## How it works When `chat.requestUpgrade()` is called in `onTurnStart` or `onValidateMessages`: 1. `run()` is **skipped** — no response is generated on old code 2. The agent calls the server-side `endAndContinueSession` endpoint, which atomically swaps the Session's `currentRunId` to a freshly-triggered run on the latest deployment (optimistic-claim against `currentRunVersion`) 3. The new run picks up the conversation and produces the response 4. The transport's existing SSE subscription to `session.out` keeps receiving chunks across the swap — no client-side reconnect The new run lives on the **same Session** as the old one. `chatId` is the durable identity; only the underlying `currentRunId` rotates. The audit log records the new run with `reason: "upgrade"`. When called from inside `run()` or `chat.defer()`, the current turn completes normally first and the run exits afterward. The next message triggers the continuation on the same session. ```mermaid theme={"theme":"css-variables"} sequenceDiagram participant User participant Transport participant RunV1 as Run (v1) participant RunV2 as Run (v2) User->>Transport: send message Transport->>RunV1: input stream RunV1->>RunV1: onTurnStart → requestUpgrade() RunV1-->>Transport: trigger:upgrade-required RunV1->>RunV1: exit (run() never called) Transport->>RunV2: trigger new run (continuation, same message) RunV2-->>Transport: response stream Transport-->>User: response (seamless) ``` ## Contract versioning Define an explicit version for the contract between your frontend and agent. The frontend sends a `protocolVersion` via `clientData`, and the agent declares which versions it supports. When a breaking change ships (new tools, changed data parts, updated response format), bump the version. This gives you full control — the frontend can be backwards-compatible across multiple agent versions, and the agent only upgrades when it sees a version it doesn't support. ```tsx title="app/components/Chat.tsx" theme={"theme":"css-variables"} import { useTriggerChatTransport } from "@trigger.dev/sdk/chat/react"; import { useChat } from "@ai-sdk/react"; export function Chat() { const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), // Bump this when you ship a breaking change to the chat UI or tools clientData: { userId: user.id, protocolVersion: "v2" }, }); const { messages, sendMessage } = useChat({ transport }); // ... } ``` On the agent side, declare which versions the current code supports: ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { streamText } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; // The set of frontend protocol versions this agent code supports. // When you deploy a breaking change, remove old versions from this set. const SUPPORTED_VERSIONS = new Set(["v2", "v3"]); export const myChat = chat .withClientData({ schema: z.object({ userId: z.string(), protocolVersion: z.string(), }), }) .agent({ id: "my-chat", onTurnStart: async ({ clientData }) => { if (clientData?.protocolVersion && !SUPPORTED_VERSIONS.has(clientData.protocolVersion)) { chat.requestUpgrade(); } }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` The transport includes `clientData` in every payload — both the initial trigger and subsequent records on the session's `.in` channel — so the agent always has the current value. This pattern is useful when: * Your frontend is backwards-compatible across several agent versions, but occasionally ships breaking changes * You want explicit control over when upgrades happen rather than upgrading on every deploy * Multiple frontend versions may be active at the same time (e.g., users with cached tabs) ## Auto-detect from build ID (Next.js / Vercel) For automatic upgrade on every deploy, pass your platform's build ID via `clientData` instead of a manual version. The agent stores the ID from the first message and upgrades when it changes: ```tsx title="app/components/Chat.tsx" theme={"theme":"css-variables"} // Vercel sets this at build time, or use your own build ID const APP_VERSION = process.env.NEXT_PUBLIC_VERCEL_DEPLOYMENT_ID ?? process.env.NEXT_PUBLIC_BUILD_ID ?? "dev"; export function Chat() { const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), clientData: { userId: user.id, appVersion: APP_VERSION }, }); // ... } ``` ```ts title="trigger/chat.ts" theme={"theme":"css-variables"} const initialAppVersion = chat.local<{ version: string }>({ id: "appVersion" }); export const myChat = chat .withClientData({ schema: z.object({ userId: z.string(), appVersion: z.string(), }), }) .agent({ id: "my-chat", onBoot: async ({ clientData }) => { initialAppVersion.init({ version: clientData.appVersion }); }, onTurnStart: async ({ clientData }) => { if (clientData?.appVersion && clientData.appVersion !== initialAppVersion.version) { chat.requestUpgrade(); } }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` This upgrades on **every** deploy, not just breaking changes. Good for fast-moving projects where you always want the latest code. ## Other agent types * **`chat.agent()`** and **`chat.createSession()`** — use `chat.requestUpgrade()` as shown above * **`chat.customAgent()`** — you control the turn loop, so just `return` from `run()` when you want to exit ## Interaction with recovery boot `chat.requestUpgrade()` is a graceful exit — the old run returns cleanly, never writing a partial assistant. The new continuation run boots with an empty `session.out` tail and the upgrade-trigger message on `session.in`. The trigger message dispatches as turn 1 on the new version via the normal continuation-wait path. [`onRecoveryBoot`](/docs/ai-chat/patterns/recovery-boot) does NOT fire on this path — the hook is reserved for mid-stream interruptions (cancel / crash / OOM) where a partial assistant exists on the tail. ## See also * [Lifecycle hooks](/docs/ai-chat/lifecycle-hooks) — where `onTurnStart` and `onChatResume` fit in the turn cycle * [Recovery boot](/docs/ai-chat/patterns/recovery-boot) — the sibling hook for mid-stream interruptions (does NOT fire on `requestUpgrade`) * [Database persistence](/docs/ai-chat/patterns/database-persistence) — how continuations interact with session state * [Client Protocol](/docs/ai-chat/client-protocol#step-4-handle-continuations) — how clients handle continuations at the wire level # Pending Messages Source: https://trigger.dev/docs/ai-chat/pending-messages Inject user messages mid-execution to steer agents between tool-call steps. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. ## Overview When an AI agent is executing tool calls, users may want to send a message that **steers the agent mid-execution** — adding context, correcting course, or refining the request without waiting for the response to finish. The `pendingMessages` option enables this by injecting user messages between tool-call steps via the AI SDK's `prepareStep`. Messages that arrive during streaming are queued and injected at the next step boundary. If there are no more step boundaries (single-step response or final text generation), the message becomes the next turn automatically. ## How it works 1. User sends a message while the agent is streaming 2. The message is sent to the backend via input stream (`transport.sendPendingMessage`) 3. The backend queues it in the steering queue 4. At the next `prepareStep` boundary (between tool-call steps), `shouldInject` is called 5. If it returns `true`, the message is injected into the LLM's context 6. A `data-pending-message-injected` stream chunk confirms injection to the frontend 7. If `prepareStep` never fires (no tool calls), the message becomes the next turn ## Backend: chat.agent Add `pendingMessages` to your `chat.agent` configuration: ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { streamText, stepCountIs } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; export const myChat = chat.agent({ id: "my-chat", pendingMessages: { // Only inject when there are completed steps (tool calls happened) shouldInject: ({ steps }) => steps.length > 0, }, run: async ({ messages, signal }) => { return streamText({ ...chat.toStreamTextOptions({ registry }), messages, tools: { /* ... */ }, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` The `prepareStep` for injection is automatically included when you spread `chat.toStreamTextOptions()`. If you provide your own `prepareStep` after the spread, it overrides the auto-injected one. ### Options | Option | Type | Description | | -------------- | ------------------------------------------------------ | --------------------------------------------------------------------------------------------------- | | `shouldInject` | `(event: PendingMessagesBatchEvent) => boolean` | Decide whether to inject the batch. Called once per step boundary. If absent, no injection happens. | | `prepare` | `(event: PendingMessagesBatchEvent) => ModelMessage[]` | Transform the batch before injection. Default: convert each message via `convertToModelMessages`. | | `onReceived` | `(event) => void` | Called when a message arrives during streaming (per-message). | | `onInjected` | `(event) => void` | Called after a batch is injected. | ### shouldInject Called once per step boundary with the full batch of pending messages. Return `true` to inject all of them, `false` to skip (they'll be available at the next boundary or become the next turn). ```ts theme={"theme":"css-variables"} pendingMessages: { // Always inject shouldInject: () => true, // Only inject after tool calls shouldInject: ({ steps }) => steps.length > 0, // Only inject if there's one message shouldInject: ({ messages }) => messages.length === 1, }, ``` The event includes: | Field | Type | Description | | --------------- | ------------------ | ---------------------------- | | `messages` | `UIMessage[]` | All pending messages (batch) | | `modelMessages` | `ModelMessage[]` | Current conversation | | `steps` | `CompactionStep[]` | Completed steps | | `stepNumber` | `number` | Current step (0-indexed) | | `chatId` | `string` | Chat session ID | | `turn` | `number` | Current turn | | `clientData` | `unknown` | Frontend metadata | ### prepare Transform the batch of pending messages before they're injected into the LLM's context. By default, each UIMessage is converted to ModelMessages individually. Use `prepare` to combine multiple messages or add context: ```ts theme={"theme":"css-variables"} pendingMessages: { shouldInject: ({ steps }) => steps.length > 0, prepare: ({ messages }) => [{ role: "user", content: messages.length === 1 ? messages[0].parts[0]?.text ?? "" : `The user sent ${messages.length} messages:\n${ messages.map((m, i) => `${i + 1}. ${m.parts[0]?.text}`).join("\n") }`, }], }, ``` ### Stream chunk When messages are injected, the SDK automatically writes a `data-pending-message-injected` stream chunk containing the message IDs and text. The frontend uses this to: * Confirm which messages were injected * Remove them from the pending overlay * Render them inline at the injection point in the assistant response A "pending message injected" span also appears in the run trace. ## Backend: chat.createSession Pass `pendingMessages` to the session options: ```ts theme={"theme":"css-variables"} const session = chat.createSession(payload, { signal, idleTimeoutInSeconds: 60, pendingMessages: { shouldInject: () => true, }, }); for await (const turn of session) { const result = streamText({ model: anthropic("claude-sonnet-4-5"), messages: turn.messages, abortSignal: turn.signal, prepareStep: turn.prepareStep(), // Handles injection + compaction stopWhen: stepCountIs(15), }); await turn.complete(result); } ``` Use `turn.prepareStep()` to get a prepareStep function that handles both injection and compaction. Users who spread `chat.toStreamTextOptions()` get it automatically. ## Backend: MessageAccumulator (raw task) Pass `pendingMessages` to the constructor and wire up the message listener manually: ```ts theme={"theme":"css-variables"} const conversation = new chat.MessageAccumulator({ pendingMessages: { shouldInject: () => true, prepare: ({ messages }) => [{ role: "user", content: `[Steering]: ${messages.map(m => m.parts[0]?.text).join(", ")}`, }], }, }); for (let turn = 0; turn < 100; turn++) { // The wire payload carries at most one new message per turn. const messages = await conversation.addIncoming( payload.message ? [payload.message] : [], payload.trigger, turn ); // Listen for steering messages during streaming const sub = chat.messages.on(async (msg) => { if (msg.message) await conversation.steerAsync(msg.message); }); const result = streamText({ model: anthropic("claude-sonnet-4-5"), messages, prepareStep: conversation.prepareStep(), // Handles injection + compaction stopWhen: stepCountIs(15), }); const response = await chat.pipeAndCapture(result); sub.off(); if (response) await conversation.addResponse(response); await chat.writeTurnComplete(); } ``` ### MessageAccumulator methods | Method | Description | | -------------------------------- | -------------------------------------------------------------- | | `steer(message, modelMessages?)` | Queue a UIMessage for injection (sync) | | `steerAsync(message)` | Queue a UIMessage, converting to model messages automatically | | `drainSteering()` | Get and clear unconsumed steering messages | | `prepareStep()` | Returns a prepareStep function handling injection + compaction | ## Frontend: usePendingMessages hook The `usePendingMessages` hook manages all the frontend complexity — tracking pending messages, detecting injections, and handling the turn lifecycle. ```tsx theme={"theme":"css-variables"} import { useChat } from "@ai-sdk/react"; import { useTriggerChatTransport, usePendingMessages } from "@trigger.dev/sdk/chat/react"; function Chat({ chatId }: { chatId: string }) { const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), }); const { messages, setMessages, sendMessage, stop, status } = useChat({ id: chatId, transport, }); const pending = usePendingMessages({ transport, chatId, status, messages, setMessages, sendMessage, metadata: { model: "gpt-4o" }, }); return (
{/* Render messages */} {messages.map((msg) => (
{msg.role === "assistant" ? ( msg.parts.map((part, i) => pending.isInjectionPoint(part) ? ( // Render injected messages inline at the injection point
{pending.getInjectedMessages(part).map((m) => (
{m.text}
))}
) : ( ) ) ) : ( )}
))} {/* Render pending messages */} {pending.pending.map((msg) => (
{msg.text} {msg.mode === "steering" ? "Steering" : "Queued"} {msg.mode === "queued" && status === "streaming" && ( )}
))} {/* Send form */}
{ e.preventDefault(); pending.steer(input); // Steers during streaming, sends normally when ready setInput(""); }}> setInput(e.target.value)} /> {status === "streaming" && ( )}
); } ``` ### Hook API | Property/Method | Type | Description | | ----------------------------- | -------------------------------------- | ------------------------------------------------------------------------- | | `pending` | `PendingMessage[]` | Current pending messages with `id`, `text`, `mode`, and `injected` status | | `steer(text)` | `(text: string) => void` | Send a steering message during streaming, or normal message when ready | | `queue(text)` | `(text: string) => void` | Queue for next turn during streaming, or send normally when ready | | `promoteToSteering(id)` | `(id: string) => void` | Convert a queued message to steering (sends via input stream immediately) | | `isInjectionPoint(part)` | `(part: unknown) => boolean` | Check if an assistant message part is an injection confirmation | | `getInjectedMessageIds(part)` | `(part: unknown) => string[]` | Get message IDs from an injection point | | `getInjectedMessages(part)` | `(part: unknown) => InjectedMessage[]` | Get messages (id + text) from an injection point | ### PendingMessage | Field | Type | Description | | ---------- | ------------------------ | --------------------------------------- | | `id` | `string` | Unique message ID | | `text` | `string` | Message text | | `mode` | `"steering" \| "queued"` | How the message is being handled | | `injected` | `boolean` | Whether the backend confirmed injection | ### Message lifecycle * **Steering messages** are sent via `transport.sendPendingMessage()` immediately. They appear as purple pending bubbles. If injected, they disappear from the overlay and render inline at the injection point. If not injected (no more step boundaries), they auto-send as the next turn when the response finishes. * **Queued messages** stay client-side until the turn completes, then auto-send as the next turn via `sendMessage()`. They can be promoted to steering mid-stream by clicking "Steer instead". * **Promoted messages** are queued messages that were converted to steering. They get sent via input stream immediately and follow the steering lifecycle from that point. ## Transport: sendPendingMessage The `TriggerChatTransport` exposes a `sendPendingMessage` method for sending messages via input stream without disrupting the active stream subscription: ```ts theme={"theme":"css-variables"} const sent = await transport.sendPendingMessage(chatId, { id: crypto.randomUUID(), role: "user", parts: [{ type: "text", text: "and compare to vercel" }], }, { model: "gpt-4o" }); ``` Unlike `sendMessage()` from useChat, this does NOT: * Add the message to useChat's local state * Cancel the active stream subscription * Start a new response stream The `usePendingMessages` hook calls this internally — you typically don't need to use it directly. ## Coexistence with compaction Pending message injection and compaction both use `prepareStep`. When both are configured, the auto-injected `prepareStep` handles them in order: 1. **Compaction** runs first — checks threshold, generates summary if needed 2. **Injection** runs second — pending messages are appended to either the compacted or original messages This means injected messages are always included after compaction, ensuring the LLM sees both the compressed history and the new steering input. # Quick Start Source: https://trigger.dev/docs/ai-chat/quick-start Get a working AI agent in 3 steps — define an agent, generate a token, and wire up the frontend. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. These steps assume you already have a Trigger.dev project with the SDK installed and the CLI authenticated — if you don't, follow [Manual setup](/docs/manual-setup) (or `npx trigger.dev@latest init` in an existing project) first. You should be able to run `pnpm exec trigger dev` from your project root before continuing. The chat surface works with Vercel AI SDK **v5, v6, or v7**; install whichever major you want. On **v7**, also install `@ai-sdk/otel` so your model calls are traced (the SDK registers it for you). See [compatibility](/docs/ai-chat/reference#compatibility) for the full matrix. Use `chat.agent` from `@trigger.dev/sdk/ai` to define an agent that handles chat messages. The `run` function receives `ModelMessage[]` (already converted from the frontend's `UIMessage[]`) — pass them directly to `streamText`. If you return a `StreamTextResult`, it's **automatically piped** to the frontend. ```ts trigger/chat.ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { streamText, stepCountIs } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; export const myChat = chat.agent({ id: "my-chat", run: async ({ messages, signal }) => { return streamText({ // Spread chat.toStreamTextOptions() FIRST — it wires up // prepareStep (compaction, steering, background injection), // the system prompt set via chat.prompt(), and telemetry. // Skipping this is the single most common cause of subtle // bugs (silent broken compaction, missing steering, etc.). ...chat.toStreamTextOptions(), model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` **Always spread `chat.toStreamTextOptions()` into your `streamText` call.** It wires up the `prepareStep` callback that drives compaction, mid-turn steering, and background injection — features that silently no-op if the spread is missing. Spread it **first** so any explicit overrides (e.g. a custom `prepareStep`) win. For a **custom** [`UIMessage`](https://sdk.vercel.ai/docs/reference/ai-sdk-core/ui-message) subtype (typed `data-*` parts, tool map, etc.), define the agent with [`chat.withUIMessage<...>().agent({...})`](/docs/ai-chat/types) instead of `chat.agent`. On your server (e.g. as Next.js server actions), expose two helpers the transport will call: one that creates the chat session, and one that mints a fresh session-scoped access token for refresh. ```ts app/actions.ts theme={"theme":"css-variables"} "use server"; import { auth } from "@trigger.dev/sdk"; import { chat } from "@trigger.dev/sdk/ai"; // Creates the Session row + triggers the first run, returns the // session PAT. Idempotent on (env, chatId) so concurrent calls // converge to the same session. export const startChatSession = chat.createStartSessionAction("my-chat"); // Pure mint — fresh session-scoped PAT for an existing session. // The transport calls this on 401/403 to refresh. export async function mintChatAccessToken(chatId: string) { return auth.createPublicToken({ scopes: { read: { sessions: chatId }, write: { sessions: chatId }, }, expirationTime: "1h", }); } ``` The browser never holds your environment's secret key — both helpers run on your server, where customer-side authorization (per-user, per-plan, etc.) lives alongside any DB writes you want to pair with session creation. Use the `useTriggerChatTransport` hook from `@trigger.dev/sdk/chat/react` to create a memoized transport instance, then pass it to `useChat`. Wire both server actions into the transport's `accessToken` and `startSession` callbacks. The example below uses the Next.js `@/*` path alias for imports from `@/trigger/chat` and `@/app/actions`. If you're not using Next.js (or haven't configured the alias), swap them for relative imports. ```tsx app/components/chat.tsx theme={"theme":"css-variables"} "use client"; import { useState } from "react"; import { useChat } from "@ai-sdk/react"; import { useTriggerChatTransport } from "@trigger.dev/sdk/chat/react"; import type { myChat } from "@/trigger/chat"; import { mintChatAccessToken, startChatSession } from "@/app/actions"; export function Chat() { const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), }); const { messages, sendMessage, stop, status } = useChat({ transport }); const [input, setInput] = useState(""); return (
{messages.map((m) => (
{m.role}: {m.parts.map((part, i) => part.type === "text" ? {part.text} : null )}
))}
{ e.preventDefault(); if (input.trim()) { sendMessage({ text: input }); setInput(""); } }} > setInput(e.target.value)} placeholder="Type a message..." /> {status === "streaming" && ( )}
); } ```
## Next steps * [Backend](/docs/ai-chat/backend) — Lifecycle hooks, persistence, session iterator, raw task primitives * [Tools](/docs/ai-chat/tools): Declare tools so `toModelOutput` survives across turns, typed in `run()` * [Frontend](/docs/ai-chat/frontend) — Session management, client data, reconnection * [Types](/docs/ai-chat/types) — `chat.withUIMessage`, `InferChatUIMessage`, and related typing * [`chat.local`](/docs/ai-chat/chat-local) — Per-run typed state across hooks, run, tools, subtasks * [Sub-agents pattern](/docs/ai-chat/patterns/sub-agents) — Subtask-as-tool, `target: "root"` streaming, `ai.toolExecute` helpers * [Background injection](/docs/ai-chat/background-injection) — `chat.inject()` and `chat.defer()` for between-turn work # API Reference Source: https://trigger.dev/docs/ai-chat/reference Complete API reference for the AI Agents SDK — backend options, events, frontend transport, and hooks. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. ## Compatibility | Dependency | Supported | Notes | | --------------------------------------------------------------- | ------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `@trigger.dev/sdk` | `>=4.5.0-rc.0` | The chat agent surface lives in this SDK release. Install with `@trigger.dev/sdk@rc`. | | `ai` (Vercel AI SDK) | `^5.0.0 \|\| ^6.0.0 \|\| >=7.0.0-canary <8` | Declared as a peer. v6 is what we develop against day to day; v5 and v7 work too (v7 is in canary/beta upstream). Your installed `ai` major drives the chat surface's types. | | `@ai-sdk/otel` | `1.x` (v7 only) | Optional. AI SDK 7 moved model-call span emission out of `ai` core into this adapter. Install it alongside `ai@7` and the SDK auto-registers it, so your model calls show up as spans in the run trace. Not needed on v5/v6, where `ai` core emits spans. See [AI SDK 7 telemetry](#ai-sdk-7-telemetry) below. | | `@ai-sdk/react` | matches your `ai` major | Pulled in by `useChat`. The transport works with whichever React hook ships in the same major as your `ai` version. | | `react` | `^18.0 \|\| ^19.0` | Required only if you use `@trigger.dev/sdk/chat/react` (the frontend transport). Server-only consumers can skip React entirely. | | Node.js | `>=18.20.0` | The SDK's engine constraint. The chat agent itself works on any version the SDK supports. | | Provider packages (`@ai-sdk/openai`, `@ai-sdk/anthropic`, etc.) | versions that target your `ai` major | Pick a provider package whose `ai` peer matches yours. The chat agent doesn't depend on any specific provider — pass whatever model you want into `streamText`. | The `ai` peer is **optional** — server-only setups that don't call `streamText` (raw `task()` with chat primitives) can skip the AI SDK entirely. ### AI SDK 7 telemetry On **AI SDK 7**, model-call spans are emitted by `@ai-sdk/otel` rather than `ai` core. Install it alongside `ai@7`: ```bash theme={"theme":"css-variables"} npm install @ai-sdk/otel ``` The SDK registers it once per worker at chat agent boot, so the `experimental_telemetry` config wired up by `chat.toStreamTextOptions()` keeps producing spans in your run trace with no extra setup. On v5 and v6 nothing changes: `ai` core emits the spans and `@ai-sdk/otel` isn't needed. If you (or a library you import) already register `@ai-sdk/otel` yourself, the SDK detects the existing integration and skips its own registration, so you won't get duplicate spans. To opt out of the auto-registration entirely, set `TRIGGER_AI_SDK_OTEL_AUTOREGISTER=0`. ## ChatAgentOptions Options for `chat.agent()`. | Option | Type | Default | Description | | ----------------------------- | ------------------------------------------------------------------------------------------------- | ------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `id` | `string` | required | Task identifier | | `run` | `(payload: ChatTaskRunPayload) => Promise` | required | Handler for each turn | | `clientDataSchema` | `TaskSchema` | — | Schema for validating and typing `clientData` | | `onBoot` | `(event: BootEvent) => Promise \| void` | — | Fires once per worker process — initial, preloaded, AND reactive continuation. Use for `chat.local` init and per-process resources. See [onBoot](/docs/ai-chat/lifecycle-hooks#onboot). | | `onRecoveryBoot` | `(event: RecoveryBootEvent) => Promise \| RecoveryBootResult \| void` | — | Fires on a continuation boot when the dead predecessor left recovered state (partial assistant or in-flight users). Override the smart default — drop partial, synthesize tool results, emit a recovery banner. See [Recovery boot](/docs/ai-chat/patterns/recovery-boot). | | `onPreload` | `(event: PreloadEvent) => Promise \| void` | — | Fires on preloaded runs before the first message | | `onChatStart` | `(event: ChatStartEvent) => Promise \| void` | — | Fires once per chat, on the very first user message. Does NOT fire on continuation runs or OOM-retries — see [onChatStart](/docs/ai-chat/lifecycle-hooks#onchatstart). | | `onValidateMessages` | `(event: ValidateMessagesEvent) => UIMessage[] \| Promise` | — | Validate/transform UIMessages before model conversion. See [onValidateMessages](/docs/ai-chat/lifecycle-hooks#onvalidatemessages) | | `hydrateMessages` | `(event: HydrateMessagesEvent) => UIMessage[] \| Promise` | — | Load message history from backend, replacing the linear accumulator. See [hydrateMessages](/docs/ai-chat/lifecycle-hooks#hydratemessages) | | `actionSchema` | `TaskSchema` | — | Schema for validating custom actions sent via `transport.sendAction()`. See [Actions](/docs/ai-chat/actions) | | `onAction` | `(event: ActionEvent) => Promise \| unknown` | — | Handle custom actions. Actions are not turns — only `hydrateMessages` + `onAction` fire. Return a `StreamTextResult` (or `string` / `UIMessage`) for a model response; return `void` for side-effect-only. See [Actions](/docs/ai-chat/actions) | | `onTurnStart` | `(event: TurnStartEvent) => Promise \| void` | — | Fires every turn before `run()` | | `onBeforeTurnComplete` | `(event: BeforeTurnCompleteEvent) => Promise \| void` | — | Fires after response but before stream closes. Includes `writer`. | | `onTurnComplete` | `(event: TurnCompleteEvent) => Promise \| void` | — | Fires after each turn completes (stream closed) | | `onCompacted` | `(event: CompactedEvent) => Promise \| void` | — | Fires when compaction occurs. Includes `writer`. See [Compaction](/docs/ai-chat/compaction) | | `compaction` | `ChatAgentCompactionOptions` | — | Automatic context compaction. See [Compaction](/docs/ai-chat/compaction) | | `pendingMessages` | `PendingMessagesOptions` | — | Mid-execution message injection. See [Pending Messages](/docs/ai-chat/pending-messages) | | `prepareMessages` | `(event: PrepareMessagesEvent) => ModelMessage[]` | — | Transform model messages before use (cache breaks, context injection, etc.) | | `tools` | `ToolSet \| ((event: ResolveToolsEvent) => ToolSet \| Promise)` | — | Tools for this agent. Threads each tool's `toModelOutput` through cross-turn history re-conversion, and hands the resolved set back on the run payload. Static set or per-turn function. See [Tools](/docs/ai-chat/tools). | | `maxTurns` | `number` | `100` | Max conversational turns per run | | `turnTimeout` | `string` | `"1h"` | How long to wait for next message | | `idleTimeoutInSeconds` | `number` | `30` | Seconds to stay idle before suspending | | `chatAccessTokenTTL` | `string` | `"1h"` | How long the scoped access token remains valid | | `preloadIdleTimeoutInSeconds` | `number` | Same as `idleTimeoutInSeconds` | Idle timeout after `onPreload` fires | | `preloadTimeout` | `string` | Same as `turnTimeout` | Suspend timeout for preloaded runs | | `uiMessageStreamOptions` | `ChatUIMessageStreamOptions` | — | Default options for `toUIMessageStream()`. Per-turn override via `chat.setUIMessageStreamOptions()` | | `onChatSuspend` | `(event: ChatSuspendEvent) => Promise \| void` | — | Fires right before the run suspends. See [onChatSuspend](/docs/ai-chat/lifecycle-hooks#onchatsuspend--onchatresume) | | `onChatResume` | `(event: ChatResumeEvent) => Promise \| void` | — | Fires right after the run resumes from suspension | | `exitAfterPreloadIdle` | `boolean` | `false` | Exit run after preload idle timeout instead of suspending. See [exitAfterPreloadIdle](/docs/ai-chat/lifecycle-hooks#exitafterpreloadidle) | | `oomMachine` | `MachinePresetName` | — | Fallback machine when an attempt fails with OOM. Setting it enables a single OOM retry on the larger machine. See [OOM resilience](/docs/ai-chat/patterns/oom-resilience) | Plus most standard [TaskOptions](/docs/tasks/overview) — `queue`, `machine`, `maxDuration`, **`onWait`**, **`onResume`**, **`onComplete`**, and other lifecycle hooks. Generic `retry` is **not** exposed on `chat.agent`; use `oomMachine` for OOM recovery, or drop down to a raw [`task()`](/docs/ai-chat/custom-agents) if you need richer retry semantics. Standard hooks use the same parameter shapes as on a normal `task()` (including `ctx`). ## Task context (`ctx`) All **`chat.agent`** lifecycle events (**`onBoot`**, **`onPreload`**, **`onChatStart`**, **`onTurnStart`**, **`onBeforeTurnComplete`**, **`onTurnComplete`**, **`onCompacted`**) and the object passed to **`run`** include **`ctx`**: the same **`TaskRunContext`** shape as the `ctx` in `task({ run: (payload, { ctx }) => ... })`. **`onValidateMessages`** does not include `ctx` — it fires before message accumulation and is designed for pure validation/transformation of incoming messages. Use **`ctx`** for run metadata, tags, parent links, or any API that needs the full run record. The chat-specific string **`runId`** on events is always **`ctx.run.id`**; both are provided for convenience. ```ts theme={"theme":"css-variables"} import type { TaskRunContext } from "@trigger.dev/sdk"; // Equivalent alias (same type): import type { Context } from "@trigger.dev/sdk"; ``` Prefer `import type { TaskRunContext } from "@trigger.dev/sdk"` in application code. Do not depend on `@trigger.dev/core` directly. ## ChatTaskRunPayload The payload passed to the `run` function. | Field | Type | Description | | ------------------- | ------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------- | | `ctx` | `TaskRunContext` | Full task run context — same as `task` `run`’s `{ ctx }` | | `messages` | `ModelMessage[]` | Model-ready messages — pass directly to `streamText` | | `tools` | `ToolSet` | Resolved tools declared on the agent config (empty object when none). Pass straight to `streamText`. See [Tools](/docs/ai-chat/tools). | | `chatId` | `string` | Your conversation ID (the session's `externalId`) | | `sessionId` | `string` | Friendly ID of the backing Session (`session_*`). Use with `sessions.open()` for advanced cases. Always set — every chat.agent run is bound to a Session. | | `trigger` | `"submit-message" \| "regenerate-message"` | What triggered the request | | `messageId` | `string \| undefined` | Message ID (for regenerate) | | `clientData` | Typed by `clientDataSchema` | Custom data from the frontend (typed when schema is provided) | | `continuation` | `boolean` | Whether this run is continuing an existing chat (previous run ended) | | `signal` | `AbortSignal` | Combined stop + cancel signal | | `cancelSignal` | `AbortSignal` | Cancel-only signal | | `stopSignal` | `AbortSignal` | Stop-only signal (per-turn) | | `previousTurnUsage` | `LanguageModelUsage \| undefined` | Token usage from the previous turn (undefined on turn 0) | | `totalUsage` | `LanguageModelUsage` | Cumulative token usage across completed turns so far | ## BootEvent Passed to the `onBoot` callback. | Field | Type | Description | | ----------------- | --------------------------- | ------------------------------------------------------------------------------------------------- | | `ctx` | `TaskRunContext` | Full task run context — see [Task context](#task-context-ctx) | | `chatId` | `string` | Chat session ID | | `runId` | `string` | The Trigger.dev run ID for this run boot | | `chatAccessToken` | `string` | Scoped access token for this run | | `clientData` | Typed by `clientDataSchema` | Custom data from the frontend | | `continuation` | `boolean` | `true` when this run is taking over from a prior dead run (cancel / crash / `endRun` / OOM retry) | | `previousRunId` | `string \| undefined` | Public id of the prior run when `continuation` is true | | `preloaded` | `boolean` | Whether this run was triggered as a preload | ## RecoveryBootEvent Passed to the `onRecoveryBoot` callback. See [Recovery boot](/docs/ai-chat/patterns/recovery-boot) for the full guide. | Field | Type | Description | | ------------------ | ------------------------------------------------------- | ------------------------------------------------------------------------------------------------- | | `ctx` | `TaskRunContext` | Full task run context — see [Task context](#task-context-ctx) | | `chatId` | `string` | Chat session ID | | `runId` | `string` | The Trigger.dev run ID for this run boot | | `previousRunId` | `string` | Public id of the prior run that died | | `cause` | `"cancelled" \| "crashed" \| "unknown"` | Best-effort cause. Currently always `"unknown"` — forward-looking, don't branch on it | | `settledMessages` | `TUIMessage[]` | Chain persisted by the predecessor's last `onTurnComplete` | | `inFlightUsers` | `TUIMessage[]` | User messages on `session.in` past the cursor — the message(s) the predecessor never acknowledged | | `partialAssistant` | `TUIMessage \| undefined` | The trailing assistant message whose stream never received `finish` | | `pendingToolCalls` | [`RecoveryPendingToolCall[]`](#recoverypendingtoolcall) | Tool calls in `input-available` state extracted from `partialAssistant` | | `writer` | [`ChatWriter`](#chatwriter) | Lazy session.out writer — emit a recovery banner / signal here | ## RecoveryBootResult Return value of `onRecoveryBoot`. Every field is optional — omit to accept the smart default. | Field | Type | Description | | ---------------- | --------------------- | --------------------------------------------------------------------------------------------------------------------------------------------- | | `chain` | `TUIMessage[]` | Replaces the seed chain. Default: `[...settledMessages, firstInFlightUser, partialAssistant]` when both present; `settledMessages` otherwise. | | `recoveredTurns` | `TUIMessage[]` | User messages to dispatch as fresh turns. Default: `inFlightUsers.slice(1)` when smart-default fires; `inFlightUsers` otherwise. | | `beforeBoot` | `() => Promise` | Runs after the writer flushes and before the first recovered turn fires. Use for blocking persistence work. | ## RecoveryPendingToolCall | Field | Type | Description | | ------------ | --------- | ---------------------------------------------------------- | | `toolCallId` | `string` | The AI SDK tool call id | | `toolName` | `string` | The tool name (the `tool-${name}` suffix on the part type) | | `input` | `unknown` | The input the model produced for the call | | `partIndex` | `number` | Index into `partialAssistant.parts` for in-place edits | ## PreloadEvent Passed to the `onPreload` callback. | Field | Type | Description | | ----------------- | --------------------------- | -------------------------------------------------------------- | | `ctx` | `TaskRunContext` | Full task run context — see [Task context](#task-context-ctx) | | `chatId` | `string` | Chat session ID | | `runId` | `string` | The Trigger.dev run ID | | `chatAccessToken` | `string` | Scoped access token for this run | | `clientData` | Typed by `clientDataSchema` | Custom data from the frontend | | `writer` | [`ChatWriter`](#chatwriter) | Stream writer for custom chunks. Lazy — no overhead if unused. | ## ChatStartEvent Passed to the `onChatStart` callback. | Field | Type | Description | | ----------------- | --------------------------- | -------------------------------------------------------------- | | `ctx` | `TaskRunContext` | Full task run context — see [Task context](#task-context-ctx) | | `chatId` | `string` | Chat session ID | | `messages` | `ModelMessage[]` | Initial model-ready messages | | `clientData` | Typed by `clientDataSchema` | Custom data from the frontend | | `runId` | `string` | The Trigger.dev run ID | | `chatAccessToken` | `string` | Scoped access token for this run | | `continuation` | `boolean` | Whether this run is continuing an existing chat | | `previousRunId` | `string \| undefined` | Previous run ID (only when `continuation` is true) | | `preloaded` | `boolean` | Whether this run was preloaded before the first message | | `writer` | [`ChatWriter`](#chatwriter) | Stream writer for custom chunks. Lazy — no overhead if unused. | ## ValidateMessagesEvent Passed to the `onValidateMessages` callback. | Field | Type | Description | | ---------- | ------------------------------------------------------------------ | ---------------------------------- | | `messages` | `UIMessage[]` | Incoming UI messages for this turn | | `chatId` | `string` | Chat session ID | | `turn` | `number` | Turn number (0-indexed) | | `trigger` | `"submit-message" \| "regenerate-message" \| "preload" \| "close"` | The trigger type for this turn | ## ResolveToolsEvent Passed to the `tools` function form on `chat.agent`, once per turn, to resolve the tool set for that turn. See [Tools](/docs/ai-chat/tools#static-or-per-turn-tools). | Field | Type | Description | | -------------- | --------------------------- | ----------------------------------------------- | | `chatId` | `string` | Chat session ID | | `turn` | `number` | Turn number (0-indexed) | | `continuation` | `boolean` | Whether this run is continuing an existing chat | | `clientData` | Typed by `clientDataSchema` | Custom data from the frontend | ## HydrateMessagesEvent Passed to the `hydrateMessages` callback. See [hydrateMessages](/docs/ai-chat/lifecycle-hooks#hydratemessages). | Field | Type | Description | | ------------------ | ------------------------------------------------------ | ------------------------------------------------------------- | | `chatId` | `string` | Chat session ID | | `turn` | `number` | Turn number (0-indexed) | | `trigger` | `"submit-message" \| "regenerate-message" \| "action"` | The trigger type for this turn | | `incomingMessages` | `UIMessage[]` | Validated wire messages from the frontend (empty for actions) | | `previousMessages` | `UIMessage[]` | Accumulated UI messages before this turn (`[]` on turn 0) | | `clientData` | Typed by `clientDataSchema` | Custom data from the frontend | | `continuation` | `boolean` | Whether this run is continuing an existing chat | | `previousRunId` | `string \| undefined` | Previous run ID (only when `continuation` is true) | ## ActionEvent Passed to the `onAction` callback. See [Actions](/docs/ai-chat/actions). | Field | Type | Description | | ------------ | --------------------------- | ---------------------------------------------------- | | `action` | Typed by `actionSchema` | The parsed and validated action payload | | `chatId` | `string` | Chat session ID | | `turn` | `number` | Turn number (0-indexed) | | `clientData` | Typed by `clientDataSchema` | Custom data from the frontend | | `uiMessages` | `UIMessage[]` | Accumulated UI messages (after hydration, if set) | | `messages` | `ModelMessage[]` | Accumulated model messages (after hydration, if set) | ## TurnStartEvent Passed to the `onTurnStart` callback. | Field | Type | Description | | ----------------- | --------------------------- | -------------------------------------------------------------- | | `ctx` | `TaskRunContext` | Full task run context — see [Task context](#task-context-ctx) | | `chatId` | `string` | Chat session ID | | `messages` | `ModelMessage[]` | Full accumulated conversation (model format) | | `uiMessages` | `UIMessage[]` | Full accumulated conversation (UI format) | | `turn` | `number` | Turn number (0-indexed) | | `runId` | `string` | The Trigger.dev run ID | | `chatAccessToken` | `string` | Scoped access token for this run | | `clientData` | Typed by `clientDataSchema` | Custom data from the frontend | | `continuation` | `boolean` | Whether this run is continuing an existing chat | | `previousRunId` | `string \| undefined` | Previous run ID (only when `continuation` is true) | | `preloaded` | `boolean` | Whether this run was preloaded | | `writer` | [`ChatWriter`](#chatwriter) | Stream writer for custom chunks. Lazy — no overhead if unused. | ## TurnCompleteEvent Passed to the `onTurnComplete` callback. | Field | Type | Description | | -------------------- | --------------------------------- | ----------------------------------------------------------------------- | | `ctx` | `TaskRunContext` | Full task run context — see [Task context](#task-context-ctx) | | `chatId` | `string` | Chat session ID | | `messages` | `ModelMessage[]` | Full accumulated conversation (model format) | | `uiMessages` | `UIMessage[]` | Full accumulated conversation (UI format) | | `newMessages` | `ModelMessage[]` | Only this turn's messages (model format) | | `newUIMessages` | `UIMessage[]` | Only this turn's messages (UI format) | | `responseMessage` | `UIMessage \| undefined` | The assistant's response for this turn | | `rawResponseMessage` | `UIMessage \| undefined` | Raw response before abort cleanup | | `turn` | `number` | Turn number (0-indexed) | | `runId` | `string` | The Trigger.dev run ID | | `chatAccessToken` | `string` | Scoped access token for this run | | `lastEventId` | `string \| undefined` | Stream position for resumption | | `stopped` | `boolean` | Whether the user stopped generation during this turn | | `continuation` | `boolean` | Whether this run is continuing an existing chat | | `usage` | `LanguageModelUsage \| undefined` | Token usage for this turn | | `totalUsage` | `LanguageModelUsage` | Cumulative token usage across all turns | | `finishReason` | `FinishReason \| undefined` | Why the LLM stopped (`"stop"`, `"tool-calls"`, `"error"`, …) | | `error` | `unknown` | Set when the turn threw; `responseMessage` is then undefined or partial | ## BeforeTurnCompleteEvent Passed to the `onBeforeTurnComplete` callback. Same fields as `TurnCompleteEvent` (including **`ctx`**) plus a `writer`. | Field | Type | Description | | -------------------------------- | --------------------------- | ----------------------------------------------------------------------------- | | *(all TurnCompleteEvent fields)* | | See [TurnCompleteEvent](#turncompleteevent) (includes `ctx`) | | `writer` | [`ChatWriter`](#chatwriter) | Stream writer — the stream is still open so chunks appear in the current turn | ## ChatSuspendEvent Passed to the `onChatSuspend` callback. A discriminated union on `phase`. | Field | Type | Description | | ------------ | --------------------------- | ---------------------------------------------------- | | `phase` | `"preload" \| "turn"` | Whether this is a preload or post-turn suspension | | `ctx` | `TaskRunContext` | Full task run context | | `chatId` | `string` | Chat session ID | | `runId` | `string` | The Trigger.dev run ID | | `clientData` | Typed by `clientDataSchema` | Custom data from the frontend | | `turn` | `number` | Turn number (**`"turn"` phase only**) | | `messages` | `ModelMessage[]` | Accumulated model messages (**`"turn"` phase only**) | | `uiMessages` | `UIMessage[]` | Accumulated UI messages (**`"turn"` phase only**) | ## ChatResumeEvent Passed to the `onChatResume` callback. Same discriminated union shape as `ChatSuspendEvent`. | Field | Type | Description | | ------------ | --------------------------- | ---------------------------------------------------- | | `phase` | `"preload" \| "turn"` | Whether this is a preload or post-turn resumption | | `ctx` | `TaskRunContext` | Full task run context | | `chatId` | `string` | Chat session ID | | `runId` | `string` | The Trigger.dev run ID | | `clientData` | Typed by `clientDataSchema` | Custom data from the frontend | | `turn` | `number` | Turn number (**`"turn"` phase only**) | | `messages` | `ModelMessage[]` | Accumulated model messages (**`"turn"` phase only**) | | `uiMessages` | `UIMessage[]` | Accumulated UI messages (**`"turn"` phase only**) | ## ChatWriter A stream writer passed to lifecycle callbacks. Write custom `UIMessageChunk` parts (e.g. `data-*` parts) to the chat stream. The writer is lazy — no stream is opened unless you call `write()` or `merge()`, so there's zero overhead for callbacks that don't use it. | Method | Type | Description | | --------------- | -------------------------------------------------- | -------------------------------------------------- | | `write(part)` | `(part: UIMessageChunk) => void` | Write a single chunk to the chat stream | | `merge(stream)` | `(stream: ReadableStream) => void` | Merge another stream's chunks into the chat stream | ```ts theme={"theme":"css-variables"} onTurnStart: async ({ writer }) => { // Write a custom data part — render it on the frontend writer.write({ type: "data-status", data: { loading: true } }); }, onBeforeTurnComplete: async ({ writer, usage }) => { // Stream is still open — these chunks arrive before the turn ends writer.write({ type: "data-usage", data: { tokens: usage?.totalTokens } }); }, ``` ## ChatAgentCompactionOptions Options for the `compaction` field on `chat.agent()`. See [Compaction](/docs/ai-chat/compaction) for usage guide. | Option | Type | Required | Description | | ---------------------- | ---------------------------------------------------------------------------- | -------- | ---------------------------------------------------------------------------- | | `shouldCompact` | `(event: ShouldCompactEvent) => boolean \| Promise` | Yes | Decide whether to compact. Return `true` to trigger | | `summarize` | `(event: SummarizeEvent) => Promise` | Yes | Generate a summary from the current messages | | `compactUIMessages` | `(event: CompactMessagesEvent) => UIMessage[] \| Promise` | No | Transform UI messages after compaction. Default: preserve all | | `compactModelMessages` | `(event: CompactMessagesEvent) => ModelMessage[] \| Promise` | No | Transform model messages after compaction. Default: replace all with summary | ## CompactMessagesEvent Passed to `compactUIMessages` and `compactModelMessages` callbacks. | Field | Type | Description | | --------------- | -------------------- | ---------------------------------------------------- | | `summary` | `string` | The generated summary text | | `uiMessages` | `UIMessage[]` | Current UI messages (full conversation) | | `modelMessages` | `ModelMessage[]` | Current model messages (full conversation) | | `chatId` | `string` | Chat session ID | | `turn` | `number` | Current turn (0-indexed) | | `clientData` | `unknown` | Custom data from the frontend | | `source` | `"inner" \| "outer"` | Whether compaction is between steps or between turns | ## CompactedEvent Passed to the `onCompacted` callback. | Field | Type | Description | | -------------- | --------------------------- | ------------------------------------------------------------- | | `ctx` | `TaskRunContext` | Full task run context — see [Task context](#task-context-ctx) | | `summary` | `string` | The generated summary text | | `messages` | `ModelMessage[]` | Messages that were compacted (pre-compaction) | | `messageCount` | `number` | Number of messages before compaction | | `usage` | `LanguageModelUsage` | Token usage from the triggering step/turn | | `totalTokens` | `number \| undefined` | Total token count that triggered compaction | | `inputTokens` | `number \| undefined` | Input token count | | `outputTokens` | `number \| undefined` | Output token count | | `stepNumber` | `number` | Step number (-1 for outer loop) | | `chatId` | `string \| undefined` | Chat session ID | | `turn` | `number \| undefined` | Current turn | | `writer` | [`ChatWriter`](#chatwriter) | Stream writer for custom chunks during compaction | ## PendingMessagesOptions Options for the `pendingMessages` field. See [Pending Messages](/docs/ai-chat/pending-messages) for usage guide. | Option | Type | Required | Description | | -------------- | --------------------------------------------------------------------------------- | -------- | ----------------------------------------------------------------------------------------- | | `shouldInject` | `(event: PendingMessagesBatchEvent) => boolean \| Promise` | No | Decide whether to inject the batch between tool-call steps. If absent, no injection. | | `prepare` | `(event: PendingMessagesBatchEvent) => ModelMessage[] \| Promise` | No | Transform the batch before injection. Default: convert each via `convertToModelMessages`. | | `onReceived` | `(event: PendingMessageReceivedEvent) => void \| Promise` | No | Called when a message arrives during streaming (per-message). | | `onInjected` | `(event: PendingMessagesInjectedEvent) => void \| Promise` | No | Called after a batch is injected via prepareStep. | ## PendingMessagesBatchEvent Passed to `shouldInject` and `prepare` callbacks. | Field | Type | Description | | --------------- | ------------------ | ----------------------------- | | `messages` | `UIMessage[]` | All pending messages (batch) | | `modelMessages` | `ModelMessage[]` | Current conversation | | `steps` | `CompactionStep[]` | Completed steps so far | | `stepNumber` | `number` | Current step (0-indexed) | | `chatId` | `string` | Chat session ID | | `turn` | `number` | Current turn (0-indexed) | | `clientData` | `unknown` | Custom data from the frontend | ## PendingMessagesInjectedEvent Passed to `onInjected` callback. | Field | Type | Description | | ----------------------- | ---------------- | ------------------------------------- | | `messages` | `UIMessage[]` | All injected UI messages | | `injectedModelMessages` | `ModelMessage[]` | The model messages that were injected | | `chatId` | `string` | Chat session ID | | `turn` | `number` | Current turn | | `stepNumber` | `number` | Step where injection occurred | ## UsePendingMessagesReturn Return value of `usePendingMessages` hook. See [Pending Messages — Frontend](/docs/ai-chat/pending-messages#frontend-usependingmessages-hook). | Property/Method | Type | Description | | ----------------------- | -------------------------------------- | --------------------------------------------------------------- | | `pending` | `PendingMessage[]` | Current pending messages with mode and injection status | | `steer` | `(text: string) => void` | Send a steering message (or normal message when not streaming) | | `queue` | `(text: string) => void` | Queue for next turn (or send normally when not streaming) | | `promoteToSteering` | `(id: string) => void` | Convert a queued message to steering | | `isInjectionPoint` | `(part: unknown) => boolean` | Check if an assistant message part is an injection confirmation | | `getInjectedMessageIds` | `(part: unknown) => string[]` | Get message IDs from an injection point | | `getInjectedMessages` | `(part: unknown) => InjectedMessage[]` | Get messages (id + text) from an injection point | ## ChatSessionOptions Options for `chat.createSession()`. | Option | Type | Default | Description | | ---------------------- | ---------------------------- | ----------- | ---------------------------------------------------------------------------------------------------------------- | | `signal` | `AbortSignal` | required | Run-level cancel signal | | `idleTimeoutInSeconds` | `number` | `30` | Seconds to stay idle between turns | | `timeout` | `string` | `"1h"` | Duration string for suspend timeout | | `maxTurns` | `number` | `100` | Max turns before ending | | `compaction` | `ChatAgentCompactionOptions` | `undefined` | Automatic context [compaction](/docs/ai-chat/compaction) — same options as `chat.agent({ compaction })` | | `pendingMessages` | `PendingMessagesOptions` | `undefined` | Mid-execution [message injection](/docs/ai-chat/pending-messages) — same options as `chat.agent({ pendingMessages })` | ## ChatTurn Each turn yielded by `chat.createSession()`. | Field | Type | Description | | ------------------- | --------------------------------- | -------------------------------------------------------- | | `number` | `number` | Turn number (0-indexed) | | `chatId` | `string` | Chat session ID | | `trigger` | `string` | What triggered this turn | | `clientData` | `unknown` | Client data from the transport | | `messages` | `ModelMessage[]` | Full accumulated model messages | | `uiMessages` | `UIMessage[]` | Full accumulated UI messages | | `signal` | `AbortSignal` | Combined stop+cancel signal (fresh each turn) | | `stopped` | `boolean` | Whether the user stopped generation this turn | | `continuation` | `boolean` | Whether this is a continuation run | | `previousTurnUsage` | `LanguageModelUsage \| undefined` | Token usage from the previous turn (undefined on turn 0) | | `totalUsage` | `LanguageModelUsage` | Cumulative token usage across all completed turns | | Method | Returns | Description | | ------------------------- | --------------------------------- | ----------------------------------------------------------------------------------------------------------------------- | | `complete(source)` | `Promise` | Pipe, capture, accumulate, cleanup, and signal turn-complete | | `done()` | `Promise` | Signal turn-complete (when you've piped manually) | | `addResponse(response)` | `Promise` | Add response to accumulator manually | | `setMessages(uiMessages)` | `Promise` | Replace the accumulated messages (continuation seeding, compaction) | | `prepareStep()` | `function \| undefined` | `prepareStep` callback wiring compaction + injection — pass to `streamText` when not using `chat.toStreamTextOptions()` | ## chat namespace All methods available on the `chat` object from `@trigger.dev/sdk/ai`. | Method | Description | | ------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- | | `chat.agent(options)` | Create a chat agent | | `chat.createSession(payload, options)` | Create an async iterator for chat turns | | `chat.pipe(source, options?)` | Pipe a stream to the frontend (from anywhere inside a task) | | `chat.pipeAndCapture(source, options?)` | Pipe and capture the response `UIMessage` | | `chat.writeTurnComplete(options?)` | Signal the frontend that the current turn is complete | | `chat.createStopSignal()` | Create a managed stop signal wired to the stop input stream | | `chat.messages` | Input stream for incoming messages — use `.waitWithIdleTimeout()` | | `chat.local({ id })` | Create a per-run typed local (see [`chat.local`](/docs/ai-chat/chat-local)) | | `chat.createStartSessionAction(taskId, options?)` | Returns a server action that creates a chat Session + triggers the first run + returns a session-scoped PAT. Idempotent on `(env, externalId)`. | | `chat.requestUpgrade()` | End the current run after this turn so the next message starts on the latest agent version. Server-orchestrated handoff. | | `chat.setTurnTimeout(duration)` | Override turn timeout at runtime (e.g. `"2h"`) | | `chat.setTurnTimeoutInSeconds(seconds)` | Override turn timeout at runtime (in seconds) | | `chat.setIdleTimeoutInSeconds(seconds)` | Override idle timeout at runtime | | `chat.setUIMessageStreamOptions(options)` | Override `toUIMessageStream()` options for the current turn | | `chat.defer(promise)` | Run background work in parallel with streaming, awaited before `onTurnComplete` | | `chat.isStopped()` | Check if the current turn was stopped by the user | | `chat.cleanupAbortedParts(message)` | Remove incomplete parts from a stopped response message | | `chat.response.write(chunk)` | Write a data part that streams to the frontend AND persists in `onTurnComplete`'s `responseMessage` | | `chat.stream` | Raw chat output stream — use `.writer()`, `.pipe()`, `.append()`, `.read()`. Chunks are NOT accumulated into the response. | | `chat.history.all()` | Read the current accumulated UI messages (returns a copy). See [chat.history](/docs/ai-chat/backend#chat-history) | | `chat.history.set(messages)` | Replace all accumulated messages (same as `chat.setMessages()`) | | `chat.history.remove(messageId)` | Remove a specific message by ID | | `chat.history.rollbackTo(messageId)` | Keep messages up to and including the given ID (undo/rollback) | | `chat.history.replace(messageId, message)` | Replace a specific message by ID (edit) | | `chat.history.slice(start, end?)` | Keep only messages in the given range | | `chat.MessageAccumulator` | Class that accumulates conversation messages across turns | | `chat.withUIMessage(config?)` | Returns a [ChatBuilder](/docs/ai-chat/types#chatbuilder) with a fixed `UIMessage` subtype. See [Types](/docs/ai-chat/types) | | `chat.withClientData({ schema })` | Returns a [ChatBuilder](/docs/ai-chat/types#chatbuilder) with a fixed client data schema. See [Types](/docs/ai-chat/types#typed-client-data-with-chatwithclientdata) | ## `chat.withUIMessage` Returns a [`ChatBuilder`](/docs/ai-chat/types#chatbuilder) with a fixed `UIMessage` subtype. Chain `.withClientData()`, hook methods, and `.agent()`. ```ts theme={"theme":"css-variables"} chat.withUIMessage(config?: ChatWithUIMessageConfig): ChatBuilder; ``` | Parameter | Type | Description | | ---------------------- | ---------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------- | | `config.streamOptions` | `ChatUIMessageStreamOptions` | Optional defaults for `toUIMessageStream()`. Shallow-merged with `uiMessageStreamOptions` on the inner `.agent({ ... })` (agent wins on key conflicts). | Use this when you need [`InferChatUIMessage`](#inferchatuimessage) / typed `data-*` parts / `InferUITools` to line up across backend hooks and `useChat`. Full guide: [Types](/docs/ai-chat/types). ## `chat.withClientData` Returns a [`ChatBuilder`](/docs/ai-chat/types#chatbuilder) with a fixed client data schema. All hooks and `run` get typed `clientData` without passing `clientDataSchema` in `.agent()` options. ```ts theme={"theme":"css-variables"} chat.withClientData({ schema: TSchema }): ChatBuilder; ``` | Parameter | Type | Description | | --------- | ------------ | -------------------------------------------------- | | `schema` | `TaskSchema` | Zod, ArkType, Valibot, or any supported schema lib | Full guide: [Typed client data](/docs/ai-chat/types#typed-client-data-with-chatwithclientdata). ## `ChatWithUIMessageConfig` | Field | Type | Description | | --------------- | ---------------------------------- | ----------------------------------------------------------------------- | | `streamOptions` | `ChatUIMessageStreamOptions` | Default `toUIMessageStream()` options for agents created via `.agent()` | ## `InferChatUIMessage` Type helper: extracts the `UIMessage` subtype from a chat agent’s wire payload. ```ts theme={"theme":"css-variables"} import type { InferChatUIMessage } from "@trigger.dev/sdk/ai"; // Use the /chat/react re-export when you're already importing other React helpers. type Msg = InferChatUIMessage; ``` Use with `useChat({ transport })` when using [`chat.withUIMessage`](/docs/ai-chat/types). For agents defined with plain `chat.agent()` (no custom generic), this resolves to the base `UIMessage`. ## `InferChatUIMessageFromTools` Type helper: derives the chat `UIMessage` type (with typed `tool-${name}` parts) directly from a tool set. Shorthand for `UIMessage>`. ```ts theme={"theme":"css-variables"} import type { InferChatUIMessageFromTools } from "@trigger.dev/sdk/ai"; const tools = { search, readFile }; type ChatUiMessage = InferChatUIMessageFromTools; ``` Pin it on the agent with [`chat.withUIMessage()`](/docs/ai-chat/types) and reuse it on the client. See [Tools](/docs/ai-chat/tools#typing-messages-from-your-tools). ## AI helpers (`ai` from `@trigger.dev/sdk/ai`) | Export | Status | Description | | ----------------------------------------------------------------------------------- | -------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `ai.toolExecute(task)` | **Preferred** | Returns the `execute` function for AI SDK `tool()`. Runs the task via `triggerAndSubscribe` and attaches tool/chat metadata (same behavior the deprecated wrapper used internally). | | `ai.tool(task, options?)` | **Deprecated** | Wraps `tool()` / `dynamicTool()` and the same execute path. Migrate to `tool({ ..., execute: ai.toolExecute(task) })`. See [Task-backed AI tools](/docs/tasks/schemaTask#task-backed-ai-tools). | | `ai.toolCallId`, `ai.chatContext`, `ai.chatContextOrThrow`, `ai.currentToolOptions` | Supported | Work for any task-backed tool execute path, including `ai.toolExecute`. | ## ChatUIMessageStreamOptions Options for customizing `toUIMessageStream()`. Set as static defaults via `uiMessageStreamOptions` on `chat.agent()`, or override per-turn via `chat.setUIMessageStreamOptions()`. See [Stream options](/docs/ai-chat/backend#stream-options) for usage examples. Derived from the AI SDK's `UIMessageStreamOptions` with `onFinish` and `originalMessages` omitted (managed internally — `onFinish` for response capture, `originalMessages` for cross-turn message ID reuse). | Option | Type | Default | Description | | ------------------- | --------------------------------- | --------------------- | --------------------------------------------------------------------------------------------------------------------------------------------- | | `onError` | `(error: unknown) => string` | Raw error message | Called on LLM errors and tool execution errors. Return a sanitized string — sent as `{ type: "error", errorText }` to the frontend. | | `sendReasoning` | `boolean` | `true` | Send reasoning parts to the client | | `sendSources` | `boolean` | `false` | Send source parts to the client | | `sendFinish` | `boolean` | `true` | Send the finish event. Set to `false` when chaining multiple `streamText` calls. | | `sendStart` | `boolean` | `true` | Send the message start event. Set to `false` when chaining. | | `messageMetadata` | `(options: { part }) => metadata` | — | Extract message metadata to send to the client. Called on `start` and `finish` events. | | `generateMessageId` | `() => string` | AI SDK's `generateId` | Custom message ID generator for response messages (e.g. UUID-v7). IDs are shared between frontend and backend via the stream's `start` chunk. | ## TriggerChatTransport options Options for the frontend transport constructor and `useTriggerChatTransport` hook. | Option | Type | Default | Description | | ---------------------- | --------------------------------------------------------------------------------------------------------- | --------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `task` | `string` | required | Task ID the transport's session is bound to. Threaded into `startSession`'s params. | | `accessToken` | `(params: AccessTokenParams) => string \| Promise` | required | Pure refresh — mints a fresh session-scoped PAT. Called on 401/403. See [callback shape](#accesstoken-callback). | | `startSession` | `(params: StartSessionParams) => Promise` | optional | Creates the chat Session and returns the session-scoped PAT. Called on `transport.preload(chatId)` and lazily on the first `sendMessage` for any chatId without a cached PAT. See [callback shape](#startsession-callback). | | `baseURL` | `string \| (ctx: { endpoint: "in" \| "out"; chatId: string }) => string` | `"https://api.trigger.dev"` | API base URL. String form applies to every endpoint; function form lets you pick per endpoint — e.g. route `.in/append` through a trusted edge proxy while keeping `.out` SSE direct (see [Trusted edge signals](/docs/ai-chat/patterns/trusted-edge-signals)). | | `fetch` | `(url: string, init: RequestInit, ctx: { endpoint: "in" \| "out"; chatId: string }) => Promise` | — | Per-request fetch override. Invoked for both `.in/append` POSTs and the `.out` SSE GET. Use for header injection (tracing), custom retries, or proxy rewrites beyond what `baseURL` can express. | | `headers` | `Record` | — | Extra headers for API requests | | `streamTimeoutSeconds` | `number` | `120` | How long to wait for stream data | | `clientData` | Typed by `clientDataSchema` | — | Default client data merged into per-turn `metadata` and threaded through `startSession`'s params (so the first run's `payload.metadata` matches per-turn `metadata`). Live-updated when the option value changes. | | `sessions` | `Record` | — | Restore sessions from storage. See [ChatSession](#chatsession). | | `onSessionChange` | `(chatId, session \| null) => void` | — | Fires when session state changes. `session` is the full `ChatSession` or `null` when the run ends. | | `multiTab` | `boolean` | `false` | Enable multi-tab claim coordination via `BroadcastChannel`. See [Frontend → multi-tab](/docs/ai-chat/frontend#multi-tab-coordination). | | `watch` | `boolean` | `false` | Read-only watcher mode — keep the SSE subscription open across `trigger:turn-complete` so a viewer sees turns 2, 3, … through one long-lived stream. | | `headStart` | `string` | — | URL of a [`chat.headStart`](/docs/ai-chat/fast-starts#head-start) route handler. When set, the FIRST message of a brand-new chat POSTs to this URL so step 1's LLM call runs in your warm process while the agent run boots in parallel. Subsequent turns bypass it. | ### `accessToken` callback The transport invokes `accessToken` whenever it needs a *fresh* session-scoped PAT — initial use after no PAT is cached, or after a 401/403 from any session-PAT-authed request. The callback's job is to **return a token, not to start a run.** `AccessTokenParams`: | Field | Type | Description | | -------- | -------- | -------------------- | | `chatId` | `string` | The conversation id. | Customer implementation typically wraps `auth.createPublicToken` server-side: ```ts theme={"theme":"css-variables"} "use server"; import { auth } from "@trigger.dev/sdk"; export async function mintChatAccessToken(chatId: string) { return auth.createPublicToken({ scopes: { read: { sessions: chatId }, write: { sessions: chatId } }, expirationTime: "1h", }); } ``` ```ts theme={"theme":"css-variables"} const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), }); ``` ### `startSession` callback The transport invokes `startSession` when it needs to *create* the session — on `transport.preload(chatId)`, and lazily on the first `sendMessage` for any chatId without a cached PAT. Concurrent and repeat calls dedupe via an in-flight promise, and the customer's wrapped helper is idempotent on `(env, externalId)` so two tabs / two `preload` calls converge on the same session. `StartSessionParams`: | Field | Type | Description | | ------------ | ------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `taskId` | `string` | The transport's `task` value. | | `chatId` | `string` | The conversation id (the session's `externalId`). | | `clientData` | `TClientData` | The transport's current `clientData` option. Pass through to `triggerConfig.basePayload.metadata` so the first run's `payload.metadata` matches per-turn `metadata`. | Customer implementation wraps `chat.createStartSessionAction(taskId)`: ```ts theme={"theme":"css-variables"} "use server"; import { chat } from "@trigger.dev/sdk/ai"; export const startChatSession = chat.createStartSessionAction("my-chat"); ``` ```ts theme={"theme":"css-variables"} const transport = useTriggerChatTransport({ task: "my-chat", startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), }); ``` `startSession` is optional only when the customer fully manages the session lifecycle externally (e.g. by hydrating `sessions: { [chatId]: ... }` and never calling `preload`). Most customers should provide it. ### multiTab Enable multi-tab coordination. When `true`, only one browser tab can send messages to a given chatId at a time. Other tabs enter read-only mode with real-time message updates via `BroadcastChannel`. ```ts theme={"theme":"css-variables"} const transport = useTriggerChatTransport({ task: "my-chat", accessToken, multiTab: true, }); ``` No-op when `BroadcastChannel` is unavailable (SSR, Node.js). See [Multi-tab coordination](/docs/ai-chat/frontend#multi-tab-coordination). ### Trigger configuration Trigger config (machine, queue, tags, maxAttempts, idleTimeoutInSeconds) lives server-side in `chat.createStartSessionAction(taskId, options?)`. The transport doesn't accept these options directly — pass them when wrapping the action: ```ts theme={"theme":"css-variables"} "use server"; import { chat } from "@trigger.dev/sdk/ai"; export const startChatSession = chat.createStartSessionAction("my-chat", { triggerConfig: { machine: "small-1x", queue: "chat-queue", tags: ["user:123"], maxAttempts: 3, idleTimeoutInSeconds: 60, }, }); ``` A `chat:{chatId}` tag is automatically added to every run. For per-call values that vary by chatId (e.g. plan-tier-driven machine), accept extra params on the customer's server action and pass them into `chat.createStartSessionAction(...)`'s options at call time. ### transport.stopGeneration() Stop the current generation for a chat session. Sends a stop signal to the backend task and closes the active SSE connection. ```ts theme={"theme":"css-variables"} transport.stopGeneration(chatId: string): Promise ``` Returns `true` if the stop signal was sent, `false` if there's no active session. Works for both initial connections and reconnected streams (after page refresh with `resume: true`). Use alongside `useChat`'s `stop()` for a complete stop experience: ```tsx theme={"theme":"css-variables"} const { stop: aiStop } = useChat({ transport }); const stop = useCallback(() => { transport.stopGeneration(chatId); aiStop(); }, [transport, chatId, aiStop]); ``` See [Stop generation](/docs/ai-chat/frontend#stop-generation) for full details. ### transport.sendAction() Send a custom action to the agent. Actions wake the agent from suspension and fire `onAction`. They are not turns — `run()` and turn lifecycle hooks do not fire. If `onAction` returns a `StreamTextResult`, the response is auto-piped to the frontend. ```ts theme={"theme":"css-variables"} transport.sendAction(chatId: string, action: unknown): Promise> ``` The action payload is validated against the agent's `actionSchema` on the backend. ```tsx theme={"theme":"css-variables"} // Undo button ``` See [Actions](/docs/ai-chat/actions) for backend setup and [Sending actions](/docs/ai-chat/frontend#sending-actions) for frontend usage. ### transport.preload() Eagerly trigger a run before the first message. ```ts theme={"theme":"css-variables"} transport.preload(chatId): Promise ``` No-op if a session already exists for this chatId. The preload idle window is set by `preloadIdleTimeoutInSeconds` on the agent, not by this call. See [Preload](/docs/ai-chat/fast-starts#preload) for full details. ## useTriggerChatTransport React hook that creates and memoizes a `TriggerChatTransport` instance. Import from `@trigger.dev/sdk/chat/react`. ```tsx theme={"theme":"css-variables"} import { useTriggerChatTransport } from "@trigger.dev/sdk/chat/react"; import type { myChat } from "@/trigger/chat"; const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), sessions: savedSessions, onSessionChange: handleSessionChange, }); ``` The transport is created once on first render and reused across re-renders. Pass a type parameter for compile-time validation of the task ID. ## AgentChat options Options for the server-side chat client constructor. Import `AgentChat` from `@trigger.dev/sdk/chat`. | Option | Type | Default | Description | | ---------------------- | --------------------------------------------------------------------------------------------------------- | -------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `agent` | `string` | required | Task ID of the chat agent to converse with. | | `id` | `string` | `crypto.randomUUID()` | Conversation ID. Used as the Session `externalId` and for tagging runs. | | `clientData` | Typed by `clientDataSchema` | — | Client data included in every request. Same shape as the agent's `clientDataSchema`. | | `session` | `ChatSession` | — | Restore a previous session (pass `lastEventId` to resume SSE). | | `triggerConfig` | `Partial` | — | Default trigger config used when starting a new session (machine, tags, etc.). | | `streamTimeoutSeconds` | `number` | `120` | SSE timeout in seconds. | | `onTriggered` | `(event) => void \| Promise` | — | Fires when a new run is triggered for this session. | | `onTurnComplete` | `(event) => void \| Promise` | — | Fires when a turn completes. Persist `event.lastEventId` for stream resumption. | | `baseURL` | `string \| (ctx: { endpoint: "in" \| "out"; chatId: string }) => string` | `apiClientManager.baseURL` | API base URL. String form applies to every endpoint; function form picks per endpoint. Defaults to whatever `@trigger.dev/sdk` was configured with (typically `TRIGGER_API_URL`). | | `fetch` | `(url: string, init: RequestInit, ctx: { endpoint: "in" \| "out"; chatId: string }) => Promise` | — | Per-request fetch override. Invoked for both `.in/append` POSTs and the `.out` SSE GET. Use for header injection, custom retries, or proxy rewrites. | ## createStartSessionAction options Second argument to `chat.createStartSessionAction(taskId, options?)`. Controls how the server-mediated session-create call reaches the trigger.dev API. | Option | Type | Default | Description | | --------------- | ---------------------------------------------------------------------------------------------------------------- | -------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `tokenTTL` | `string \| number \| Date` | `"1h"` | TTL for the session-scoped public access token returned to the browser. | | `triggerConfig` | `Partial` | — | Default trigger config (machine, tags, queue, etc.). Per-call config shallow-merges on top. | | `baseURL` | `string \| (ctx: { endpoint: "sessions" \| "auth"; chatId: string }) => string` | `apiClientManager.baseURL` | API base URL. `endpoint` is `"sessions"` for `POST /api/v1/sessions` or `"auth"` for `POST /api/v1/auth/jwt/claims` (only fires when `tokenTTL` is set). | | `fetch` | `(url: string, init: RequestInit, ctx: { endpoint: "sessions" \| "auth"; chatId: string }) => Promise` | — | Per-request fetch override. Use to route session-create through a trusted edge proxy so `basePayload.metadata` is rewritten before reaching `api.trigger.dev`. | ## useMultiTabChat React hook for multi-tab message coordination. Import from `@trigger.dev/sdk/chat/react`. ```tsx theme={"theme":"css-variables"} import { useMultiTabChat } from "@trigger.dev/sdk/chat/react"; const { isReadOnly } = useMultiTabChat(transport, chatId, messages, setMessages); ``` | Parameter | Type | Description | | ------------- | ---------------------- | ---------------------------------------- | | `transport` | `TriggerChatTransport` | Transport instance with `multiTab: true` | | `chatId` | `string` | The chat session ID | | `messages` | `UIMessage[]` | Current messages from `useChat` | | `setMessages` | `(messages) => void` | Message setter from `useChat` | **Returns:** `{ isReadOnly: boolean }` — `true` when another tab is actively sending to this chatId. The hook handles: * Tracking read-only state from the transport's `BroadcastChannel` coordinator * Broadcasting messages when this tab is the active sender * Receiving messages from other tabs and updating via `setMessages` See [Multi-tab coordination](/docs/ai-chat/frontend#multi-tab-coordination). ## ChatSession Persistable session state for the frontend `TriggerChatTransport` and the server-side `AgentChat`. The underlying Session row is keyed on `chatId` (durable across runs); the persistable shape is just the SSE resume cursor and a refresh token. | Field | Type | Description | | ------------------- | ---------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `publicAccessToken` | `string` | Session-scoped JWT (`read:sessions:{chatId} + write:sessions:{chatId}`). Refreshed automatically on 401/403 via the transport's `accessToken` callback. | | `lastEventId` | `string \| undefined` | Last SSE event received on `.out`. Used to resume mid-stream after a disconnect. | | `isStreaming` | `boolean \| undefined` | Optional. If persisted, `reconnectToStream` uses it as a fast-path short-circuit. If omitted, the server decides via the session's [`X-Session-Settled`](/docs/ai-chat/client-protocol#x-session-settled-fast-close-on-idle-reconnects) response header. | ## ChatInputChunk The wire shape for records sent on `.in`. Consumed by `chat.agent` internally — you typically don't write these yourself; `transport.sendMessage`, `transport.stopGeneration`, and `transport.sendAction` all serialize into this shape. ```ts theme={"theme":"css-variables"} type ChatInputChunk = | { kind: "message"; payload: ChatTaskWirePayload } | { kind: "stop"; message?: string }; ``` | Variant | When | Payload | | ----------------- | ------------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------- | | `kind: "message"` | New message, action, approval response, or close | `payload` is a full `ChatTaskWirePayload` — its `trigger` field (`"submit-message"` / `"action"` / `"close"`) determines the agent's dispatch | | `kind: "stop"` | Client aborted the active turn | Optional `message` surfaces in the stop handler | For the raw wire format, see [Client Protocol — ChatInputChunk](/docs/ai-chat/client-protocol#chatinputchunk). ## Session token scopes Tokens minted for `TriggerChatTransport` and `AgentChat` are session-scoped — keyed on the chat's `externalId` (the `chatId` you assign). | Scope | Grants | | ------------------------- | --------------------------------------------------------------------- | | `read:sessions:` | Subscribe to `.out`, HEAD probe the stream, retrieve the session row | | `write:sessions:` | Append to `.in`, close the session, end-and-continue, update metadata | Tokens are produced by `auth.createPublicToken({ scopes: { read: { sessions: chatId }, write: { sessions: chatId } } })` (used by the customer's `accessToken` server action) or returned automatically from `chat.createStartSessionAction` / `POST /api/v1/sessions`. Either form authorizes both URL forms (`/sessions/{chatId}/...` and `/sessions/session_*/...`) on every read and write route. ## Related * [Realtime Streams](/docs/tasks/streams) — How streams work under the hood * [Using the Vercel AI SDK](/docs/guides/examples/vercel-ai-sdk) — Basic AI SDK usage with Trigger.dev * [Realtime React Hooks](/docs/realtime/react-hooks/overview) — Lower-level realtime hooks * [Authentication](/docs/realtime/auth) — Public access tokens and trigger tokens # Server-Side Chat Source: https://trigger.dev/docs/ai-chat/server-chat Use AgentChat to interact with chat agents from server-side code — tasks, webhooks, scripts, or other agents. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. `AgentChat` lets you chat with agents from server-side code. It works inside tasks (agent-to-agent), request handlers, webhook processors, and scripts. ```ts theme={"theme":"css-variables"} import { AgentChat } from "@trigger.dev/sdk/chat"; const chat = new AgentChat({ agent: "my-agent" }); const stream = await chat.sendMessage("Hello!"); const text = await stream.text(); await chat.close(); ``` ## Type-safe client data Pass `typeof yourAgent` as a type parameter and `clientData` is automatically typed from the agent's `withClientData` schema: ```ts theme={"theme":"css-variables"} import { AgentChat } from "@trigger.dev/sdk/chat"; import type { myAgent } from "./trigger/my-agent"; const chat = new AgentChat({ agent: "my-agent", clientData: { userId: "user_123" }, // ← typed from agent definition }); ``` ## Conversation lifecycle Each `AgentChat` instance represents one conversation. The conversation ID is auto-generated or can be set explicitly: ```ts theme={"theme":"css-variables"} // Auto-generated ID const chat = new AgentChat({ agent: "my-agent" }); // Explicit ID — useful for persistence or finding the run later const chat = new AgentChat({ agent: "my-agent", id: `review-${prNumber}` }); ``` ### Sending messages `sendMessage()` triggers a new run on the first call, then reuses the same run for subsequent messages via input streams: ```ts theme={"theme":"css-variables"} // First message — triggers a new run const stream1 = await chat.sendMessage("Review PR #42"); const review = await stream1.text(); // Follow-up — same run, agent has full context const stream2 = await chat.sendMessage("Can you fix the main bug?"); const fix = await stream2.text(); ``` ### Preloading (optional) If you want the agent to initialize before the first message (e.g., load data, authenticate), call `preload()`. This is optional — `sendMessage()` triggers the run automatically if needed. ```ts theme={"theme":"css-variables"} await chat.preload(); // Agent's onPreload hook fires now, before user types anything const stream = await chat.sendMessage("Hello"); ``` ### Closing Signal the agent to exit its loop gracefully: ```ts theme={"theme":"css-variables"} await chat.close(); ``` Without `close()`, the agent exits on its own when its idle/suspend timeout expires. ## Reading responses `sendMessage()` returns a `ChatStream` — a typed wrapper around the response. ### Get the full text ```ts theme={"theme":"css-variables"} const stream = await chat.sendMessage("What is Trigger.dev?"); const text = await stream.text(); ``` ### Get structured results ```ts theme={"theme":"css-variables"} const stream = await chat.sendMessage("Research this topic"); const { text, toolCalls, toolResults } = await stream.result(); for (const tc of toolCalls) { console.log(`Tool: ${tc.toolName}, Input: ${JSON.stringify(tc.input)}`); } ``` ### Stream chunks in real-time ```ts theme={"theme":"css-variables"} const stream = await chat.sendMessage("Write a report"); for await (const chunk of stream) { if (chunk.type === "text-delta") { process.stdout.write(chunk.delta); } if (chunk.type === "tool-input-available") { console.log(`Using tool: ${chunk.toolName}`); } } ``` ## Stateless request handlers In a stateless environment (HTTP handler, serverless function), you need to persist and restore the session across requests. Each chat is backed by a durable Session row that outlives any single run. `AgentChat` exposes the persistable state via `chat.session` (the SSE resume cursor) and surfaces the current run id via the `onTriggered` callback for telemetry / dashboard linking. ```ts theme={"theme":"css-variables"} import { AgentChat } from "@trigger.dev/sdk/chat"; export async function POST(req: Request) { const { chatId, message } = await req.json(); const saved = await db.sessions.find({ chatId }); const chat = new AgentChat({ agent: "my-agent", id: chatId, // Restore from previous request — `lastEventId` is the SSE resume // cursor; the underlying Session is keyed on `chatId` so it's // implicit and durable. session: saved ? { lastEventId: saved.lastEventId } : undefined, // Useful for telemetry / dashboard linking. The `runId` is the // current run, which may change across continuations and upgrades. onTriggered: async ({ runId }) => { await db.sessions.upsert({ chatId, runId }); }, // Persist after each turn for stream resumption onTurnComplete: async ({ lastEventId }) => { await db.sessions.update({ chatId, lastEventId }); }, }); const stream = await chat.sendMessage(message); const text = await stream.text(); return Response.json({ text }); } ``` The Session row is the run manager — a chat that was active yesterday resumes against the same chatId today, even if the original run has long since exited. `AgentChat` (server-side) and `TriggerChatTransport` (browser) both rely on this: send a new message and the server triggers a fresh continuation run on the same session, carrying the conversation forward without losing history or identity. ## Sub-agent tool pattern `AgentChat` can be used inside an AI SDK tool to delegate work to a durable sub-agent. The sub-agent's response streams as preliminary tool results: ```ts theme={"theme":"css-variables"} import { tool } from "ai"; import { AgentChat } from "@trigger.dev/sdk/chat"; import { z } from "zod"; const researchTool = tool({ description: "Delegate research to a specialist agent.", inputSchema: z.object({ topic: z.string() }), execute: async function* ({ topic }, { abortSignal }) { const chat = new AgentChat({ agent: "research-agent" }); const stream = await chat.sendMessage(topic, { abortSignal }); yield* stream.messages(); await chat.close(); }, toModelOutput: ({ output: message }) => { const lastText = message?.parts?.findLast( (p: { type: string }) => p.type === "text" ) as { text?: string } | undefined; return { type: "text", value: lastText?.text ?? "Done." }; }, }); ``` This supports single-turn delegation, multi-turn LLM-driven conversations with persistent sub-agents, and cross-turn state that survives snapshot/restore. See the [Sub-Agents guide](/docs/ai-chat/patterns/sub-agents) for the full pattern including multi-turn conversations, cleanup, and what the frontend sees. ## Additional methods ### Steering Send a message during an active stream without interrupting it: ```ts theme={"theme":"css-variables"} await chat.steer("Focus on security issues specifically"); ``` ### Stop generation Abort the current `streamText` call without ending the run: ```ts theme={"theme":"css-variables"} await chat.stop(); ``` ### Raw messages For full control over the UIMessage shape: ```ts theme={"theme":"css-variables"} const rawStream = await chat.sendRaw([ { id: "msg-1", role: "user", parts: [ { type: "text", text: "Hello" }, { type: "file", url: "https://...", mediaType: "image/png" }, ], }, ]); ``` ### Reconnect Resume a stream subscription after a disconnect: ```ts theme={"theme":"css-variables"} const stream = await chat.reconnect(); ``` ## AgentChat options | Option | Type | Default | Description | | ---------------------- | --------------------------------------------------------------------------------------------------------- | -------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `agent` | `string` | required | The agent task ID to trigger | | `id` | `string` | `crypto.randomUUID()` | Conversation ID for tagging and correlation | | `clientData` | typed from agent | `undefined` | Client data included in every request | | `session` | `ChatSession` (`{ lastEventId?: string }`) | `undefined` | Restore a previous session's SSE resume cursor. The Session row itself is keyed on `chatId` (durable) — no other state to thread. | | `onTriggered` | `(event) => void` | `undefined` | Called when a new run is created | | `onTurnComplete` | `(event) => void` | `undefined` | Called when a turn's stream ends | | `streamTimeoutSeconds` | `number` | `120` | SSE timeout in seconds | | `triggerConfig` | `SessionTriggerConfig` | `undefined` | Tags, queue, machine, `maxAttempts`, `idleTimeoutInSeconds`, `basePayload` — folded into `sessions.start({...})` | | `baseURL` | `string \| (ctx: { endpoint: "in" \| "out"; chatId: string }) => string` | `apiClientManager.baseURL` | API base URL. String form applies to every endpoint; function form picks per endpoint — useful for routing `.in/append` through an edge proxy while keeping `.out` SSE direct. Defaults to whatever `@trigger.dev/sdk` was configured with (typically `TRIGGER_API_URL`). | | `fetch` | `(url: string, init: RequestInit, ctx: { endpoint: "in" \| "out"; chatId: string }) => Promise` | `undefined` | Per-request fetch override. Invoked for both `.in/append` POSTs and the `.out` SSE GET. Use for header injection, custom retries, or proxy rewrites. | ## ChatStream methods | Method | Returns | Description | | ------------------------ | -------------------------------- | --------------------------------------------------------- | | `text()` | `Promise` | Consume stream, return accumulated text | | `result()` | `Promise` | Consume stream, return `{ text, toolCalls, toolResults }` | | `messages()` | `AsyncGenerator` | Yield accumulated UIMessage snapshots (sub-agent pattern) | | `[Symbol.asyncIterator]` | `UIMessageChunk` | Iterate over typed stream chunks | | `.stream` | `ReadableStream` | Raw stream for AI SDK utilities | # Sessions Source: https://trigger.dev/docs/ai-chat/sessions A Session is a pair of durable streams — input carries your users' messages to the agent, output carries everything the agent produces back — plus orchestration of the runs that process them. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. **A Session is a pair of durable streams.** The input stream (`.in`) carries incoming user messages to your task. The output stream (`.out`) carries everything the agent produces back to your clients: AI generation parts (text, reasoning, tool calls) and any custom data parts you write. Sessions also **orchestrate the runs that process those streams**. A Session is keyed on your stable id (`externalId` — for chat, the `chatId`) and owns its current run: when a run suspends, idles out, or hands off to a new version, the Session starts or swaps to a fresh run and the streams carry on. Clients keep sending and reading against the same id; they never know a run changed underneath. ```mermaid theme={"theme":"css-variables"} flowchart LR C[Browser / backend clients] -- "user messages" --> IN([Session .in]) IN --> R["current run
(runs come and go)"] R -- "text, reasoning, tool calls,
data parts" --> OUT([Session .out]) OUT --> C ``` `chat.agent` is built on Sessions. You can also use them directly for any pattern that needs durable bi-directional streaming across runs: long-lived agent inboxes, multi-step approval flows, server-to-server pipelines that survive worker restarts. ## A minimal example A task that echoes whatever lands on its input stream, and a backend that starts the session, sends a message, and reads the reply: ```ts trigger/inbox.ts theme={"theme":"css-variables"} import { task, sessions } from "@trigger.dev/sdk"; export const inboxAgent = task({ id: "inbox-agent", run: async (payload: { sessionId: string }) => { const session = sessions.open(payload.sessionId); while (true) { // Suspends the run (no compute billed) until a record arrives. const next = await session.in.wait<{ text: string }>({ timeout: "1h" }); if (!next.ok) return; await session.out.append({ type: "reply", text: `echo: ${next.output.text}` }); } }, }); ``` ```ts Your backend theme={"theme":"css-variables"} import { sessions } from "@trigger.dev/sdk"; // Atomically create the session AND trigger its first run. await sessions.start({ type: "inbox", externalId: userId, taskIdentifier: "inbox-agent", triggerConfig: { basePayload: { sessionId: userId } }, }); const session = sessions.open(userId); await session.in.send({ text: "hello" }); const stream = await session.out.read({ signal: AbortSignal.timeout(30_000) }); for await (const chunk of stream) { console.log(chunk); // { type: "reply", text: "echo: hello" } } ``` The run can suspend, crash, or be replaced between the `send` and the `read` — the streams are durable, so nothing is lost and the client code doesn't change. ## Sessions and runs One Session spans many runs over its lifetime. The Session row tracks `currentRunId`; the runs do the work: * **First run**: created atomically by `sessions.start` (no gap where the session exists but nothing is listening). * **Idle suspend**: a run blocked on `in.wait` suspends and frees compute. A new record on `.in` wakes it. * **Continuation**: when a run ends (idle timeout, `chat.endRun`, a crash, a version upgrade), the next incoming record triggers a fresh run against the same Session. The new run picks up the streams where the old one left off. This is what makes a Session the durable identity for a conversation: runs are an execution detail, the Session (and its `externalId`) is what your clients address. See [How it works](/docs/ai-chat/how-it-works) for how `chat.agent` drives this loop. ## When to reach for Sessions directly `chat.agent` handles 90% of chat-shaped workloads — message accumulation, the turn loop, stop signals, lifecycle hooks. Use the raw `sessions` API when you need any of: * **Non-chat conversational state**: an agent inbox where each "turn" is a webhook event rather than a UI message. * **Server-to-server bi-directional streaming** where an external service produces records the task consumes (and vice-versa) over the same durable channel. * **A custom turn loop** where the agent abstraction doesn't fit but you still want session-survival across runs. For chat use cases, prefer [`chat.agent`](/docs/ai-chat/backend#chat-agent) or [`chat.createSession`](/docs/ai-chat/backend#chat-createsession). ## `sessions` namespace ```ts theme={"theme":"css-variables"} import { sessions } from "@trigger.dev/sdk"; ``` ### `sessions.start(body, requestOptions?)` Atomically create a Session row and trigger its first run. Idempotent on `(env, externalId)` — two concurrent calls with the same `externalId` converge to one session. ```ts theme={"theme":"css-variables"} const { id, runId, publicAccessToken, isCached } = await sessions.start({ type: "chat.agent", externalId: chatId, taskIdentifier: "my-chat", triggerConfig: { tags: [`chat:${chatId}`], basePayload: { /* whatever your task's payload shape is */ }, }, }); ``` | Field | Type | Notes | | ---------------- | -------------------------- | ----------------------------------------------------------------------------------------------------------------------- | | `type` | `string` | Free-form discriminator. `chat.agent` uses `"chat.agent"`. | | `externalId` | `string?` | Your stable identity. Cannot start with `session_` (reserved). | | `taskIdentifier` | `string` | Task this session triggers runs against. | | `triggerConfig` | `SessionTriggerConfig` | Trigger options applied to every run: `tags`, `queue`, `machine`, `maxAttempts`, `idleTimeoutInSeconds`, `basePayload`. | | `tags` | `string[]?` | Up to 10 tags on the Session row (separate from `triggerConfig.tags`). | | `metadata` | `Record?` | Arbitrary JSON. | | `expiresAt` | `Date?` | Hard retention deadline. | Returns `CreatedSessionResponseBody`: | Field | Type | Notes | | ------------------- | --------- | ---------------------------------------------------------------- | | `id` | `string` | Server-assigned `session_*` friendlyId. | | `runId` | `string` | The first run created alongside the session. | | `publicAccessToken` | `string` | Session-scoped PAT (`read:sessions:{id} + write:sessions:{id}`). | | `isCached` | `boolean` | `true` if the session already existed (idempotent upsert). | ### `sessions.retrieve(idOrExternalId, requestOptions?)` Retrieve a Session by either its server-assigned `session_*` id or your user-supplied `externalId`. The server disambiguates via the `session_` prefix. ```ts theme={"theme":"css-variables"} const session = await sessions.retrieve(chatId); console.log(session.currentRunId, session.tags, session.closedAt); ``` ### `sessions.update(idOrExternalId, body, requestOptions?)` Mutate `tags` or `metadata` on an existing Session. `externalId` is read-only after create: it cannot be changed or cleared (it keys the session's durable streams and token scope), so sending a different value returns `422`. ### `sessions.close(idOrExternalId, body?, requestOptions?)` Mark a Session as closed. Terminal and idempotent. The optional `reason` is stored on the row. ```ts theme={"theme":"css-variables"} await sessions.close(chatId, { reason: "user signed out" }); ``` ### `sessions.list(options?, requestOptions?)` Cursor-paginated list of Sessions in the current environment. Returns a `CursorPagePromise` you can iterate with `for await`. ```ts theme={"theme":"css-variables"} for await (const s of sessions.list({ type: "chat.agent", tag: `user:${userId}`, status: "ACTIVE", limit: 50, })) { console.log(s.id, s.externalId, s.createdAt); } ``` | Filter | Type | Notes | | ---------------------------- | ----------------------------------- | --------------------------------------- | | `type` | `string \| string[]` | e.g. `"chat.agent"` | | `tag` | `string \| string[]` | Matches `triggerConfig.tags` | | `taskIdentifier` | `string \| string[]` | Filter by task | | `externalId` | `string` | Exact match | | `status` | `"ACTIVE" \| "CLOSED" \| "EXPIRED"` | Lifecycle state | | `period` / `from` / `to` | window | Time-range filter | | `limit` / `after` / `before` | cursor | Pagination (1–100 per page; default 20) | ### `sessions.open(idOrExternalId)` Open a lightweight `SessionHandle` to the realtime channels. Does **not** hit the network — each handle method calls the corresponding endpoint lazily. ```ts theme={"theme":"css-variables"} const session = sessions.open(chatId); await session.out.append({ kind: "message", text: "hello" }); const next = await session.in.once({ timeoutMs: 30_000 }); ``` ## `SessionHandle` ```ts theme={"theme":"css-variables"} class SessionHandle { readonly id: string; readonly in: SessionInputChannel; readonly out: SessionOutputChannel; } ``` The two channels mirror the producer/consumer pair in `streams.define` (out) and `streams.input` (in), but are **session-scoped** rather than run-scoped — they survive across run boundaries. ## `session.out` — task → clients The output channel. The task writes; external clients (browser, server action, another task) read via SSE. The underlying HTTP endpoints are documented in [Session channels](/docs/management/sessions/channels) for non-SDK callers. ### `out.append(value, options?)` Append a single record. Routes through `writer` internally so SSE consumers see the same parsed-object shape on every event. ### `out.pipe(stream, options?)` Pipe an `AsyncIterable` or `ReadableStream` directly to S2 (the durable backing store). Returns `{ stream, waitUntilComplete }`. ### `out.writer({ execute, ... })` Imperative writer. `execute({ write, merge })` runs against an in-memory queue whose records are piped to S2. ```ts theme={"theme":"css-variables"} session.out.writer({ execute: ({ write }) => { write({ type: "text", text: "hi" }); write({ type: "text", text: " there" }); }, }); ``` ### `out.read(options?)` Subscribe to SSE records on `.out`. Returns an async-iterable stream with auto-retry and `Last-Event-ID` resume. ```ts theme={"theme":"css-variables"} const stream = await session.out.read({ signal: AbortSignal.timeout(30_000), lastEventId: lastSeenSeqNum, }); for await (const chunk of stream) { // ... } ``` ### `out.writeControl(subtype, extraHeaders?)` Write a Trigger control record. Carries a `trigger-control` header valued with `subtype` (e.g. `turn-complete`, `upgrade-required`); the body is empty. The SDK transport filters control records out of the consumer-facing chunk stream — readers route them via `onControl` instead. Returns `{ lastEventId }` — useful for trim chains. ### `out.trimTo(earliestSeqNum)` Append an S2 `trim` command. Records with `seq_num < earliestSeqNum` are eventually deleted. Idempotent and monotonic. `chat.agent` uses this to keep `session.out` bounded to roughly one turn at steady state. ## `session.in` — clients → task The input channel. External clients call `send`; the task consumes via `on` / `once` / `peek` / `wait` / `waitWithIdleTimeout`. The underlying HTTP endpoints are documented in [Session channels](/docs/management/sessions/channels) for non-SDK callers. ### `in.send(value, requestOptions?)` Append a single record. Called from outside the task (browser, server action, another task). ```ts theme={"theme":"css-variables"} const session = sessions.open(chatId); await session.in.send({ kind: "user-event", payload: { ... } }); ``` ### `in.on(handler)` Register a handler that fires for every record landing on `.in`. Buffered records flush on attach. Returns `{ off }`. ### `in.once(options?)` Wait for the next record without suspending the run. `{ ok: true, output }` or `{ ok: false, error }` on timeout. Chain `.unwrap()` to get the data directly. ```ts theme={"theme":"css-variables"} const result = await session.in.once({ timeoutMs: 5_000 }); if (result.ok) handle(result.output); ``` ### `in.peek()` Non-blocking peek at the head of the `.in` buffer. ### `in.wait(options?)` Suspend the current run until the next record arrives — frees compute while blocked. Only callable from inside `task.run()`. ```ts theme={"theme":"css-variables"} const next = await session.in.wait({ timeout: "1h" }); ``` ### `in.waitWithIdleTimeout({ idleTimeoutInSeconds, timeout, ... })` Hybrid: stay warm for `idleTimeoutInSeconds`, then suspend via `wait` if nothing arrives. `chat.agent`'s turn loop uses this to balance responsiveness and cost. ```ts theme={"theme":"css-variables"} const next = await session.in.waitWithIdleTimeout({ idleTimeoutInSeconds: 30, timeout: "1h", onSuspend: () => { /* persist before suspending */ }, onResume: () => { /* re-hydrate after resume */ }, }); ``` ### `in.lastDispatchedSeqNum()` The highest S2 `seq_num` this channel has delivered to a consumer. Used by `chat.agent` to persist a resume cursor on each `turn-complete` so the next worker boot subscribes past already-processed records. ## Authorization Browser and server-side clients use a session-scoped Public Access Token: ```ts theme={"theme":"css-variables"} import { auth } from "@trigger.dev/sdk"; const pat = await auth.createPublicToken({ scopes: { read: { sessions: chatId }, write: { sessions: chatId }, }, expirationTime: "1h", }); ``` Tokens authorize **both** URL forms: `/sessions/{externalId}/...` and `/sessions/session_*/...`. For the `chat.agent` transport, `auth.createPublicToken` is wrapped by `accessToken` in `useTriggerChatTransport`; for direct session access from your server, mint a token per request just like any other realtime resource. See [Session scopes](/docs/management/authentication#session-scopes) for exactly what `read:sessions` and `write:sessions` grant, and why updating, closing, and appending to `.out` require a secret key. ## See also * [Sessions HTTP API](/docs/management/sessions/create) — The REST endpoints for creating, listing, retrieving, updating, and closing sessions, plus the [channel endpoints](/docs/management/sessions/channels) for non-SDK callers. * [Session scopes](/docs/management/authentication#session-scopes) — The public-token scopes that authorize session and channel access. * [How it works](/docs/ai-chat/how-it-works) — How `chat.agent` builds on Sessions. * [Backend](/docs/ai-chat/backend) — `chat.agent` / `chat.createSession` / raw `task()` with chat primitives. * [Client Protocol](/docs/ai-chat/client-protocol) — The wire-level view of `.in/append` and `.out` SSE. * [Persistence and replay](/docs/ai-chat/patterns/persistence-and-replay) — How tails are read at boot. # Testing Source: https://trigger.dev/docs/ai-chat/testing Drive a chat.agent through real turns in unit tests — no network, no task runtime, no mocking the SDK. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. ## Overview `@trigger.dev/sdk/ai/test` exports `mockChatAgent`, an offline harness that runs your `chat.agent` definition's `run()` function inside an in-memory task runtime. You send messages, actions, and stop signals through driver methods and assert against the chunks the agent emits. Under the hood the harness drives the agent's backing Session channels — `.in` receives the records your `sendMessage` / `sendStop` / `sendAction` produce, `.out` captures the chunks the agent emits. The harness API itself is session-agnostic; you don't need to manage `sessionId` in tests. The harness exercises the real turn loop, lifecycle hooks, validation, hydration, and action routing — only the language model and the surrounding Trigger.dev runtime are replaced. Pair it with [`MockLanguageModelV3`](https://sdk.vercel.ai/docs/reference/ai-sdk-core/mock-language-model-v3) and `simulateReadableStream` from `ai` to control LLM responses. Import `@trigger.dev/sdk/ai/test` **before** your agent module. It installs the resource catalog so `chat.agent({ id, ... })` can register tasks during testing. ## Quick start ```ts trigger/my-chat.test.ts theme={"theme":"css-variables"} import { mockChatAgent } from "@trigger.dev/sdk/ai/test"; import { describe, expect, it } from "vitest"; import { simulateReadableStream, stepCountIs } from "ai"; import { MockLanguageModelV3 } from "ai/test"; import type { LanguageModelV3StreamPart } from "@ai-sdk/provider"; import { myChatAgent } from "./my-chat.js"; function modelWithText(text: string) { const chunks: LanguageModelV3StreamPart[] = [ { type: "text-start", id: "t1" }, { type: "text-delta", id: "t1", delta: text }, { type: "text-end", id: "t1" }, { type: "finish", finishReason: { unified: "stop", raw: "stop" }, usage: { inputTokens: { total: 10, noCache: 10, cacheRead: undefined, cacheWrite: undefined }, outputTokens: { total: 10, text: 10, reasoning: undefined }, }, }, ]; return new MockLanguageModelV3({ doStream: async () => ({ stream: simulateReadableStream({ chunks }) }), }); } describe("myChatAgent", () => { it("streams the model's response", async () => { const model = modelWithText("hello world"); const harness = mockChatAgent(myChatAgent, { chatId: "test-1", clientData: { model }, }); try { const turn = await harness.sendMessage({ id: "u1", role: "user", parts: [{ type: "text", text: "hi" }], }); const text = turn.chunks .filter((c) => c.type === "text-delta") .map((c) => (c as { delta: string }).delta) .join(""); expect(text).toBe("hello world"); } finally { await harness.close(); } }); }); ``` The agent reads the mock model from `clientData`: ```ts trigger/my-chat.ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { streamText, type LanguageModel } from "ai"; import { z } from "zod"; type ClientData = { model: LanguageModel }; export const myChatAgent = chat .withClientData({ schema: z.custom( (v) => !!v && typeof v === "object" && "model" in (v as object) ), }) .agent({ id: "my-chat", run: async ({ messages, clientData, signal }) => { return streamText({ model: clientData?.model ?? "openai/gpt-4o-mini", messages, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` ## Setup ### Install dev dependencies The harness itself ships with `@trigger.dev/sdk`. You need a test runner and the AI SDK's mock model utilities: ```bash theme={"theme":"css-variables"} pnpm add -D vitest ai @ai-sdk/provider ``` `@ai-sdk/provider` is only needed to type the chunk array as `LanguageModelV3StreamPart[]` — drop it if you cast inline. ### Vitest config A minimal `vitest.config.ts` for a Trigger.dev project: ```ts theme={"theme":"css-variables"} import { defineConfig } from "vitest/config"; export default defineConfig({ test: { include: ["src/**/*.test.ts"], environment: "node", }, }); ``` ### Import order `mockChatAgent` must be imported **first** so the resource catalog is installed before any `chat.agent({ id, ... })` registration runs: ```ts theme={"theme":"css-variables"} // ✅ Correct import { mockChatAgent } from "@trigger.dev/sdk/ai/test"; import { myAgent } from "./my-agent.js"; // ❌ Wrong — agent loads before the catalog exists import { myAgent } from "./my-agent.js"; import { mockChatAgent } from "@trigger.dev/sdk/ai/test"; ``` If the agent isn't registered when `mockChatAgent` runs, you'll get: ``` mockChatAgent: no task registered with id "my-chat". ``` ## Inject the model via clientData `MockLanguageModelV3` lives in test code and shouldn't leak into your agent module. Pass it through `clientData` so the agent picks it up at runtime in tests, and falls back to a real model in production: ```ts trigger/agent.ts theme={"theme":"css-variables"} type ClientData = { model?: LanguageModel }; export const agent = chat .withClientData({ schema: z.custom() }) .agent({ id: "agent", run: async ({ messages, clientData, signal }) => { return streamText({ model: clientData?.model ?? anthropic("claude-haiku-4-5"), messages, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` ```ts agent.test.ts theme={"theme":"css-variables"} const harness = mockChatAgent(agent, { chatId: "test", clientData: { model: mockModel }, }); ``` ## Driving turns The harness exposes one method per chat trigger. Each waits for the next `trigger:turn-complete` chunk before resolving. ### sendMessage ```ts theme={"theme":"css-variables"} const turn = await harness.sendMessage({ id: "u1", role: "user", parts: [{ type: "text", text: "hi" }], }); ``` Pass an array to send multiple messages at once. ### sendRegenerate ```ts theme={"theme":"css-variables"} const turn = await harness.sendRegenerate(messages); ``` Equivalent to the frontend's `useChat().regenerate()` — replays a turn with the given message history. ### sendAction Routes a payload through `actionSchema` + `onAction`. Actions are not turns: only `hydrateMessages` and `onAction` fire on the agent side — no turn lifecycle hooks, no `run()`. The returned `turn.rawChunks` contains whatever `onAction` produced (a streamed model response if it returned a `StreamTextResult`, otherwise just `trigger:turn-complete`): ```ts theme={"theme":"css-variables"} const turn = await harness.sendAction({ type: "undo" }); ``` If the action fails schema validation, an `error` chunk appears in `turn.rawChunks`. ### sendStop Fires a stop signal. Does **not** wait for a turn — the agent's `signal.aborted` becomes `true` and the current turn unwinds: ```ts theme={"theme":"css-variables"} await harness.sendStop("user requested stop"); ``` ### close Sends a `close` trigger, closes the session's `.in` channel, and aborts the run signal so the task exits cleanly. Always call this at the end of every test: ```ts theme={"theme":"css-variables"} afterEach(() => harness.close()); // or with a try/finally try { await harness.sendMessage(...); } finally { await harness.close(); } ``` ## Inspecting output Each turn returns: ```ts theme={"theme":"css-variables"} type MockChatAgentTurn = { chunks: UIMessageChunk[]; // text-delta, tool-call, etc. rawChunks: unknown[]; // includes control chunks (turn-complete, errors) }; ``` The harness also exposes accumulators across all turns: ```ts theme={"theme":"css-variables"} harness.allChunks; // every UIMessageChunk since creation harness.allRawChunks; // every raw chunk including control frames ``` A small helper to assemble streamed text: ```ts theme={"theme":"css-variables"} function collectText(chunks: UIMessageChunk[]): string { return chunks .filter((c) => c.type === "text-delta") .map((c) => (c as { delta: string }).delta) .join(""); } ``` ## Common patterns ### Asserting hook order ```ts theme={"theme":"css-variables"} const events: string[] = []; const agent = chat.agent({ id: "hook-order", onChatStart: async () => { events.push("onChatStart"); }, onTurnStart: async () => { events.push("onTurnStart"); }, onBeforeTurnComplete: async () => { events.push("onBeforeTurnComplete"); }, onTurnComplete: async () => { events.push("onTurnComplete"); }, run: async ({ messages, signal }) => { events.push("run"); return streamText({ model, messages, abortSignal: signal }); }, }); const harness = mockChatAgent(agent, { chatId: "t" }); await harness.sendMessage(userMessage("hi")); // onTurnComplete fires after the turn-complete chunk is written — // give it a tick before asserting. await new Promise((r) => setTimeout(r, 20)); expect(events).toEqual([ "onChatStart", "onTurnStart", "run", "onBeforeTurnComplete", "onTurnComplete", ]); await harness.close(); ``` ### Testing onValidateMessages ```ts theme={"theme":"css-variables"} const turn = await harness.sendMessage(userMessage("hello blocked-word")); // The turn completes with an error chunk, not text expect(collectText(turn.chunks)).toBe(""); expect(turn.rawChunks.some((c) => typeof c === "object" && c !== null && (c as { type?: string }).type === "trigger:turn-complete" )).toBe(true); ``` ### Testing actions and rejection ```ts theme={"theme":"css-variables"} // Valid action await harness.sendAction({ type: "undo" }); // Invalid action — schema validation fails, error chunk emitted const turn = await harness.sendAction({ type: "not-a-real-action" }); const errors = turn.rawChunks.filter((c) => typeof c === "object" && c !== null && (c as { type?: string }).type === "error" ); expect(errors.length).toBeGreaterThan(0); ``` ### Multi-turn accumulation The harness preserves chat history across turns, just like the real runtime: ```ts theme={"theme":"css-variables"} const seenLengths: number[] = []; const agent = chat.agent({ id: "multi-turn", run: async ({ messages, signal }) => { seenLengths.push(messages.length); return streamText({ model, messages, abortSignal: signal }); }, }); const harness = mockChatAgent(agent, { chatId: "t" }); await harness.sendMessage(userMessage("first")); await harness.sendMessage(userMessage("second")); await harness.sendMessage(userMessage("third")); // Turn 1: 1 message; turn 2: user + assistant + user = 3; turn 3: 5 expect(seenLengths).toEqual([1, 3, 5]); ``` ### Hydrating from a "database" Use `clientData` to seed a synthetic prior context for `hydrateMessages`: ```ts theme={"theme":"css-variables"} const hydrated = [ { id: "h1", role: "user", parts: [{ type: "text", text: "prior question" }] }, { id: "h2", role: "assistant", parts: [{ type: "text", text: "prior answer" }] }, ]; const harness = mockChatAgent(agent, { chatId: "test-hydrate", clientData: { model, hydrated: [...hydrated, userMessage("follow up")] }, }); await harness.sendMessage(userMessage("follow up")); // Model should have been called with the hydrated context expect(model.doStreamCalls[0]!.prompt.length).toBeGreaterThanOrEqual(3); ``` The agent reads `clientData.hydrated` inside its `hydrateMessages` hook: ```ts theme={"theme":"css-variables"} hydrateMessages: async ({ clientData, incomingMessages }) => { return clientData?.hydrated ?? incomingMessages; }, ``` ### Testing continuation runs A continuation run is a new run picking up an existing session after the prior run ended — `chat.endRun`, waitpoint timeout, or `chat.requestUpgrade`. The contract differs from a fresh run in two ways: * `onChatStart` does **not** fire (it's once-per-chat — fires only on the chat's very first user message ever). * The boot payload arrives with `continuation: true` and no `message`. The SDK waits silently on `session.in` until the next user message arrives. Pass `continuation: true` to drive this path: ```ts theme={"theme":"css-variables"} const onChatStart = vi.fn(); const onTurnStart = vi.fn(); const agent = chat.agent({ id: "my-chat", onChatStart, onTurnStart, run: async ({ messages, signal }) => streamText({ model, messages, abortSignal: signal }), }); const harness = mockChatAgent(agent, { chatId: "test-continuation", // Auto-selects `mode: "continuation"` — boots with `trigger` omitted // and `continuation: true` in the wire payload, exactly as the server // produces it on continuation runs in production. continuation: true, previousRunId: "run_test_prior", }); try { // The SDK enters continuation-wait; sendMessage wakes it and drives turn 0. await harness.sendMessage({ id: "u1", role: "user", parts: [{ type: "text", text: "where were we?" }], }); await new Promise((r) => setTimeout(r, 20)); expect(onChatStart).not.toHaveBeenCalled(); expect(onTurnStart).toHaveBeenCalledTimes(1); } finally { await harness.close(); } ``` To simulate an **OOM-retry attempt** (also a continuation by contract — same `onChatStart` skip), bump `ctx.attempt.number`: ```ts theme={"theme":"css-variables"} const harness = mockChatAgent(agent, { chatId: "test-oom-retry", taskContext: { ctx: { attempt: { number: 2, startedAt: new Date(0), status: "EXECUTING" } }, }, }); await harness.sendMessage(/* ... */); expect(onChatStart).not.toHaveBeenCalled(); ``` ### Testing recovery boot `onRecoveryBoot` fires when the dead predecessor left state behind — a partial assistant on `session.out`, in-flight users on `session.in`, or both. The harness exposes two seeders to drive this state at boot time: * `harness.seedSessionOutPartial(message)` — pre-seed a trailing partial assistant. The next boot's replay surfaces it as `event.partialAssistant`. * `harness.seedSessionInTail(messages)` — pre-seed user messages on the input tail. The next boot's replay surfaces them as `event.inFlightUsers`. Combined with `continuation: true`, this drives the full recovery boot path: ```ts theme={"theme":"css-variables"} import { mockChatAgent } from "@trigger.dev/sdk/ai/test"; const onRecoveryBoot = vi.fn(async () => { // accept smart default }); const agent = chat.agent({ id: "my-chat", onRecoveryBoot, run: async ({ messages, signal }) => streamText({ model, messages, abortSignal: signal }), }); const harness = mockChatAgent(agent, { chatId: "test-recovery", continuation: true, previousRunId: "run_prior", }); // Predecessor was answering "write an essay" and got cut off mid-stream // after producing some text. Customer then sent a follow-up. harness.seedSessionOutPartial({ id: "a-orphan", role: "assistant", parts: [{ type: "text", text: "Espresso originated in..." }], }); harness.seedSessionInTail([ { id: "u-1", role: "user", parts: [{ type: "text", text: "Write an essay about espresso." }] }, { id: "u-2", role: "user", parts: [{ type: "text", text: "keep going" }] }, ]); await new Promise((r) => setTimeout(r, 50)); expect(onRecoveryBoot).toHaveBeenCalledTimes(1); const event = onRecoveryBoot.mock.calls[0]![0]; expect(event.partialAssistant?.id).toBe("a-orphan"); expect(event.inFlightUsers).toHaveLength(2); ``` Use `harness.seedSnapshot({ messages: [...] })` alongside these to model a continuation where settled history exists. See the [Recovery boot](/docs/ai-chat/patterns/recovery-boot) pattern for what each field means and what the smart default does with it. ## Testing against a database Most agents call into a database from `hydrateMessages` or `onTurnComplete` to load history and persist replies. You shouldn't pass database clients through `clientData` — that's wire-data from the browser. Use **`locals` for dependency injection** instead. `locals` are task-scoped, server-side only, and untyped to the wire format. The mock harness exposes a `setupLocals` callback that pre-seeds them before the agent's `run()` starts. ### Define a locals key for the dependency Create a single key per dependency, exported from your project: ```ts db.ts theme={"theme":"css-variables"} import { locals } from "@trigger.dev/sdk"; import { PrismaClient } from "@prisma/client"; export type Db = PrismaClient; export const dbKey = locals.create("db"); export function getDb(): Db { // Returns the seeded test instance if present, otherwise lazy-creates prod. return locals.get(dbKey) ?? locals.set(dbKey, new PrismaClient()); } ``` ### Use the dependency from agent hooks Hooks read from `locals` instead of constructing clients themselves: ```ts trigger/agent.ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { getDb } from "../db"; export const agent = chat.agent({ id: "agent", hydrateMessages: async ({ chatId }) => { const db = getDb(); const row = await db.chat.findUnique({ where: { id: chatId } }); return (row?.messages as UIMessage[]) ?? []; }, onTurnComplete: async ({ chatId, messages }) => { const db = getDb(); await db.chat.upsert({ where: { id: chatId }, create: { id: chatId, messages }, update: { messages }, }); }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` ### Inject a test database in the harness `setupLocals` runs *before* the agent starts, so `getDb()` returns the test instance for every hook: ```ts agent.test.ts theme={"theme":"css-variables"} import { mockChatAgent } from "@trigger.dev/sdk/ai/test"; import { dbKey } from "./db"; import { agent } from "./trigger/agent"; const harness = mockChatAgent(agent, { chatId: "test-1", setupLocals: ({ set }) => { set(dbKey, testDb); // testDb = your testcontainers Prisma client, sqlite stub, etc. }, }); ``` ### Pick a backing database You still need to decide what `testDb` actually is: * **Testcontainers (recommended).** Spin up Postgres in Docker via `@internal/testcontainers` (or `testcontainers` directly), run migrations, hand the resulting `PrismaClient` to `set(dbKey, ...)`. Highest fidelity — catches schema drift, migration bugs, transaction issues. * **Embedded SQLite / PGlite.** Fast and no Docker, but a different SQL dialect from production. Fine for hooks that only do simple CRUD; risky for raw SQL or Postgres-specific features. * **In-memory fake.** Hand-rolled object with the same interface as your DB module. Fastest, lowest fidelity — works when you only care about whether the agent *called* the right method, not what the DB *did* with it. ### Drizzle, Kysely, etc. The pattern is the same — replace `PrismaClient` with your client class: ```ts db.ts theme={"theme":"css-variables"} import { drizzle } from "drizzle-orm/node-postgres"; import { Pool } from "pg"; export type Db = ReturnType; export const dbKey = locals.create("db"); export function getDb(): Db { return locals.get(dbKey) ?? locals.set( dbKey, drizzle(new Pool({ connectionString: process.env.DATABASE_URL })), ); } ``` The same `setupLocals` pattern works for any server-side dependency: feature flag clients, Stripe SDK, internal HTTP clients, Sentry. Anything you'd normally inject via constructor parameters in a class-based design. ## API reference ### mockChatAgent(agent, options?) ```ts theme={"theme":"css-variables"} function mockChatAgent( agent: { id: string }, options?: MockChatAgentOptions, ): MockChatAgentHarness; ``` #### MockChatAgentOptions | Option | Type | Default | Description | | --------------- | ----------------------------------------------------------------------- | ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | | `chatId` | `string` | `"test-chat"` | Chat session id passed in every wire payload. | | `clientData` | `unknown` | `undefined` | Client-provided data forwarded to `run()` and every hook. | | `taskContext` | `MockTaskContextOptions` | `{}` | Overrides for the mock `TaskRunContext`. Use `ctx.attempt.number > 1` to simulate an OOM-retry attempt — the agent skips `onChatStart` (same as continuation runs). | | `preload` | `boolean` | `true` | Start in preload mode. When `false`, the first `sendMessage()` starts turn 0 directly without preload. Ignored when `mode` is set explicitly. | | `mode` | `"preload" \| "submit-message" \| "handover-prepare" \| "continuation"` | derived | Initial boot trigger. Defaults to `"preload"` (or `"submit-message"` when `preload: false`, or `"continuation"` when `continuation: true`). See [Boot modes](#boot-modes) below. | | `continuation` | `boolean` | `false` | Boot as a continuation run (a new run on an existing session). Auto-selects `mode: "continuation"` if `mode` is not set — boots with `trigger` omitted and `continuation: true` in the payload, exercising the SDK's continuation-wait branch. `onChatStart` does NOT fire on continuation runs. | | `previousRunId` | `string` | `undefined` | Set `payload.previousRunId` on the initial wire payload. Typically paired with `continuation: true`. | | `snapshot` | `ChatSnapshotV1` | `undefined` | Pre-seed the snapshot the agent reads at run boot (replaces the real S3 GET). Use to drive resume scenarios with prior history. See [Persistence and replay](/docs/ai-chat/patterns/persistence-and-replay) for the production snapshot model. | | `setupLocals` | `({ set }) => void \| Promise` | `undefined` | Callback invoked before `run()` starts. Use `set(key, value)` to inject server-side dependencies (DB clients, service stubs) that the agent reads via `locals.get()`. | ##### Boot modes The harness's initial wire payload depends on `mode`: | Mode | Wire payload | Use when | | -------------------- | --------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `"preload"` | `{ trigger: "preload" }` | Simulating a `transport.preload(chatId)` warm-up. Fires `onPreload`, waits for the first `sendMessage()`. | | `"submit-message"` | `{ trigger: "submit-message" }` | Skipping preload — `sendMessage()` drives turn 0 directly. | | `"continuation"` | `{ continuation: true }` (no `trigger`) | A new run picking up an existing session after the prior run ended (`chat.endRun`, waitpoint timeout, `chat.requestUpgrade`). Mirrors the boot payload the server's `ensureRunForSession` / `swapSessionRun` produce. The SDK enters its continuation-wait branch — `onPreload` and `onChatStart` do NOT fire. | | `"handover-prepare"` | `{ trigger: "handover-prepare" }` | Driving the `chat.handover` wait path. Use `sendHandover()` / `sendHandoverSkip()` to dispatch the handover signal. | #### MockChatAgentHarness | Member | Description | | ----------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `chatId` | The chat session id used by this harness. | | `sendMessage(message)` | Send a single user message (or tool-approval-responded assistant message). Slim wire: at most ONE message per record. Returns the chunks produced during the resulting turn. | | `sendRegenerate()` | Send a regenerate-message trigger (no body — slim wire). The agent trims trailing assistant messages from its accumulator and re-runs. | | `sendHeadStart({ messages })` | Drive the head-start path: sends `trigger: "handover-prepare"` with `headStartMessages` carrying the first-turn UIMessage history. Used only at the very first turn before any snapshot exists. | | `sendHandover({ partialAssistantMessage, isFinal?, messageId? })` | Dispatch a `handover` signal — only meaningful when started with `mode: "handover-prepare"`. The agent picks up partial assistant messages and continues the turn. | | `sendHandoverSkip()` | Dispatch a `handover-skip` signal — only meaningful when started with `mode: "handover-prepare"`. The agent exits cleanly without firing turn hooks. | | `sendAction(action)` | Route a custom action through `actionSchema` + `onAction`. | | `sendStop(message?)` | Fire a stop signal. Does not wait for the turn — the run's `signal.aborted` becomes `true`. | | `seedSnapshot(snapshot)` | Pre-seed the snapshot read for the next boot. Effective on the next run boot only. | | `seedSessionOutTail(chunks?)` | Pre-seed `session.out` chunks for the next boot's replay. Reduces to settled assistant turns. | | `seedSessionOutPartial(message?)` | Pre-seed a trailing partial assistant for the next boot's replay. Surfaces as `event.partialAssistant` in `onRecoveryBoot`. | | `seedSessionInTail(messages)` | Pre-seed user messages on `session.in` for the next boot. Surfaces as `event.inFlightUsers` in `onRecoveryBoot`. | | `getSnapshot()` | The most recently written snapshot, or `undefined` if no snapshot was written. | | `close()` | Send a `close` trigger, abort the signal, wait for `run()` to return. Always call at end of test. | | `allChunks` | Every `UIMessageChunk` emitted since the harness was created. | | `allRawChunks` | Every raw chunk emitted since creation, including control chunks (`trigger:turn-complete`, errors). | ### runInMockTaskContext `mockChatAgent` is a higher-level wrapper around `runInMockTaskContext`, re-exported from `@trigger.dev/sdk/ai/test` so you don't need to depend on `@trigger.dev/core` directly. Use it when you need to drive a non-chat task offline: ```ts theme={"theme":"css-variables"} import { runInMockTaskContext } from "@trigger.dev/sdk/ai/test"; await runInMockTaskContext( async ({ inputs, outputs, ctx }) => { setTimeout(() => { inputs.send("chat-messages", { messages: [], chatId: "c1" }); }, 0); await myTask.fns.run(payload, { ctx, signal: new AbortController().signal, }); expect(outputs.chunks("chat")).toContainEqual( expect.objectContaining({ type: "text-delta", delta: "hi" }), ); }, { ctx: { run: { id: "run_abc" } } }, ); ``` ## Limitations * **No network.** The mock task context replaces realtime streams, run metadata, lifecycle managers, and the runtime. Anything that bypasses these (raw `fetch`, direct DB clients) runs against the real network. * **Single agent per process.** The resource catalog is process-global; tests within a file are sequential by default. If you parallelize across files, vitest runs each file in its own worker, which avoids registry collisions. * **Time-sensitive hooks.** `onTurnComplete` runs *after* the `turn-complete` chunk is written, so `sendMessage()` resolves before that hook finishes. Add a brief `await new Promise((r) => setTimeout(r, 20))` if you need to assert on hook side-effects. * **No real LLM.** The harness does not call providers — you must inject `MockLanguageModelV3` (or another mock) yourself. # Tools Source: https://trigger.dev/docs/ai-chat/tools Declare tools on chat.agent so toModelOutput survives across turns, get them back typed in run(), and type your messages from them. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. `chat.agent` doesn't call the model for you. Your tools still go to [`streamText`](https://sdk.vercel.ai/docs/ai-sdk-core/tools-and-tool-calling) inside `run()`. But you should **also declare them on the agent config**: ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { streamText, stepCountIs, tool } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; import { z } from "zod"; const tools = { searchDocs: tool({ description: "Search the docs.", inputSchema: z.object({ query: z.string() }), execute: async ({ query }) => searchIndex(query), }), }; export const myChat = chat.agent({ id: "my-chat", tools, // ← declare here run: async ({ messages, tools, signal }) => streamText({ ...chat.toStreamTextOptions({ tools }), // ← the same set, handed back on the payload model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal, stopWhen: stepCountIs(15), }), }); ``` Declaring `tools` on the config does two things you can't get by passing them to `streamText` alone: * It threads your tools into the SDK's internal message conversion, so each tool's [`toModelOutput`](https://sdk.vercel.ai/docs/ai-sdk-core/tools-and-tool-calling#tomodeloutput) is re-applied when prior-turn history is re-converted (see [`toModelOutput` across turns](#tomodeloutput-across-turns)). * It hands the resolved set back, typed, on the `run()` payload as `tools`, so you declare them once and don't re-import the map. ## Where tools go There are three places a tool set shows up. Declare once, reuse: | Surface | What it's for | | ------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `chat.agent({ tools })` | Re-applies `toModelOutput` on prior-turn history; hands the set back typed on the `run()` payload. | | `chat.toStreamTextOptions({ tools })` | Detects which tool calls need [HITL approval](/docs/ai-chat/patterns/human-in-the-loop) (`needsApproval`) and merges any auto-injected [skill](/docs/ai-chat/patterns/skills) tools. | | `streamText({ tools })` | What the model actually calls. `chat.toStreamTextOptions({ tools })` already sets this, so spread it instead of passing `tools` twice. | The canonical pattern: declare `tools` on the config, read them back from the `run()` payload, and pass that to `chat.toStreamTextOptions({ tools })`. One declaration flows everywhere. Conversion only reads each tool's `inputSchema` and `toModelOutput`, never `execute`. If you keep heavy `execute` dependencies out of a module (for bundle reasons), you can declare a lightweight schema-only tool map on the config and add the executes where you call `streamText`. ## `toModelOutput` across turns `toModelOutput` transforms a tool's result before it enters the model's context, turning raw image bytes into an image content part, or compressing a long sub-agent transcript into a one-line summary. The full result still streams to the frontend; the model only sees the transformed version. The catch is multi-turn. After each turn, `chat.agent` persists the conversation as `UIMessage[]` and re-converts it to model messages at the start of the next turn. That conversion needs your tools to find each `toModelOutput`. **If you only pass tools to `streamText` and not to the config, the transform runs on turn 1 but is skipped on every later turn.** The raw output gets stringified back into the prompt instead, and the model loses the transformed view. Declaring `tools` on the config fixes this: the SDK threads them into the conversion, so `toModelOutput` is re-applied on every turn. ```ts theme={"theme":"css-variables"} const tools = { renderChart: tool({ description: "Render a chart and return it as an image.", inputSchema: z.object({ spec: z.string() }), execute: async ({ spec }) => renderToPng(spec), // raw bytes // The model should see an image part, not base64 bytes: toModelOutput: ({ output }) => ({ type: "content", value: [{ type: "media", mediaType: "image/png", data: output.base64 }], }), }), }; export const chartChat = chat.agent({ id: "chart-chat", tools, // ← without this, the image is "remembered" on turn 1 and gone from turn 2 run: async ({ messages, tools, signal }) => streamText({ ...chat.toStreamTextOptions({ tools }), model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal, stopWhen: stepCountIs(15), }), }); ``` ## Static or per-turn tools `tools` accepts either a static `ToolSet` or a function that returns one per turn, for tools that depend on the user, a feature flag, or anything in the turn context: ```ts theme={"theme":"css-variables"} export const myChat = chat .withClientData({ schema: z.object({ userId: z.string(), plan: z.string() }) }) .agent({ id: "my-chat", tools: ({ clientData }) => ({ searchDocs, ...(clientData?.plan === "pro" ? { deepResearch } : {}), }), run: async ({ messages, tools, signal }) => streamText({ ...chat.toStreamTextOptions({ tools }), model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal, stopWhen: stepCountIs(15), }), }); ``` The function receives a `ResolveToolsEvent` and runs once per turn (after `clientData` is parsed): | Field | Type | Description | | -------------- | ------------- | ------------------------------------------------ | | `chatId` | `string` | The chat session ID. | | `turn` | `number` | The current turn number (0-indexed). | | `continuation` | `boolean` | Whether this run is continuing an existing chat. | | `clientData` | `TClientData` | Parsed client data from the frontend. | The resolved set is what lands on the `run()` payload's `tools`. ## Typed tools in `run()` The `run()` payload's `tools` is typed to whatever you declared, so you can pass it straight through without re-importing the map: ```ts theme={"theme":"css-variables"} run: async ({ messages, tools, signal }) => { // `tools` is typed as your tool set, not a broad `ToolSet` return streamText({ ...chat.toStreamTextOptions({ tools }), model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal, }); }; ``` When no `tools` are declared, the payload's `tools` is an empty object and behaves exactly as before, so declaring tools is fully opt-in. ## Typing messages from your tools To get typed tool parts (`tool-${name}` with typed input/output) on your `UIMessage`, in hooks like `onTurnComplete` and on the frontend, derive the message type from your tool set with `InferChatUIMessageFromTools`: ```ts theme={"theme":"css-variables"} import type { InferChatUIMessageFromTools } from "@trigger.dev/sdk/ai"; const tools = { searchDocs, renderChart }; export type ChatUiMessage = InferChatUIMessageFromTools; ``` This is shorthand for `UIMessage>`. Pin it on the agent with [`chat.withUIMessage()`](/docs/ai-chat/types#custom-uimessage-with-chat-withuimessage) and reuse it on the client. If you also have custom `data-*` parts, build the `UIMessage` generic directly instead. See [Types](/docs/ai-chat/types). ## Skills [Agent skills](/docs/ai-chat/patterns/skills) are auto-injected as tools (`loadSkill`, `readFile`, `bash`) by `chat.toStreamTextOptions()`. They're separate from your config `tools`: declare your own tools on the config (so their `toModelOutput` survives across turns), and let `toStreamTextOptions` merge the skill tools on top at call time. Skill tools don't define `toModelOutput`, so they don't need to be on the config. ## Manual turn loops (`chat.customAgent`) The `tools` config option belongs to the managed [`chat.agent`](/docs/ai-chat/backend#chat-agent). When you drive the loop yourself with [`chat.customAgent`](/docs/ai-chat/custom-agents#chat-customagent) (or build messages from `chat.history`), you own the conversion, so pass your tools to `convertToModelMessages` directly to get the same cross-turn `toModelOutput` behavior: ```ts theme={"theme":"css-variables"} import { convertToModelMessages, streamText } from "ai"; // Inside your loop, with `tools` in scope: const uiMessages = chat.history.all(); const messages = await convertToModelMessages(uiMessages, { tools, ignoreIncompleteToolCalls: true, }); return streamText({ model: anthropic("claude-sonnet-4-5"), messages, tools }); ``` ## Learn more * [Human-in-the-loop](/docs/ai-chat/patterns/human-in-the-loop): tools that pause for approval. * [Sub-agents](/docs/ai-chat/patterns/sub-agents): tools that delegate to other agents and compress their output with `toModelOutput`. * [Tool result auditing](/docs/ai-chat/patterns/tool-result-auditing): logging tool results as they resolve. * [AI SDK: Tools and tool calling](https://sdk.vercel.ai/docs/ai-sdk-core/tools-and-tool-calling). # Types Source: https://trigger.dev/docs/ai-chat/types TypeScript types for AI Agents, UI messages, and the frontend transport. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. TypeScript patterns for [AI Chat](/docs/ai-chat/overview). This page covers how to pin a custom AI SDK [`UIMessage`](https://sdk.vercel.ai/docs/reference/ai-sdk-core/ui-message) subtype with `chat.withUIMessage`, fix a typed `clientData` schema with `chat.withClientData`, chain builder-level hooks, and align types on the client. ## Custom `UIMessage` with `chat.withUIMessage` `chat.agent()` types the wire payload with the base AI SDK `UIMessage`. That is enough for many apps. When you add **custom `data-*` parts** (via `chat.stream` / `writer`) or a **typed tool map** (e.g. `InferUITools`), you want a **narrower** `UIMessage` generic so that: * `onTurnStart`, `onTurnComplete`, and similar hooks expose correctly typed `uiMessages` * Stream options like `sendReasoning` align with your message shape * The frontend can treat `useChat` messages as the same subtype end-to-end `chat.withUIMessage(config?)` returns a [ChatBuilder](#chatbuilder) where `.agent(...)` accepts the **same options as** [`chat.agent()`](/docs/ai-chat/backend#chat-agent) but fixes `YourUIMessage` as the UI message type for that chat agent. ### Defining a `UIMessage` subtype Build the type from AI SDK helpers and your tools object: ```ts theme={"theme":"css-variables"} import type { InferUITools, UIDataTypes, UIMessage } from "ai"; import { tool, stepCountIs } from "ai"; import { z } from "zod"; const myTools = { lookup: tool({ description: "Look up a record", inputSchema: z.object({ id: z.string() }), execute: async ({ id }) => ({ id, label: "example" }), }), }; type MyChatTools = InferUITools; type MyChatDataTypes = UIDataTypes & { "turn-status": { status: "preparing" | "streaming" | "done" }; }; export type MyChatUIMessage = UIMessage; ``` If you don't need custom `data-*` parts, [`InferChatUIMessageFromTools`](/docs/ai-chat/tools#typing-messages-from-your-tools) from `@trigger.dev/sdk/ai` collapses the tools half into one line (it's shorthand for `UIMessage>`). Task-backed tools should use AI SDK [`tool()`](https://sdk.vercel.ai/docs/ai-sdk-core/tools-and-tool-calling) with `execute: ai.toolExecute(schemaTask)` where needed — see [Task-backed AI tools](/docs/tasks/schemaTask#task-backed-ai-tools). ### Backend: `chat.withUIMessage(...).agent(...)` Call `withUIMessage` **once**, then chain `.agent({ ... })` instead of `chat.agent({ ... })`. You can also chain `.withClientData()` and hook methods before `.agent()`: ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { streamText, tool } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; import { z } from "zod"; import type { MyChatUIMessage } from "./my-chat-types"; const myTools = { lookup: tool({ description: "Look up a record", inputSchema: z.object({ id: z.string() }), execute: async ({ id }) => ({ id, label: "example" }), }), }; export const myChat = chat .withUIMessage({ streamOptions: { sendReasoning: true, onError: (error) => error instanceof Error ? error.message : "Something went wrong.", }, }) .withClientData({ schema: z.object({ userId: z.string() }), }) .agent({ id: "my-chat", tools: myTools, onTurnStart: async ({ uiMessages, writer }) => { // uiMessages is MyChatUIMessage[] — custom data parts are typed writer.write({ type: "data-turn-status", data: { status: "preparing" }, }); }, run: async ({ messages, tools, signal }) => { // `tools` is myTools, typed, handed back on the payload return streamText({ model: anthropic("claude-sonnet-4-5"), messages, tools, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` ### Default stream options The optional `streamOptions` object becomes the **default** [`uiMessageStreamOptions`](/docs/ai-chat/reference#chatagentoptions) for `toUIMessageStream()`. If you also set `uiMessageStreamOptions` on the inner `.agent({ ... })`, the two objects are **shallow-merged** — keys on the **agent** win on conflicts. Per-turn overrides via [`chat.setUIMessageStreamOptions()`](/docs/ai-chat/backend#stream-options) still apply on top. ### Frontend: `InferChatUIMessage` Import the helper type and pass it to `useChat` so `messages` and render logic match the backend: ```tsx theme={"theme":"css-variables"} import { useChat } from "@ai-sdk/react"; import { useTriggerChatTransport, type InferChatUIMessage } from "@trigger.dev/sdk/chat/react"; import type { myChat } from "./myChat"; type Msg = InferChatUIMessage; export function Chat() { const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), }); const { messages } = useChat({ transport }); return messages.map((m) => (
{/* m.parts narrowed for your UIMessage subtype */}
)); } ``` You can also import `InferChatUIMessage` from `@trigger.dev/sdk/ai` in non-React modules. ## Typed client data with `chat.withClientData` `chat.withClientData({ schema })` returns a [ChatBuilder](#chatbuilder) that fixes the client data schema. All hooks and `run` receive typed `clientData` without needing `clientDataSchema` in `.agent()` options. ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { z } from "zod"; export const myChat = chat .withClientData({ schema: z.object({ userId: z.string(), model: z.string().optional() }), }) .agent({ id: "my-chat", onPreload: async ({ clientData }) => { // clientData is typed as { userId: string; model?: string } await initUser(clientData.userId); }, run: async ({ messages, clientData, signal }) => { return streamText({ model: getModel(clientData.model), messages, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` ## ChatBuilder Both `chat.withUIMessage()` and `chat.withClientData()` return a **ChatBuilder** — a chainable object that accumulates configuration before creating the agent with `.agent()`. Builder methods can be chained in any order: ```ts theme={"theme":"css-variables"} export const myChat = chat .withUIMessage({ streamOptions: { sendReasoning: true }, }) .withClientData({ schema: z.object({ userId: z.string() }), }) .onChatSuspend(async ({ ctx }) => { await disposeCodeSandbox(ctx.run.id); }) .onChatResume(async ({ ctx }) => { warmCache(ctx.run.id); }) .agent({ id: "my-chat", run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` ### Builder-level hooks All [lifecycle hooks](/docs/ai-chat/lifecycle-hooks) can be set on the builder: `onPreload`, `onChatStart`, `onTurnStart`, `onBeforeTurnComplete`, `onTurnComplete`, `onCompacted`, `onChatSuspend`, `onChatResume`. Builder hooks and task-level hooks **coexist**. When both are defined for the same event, the builder hook runs first, then the task hook: ```ts theme={"theme":"css-variables"} chat .withUIMessage() .onPreload(async (event) => { // Runs first — shared setup across tasks using this builder await initializeSharedState(event.chatId); }) .agent({ id: "my-chat", onPreload: async (event) => { // Runs second — task-specific logic await createChatRecord(event.chatId); }, run: async ({ messages, signal }) => { return streamText({ model: anthropic("claude-sonnet-4-5"), messages, abortSignal: signal }); }, }); ``` Set types first (`.withUIMessage()`, `.withClientData()`), then hooks. Hook parameters are typed based on the builder's current generics — so hooks registered after `.withClientData()` get typed `clientData`. ### When plain `chat.agent()` is enough If you do not rely on custom `UIMessage` generics (only default text, reasoning, and built-in tool UI types), **`chat.agent()` alone is fine** — no need for `withUIMessage`. ## See also * [Backend — `chat.agent()`](/docs/ai-chat/backend#chat-agent) * [Lifecycle hooks](/docs/ai-chat/lifecycle-hooks) * [Frontend — transport & `useChat`](/docs/ai-chat/frontend) * [API reference — `chat.withUIMessage`](/docs/ai-chat/reference#chat-withuimessage) * [API reference — `chat.withClientData`](/docs/ai-chat/reference#chat-withclientdata) * [Task-backed AI tools — `ai.toolExecute`](/docs/tasks/schemaTask#task-backed-ai-tools) # Upgrade Guide: prerelease → Sessions-as-run-manager Source: https://trigger.dev/docs/ai-chat/upgrade-guide Migrating chat.agent code from the prerelease API to the Sessions-as-run-manager release. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. This guide is for customers who tried `chat.agent` during the prerelease period. The public surface of `chat.agent({...})`, `useTriggerChatTransport`, `AgentChat`, `chat.defer`, and `chat.history` is largely unchanged — but the transport's auth callbacks and the server-side helpers that feed them were reshaped, so most prerelease apps need a small wiring update. ## TL;DR ```ts before.ts theme={"theme":"css-variables"} // Single accessToken callback, dispatches on purpose accessToken: async ({ chatId, purpose }) => { if (purpose === "trigger") { return chat.createAccessToken("my-chat"); } // purpose === "preload" — same call, same trigger token return chat.createAccessToken("my-chat"); }; ``` ```ts after.ts theme={"theme":"css-variables"} // Two callbacks: pure refresh + server action that creates the session accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), ``` What changed: * `accessToken` is now a **pure session-PAT mint** — called only on 401/403 to refresh. It must return a token scoped to the session, not a `trigger:tasks` JWT. * `startSession` is a **new callback** that wraps a server action calling `chat.createStartSessionAction(taskId)`. The transport invokes it on `transport.preload(chatId)` and lazily on the first `sendMessage` for any chatId without a cached PAT. * `ChatSession` persistable state drops `runId` — store only `{publicAccessToken, lastEventId?}`. * Per-call options on `transport.preload(chatId, ...)` are gone. Trigger config (machine, idleTimeoutInSeconds, tags, queue, maxAttempts) lives server-side in `chat.createStartSessionAction(taskId, options)`. The architectural shift is that `chat.agent` no longer rolls its own per-run streams. It runs on top of a durable **Session** row that owns its current run, persists across run lifecycles, and orchestrates upgrades server-side. The customer-facing surface is similar; the wire path beneath it changed completely. ## Step 1: Replace your access-token server action with two server actions The old pattern was a single helper that minted a trigger token: ```ts app/actions.ts (before) theme={"theme":"css-variables"} "use server"; import { chat } from "@trigger.dev/sdk/ai"; import type { myChat } from "@/trigger/chat"; export const getChatToken = () => chat.createAccessToken("my-chat"); ``` Replace with two helpers — one for session creation, one for PAT refresh: ```ts app/actions.ts (after) theme={"theme":"css-variables"} "use server"; import { auth } from "@trigger.dev/sdk"; import { chat } from "@trigger.dev/sdk/ai"; // Server-side wrapper for session creation. Idempotent on (env, chatId). // The customer's server is the only entry point that creates Session rows; // the browser never holds a `trigger:tasks` JWT. export const startChatSession = chat.createStartSessionAction("my-chat"); // Pure session-PAT mint for the transport's 401/403 retry path. export async function mintChatAccessToken(chatId: string) { return auth.createPublicToken({ scopes: { read: { sessions: chatId }, write: { sessions: chatId }, }, expirationTime: "1h", }); } ``` `chat.createStartSessionAction(taskId)` returns a server action that: 1. Creates the Session row for `chatId` (idempotent on the `(env, externalId)` unique pair). 2. Triggers the agent task's first run with `basePayload: {messages: [], trigger: "preload"}` defaults plus any overrides you pass. 3. Returns `{sessionId, runId, publicAccessToken}` to the browser. ## Step 2: Update the transport wiring The transport now takes two callbacks instead of one: ```tsx app/components/chat.tsx (after) theme={"theme":"css-variables"} "use client"; import { useChat } from "@ai-sdk/react"; import { useTriggerChatTransport } from "@trigger.dev/sdk/chat/react"; import type { myChat } from "@/trigger/chat"; import { mintChatAccessToken, startChatSession } from "@/app/actions"; export function Chat() { const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), }); const { messages, sendMessage, status } = useChat({ transport }); // ... } ``` The transport calls them in two distinct flows: | Trigger | Callback fired | | ---------------------------------------------------------------- | --------------------------- | | `transport.preload(chatId)` | `startSession` | | First `sendMessage` for a chatId with no cached PAT | `startSession` (auto) | | Any 401/403 from `.in/append`, `.out` SSE, or `end-and-continue` | `accessToken` | | Page hydrates with `sessions: { [chatId]: ... }` | Neither (uses hydrated PAT) | `startSession` is deduped via an in-flight promise — concurrent `preload` + `sendMessage` calls converge to one server action invocation. ## Step 3: Drop transport-level trigger config The prerelease transport accepted `triggerConfig`, `triggerOptions`, and per-call options on `preload`. All of that moved server-side: ```ts before theme={"theme":"css-variables"} const transport = useTriggerChatTransport({ task: "my-chat", accessToken: getChatToken, triggerConfig: { basePayload: { /* ... */ } }, triggerOptions: { tags: [...], machine: "small-1x", maxAttempts: 3 }, }); transport.preload(chatId, { idleTimeoutInSeconds: 60, metadata: { ... } }); ``` ```ts after theme={"theme":"css-variables"} // Trigger config now lives in chat.createStartSessionAction export const startChatSession = chat.createStartSessionAction("my-chat", { triggerConfig: { machine: "small-1x", maxAttempts: 3, tags: ["my-tag"], idleTimeoutInSeconds: 60, }, }); // Browser side const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), }); transport.preload(chatId); // no second arg ``` For metadata that varies per chat, use `clientData` on the transport (see the next step) — it's typed and threaded through `startSession` automatically. ## Step 4: Use `clientData` for typed payload metadata If your agent uses `withClientData({schema})`, the transport's `clientData` option is now the canonical place to set it. The same value: * Is passed to your `startSession` callback as `params.clientData`, where you forward it into `chat.createStartSessionAction`'s `triggerConfig.basePayload.metadata`. The agent's first run sees it in `payload.metadata` (visible to `onPreload` / `onChatStart`). * Merges into per-turn `metadata` on every `.in/append` chunk (visible to `onTurnStart` / inside `run` via `turn.clientData`). ```tsx theme={"theme":"css-variables"} const transport = useTriggerChatTransport({ task: "my-chat", accessToken: ({ chatId }) => mintChatAccessToken(chatId), startSession: ({ chatId, clientData }) => startChatSession({ chatId, clientData }), clientData: { userId: currentUser.id, plan: currentUser.plan, }, }); ``` The `clientData` value is live-updated when the option changes (the hook calls `setClientData` under the hood), so dynamic values work without reconstructing the transport. Server-side authorization can still override or augment the browser-claimed `clientData` inside `startSession` — never trust the browser's identity claim. A typical pattern: the server action looks up the user from the request session, then merges the trusted server fields on top of `params.clientData`. ## Step 5: Update your `ChatSession` persistence If you persist session state across page loads, drop the `runId` field: ```ts before theme={"theme":"css-variables"} type ChatSession = { runId: string; publicAccessToken: string; lastEventId?: string; }; ``` ```ts after theme={"theme":"css-variables"} type ChatSession = { publicAccessToken: string; lastEventId?: string; }; ``` If your DB has a `runId` column, you can drop it (the transport doesn't read it) or keep it for telemetry. The current run ID lives on the Session row server-side now. Hydration on page reload is unchanged: ```tsx theme={"theme":"css-variables"} const transport = useTriggerChatTransport({ // ... sessions: persistedSession ? { [chatId]: persistedSession } : {}, }); ``` ## `chat.requestUpgrade()`: same call, faster handoff Calling `chat.requestUpgrade()` inside `onTurnStart` / `onValidateMessages` still ends the current run so the next message starts on the latest version. What changed is the mechanism: * **Before:** the agent emitted a `trigger:upgrade-required` chunk on `.out`; the transport consumed it browser-side and triggered a new run. * **After:** the agent calls `endAndContinueSession` server-to-server; the webapp triggers a new run and atomically swaps `Session.currentRunId` via optimistic locking. The browser's existing SSE subscription keeps receiving chunks across the swap — no transport-side bookkeeping. The new run is recorded in a `SessionRun` audit row with `reason: "upgrade"` for dashboard provenance. ## Hitting raw URLs If your code talks to the realtime API directly instead of going through the SDK, the URL shapes changed: | Before | After | | ----------------------------------------------------------------- | ------------------------------------------------------------------------------------ | | `GET /realtime/v1/streams/{runId}/chat` | `GET /realtime/v1/sessions/{chatId}/out` | | `POST /realtime/v1/streams/{runId}/{target}/chat-messages/append` | `POST /realtime/v1/sessions/{chatId}/in/append` (body: `{kind: "message", payload}`) | | `POST /realtime/v1/streams/{runId}/{target}/chat-stop/append` | `POST /realtime/v1/sessions/{chatId}/in/append` (body: `{kind: "stop"}`) | The session-scoped PAT (`read:sessions:{chatId} + write:sessions:{chatId}`) authorizes both the externalId form (`/sessions/my-chat-id/...`) and the friendlyId form (`/sessions/session_abc.../...`). The transport always uses the externalId form; the friendlyId form is available for dashboard tooling and direct API consumers. ## What didn't change * `chat.agent({...})` definition — `id`, `idleTimeoutInSeconds`, `clientDataSchema`, `actionSchema`, `hydrateMessages`, `onPreload`, `onChatStart`, `onValidateMessages`, `onTurnStart`, `onTurnComplete`, `onChatSuspend`, `run`. All callbacks have the same signature and fire at the same lifecycle points. * `onAction` is still defined the same way, but its semantics changed in the [May 6 prerelease](/docs/ai-chat/changelog) — actions are no longer turns, and `onAction` returning a `StreamTextResult` produces a model response. * `chat.customAgent({...})` and the `chat.createSession(payload, ...)` helper for building a session loop manually inside a custom agent. * `chat.defer` (deferred work) and `chat.history` (imperative history mutations from inside `onAction`). * `AgentChat` (server-side chat client) — `agent`, `id`, `clientData`, `session`, `onTriggered`, `onTurnComplete`, `sendMessage`, `text()`. * `useTriggerChatTransport` React semantics (created once, kept in a ref, callbacks updated under the hood). * Multi-tab coordination (`multiTab: true`), [pending messages / steering](/docs/ai-chat/pending-messages), [background injection](/docs/ai-chat/background-injection), [compaction](/docs/ai-chat/compaction). * Per-turn `metadata` flowing through `sendMessage({ text }, { metadata })` to `turn.metadata` server-side. ## Verifying the migration After updating, the smoke check is the same as before: send a message, confirm the assistant streams a response, reload mid-stream, confirm resume. A few new things worth verifying once you've cut over: * **Eager preload.** Click the button (or call `transport.preload(id)` programmatically) — your `startSession` callback should fire and a Session row + first run should be created before you send a message. * **Idle-timeout continuation.** Wait past the agent's `idleTimeoutInSeconds` so the run exits, then send another message — the transport's `.in/append` should boot a new run on the same Session, with a `SessionRun` row of `reason: "continuation"`. * **PAT refresh.** Force a stale PAT in your DB (corrupt the signature) and reload — the first request should 401, your `accessToken` callback should fire, and the retry should succeed. If any of those misfire, check that: * Your `accessToken` callback returns a token minted via `auth.createPublicToken({ scopes: { read: { sessions: chatId }, write: { sessions: chatId } } })`, **not** `chat.createAccessToken` or `auth.createTriggerPublicToken`. The transport rejects trigger tokens now. * Your `startSession` callback returns `{publicAccessToken: string}` — the result of `chat.createStartSessionAction(taskId)({chatId, ...})` already has this shape. * You haven't left a stale `getStartToken` option on the transport; it's not part of `TriggerChatTransportOptions` anymore. ## v4.5 wire format change A second migration lands on top of the Sessions release. v4.5 removes the full-history wire payload — clients now ship at most one new `UIMessage` per `.in/append`, and the agent rebuilds prior history from a durable JSON snapshot in object storage plus a replay of the `session.out` tail. If you use the built-in `TriggerChatTransport` / `AgentChat` and don't reach into the wire shape directly, **most apps need no changes** — the change is below the customer-facing surface. Customers who built custom transports, hit `/realtime/v1/sessions/{id}/in/append` directly, or rely on specific behaviors of `hydrateMessages` / `onChatStart` should read this section. ### Why the change Long chats with heavy tool results were hitting the realtime API's 512 KiB body cap on `/in/append` once the accumulated `UIMessage[]` history (which the wire shipped in full on every send) crossed the limit. The 413 surfaced as a CORS error in browsers and stalled chats around turn 10–30 with tool use. The wire is now **delta-only**: each `.in/append` carries at most one new `UIMessage`. The agent rebuilds prior history at run boot. The 512 KiB ceiling stops being pressure — typical payloads are a few KB regardless of chat length. ### Object-store configuration Snapshot read/write uses Trigger.dev's existing object-store infrastructure — the same presigned-URL routes used for large payloads. Set the standard `OBJECT_STORE_*` env vars on your webapp deployment if you haven't already; MinIO and S3-compatible stores work via `OBJECT_STORE_DEFAULT_PROTOCOL`. | Env var | Purpose | | -------------------------------- | ---------------------------------- | | `OBJECT_STORE_BASE_URL` | Endpoint URL (S3, MinIO, R2, etc.) | | `OBJECT_STORE_ACCESS_KEY_ID` | Access key | | `OBJECT_STORE_SECRET_ACCESS_KEY` | Secret key | | `OBJECT_STORE_DEFAULT_PROTOCOL` | `s3` (default), `minio`, etc. | Snapshots are written under `packets/{projectRef}/{envSlug}/sessions/{sessionId}/snapshot.json`. Each snapshot is small (typically tens of KB) and overwritten every turn — no append-only growth. **No object store + no `hydrateMessages` = conversations don't survive run boundaries.** With neither piece of state, a continuation boots empty and the agent can't reconstruct prior turns. Either configure an object store or register `hydrateMessages`. The runtime logs a warning at agent registration time when both are missing. ### Custom transports If you've built your own transport (Slack bot, CLI, native app) against the [Client Protocol](/docs/ai-chat/client-protocol), the `ChatTaskWirePayload` shape changed: ```ts before theme={"theme":"css-variables"} type ChatTaskWirePayload = { messages: UIMessage[]; // full history chatId: string; trigger: "submit-message" | "regenerate-message" | "preload" | "close" | "action"; // ... }; ``` ```ts after theme={"theme":"css-variables"} type ChatTaskWirePayload = { message?: UIMessage; // singular, optional headStartMessages?: UIMessage[]; // chat.headStart only, "handover-prepare" chatId: string; trigger: | "submit-message" | "regenerate-message" | "preload" | "close" | "action" | "handover-prepare"; // ... }; ``` What to send per trigger: | Trigger | What to put in the payload | | ------------------------------------ | ---------------------------------------------------------------------------------- | | `submit-message` | The new user message (or a tool-approval-responded assistant message) in `message` | | `regenerate-message` | No `message` — the agent trims its own tail | | `preload` / `close` / `action` | No `message` | | `handover-prepare` (head-start only) | Full prior history in `headStartMessages` (route handler — not on `/in/append`) | The full wire breakdown is in the rewritten [Client Protocol](/docs/ai-chat/client-protocol). ### `hydrateMessages` consumers The hook signature is unchanged. Two behavior tightenings worth knowing: 1. **`incomingMessages` is now consistently 0-or-1-length.** Previously some triggers (`regenerate-message`, continuation) shipped full history; now all triggers ship at most one. If you assumed `incomingMessages` could contain multiple messages and acted on them as a batch, the loop now runs zero or one times. Patterns like the one below work the same — they just iterate fewer messages: ```ts theme={"theme":"css-variables"} hydrateMessages: async ({ incomingMessages }) => { for (const msg of incomingMessages) { // 0-or-1 iterations for (const r of chat.history.extractNewToolResults(msg)) { await auditLog.record({ id: r.toolCallId, output: r.output }); } } return await db.getMessages(chatId); } ``` 2. **Registering `hydrateMessages` short-circuits snapshot+replay.** The runtime trusts your hook to be the source of truth, so it doesn't read or write the JSON snapshot or replay `session.out`. Zero object-store traffic. Trade-off: you own persistence end-to-end. ### `onChatStart` is now once-per-chat `onChatStart` no longer fires on continuation runs (post-`endRun`, post-waitpoint-timeout, post-`chat.requestUpgrade`, post-cancel, post-crash) or on OOM-retry attempts. It fires **exactly once per chat**, on the very first user message of the chat's lifetime. The `continuation` and `previousRunId` fields on `ChatStartEvent` are now `@deprecated` (always `false` / `undefined` when the hook fires). This makes once-per-chat setup code (create the Chat DB row, mint chat-scoped resources) safe to write without continuation gates. Drop any `if (continuation) return;` checks from `onChatStart`: ```ts before theme={"theme":"css-variables"} onChatStart: async ({ continuation, chatId, clientData }) => { if (continuation) return; // ❌ no longer needed — fires only on first message ever await db.chat.create({ /* ... */ }); } ``` ```ts after theme={"theme":"css-variables"} onChatStart: async ({ chatId, clientData }) => { await db.chat.create({ /* ... */ }); // ✅ guaranteed first-message-of-chat } ``` If you need per-turn setup that **does** run on continuations, move it to [`onTurnStart`](/docs/ai-chat/lifecycle-hooks#onturnstart) — that hook still fires on every turn, including the first turn of a continuation run. ### Move `chat.local` init from `onChatStart` to `onBoot` Because `onChatStart` no longer fires on continuation runs, **`chat.local`** state initialized there will be missing when a continuation run starts — `run()` then crashes with `"chat.local can only be modified after initialization"`. The fix is to move per-process initialization to the new [`onBoot`](/docs/ai-chat/lifecycle-hooks#onboot) hook, which fires once per worker boot (initial, preloaded, AND continuation): ```ts before theme={"theme":"css-variables"} const userContext = chat.local<{ name: string; plan: string }>({ id: "userContext" }); onChatStart: async ({ clientData }) => { const user = await db.user.findUnique({ where: { id: clientData.userId } }); userContext.init({ name: user.name, plan: user.plan }); // ❌ never runs on continuation } ``` ```ts after theme={"theme":"css-variables"} const userContext = chat.local<{ name: string; plan: string }>({ id: "userContext" }); onBoot: async ({ clientData }) => { const user = await db.user.findUnique({ where: { id: clientData.userId } }); userContext.init({ name: user.name, plan: user.plan }); // ✅ runs on every fresh worker } ``` Anything else that's per-process (DB connection pools, sandbox handles, in-memory caches) belongs in `onBoot` for the same reason. Branch on `continuation` inside `onBoot` if you need to re-load state from your DB on takeover. ### Client-side `setMessages` doesn't round-trip The new wire makes one thing explicit that was implicit before: **mutating `useChat()`'s messages on the client doesn't change the agent's history.** Full-history mutations were silently overwritten by the wire's accumulator before this release; now they aren't even shipped. For history compaction, summarization, or branch-swap, mutate the agent's accumulator inside `onTurnStart` using [`chat.setMessages()`](/docs/ai-chat/backend) or [`chat.history.set()`](/docs/ai-chat/backend#chat-history). The client's `useChat` will reconcile against the next `session.out` payload. ### Verifying the v4.5 migration After updating, the smoke check is the same as for v4.4: * Send a message, confirm the assistant streams a response. * Reload mid-stream, confirm resume. * Send 30+ turns with tool calls — `.in/append` body sizes stay under \~5 KB the entire time. (Pre-change baseline: payloads grew past 512 KB around turn 10-30.) * Idle out a run, send another message — the new run reads the snapshot, replays the tail, and continues seamlessly. If continuations boot empty: * Confirm `OBJECT_STORE_*` env vars are set on the webapp. * Confirm the bucket key `packets/{projectRef}/{envSlug}/sessions/{sessionId}/snapshot.json` exists after a successful turn. * Or — register `hydrateMessages` and let your DB be the source of truth. ## Reference * [TriggerChatTransport options](/docs/ai-chat/reference#triggerchattransport-options) * [`chat.createStartSessionAction`](/docs/ai-chat/reference) * [Backend setup](/docs/ai-chat/backend) * [Frontend setup](/docs/ai-chat/frontend) * [Client Protocol](/docs/ai-chat/client-protocol) — wire format reference * [Persistence and replay](/docs/ai-chat/patterns/persistence-and-replay) — snapshot model end-to-end # Prompts Source: https://trigger.dev/docs/ai/prompts Define prompt templates as code, version them on deploy, and override from the dashboard without redeploying. The AI Agents and Prompts surface ships as part of the **v4.5 release candidate**. Install with `@trigger.dev/sdk@rc` (or pin `4.5.0-rc.0` or later) to use these features — they aren't yet on the latest stable, and APIs may still change before the 4.5.0 GA. See [supported AI SDK versions](/docs/ai-chat/reference#compatibility) and the [AI chat changelog](/docs/ai-chat/changelog) for details. ## Overview AI Prompts let you define prompt templates in your codebase alongside your tasks. When you deploy, Trigger.dev automatically versions your prompts. You can then: * View all prompt versions in the dashboard * Create **overrides** to change the prompt text or model without redeploying * Track every generation that used each prompt version * See token usage, cost, and latency metrics per prompt * Manage prompts programmatically via SDK methods ## Defining a prompt Use `prompts.define()` to create a prompt with typed variables: ```ts theme={"theme":"css-variables"} import { prompts } from "@trigger.dev/sdk"; import { z } from "zod"; export const supportPrompt = prompts.define({ id: "customer-support", description: "System prompt for customer support interactions", model: "gpt-4o", config: { temperature: 0.7 }, variables: z.object({ customerName: z.string(), plan: z.string(), issue: z.string(), }), content: `You are a support agent for Acme SaaS. ## Customer context - **Name:** {{customerName}} - **Plan:** {{plan}} - **Issue:** {{issue}} Respond to the customer's issue. Be concise and helpful.`, }); ``` ### Options | Option | Type | Required | Description | | ------------- | ------------------ | -------- | ------------------------------------------------------------------- | | `id` | `string` | Yes | Unique identifier (becomes the prompt slug) | | `description` | `string` | No | Shown in the dashboard | | `model` | `string` | No | Default model (e.g. `"gpt-4o"`, `"claude-sonnet-4-6"`) | | `config` | `object` | No | Default config (temperature, maxTokens, etc.) | | `variables` | Zod/ArkType schema | No | Schema for template variables (enables validation and dashboard UI) | | `content` | `string` | Yes | The prompt template with `{{variable}}` placeholders | ### Template syntax Templates use Mustache-style placeholders: * `{{variableName}}` — replaced with the variable value * `{{#conditionalVar}}...{{/conditionalVar}}` — content only included if the variable is truthy ```ts theme={"theme":"css-variables"} export const prompt = prompts.define({ id: "summarizer", model: "gpt-4o-mini", variables: z.object({ text: z.string(), maxSentences: z.string().optional(), }), content: `Summarize the following text{{#maxSentences}} in {{maxSentences}} sentences or fewer{{/maxSentences}}: {{text}}`, }); ``` ## Resolving a prompt ### Via prompt handle Call `.resolve()` on the handle returned by `define()`: ```ts theme={"theme":"css-variables"} const resolved = await supportPrompt.resolve({ customerName: "Alice", plan: "Pro", issue: "Cannot access billing dashboard", }); console.log(resolved.text); // The compiled prompt with variables filled in console.log(resolved.version); // e.g. 3 console.log(resolved.model); // "gpt-4o" console.log(resolved.labels); // ["current"] or ["override"] ``` ### Via standalone prompts.resolve() Resolve any prompt by slug without needing a handle. Pass the prompt handle as a type parameter for full type safety: ```ts theme={"theme":"css-variables"} import { prompts } from "@trigger.dev/sdk"; import type { supportPrompt } from "./prompts"; // Fully typesafe — ID and variables are checked at compile time const resolved = await prompts.resolve("customer-support", { customerName: "Alice", plan: "Pro", issue: "Cannot access billing dashboard", }); ``` Without the generic, the function still works but accepts any string slug and `Record` variables. ### Resolve options You can resolve a specific version or label: ```ts theme={"theme":"css-variables"} // Resolve a specific version const v2 = await supportPrompt.resolve(variables, { version: 2 }); // Resolve by label const current = await supportPrompt.resolve(variables, { label: "current" }); ``` By default, `resolve()` returns the **override** version if one is active, otherwise the **current** (latest deployed) version. Both `promptHandle.resolve()` and `prompts.resolve()` call the Trigger.dev API when a client is configured. During local dev with `trigger dev`, this means you'll always get the server version (including overrides). ## Using with the AI SDK The resolved prompt integrates with the [Vercel AI SDK](https://ai-sdk.dev) via `toAISDKTelemetry()`. This links AI generation spans to the prompt in the dashboard. ### generateText ```ts theme={"theme":"css-variables"} import { task } from "@trigger.dev/sdk"; import { generateText, stepCountIs } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; export const supportTask = task({ id: "handle-support", run: async (payload) => { const resolved = await supportPrompt.resolve({ customerName: payload.name, plan: payload.plan, issue: payload.issue, }); const result = await generateText({ model: openai(resolved.model ?? "gpt-4o"), system: resolved.text, prompt: payload.issue, ...resolved.toAISDKTelemetry(), }); return { response: result.text }; }, }); ``` ### streamText ```ts theme={"theme":"css-variables"} import { streamText } from "ai"; export const streamTask = task({ id: "stream-support", run: async (payload) => { const resolved = await supportPrompt.resolve({ customerName: payload.name, plan: payload.plan, issue: payload.issue, }); const result = streamText({ model: openai(resolved.model ?? "gpt-4o"), system: resolved.text, prompt: payload.issue, ...resolved.toAISDKTelemetry(), stopWhen: stepCountIs(15), }); let fullText = ""; for await (const chunk of result.textStream) { fullText += chunk; } return { response: fullText }; }, }); ``` ### Custom telemetry metadata Pass additional metadata to `toAISDKTelemetry()` that will appear on the generation span: ```ts theme={"theme":"css-variables"} const result = await generateText({ model: anthropic("claude-sonnet-4-5"), prompt: resolved.text, ...resolved.toAISDKTelemetry({ "task.type": "summarization", "customer.tier": "enterprise", }), }); ``` ## Using with chat.agent() Prompts integrate with `chat.agent()` via `chat.prompt` — a run-scoped store for the resolved prompt. Store a prompt once in a lifecycle hook, then access it anywhere during the run. ### chat.prompt.set() and chat.prompt() ```ts theme={"theme":"css-variables"} import { chat } from "@trigger.dev/sdk/ai"; import { prompts } from "@trigger.dev/sdk"; import { streamText, createProviderRegistry } from "ai"; import { anthropic } from "@ai-sdk/anthropic"; const registry = createProviderRegistry({ anthropic }); const systemPrompt = prompts.define({ id: "my-chat-system", model: "anthropic:claude-sonnet-4-5", config: { temperature: 0.7 }, variables: z.object({ name: z.string() }), content: `You are a helpful assistant for {{name}}.`, }); export const myChat = chat.agent({ id: "my-chat", onChatStart: async ({ clientData }) => { const resolved = await systemPrompt.resolve({ name: clientData.name }); chat.prompt.set(resolved); }, run: async ({ messages, signal }) => { return streamText({ ...chat.toStreamTextOptions({ registry }), messages, abortSignal: signal, stopWhen: stepCountIs(15), }); }, }); ``` ### chat.toStreamTextOptions() Returns an options object ready to spread into `streamText()`. When a prompt is stored via `chat.prompt.set()`, it includes: * `system` — the compiled prompt text * `model` — resolved via the `registry` when provided * `temperature`, `maxTokens`, etc. — from the prompt's `config` * `experimental_telemetry` — links generations to the prompt in the dashboard ```ts theme={"theme":"css-variables"} // With registry — model is resolved automatically const options = chat.toStreamTextOptions({ registry }); // { system: "...", model: LanguageModel, temperature: 0.7, experimental_telemetry: { ... } } // Without registry — model is not included const options = chat.toStreamTextOptions(); // { system: "...", temperature: 0.7, experimental_telemetry: { ... } } ``` When the user provides a `registry` and the prompt has a `model` string (e.g. `"anthropic:claude-sonnet-4-5"`), the model is resolved via `registry.languageModel()` and included in the returned options. This means `streamText` uses the prompt's model by default — no manual model selection needed. ### Reading the prompt Access the stored prompt from anywhere in the run: ```ts theme={"theme":"css-variables"} run: async ({ messages, signal }) => { const prompt = chat.prompt(); // Throws if not set console.log(prompt.text); // The compiled prompt console.log(prompt.model); // "anthropic:claude-sonnet-4-5" console.log(prompt.version); // 3 return streamText({ ...chat.toStreamTextOptions({ registry }), messages, abortSignal: signal, stopWhen: stepCountIs(15), }); }, ``` You can also set a plain string if you don't need the full prompt system: ```ts theme={"theme":"css-variables"} chat.prompt.set("You are a helpful assistant."); ``` ## Prompt management SDK The `prompts` namespace includes methods for managing prompts programmatically. These work both inside tasks and outside (e.g. scripts, API handlers) as long as an API client is configured. ### List prompts ```ts theme={"theme":"css-variables"} const allPrompts = await prompts.list(); ``` ### List versions ```ts theme={"theme":"css-variables"} const versions = await prompts.versions("customer-support"); ``` ### Create an override Create a new override that takes priority over the deployed version: ```ts theme={"theme":"css-variables"} const result = await prompts.createOverride("customer-support", { textContent: "New prompt template: Hello {{customerName}}!", model: "gpt-4o-mini", commitMessage: "Shorter prompt", }); ``` ### Update an override ```ts theme={"theme":"css-variables"} await prompts.updateOverride("customer-support", { textContent: "Updated template: Hi {{customerName}}!", model: "gpt-4o", }); ``` ### Remove an override Remove the active override, reverting to the deployed version: ```ts theme={"theme":"css-variables"} await prompts.removeOverride("customer-support"); ``` ### Promote a version ```ts theme={"theme":"css-variables"} await prompts.promote("customer-support", 2); ``` ### All management methods | Method | Description | | --------------------------------------------- | ------------------------------------------- | | `prompts.list()` | List all prompts in the current environment | | `prompts.versions(slug)` | List all versions for a prompt | | `prompts.resolve(slug, variables?, options?)` | Resolve a prompt by slug | | `prompts.promote(slug, version)` | Promote a version to current | | `prompts.createOverride(slug, body)` | Create an override | | `prompts.updateOverride(slug, body)` | Update the active override | | `prompts.removeOverride(slug)` | Remove the active override | | `prompts.reactivateOverride(slug, version)` | Reactivate a removed override | ## Overrides Overrides let you change a prompt's template or model from the dashboard or SDK without redeploying your code. When an override is active, `resolve()` returns the override version instead of the deployed version. ### How overrides work * Overrides take priority over the deployed ("current") version * Only one override can be active at a time * Creating a new override replaces the previous one * Removing an override reverts to the deployed version * Overrides are environment-scoped (dev, staging, production are independent) ### Creating an override (dashboard) 1. Go to the prompt detail page 2. Click **Create Override** 3. Edit the template text and/or model 4. Add an optional commit message 5. Click **Create override** ### Version resolution order When `resolve()` is called, versions are resolved in this order: 1. **Specific version** — if `{ version: N }` is passed 2. **Override** — if an override is active in this environment 3. **Label** — if `{ label: "..." }` is passed (defaults to `"current"`) 4. **Current** — the latest deployed version with the "current" label ## Dashboard ### Prompts list The prompts list page shows all prompts in the current environment with the current or override version, default model, and a usage sparkline. ### Prompt detail Click a prompt to see: * **Template panel** — the prompt template for the selected version * **Details tab** — slug, description, model, config, source file, and variable schema * **Versions tab** — all versions with labels, source, and commit messages * **Generations tab** — every AI generation that used this prompt, with live polling * **Metrics tab** — token usage, cost, and latency charts ### AI span inspectors When you use `toAISDKTelemetry()`, AI generation spans in the run trace get a custom inspector showing: * **Overview** — model, provider, token usage, cost, input/output preview * **Messages** — the full message thread * **Tools** — tool definitions and tool call details * **Prompt** — the linked prompt's metadata, input variables, and template content ## Type utilities ```ts theme={"theme":"css-variables"} import type { PromptHandle, PromptIdentifier, PromptVariables } from "@trigger.dev/sdk"; type Id = PromptIdentifier; // "customer-support" type Vars = PromptVariables; // { customerName: string; plan: string; issue: string } ``` # API keys Source: https://trigger.dev/docs/apikeys How to authenticate with Trigger.dev so you can trigger tasks. ### Authentication and your secret keys When you [trigger a task](/docs/triggering) from your backend code, you need to set the `TRIGGER_SECRET_KEY` environment variable. Each environment has its own secret key. You can find the value on the API keys page in the Trigger.dev dashboard: How to find your secret key For preview branches, you need to also set the `TRIGGER_PREVIEW_BRANCH` environment variable as well. You can find the value on the API keys page when you're on the preview branch. ### Automatically Configuring the SDK To automatically configure the SDK with your secret key, you can set the `TRIGGER_SECRET_KEY` environment variable. The SDK will automatically use this value when calling API methods (like `trigger`). ```bash .env theme={"theme":"css-variables"} TRIGGER_SECRET_KEY="tr_dev_…" TRIGGER_PREVIEW_BRANCH="my-branch" # Only needed for preview branches ``` You can do the same if you are self-hosting and need to change the default URL by using `TRIGGER_API_URL`. ```bash .env theme={"theme":"css-variables"} TRIGGER_API_URL="https://trigger.example.com" TRIGGER_PREVIEW_BRANCH="my-branch" # Only needed for preview branches ``` The default URL is `https://api.trigger.dev`. ### Manually Configuring the SDK If you prefer to manually configure the SDK, you can call the `configure` method: ```ts theme={"theme":"css-variables"} import { configure } from "@trigger.dev/sdk"; import { myTask } from "./trigger/myTasks"; configure({ secretKey: "tr_dev_1234", // WARNING: Never actually hardcode your secret key like this previewBranch: "my-branch", // Only needed for preview branches baseURL: "https://mytrigger.example.com", // Optional }); async function triggerTask() { await myTask.trigger({ userId: "1234" }); // This will use the secret key and base URL you configured } ``` # Overview Source: https://trigger.dev/docs/building-with-ai Tools and resources for building Trigger.dev projects with AI coding assistants. ## Quick setup We provide multiple tools to help AI coding assistants write correct Trigger.dev code. Use one or all of them for the best developer experience. Give your AI assistant direct access to Trigger.dev tools — search docs, trigger tasks, deploy projects, and monitor runs. Works with Claude Code, Cursor, Windsurf, VS Code (Copilot), and Zed. ```bash theme={"theme":"css-variables"} npx trigger.dev@latest install-mcp ``` [Learn more →](/docs/mcp-introduction) Portable instruction sets that teach any AI coding assistant Trigger.dev best practices: writing tasks, realtime frontends, and `chat.agent` AI agents. They ship with the CLI, versioned with it, and install into Claude Code, Cursor, VS Code (Copilot), and AGENTS-compatible tools such as Codex via `.agents/skills/`. ```bash theme={"theme":"css-variables"} npx trigger.dev@latest skills ``` [Learn more →](/docs/skills) ## Skills and the MCP server Skills and the MCP server do different jobs and work best together. Here's how they compare: | | **Skills** | **MCP Server** | | :---------------- | :------------------------------------------------------------------------- | :-------------------------------------------------- | | **What it does** | Drops skill files into your project that teach Trigger.dev patterns | Runs a live server your AI connects to | | **Installs to** | `.claude/skills/`, `.cursor/skills/`, `.github/skills/`, `.agents/skills/` | `mcp.json`, `~/.claude.json`, etc. | | **Updates** | Re-run `npx trigger.dev@latest skills`, or auto-prompted on `trigger dev` | Always latest (uses `@latest`) | | **Best for** | Teaching patterns and best practices | Live project interaction (deploy, trigger, monitor) | | **Works offline** | Yes | No (calls Trigger.dev API) | **Our recommendation:** Install both. Skills teach your AI *how* to write Trigger.dev code; the MCP Server lets it *do things* in your project. ## Project-level context snippet If you prefer a lightweight/passive approach, paste the snippet below into a context file at the root of your project. Different AI tools read different files: | File | Read by | | :-------------------------------- | :---------------------------- | | `CLAUDE.md` | Claude Code | | `AGENTS.md` | OpenAI Codex, Jules, OpenCode | | `.cursor/rules/*.md` | Cursor | | `.github/copilot-instructions.md` | GitHub Copilot | | `CONVENTIONS.md` | Windsurf, Cline, and others | Create the file that matches your AI tool (or multiple files if your team uses different tools) and paste the snippet below. This gives the AI essential Trigger.dev context without installing anything. ````markdown theme={"theme":"css-variables"} # Trigger.dev rules ## Imports Always import from `@trigger.dev/sdk` — never from `@trigger.dev/sdk/v3` or use the deprecated `client.defineJob` pattern. ## Task pattern Every task must be exported. Use `task()` from `@trigger.dev/sdk`: ```ts import { task } from "@trigger.dev/sdk"; export const myTask = task({ id: "my-task", retry: { maxAttempts: 3, factor: 1.8, minTimeoutInMs: 500, maxTimeoutInMs: 30_000, }, run: async (payload: { url: string }) => { // No timeouts — runs can take as long as needed return { success: true }; }, }); ``` ## Triggering tasks From your backend (Next.js route, Express handler, etc.): ```ts import type { myTask } from "./trigger/my-task"; import { tasks } from "@trigger.dev/sdk"; // Fire and forget const handle = await tasks.trigger("my-task", { url: "https://example.com" }); // Batch trigger (up to 1,000 items) const batchHandle = await tasks.batchTrigger("my-task", [ { payload: { url: "https://example.com/1" } }, { payload: { url: "https://example.com/2" } }, ]); ``` ### From inside other tasks ```ts export const parentTask = task({ id: "parent-task", run: async (payload) => { // Fire and forget await childTask.trigger({ data: "value" }); // Wait for result — returns a Result object, NOT the output directly const result = await childTask.triggerAndWait({ data: "value" }); if (result.ok) { console.log(result.output); // The actual return value } else { console.error(result.error); } // Or use .unwrap() to get output directly (throws on failure) const output = await childTask.triggerAndWait({ data: "value" }).unwrap(); }, }); ``` > Never wrap `triggerAndWait` or `batchTriggerAndWait` in `Promise.all` — this is not supported. ## Error handling ```ts import { task, retry, AbortTaskRunError } from "@trigger.dev/sdk"; export const resilientTask = task({ id: "resilient-task", retry: { maxAttempts: 5 }, run: async (payload) => { // Permanent error — skip retrying if (!payload.isValid) { throw new AbortTaskRunError("Invalid payload, will not retry"); } // Retry a specific block (not the whole task) const data = await retry.onThrow( async () => await fetchExternalApi(payload), { maxAttempts: 3 } ); return data; }, }); ``` ## Schema validation Use `schemaTask` with Zod for payload validation: ```ts import { schemaTask } from "@trigger.dev/sdk"; import { z } from "zod"; export const processVideo = schemaTask({ id: "process-video", schema: z.object({ videoUrl: z.string().url() }), run: async (payload) => { // payload is typed and validated }, }); ``` ## Waits Use `wait.for` for delays, `wait.until` for dates, and `wait.forToken` for external callbacks: ```ts import { wait } from "@trigger.dev/sdk"; await wait.for({ seconds: 30 }); await wait.until({ date: new Date("2025-01-01") }); ``` ## Configuration `trigger.config.ts` lives at the project root: ```ts import { defineConfig } from "@trigger.dev/sdk/build"; export default defineConfig({ project: "", dirs: ["./trigger"], }); ``` ## Common mistakes 1. **Forgetting to export tasks** — every task must be a named export 2. **Importing from `@trigger.dev/sdk/v3`** — this is the old v3 path; always use `@trigger.dev/sdk` 3. **Using `client.defineJob()`** — this is the deprecated v2 API 4. **Calling `task.trigger()` directly** — use `tasks.trigger("task-id", payload)` from your backend 5. **Using `triggerAndWait` result as output** — it returns a `Result` object; check `result.ok` then access `result.output`, or use `.unwrap()` 6. **Wrapping waits/triggerAndWait in `Promise.all`** — not supported in Trigger.dev tasks 7. **Adding timeouts to tasks** — tasks have no built-in timeout; use `maxDuration` in config if needed ```` ## llms.txt We also publish machine-readable documentation for LLM consumption: * [trigger.dev/docs/llms.txt](https://trigger.dev/docs/llms.txt) — concise overview * [trigger.dev/docs/llms-full.txt](https://trigger.dev/docs/llms-full.txt) — full documentation These follow the [llms.txt standard](https://llmstxt.org) and can be fed directly into any LLM context window. ## Troubleshooting Install [Skills](/docs/skills); they override the outdated patterns in the AI's training data. The [context snippet](#project-level-context-snippet) above is a quick alternative. 1. Make sure you've restarted your AI client after adding the config 2. Run `npx trigger.dev@latest install-mcp` again — it will detect and fix common issues 3. Check that `npx trigger.dev@latest mcp` runs without errors in your terminal 4. See the [MCP introduction](/docs/mcp-introduction) for client-specific config details Both if possible: * **Skills** to teach your AI how to write Trigger.dev code (tasks, realtime, chat.agent) * **MCP Server** if you need to trigger tasks, deploy, and search docs from your AI ## Next steps Install and configure the MCP Server for live project interaction. Install Trigger.dev agent skills into any AI coding assistant. Learn the task patterns your AI assistant will follow. # Bulk actions Source: https://trigger.dev/docs/bulk-actions Perform actions like replay and cancel on multiple runs at once. Bulk actions allow you to perform replaying and canceling on multiple runs at once. This is especially useful when you need to retry a batch of failed runs with a new version of your code, or when you need to cancel multiple in-progress runs.