Skip to main content

Documentation Index

Fetch the complete documentation index at: https://trigger.dev/docs/llms.txt

Use this file to discover all available pages before exploring further.

The AI Agents and Prompts surface ships as part of the v4.5 release candidate. Install with @trigger.dev/sdk@rc (or pin 4.5.0-rc.0 or later) to use these features — they aren’t yet on the latest stable, and APIs may still change before the 4.5.0 GA. See supported AI SDK versions and the AI chat changelog for details.
The realtime stream that backs chat.agent enforces a per-record cap of ~1 MiB (1048576 bytes minus a small envelope reserve). Anything written through the chat output — auto-piped LLM chunks, chat.response.write, custom writer.write parts — counts as one record per chunk and is rejected if it crosses the cap. This is a platform-level limit and cannot be raised per project or per stream.

What you’ll see

When a chunk crosses the cap, the run fails with a typed ChatChunkTooLargeError:
ChatChunkTooLargeError: chat.agent chunk of type "tool-output-available" is 2000126 bytes,
over the realtime stream's per-record cap of 1047552 bytes. For oversized payloads
(e.g. large tool outputs), write the value to your own store and emit only an id/url
through the chat stream — see https://trigger.dev/docs/ai-chat/patterns/large-payloads.
The error includes:
  • chunkType — discriminant on the chunk that failed (e.g. tool-output-available, data-handover, text-delta).
  • chunkSize — UTF-8 byte count of the JSON-serialized record.
  • maxSize — the effective cap.
You can catch and re-throw / log it explicitly:
import { ChatChunkTooLargeError, isChatChunkTooLargeError } from "@trigger.dev/sdk";

try {
  await someWrite();
} catch (err) {
  if (isChatChunkTooLargeError(err)) {
    logger.error("Oversized chunk", { type: err.chunkType, size: err.chunkSize });
  }
  throw err;
}

Most common cause: large tool outputs

If you return a streamText result from run(), the AI SDK auto-pipes its UIMessageStream into the chat output. A tool whose result object is large (a fetched HTML body, a CSV blob, an image as base64, a deep DB row dump) gets emitted as one tool-output-available chunk — and that’s the chunk that overruns. Diagnose first: log tool sizes during development.
const fetchPage = tool({
  inputSchema: z.object({ url: z.string().url() }),
  execute: async ({ url }) => {
    const html = await (await fetch(url)).text();
    if (html.length > 500_000) {
      logger.warn("Large tool output", { tool: "fetchPage", bytes: html.length });
    }
    return { html };
  },
});
If the size is unbounded by input, fix the tool — not the stream.

ID-reference pattern

Store the large value in your own database (or object store) and emit only an identifier through the chat stream. The frontend fetches the full payload separately on demand. This keeps the chat stream small, predictable, and resumable, and lets you reuse the value across turns or sessions without re-streaming it.
import { chat } from "@trigger.dev/sdk/ai";
import { tool } from "ai";
import { z } from "zod";

const fetchPage = tool({
  description: "Fetch a URL and store the HTML for later inspection.",
  inputSchema: z.object({ url: z.string().url() }),
  execute: async ({ url }) => {
    const html = await (await fetch(url)).text();
    const docId = await db.documents.create({
      data: { url, html, byteSize: html.length },
    });

    // Tool result is small — just an id and metadata.
    // The model and the UI both work with this lightweight handle.
    return {
      docId,
      url,
      byteSize: html.length,
      preview: html.slice(0, 500),
    };
  },
});
The same pattern works for chat.response.write — push the heavy value to your DB, then emit a small data part with the id:
const id = await db.attachments.create({ data: { content: hugeReport } });
chat.response.write({ type: "data-report", data: { id, summary: shortSummary } });
Persist the large value before you emit the id chunk. If the chunk reaches the UI before the row is written, the frontend gets a 404 on the follow-up fetch.

Transient UI parts

For progress indicators or status data that should stream to the UI but not persist into the response message, use chat.response.write with transient: true. The chunk still travels on the chat stream (so the 1 MiB per-record cap still applies), but it never lands in responseMessage or uiMessages:
chat.response.write({
  type: "data-progress",
  data: { percent: 50 },
  transient: true,
});
For genuinely high-volume diagnostic data (per-token traces, large debug dumps), don’t try to ship it through the realtime stream at all. Log to your own store (DB, object storage, OTel logger) and surface it through a separate UI route that isn’t tied to the chat session.

What does not trigger the cap

These calls don’t go through the realtime stream and have no per-record cap: The control markers chat.agent emits internally (trigger:turn-complete, trigger:upgrade-required) are tiny by construction.

See also