This feature is only available in the v4 beta. To upgrade to v4, see the upgrade to v4 docs.

Acknowledgements: This example project is derived from the brilliant deep research guide by Nico Albanese.

Overview

This full-stack project is an intelligent deep research agent that autonomously conducts multi-layered web research, generating comprehensive reports which are then converted to PDF and uploaded to storage.

Tech stack:

Features:

  • Recursive research: AI generates search queries, evaluates their relevance, asks follow-up questions and searches deeper based on initial findings.
  • Real-time progress: Live updates are shown on the frontend using Trigger.dev Realtime as research progresses.
  • Intelligent source evaluation: AI evaluates search result relevance before processing.
  • Research report generation: The completed research is converted to a structured HTML report using a detailed system prompt.
  • PDF creation and uploading to Cloud storage: The completed reports are then converted to PDF using LibreOffice and uploaded to Cloudflare R2.

GitHub repo

View the Vercel AI SDK deep research agent repo

Click here to view the full code for this project in our examples repository on GitHub. You can fork it and use it as a starting point for your own project.

How the deep research agent works

Trigger.dev orchestration

The research process is orchestrated through three connected Trigger.dev tasks:

  1. deepResearchOrchestrator - Main task that coordinates the entire research workflow.
  2. generateReport - Processes research data into a structured HTML report using OpenAI’s GPT-4o model
  3. generatePdfAndUpload - Converts HTML to PDF using LibreOffice and uploads to R2 cloud storage

Each task uses triggerAndWait() to create a dependency chain, ensuring proper sequencing while maintaining isolation and error handling.

The deep research recursive function

The core research logic uses a recursive depth-first search approach. A query is recursively expanded and the results are collected.

Key parameters:

  • depth: Controls recursion levels (default: 2)
  • breadth: Number of queries per level (default: 2, halved each recursion)
Level 0 (Initial Query): "AI safety in autonomous vehicles"

├── Level 1 (depth = 1, breadth = 2):
│   ├── Sub-query 1: "Machine learning safety protocols in self-driving cars"
│   │   ├── → Search Web → Evaluate Relevance → Extract Learnings
│   │   └── → Follow-up: "How do neural networks handle edge cases?"
│   │
│   └── Sub-query 2: "Regulatory frameworks for autonomous vehicle testing"
│       ├── → Search Web → Evaluate Relevance → Extract Learnings
│       └── → Follow-up: "What are current safety certification requirements?"

└── Level 2 (depth = 2, breadth = 1):
    ├── From Sub-query 1 follow-up:
    │   └── "Neural network edge case handling in autonomous systems"
    │       └── → Search Web → Evaluate → Extract → DEPTH LIMIT REACHED

    └── From Sub-query 2 follow-up:
        └── "Safety certification requirements for self-driving vehicles"
            └── → Search Web → Evaluate → Extract → DEPTH LIMIT REACHED

Process flow:

  1. Query generation: OpenAI’s GPT-4o generates multiple search queries from the input
  2. Web search: Each query searches the web via the Exa API with live crawling
  3. Relevance evaluation: OpenAI’s GPT-4o evaluates if results help answer the query
  4. Learning extraction: Relevant results are analyzed for key insights and follow-up questions
  5. Recursive deepening: Follow-up questions become new queries for the next depth level
  6. Accumulation: All learnings, sources, and queries are accumulated across recursion levels

Using Trigger.dev Realtime to trigger and subscribe to the deep research task

We use the useRealtimeTaskTrigger React hook to trigger the deep-research task and subscribe to it’s updates.

Frontend (React Hook):

const triggerInstance = useRealtimeTaskTrigger<typeof deepResearchOrchestrator>("deep-research", {
  accessToken: triggerToken,
});
const { progress, label } = parseStatus(triggerInstance.run?.metadata);

As the research progresses, the metadata is set within the tasks and the frontend is kept updated with every new status:

Task Metadata:

metadata.set("status", {
  progress: 25,
  label: `Searching the web for: "${query}"`,
});

Relevant code

Learn more about Trigger.dev Realtime

To learn more, take a look at the following resources: