Customer Story

How we built a kanban-style triage agent for managing coding agents

Co-founder and CTO, Capy

Image for How we built a kanban-style triage agent for managing coding agents

In this customer story, Justin Sun, Co-founder and CTO of Capy, explains how they use Trigger.dev to orchestrate parallel agents through an automated triage system. After hitting serverless limits and struggling with manual coordination, they built a "PM that never sleeps"; a triage agent that manages dozens of concurrent coding agents. Trigger.dev provides the durable execution and observability that makes this possible at scale.

Identifying the problem

We noticed our team constantly running multiple Claude sessions in parallel, alt-tabbing between terminals, manually checking progress and restarting failed runs. It works, but it's clunky. That's when we realized: if coding agents are going to scale, they need proper orchestration that goes beyond "open more tabs."

At Capy, we'd already built VM infrastructure for computer-use agents (Scrapybara). But watching real usage patterns (1,000+ weekly active users doing general coding tasks, not just quick websites) showed us the real problem wasn't actually the agents themselves, it was managing many of them efficiently.

The limitations of our existing approach

We started hitting fundamental limits which led us to Trigger.dev:

Serverless timeouts broke long-running tasks. Vercel's runtime caps meant agents were cut-off mid-execution on anything substantial.
Streaming was a hack. Our Python websockets were brittle, unobservable, and fell apart with concurrent runs.
Manual coordination didn't scale. Starting agents, checking progress, restarting failures; our operational overhead grew with usage.
We felt that current interfaces aren't fit for purpose. CLIs and chat UIs dump walls of text. It's not the best user experience.

We needed infrastructure that could handle hours-long execution, coordinate parallel work, and to be able to present our results in a way developers actually use.

The triage agent concept

Our solution: a triage agent that acts like an engineering manager. It takes a queue of work (features, bugs, refactors), breaks it down, assigns tasks to specialized coding agents, monitors progress, and consolidates results. This is visualised as an intelligent kanban board, with the board itself being active. Instead of manually juggling terminal sessions, the triage agent:

Spawns child coding agents with appropriate context
Monitors their progress via structured events
Handles retries, timeouts, and escalations
Aggregates outputs into actionable summaries

Building it with Trigger.dev

Trigger.dev became the execution layer that made this architecture possible:

Developer-friendly integration

TypeScript-native SDK fit our monorepo perfectly
Frontend stays on Vercel; long-running work moves to Trigger
Migration from our hacky Python scripts was straightforward

Durable execution beyond serverless limits

Coding agents run for hours, not seconds
Step-level checkpoints mean failures don't waste completed work
Automatic retries with exponential backoff to handle transient issues

Observable, structured workflows

The dashboard with a detailed runs list, trace views, logs, etc
Alerts for when things go wrong
Advanced filtering and search

First-class orchestration

The triage agent uses Trigger's job scheduling to manage child agents
Concurrency limits prevent resource exhaustion
State management and queuing happen at the platform level, not in our code

The result

We went from brittle scripts and manual coordination to a self-managing system. The triage agent spawns dozens of coding tasks in parallel, each running as a durable workflow with full observability. When something fails, we see exactly where and why. When tasks complete, results flow back to a unified view.

More importantly, we can focus on making the agents smarter instead of keeping them running. The orchestration layer just works, with no more babysitting processes or opaque debugging.

Capy is an AI software engineer that ships dozens of features in parallel. It works end-to-end - autonomously triaging issues, executing code in isolated VMs, and pushing PRs to GitHub.

USE CASE

Use Capy to plan and implement features end-to-end, turn tickets into production-ready PRs, and review code with our SOTA agent compatible with all of the latest frontier models.

FOUNDED

September, 2024

CUSTOMER SINCE

April, 2025

Trigger.dev raises $16M Series A

Article

•

December 17

How GovSignals is solving government procurement using Trigger.dev

Customer story

•

December 11

Accelerating global logistics workflows using AI copilots

Customer story

•

November 13

Our roadmap for the next 3 months and beyond

Article

•

September 4

Powering HeroUI Chat's complex deployment pipeline with Trigger.dev

Customer story

•

August 29

Official MCP server & agent rules

Launch week

•

August 22

Ready to start building?

Build and deploy your first task in 3 minutes.

Get started now

Product

AI Agents

Trigger.dev Realtime

Concurrency & queues

Scheduled tasks

Observability & monitoring

Roadmap

Latest changelogs

Debounced task runs

Batch trigger improvements

Adjacent task runs with ease

Latest blog posts

Trigger.dev raises $16M Series A

How GovSignals is solving government procurement using Trigger.dev

OTel incident post-mortem

Documentation

Guides

How we built a kanban-style triage agent for managing coding agents

Identifying the problem

The limitations of our existing approach

The triage agent concept

Building it with Trigger.dev

Developer-friendly integration

Durable execution beyond serverless limits

Observable, structured workflows

First-class orchestration

The result

Trigger.dev raises $16M Series A

How GovSignals is solving government procurement using Trigger.dev

Accelerating global logistics workflows using AI copilots

Our roadmap for the next 3 months and beyond

Powering HeroUI Chat's complex deployment pipeline with Trigger.dev

Official MCP server & agent rules

Ready to start building?