May 31, 2024

Major v3 reliability improvements

CTO, Trigger.dev

Image for Major v3 reliability improvements

We strongly recommend you upgrade to the latest version of the v3 SDK because we've made some major improvements to run execution reliability.

Run this command in your repo to easily upgrade:

npx trigger.dev@beta update

Run attempts

Each v3 run has at least one attempt. An attempt is an execution of your code, if the attempt succeeds then the run will succeed. If the attempt fails then the run will be retried with more attempts until it succeeds or the maximum number of attempts is reached.

You should read the Errors & Retrying guide. If you apply the guidance you will achieve highly reliable runs.

What's changed and why is it better?

When we first shipped v3 this is how attempts worked:

Your run is taken from the queue on the platform.
We create a run attempt on the platform.
We spin up a new worker to execute your run.
The worker runs (hopefully).
The attempt succeeds or fails.
Failed attempts go back into the queue.
Repeat 1–6.

This had a couple of problems:

If the worker failed to start then there'd be a hung attempt. We automatically fail attempts that haven't communicated with the platform recently so it would try again.
Each attempt needed to go back into the queue which is innefficient and causes load on the platform.
If an attempt didn't start it would count againt your run's attempt limit.

We've moved creating run attempts to the worker, so now:

Your run is taken from the queue on the platform.
We spin up a new worker to execute your run.
The worker creates a run attempt via the platform.
The worker runs (hopefully).
The attempt succeeds or fails.
Failed attempts are retried by the worker, no need to be requeued.
Repeat 3–6.

This is far more reliable because attempts are only created when the worker is actually running. It's also more efficient because we don't need to requeue failed attempts. Win win.

Jump to

Share this article

#10 Environment variables SDK

runs.get() & runs.list() #12

Ready to start building?

Build and deploy your first task in 3 minutes.

Get started now

Product

AI Agents

Trigger.dev Realtime

Concurrency & queues

Scheduled tasks

Observability & monitoring

Roadmap

Latest changelogs

Billing alerts

How to reduce your Trigger.dev spend

Mastra agents with memory example

Latest blog posts

Trigger.dev v4 GA

How we built a real-time service that handles 20,000 updates per second

How Magic Patterns migrated 200k monthly jobs to Trigger.dev in one day

Documentation

Guides

Major v3 reliability improvements

Run attempts

What's changed and why is it better?

Ready to start building?