Medium and large machines on Trigger.dev Cloud start faster and more reliably, especially during peak usage. When we need to spin up brand new servers, they're ready to accept your runs immediately - we've eliminated an entire class of failures where runs could start before critical infrastructure was healthy. Fewer incidents and better availability.
Technical details:
- Implemented a new scheduling strategy with MostAllocated bin packing
- Deployed Smooth Operator, a custom Kubernetes operator that continuously monitors supervisor DaemonSet health
- Configured parallel image pulls and aggressive garbage collection to support denser bin packing