We’ve migrated run logs/spans to ClickHouse from PostgreSQL, which enables a faster dashboard experience and prepares us for a long requested feature: log search and filtering.
What’s new
Your open telemetry logs and spans associated with runs are now stored in ClickHouse, which is better suited for this type of data and allows more efficient querying and aggregation, which was previously not possible with logs and spans stored in PostgreSQL. Because of this move, we're able to increase the number of logs and spans we can display in the dashboard (up to 50,000 in the Trigger.dev Cloud).
Under the hood
This migration represents a fundamental shift in how we store and query OpenTelemetry data. Here's what changed architecturally:
Why ClickHouse over PostgreSQL?
Columnar vs Row-based storage: ClickHouse's columnar storage format is specifically designed for analytical workloads like log aggregation and time-series queries. Unlike PostgreSQL's row-oriented model (optimized for transactional operations), ClickHouse stores data by column, enabling:
- Massive compression gains: Similar log attributes compress extremely well when stored together
- Faster analytical queries: Only the columns needed for a query are read from disk
- Better aggregation performance: Sum, count, and group-by operations are significantly faster
Schema design differences: Our new ClickHouse schema leverages native partitioning and advanced data types:
-- Actual ClickHouse table structure for task eventsCREATE TABLE task_events_v1 ( environment_id String, organization_id String, project_id String, task_identifier String CODEC(ZSTD(1)), run_id String CODEC(ZSTD(1)), start_time DateTime64(9) CODEC(Delta(8), ZSTD(1)), trace_id String CODEC(ZSTD(1)), span_id String CODEC(ZSTD(1)), parent_span_id String CODEC(ZSTD(1)), message String CODEC(ZSTD(1)), -- Unified event type classification kind LowCardinality(String) CODEC(ZSTD(1)), status LowCardinality(String) CODEC(ZSTD(1)), -- Native JSON type for complex attributes attributes JSON CODEC(ZSTD(1)), attributes_text String MATERIALIZED toJSONString(attributes), metadata String CODEC(ZSTD(1)), duration UInt64 CODEC(ZSTD(1)), expires_at DateTime64(3), -- Specialized indexes for different query patterns INDEX idx_run_id run_id TYPE bloom_filter(0.001) GRANULARITY 1, INDEX idx_span_id span_id TYPE bloom_filter(0.001) GRANULARITY 1, INDEX idx_duration duration TYPE minmax GRANULARITY 1, INDEX idx_attributes_text attributes_text TYPE tokenbf_v1(32768, 3, 0) GRANULARITY 8) ENGINE = MergeTreePARTITION BY toDate(start_time)ORDER BY (environment_id, toUnixTimestamp(start_time), trace_id)TTL toDateTime(expires_at) + INTERVAL 7 DAY;
Performance improvements: The migration solved the core limitation we hit with PostgreSQL around dynamic attribute search. PostgreSQL's JSON support made it difficult to efficiently search across the varied OpenTelemetry attributes that change per task and span - creating indexes for every possible attribute path wasn't feasible. Searching log messages and attributes required complex PostgreSQL queries that didn't scale well with billions of records, and grouping and filtering by dynamic JSON attributes was slow, making dashboard queries that aggregate across runs impossible.
Technical implementation details
Dual-store architecture: We implemented a feature-flagged approach that allows gradual migration:
- Both PostgreSQL and ClickHouse event repositories implement the same
IEventRepositoryinterface - Environment variables control which store is used for reads/writes
- This enables zero-downtime migration and easy rollback if needed
OpenTelemetry data handling: The new system better handles the unique characteristics of OTEL data:
- Native JSON attributes: ClickHouse's
JSONdata type preserves the full structure of OpenTelemetry attributes without flattening, while the materializedattributes_textcolumn enables full-text search - Advanced indexing: Bloom filters for exact ID lookups, token-based full-text search on attributes, and minmax indexes for duration queries
- Compression optimizations: ZSTD compression with delta encoding for timestamps and specialized codecs for different data patterns
- Time-based queries: Most log queries are time-range based, perfectly suited for ClickHouse's date partitioning
Data consistency model: Unlike PostgreSQL's ACID guarantees, ClickHouse provides eventual consistency, which is acceptable for observability data where slight delays are tolerable in exchange for massive performance gains.
Materialized views for metrics: The ClickHouse migration also enabled us to build real-time usage aggregations using materialized views:
-- Pre-aggregated usage metrics by minute and hourCREATE MATERIALIZED VIEW mv_task_event_usage_by_minute_v1 ASSELECT organization_id, project_id, environment_id, toStartOfMinute(start_time) AS bucket_start, count() AS event_countFROM task_events_v1GROUP BY organization_id, project_id, environment_id, bucket_start;
These materialized views automatically maintain real-time usage statistics as events are inserted, powering an upcoming logs metrics feature that will give you insights into your task execution patterns and resource usage without expensive on-the-fly aggregations.
Migration challenges solved
During implementation, we addressed several technical hurdles:
- Unicode handling: Implemented safeguards against unpaired Unicode surrogate pairs that could cause ClickHouse insert failures
- Async insertions: Enabled asynchronous insertions to improve write throughput while maintaining data durability
- DateTime precision: Ensured proper handling of nanosecond-precision timestamps (
DateTime64(9)) for accurate trace correlation
The result is a system that can efficiently store and query millions of log entries while maintaining the real-time dashboard experience developers expect.
For self-hosted users
If you’re running Trigger.dev yourself, you’ll need to add a ClickHouse instance to your deployment and enable the new logging backend via environment variables. The migration is non-breaking, and the PostgreSQL event store remains supported until you switch over. Check the self-hosting guide for setup details and schema configuration.

