8 Time-Series Database Engines Compared — 22 Million Row Benchmark
When you need to store and analyze tens of millions of time-series records, which database should you choose? IoT sensor logs, server metrics, real-time traffic data, financial tick data — large-scale time-indexed datasets hit the limits of general-purpose RDBMS fast. We benchmarked 8 database engines against 22 million time-series rows under identical conditions. Aggregate query speeds ranged from 25ms to 3,493ms — a 140x difference.
Why Dedicated Time-Series Databases?
Time-series data has fundamentally different characteristics from general relational data. Writes are almost exclusively append-only in chronological order. Reads are dominated by time-range aggregate queries. Updates are rare. A standard PostgreSQL query that takes 3,493ms drops to 25ms on a purpose-built engine. The difference comes from storage layout, indexing strategy, and query optimization designed for temporal data.
Time-Series Data Characteristics
- Write pattern: 99%+ append-only, ordered by timestamp
- Read pattern: Time-range aggregation (5-min, 1-hour, 1-day buckets)
- Volume: Tens of thousands to millions of rows per second
- Retention: Old data gets downsampled or deleted
- Compression: Time-sorted data achieves 90%+ compression
General RDBMS Limitations
- Row-based storage scans unnecessary columns during aggregation
- B-tree indexes are inefficient for time-range scans
- VACUUM overhead degrades performance after mass INSERTs
- No native compression — storage costs grow linearly
- Time-based partitioning requires manual management
Common Use Cases
| Domain | Example Data |
|---|---|
| Server monitoring | CPU, memory, disk metrics |
| IoT / sensors | Temperature, humidity, vibration readings |
| Traffic analysis | Request count, response time, error rate |
| Financial data | Price movements, trading volume |
Test Environment and Evaluation Criteria
We tested with 22 million time-series rows (timestamp + 4 numeric columns + volume), measuring time-range aggregate query response times. Beyond raw performance, we scored each engine across 6 practical criteria on a 100-point scale.
Dataset Specifications
- Row count: 22 million
- Columns: timestamp, open, high, low, close, volume
- Time span: ~20 years of mixed minute/daily bars
- Raw size: ~4.2GB (CSV)
Benchmark Query
- Time-range aggregation (5-min / 1-hour / 1-day buckets)
- FIRST/LAST value extraction
- MAX/MIN/SUM calculations
- Multi-symbol filter + aggregation
Six Evaluation Criteria
- SQL compatibility: Standard SQL support level
- AI/ML integration: Python, vector, and pipeline compatibility
- Frontend speed: Dashboard response time
- Ecosystem: Tools, drivers, community size
- Ease of learning: Adoption and operational difficulty
- Scalability: Horizontal scaling, clustering support
Overall Rankings
Combining aggregate query performance, ingestion throughput, compression ratio, and the 6 practical evaluation criteria, here are the final rankings on a 100-point scale.
| # | Engine | Type | Query Speed | Ingestion | Compression | Score |
|---|---|---|---|---|---|---|
| 1 | PostgreSQL + TimescaleDB | Time-series RDBMS | 1,021ms | 220K rows/s | 90%+ | 92/100 |
| 2 | ClickHouse | Columnar OLAP | 547ms | 1M+ rows/s | 95%+ | 85/100 |
| 3 | QuestDB | Time-series DB | 25ms | 4M+ rows/s | 85%+ | 73/100 |
| 4 | DuckDB | Embedded OLAP | ~500ms | Local optimized | Parquet | 68/100 |
| 5 | InfluxDB 3.0 | Time-series DB | N/A | High | Arrow+Parquet | 55/100 |
| 6 | MongoDB | Document NoSQL | High | Medium | WiredTiger | 45/100 |
| 7 | TDengine | IoT Time-series | ~300ms | High | 90%+ | 42/100 |
| 8 | Apache Druid | Real-time OLAP | Fast | High | Columnar | 38/100 |
Why the Top 3 Stand Out
- TimescaleDB (92 points): The best balance of PostgreSQL compatibility and time-series optimization. Existing PostgreSQL infrastructure can be reused directly.
- ClickHouse (85 points): Fastest pure analytical query speed. Ingests 1M+ rows/s with 95%+ compression. Ideal for analytics-heavy workloads.
- QuestDB (73 points): 25ms aggregate queries are unmatched, but limited ecosystem and missing OLTP features lower its overall score.
Why Lower-Ranked Engines Score Low
- MongoDB (45): Not purpose-built for time-series. Document model creates structural limitations for aggregation.
- TDengine (42): IoT-specialized with limited general-purpose use. Small community.
- Apache Druid (38): Extremely high operational complexity. Unrealistic for small teams.
Aggregate Query Benchmark
Time-range aggregate queries (OHLCV aggregation per time bucket) were run against all 22 million rows. Each query was written in the engine's native syntax for fair comparison.
| Engine | Query Time | vs PostgreSQL |
|---|---|---|
| QuestDB | 25ms | 140x faster |
| KDB+ | 109ms (commercial) | 32x faster |
| TDengine | ~300ms | 12x faster |
| DuckDB | ~500ms | 7x faster |
| ClickHouse | 547ms | 6x faster |
| TimescaleDB | 1,021ms | 3.4x faster |
| PostgreSQL | 3,493ms | baseline |
Why Such a Massive Speed Difference?
PostgreSQL (3,493ms) uses row-based storage that scans all columns during aggregation, lacks native time-bucket functions, and suffers from excessive I/O without compression or partitioning.
QuestDB (25ms) uses columnar storage with native time partitioning, aSAMPLE BY syntax optimized for time aggregation, and memory-mapped I/O that minimizes disk access.
Why Query Speed Alone Is Not Enough
QuestDB is the fastest at 25ms, yet it ranks 3rd overall. It lacks UPDATE/DELETE support, has limited JOIN capabilities, and a smaller ecosystem — all significant constraints in production. TimescaleDB at 1,021ms is 40x slower, but gives you full PostgreSQL functionality. Its Continuous Aggregates feature caches results in real-time, bringing actual dashboard response times down to single-digit milliseconds.
Deep Comparison by Engine
PostgreSQL + TimescaleDB (Rank 1 — Score 92/100)
TimescaleDB extends PostgreSQL with time-series optimizations while preserving 100% SQL compatibility. Key features include time_bucket(), FIRST(),LAST() functions, Continuous Aggregates for automated real-time caching, and 90%+ native compression. The pgvector extension adds AI vector search capability.
Strengths: Minimal migration from existing PostgreSQL, all PostgreSQL tools and drivers work out of the box, Continuous Aggregates automate materialized views. Weaknesses: Pure analytical query speed trails columnar databases, multi-node horizontal scaling has been discontinued.
ClickHouse (Rank 2 — Score 85/100)
ClickHouse is a columnar OLAP database optimized for analytical queries. It achieves 547ms aggregate queries, ingests over 1 million rows per second, and compresses data by 95%+. Materialized Views provide automatic aggregation, and native horizontal scaling is built in.
Strengths: Best-in-class analytical speed, massive ingestion throughput, fully open source (Apache 2.0). Weaknesses: Limited UPDATE/DELETE (not suitable for OLTP), requires separate operation from PostgreSQL, steeper learning curve, no transaction support.
QuestDB (Rank 3 — Score 73/100)
QuestDB is a purpose-built time-series database with the fastest aggregate query speed in this benchmark at 25ms. It natively supports ASOF JOIN (critical for time-series data alignment), SAMPLE BY for time aggregation, and LATEST ON for retrieving the most recent values. Ingestion peaks at 4M+ rows/s.
Strengths: Unmatched time-series query speed, PostgreSQL wire protocol compatibility, completely free (Apache 2.0). Weaknesses: No UPDATE/DELETE, limited JOIN support, small ecosystem, requires a separate DB for metadata management.
DuckDB (Rank 4 — Score 68/100)
DuckDB is an embedded OLAP engine — install with pip install duckdb and query Parquet/CSV files directly. Native Python/Pandas integration makes it ideal for ML pipelines and feature engineering. It supports ASOF JOIN and runs entirely in-process.
Strengths: Zero-install, queries files directly, deep Python integration. Weaknesses: Single-process (no concurrent connections), not designed for production serving, unsuitable for real-time ingestion.
InfluxDB 3.0 (Rank 5 — Score 55/100)
InfluxDB 3.0 is a complete rewrite in Rust with an Arrow + Parquet architecture. It switches from Flux to SQL as the primary query language and removes cardinality limits. However, the Core edition has a 72-hour retention limit, Continuous Aggregates are not supported, and it is incompatible with previous InfluxDB versions.
Remaining Engines (Ranks 6–8)
| Engine | Position | Best For | Limitation |
|---|---|---|---|
| MongoDB | Document NoSQL | Schema-flexible log storage | Slow aggregation, not optimized for time-series |
| TDengine | IoT time-series | Mass IoT device ingestion | Small community, low SQL compatibility |
| Apache Druid | Real-time OLAP | Large-scale real-time dashboards | Extreme operational complexity |
Recommended Architecture
Trying to solve everything with a single database fails. The key is placing the right tool at each layer: ingestion, storage, analysis, and serving.
| Layer | Recommended Tools | Purpose |
|---|---|---|
| Data ingestion | Python, Celery + Redis, APScheduler | Periodic collection from multiple sources |
| Main storage | PostgreSQL + TimescaleDB | Unified time-series, metadata, and user data |
| Caching | Redis (open source) | Recent data cache, API response cache, Pub/Sub |
| Analytics | DuckDB / ClickHouse | Batch analysis, statistics, ML feature generation |
| AI/ML | Python + PyTorch, DuckDB, Parquet | Model training, embedding, inference pipeline |
| API | FastAPI / Next.js API | Serve data to frontend and AI modules |
| Frontend | Next.js + React, charting libraries | Real-time charts, dashboards, visualization |
Why Multi-Database?
No single engine covers all needs. TimescaleDB is slow for heavy analytics — pair it with ClickHouse or DuckDB. ClickHouse cannot handle OLTP — you still need PostgreSQL for metadata. QuestDB has limited JOINs — complex queries require a different engine.
Recommended combinations by scale:
- Small: TimescaleDB alone (sufficient for most use cases)
- Medium: TimescaleDB + DuckDB (analytics supplement)
- Large: TimescaleDB + ClickHouse + Redis (full pipeline)
Use-Case Selection Guide
General-Purpose Time-Series Storage
When you need CRUD + time-series aggregation in a single database with existing PostgreSQL infrastructure: choose TimescaleDB (score 92/100). Full SQL compatibility, pgvector for AI extension.
Large-Scale Analytics / Dashboards
When serving real-time aggregate dashboards over hundreds of millions of rows: choose ClickHouse (547ms aggregation, 1M+ rows/s ingestion, 95% compression).
Ultra-Low-Latency Time-Series Queries
When millisecond response times are required for real-time monitoring or trading systems: choose QuestDB (25ms aggregation, native ASOF JOIN, read-only analytics layer).
ML/AI Pipeline
For data exploration, feature engineering, and model training in Python: choose DuckDB (one-line install, direct Parquet queries, Pandas integration).
Common Mistakes to Avoid
- Sticking with plain PostgreSQL: Once data exceeds a few million rows, aggregate queries slow dramatically. Adding TimescaleDB alone provides a 3.4x speedup.
- Choosing by benchmark speed alone: QuestDB's 25ms is impressive, but missing UPDATE support and limited JOINs create real production constraints.
- Expecting InfluxDB 2.x to 3.0 migration: They are completely different engines. All Flux queries must be rewritten in SQL.
- Using MongoDB as a time-series DB: Time Series Collections exist, but aggregation performance significantly trails purpose-built engines.
Conclusion
There is no universal database for large-scale time-series data processing. The key principle is separating the main storage layer from the analytics layer.
- Main storage: PostgreSQL + TimescaleDB (score 92, 100% SQL compatible)
- Analytics acceleration: ClickHouse or DuckDB based on workload
- Ultra-low latency: QuestDB as a read-only query layer
- Caching: Redis to keep frontend response times in milliseconds
Storage infrastructure performance is not determined by database engine choice alone. The physical disk arrangement matters just as much — NVMe 3-tier storage strategies and per-device I/O benchmarks are equally important components of a comprehensive data infrastructure design.