Documentation Index
Fetch the complete documentation index at: https://mintlify.com/questdb/questdb/llms.txt
Use this file to discover all available pages before exploring further.
Column-Oriented Storage
QuestDB stores table data in a columnar format where each column is written to separate files. This architecture provides several advantages for time-series workloads:File Layout
For a table namedtrades with columns timestamp, symbol, price, quantity:
Column File Types
Each column type has specific storage characteristics: Fixed-width columns (INT, LONG, DOUBLE, TIMESTAMP):- Stored in
.dfiles as contiguous arrays - Random access via direct offset calculation:
offset = rowIndex * columnWidth - Memory-mapped for zero-copy reads
- Data file (
.d): Contains actual variable-length data - Index file (
.i): Contains 8-byte offsets to data file - Two-level lookup: index[rowIndex] → offset → data
.vfile: Dictionary of unique string values.kfile: Integer keys per row (maps to dictionary).ofile: Offsets into.vfile- Enables integer-based comparisons and bitmap indexing
Memory-Mapped I/O
QuestDB uses memory-mapped files extensively for performance:MemoryCMR (Memory Contiguous Mapped Read) regions (source:core/src/main/java/io/questdb/cairo/TableReader.java:90), allowing the OS to manage page cache transparently.
MemoryMA (Memory Mapped Appendable) for efficient appends (source:core/src/main/java/io/questdb/cairo/TableWriter.java:191).
Partitioning Strategy
Time-Based Partitioning
Tables are partitioned automatically based on the designated timestamp column. ThePartitionBy class defines supported intervals:
Partition Directory Naming
Partitions use formatted timestamps as directory names:HOUR:2024-03-01T14DAY:2024-03-01WEEK:2024-W09MONTH:2024-03YEAR:2024
Benefits of Partitioning
- Query pruning: Skip partitions outside query time range
- Parallel processing: Process multiple partitions concurrently
- Data lifecycle: Drop old partitions for efficient data retention
- Write isolation: Current partition is being written while old partitions are read-only
- Backup efficiency: Copy individual partitions incrementally
Multi-Tier Storage
QuestDB implements a multi-tier storage pipeline: WAL → Native → Parquet.Tier 1: Write-Ahead Log (WAL)
For WAL-enabled tables, writes go through the WAL layer:WalWriter buffers incoming data in WAL segments before applying to table (source:core/src/main/java/io/questdb/cairo/wal/WalWriter.java:110-118).
Characteristics:
- Append-only segments
- Fast writes with minimal fsync
- Multiple concurrent writers
- Asynchronously applied to native storage
Tier 2: Native Columnar Format
The primary storage format after WAL application:TableWriter manages native column files and partitions (source:core/src/main/java/io/questdb/cairo/TableWriter.java:155).
Characteristics:
- Column files (
.d,.i,.k,.v) - Time-based partitions
- Optimized for query performance
- Supports updates and schema changes
Tier 3: Parquet Format
Older partitions can be converted to Parquet for:PartitionDecoder (source:core/src/main/java/io/questdb/cairo/TableReader.java:93-94).
Characteristics:
- Better compression ratios
- S3-compatible storage
- Read-only (immutable)
- Industry-standard format
Storage Tier Transitions
Transaction Management
Transaction File (_txn)
The transaction file tracks the current state of the table:
TxReader reads the transaction file for consistent snapshot isolation (source:core/src/main/java/io/questdb/cairo/TableReader.java:148-152).
Contains:
- Transaction number (sequentially incrementing)
- Row counts per partition
- Timestamp min/max per partition
- Column structure version
- Attached/detached partition list
Column Version File (_cv)
Tracks column schema evolution:
ColumnVersionReader tracks schema changes per partition (source:core/src/main/java/io/questdb/cairo/TableReader.java:142).
Supports:
- Adding/dropping columns
- Column type changes
- Per-partition column versions
- Schema evolution without rewriting data
Partition Formats
Tables can contain partitions in different formats:Native Format
- Individual column files
- Random read/write access
- Optimal for recent, active data
Parquet Format
- Single
.parquetfile per partition - Compressed, columnar
- Read-only, optimal for archival
Mixed-Format Tables
A single table can have:- Recent partitions in native format (fast writes)
- Historical partitions in Parquet format (space efficiency)
Compression
Native Format Compression
- SYMBOL columns: Dictionary compression (unique values stored once)
- Fixed-width columns: OS-level page compression (if supported)
- Sparse columns: Null bitmap reduces storage
Parquet Format Compression
Parquet partitions use configurable compression:- SNAPPY (default, balanced)
- GZIP (higher compression)
- ZSTD (modern, efficient)
- LZ4 (fastest)
Storage Patterns
Hot-Warm-Cold Architecture
Deduplication via LATEST ON
For tables with duplicate timestamps per key:Performance Implications
Column Pruning
Only read column files needed for query:Partition Pruning
Skip entire partitions based on query filter:Parallel Execution
Multiple worker threads process partitions concurrently:See Also
- Partitioning - Detailed partitioning strategies
- WAL Architecture - Write-Ahead Log implementation
- SQL Extensions - Query capabilities