SQL Compiler (Griffin)

Overview

Griffin is QuestDB’s SQL compiler and execution engine. Named after the mythical creature, Griffin transforms SQL text into efficient execution plans optimized for time-series workloads. Location: core/src/main/java/io/questdb/griffin/

Architecture

The SQL compiler follows a multi-stage pipeline:

SQL Text → Lexer → Parser → Query Model (AST) → Optimizer → Code Generator → Execution Plan

Stage 1: Lexing and Parsing

Input: SQL text string Output: QueryModel (Abstract Syntax Tree) Key classes:

SqlParser — Top-level parser: io/questdb/griffin/SqlParser.java
SqlLexer — Tokenizes SQL text
ExpressionParser — Parses expressions: io/questdb/griffin/ExpressionParser.java
QueryModel — AST representation: io/questdb/griffin/model/QueryModel.java

Example:

SELECT symbol, price
FROM trades
WHERE price > 100
ORDER BY ts DESC

Parsed into:

QueryModel {
  columns: [symbol, price]
  tableName: "trades"
  whereClause: ExpressionNode(price > 100)
  orderBy: [ts DESC]
}

Stage 2: Optimization

Input: QueryModel (AST) Output: Optimized QueryModel Key class: SqlOptimiser: io/questdb/griffin/SqlOptimiser.java Optimizations:

Predicate pushdown — Move filters closer to data source
Partition pruning — Skip partitions outside time range
Index selection — Choose best index for filter
Join reordering — Optimize join order
Constant folding — Evaluate constants at compile time
Subquery flattening — Convert subqueries to joins where possible

Example optimization:

-- Original
SELECT * FROM (
  SELECT * FROM trades WHERE symbol = 'AAPL'
) WHERE price > 100;

-- Optimized (predicate pushdown)
SELECT * FROM trades WHERE symbol = 'AAPL' AND price > 100;

Stage 3: Code Generation

Input: Optimized QueryModel Output: RecordCursorFactory Key class: SqlCodeGenerator: io/questdb/griffin/SqlCodeGenerator.java Generates execution plan as a tree of factories:

Each factory creates a RecordCursor for data iteration
Factories compose to form complex plans
Supports lazy evaluation (only compute what’s needed)

Example plan:

FilteredRecordCursorFactory
  ├── filter: price > 100
  └── base: DataFrameRecordCursorFactory
        └── reader: TableReader(trades)

Stage 4: Execution

Input: RecordCursorFactory Output: RecordCursor (iterator over rows) Execution is lazy:

Factory creates cursor
Cursor iterates row-by-row (or in batches)
Each row is a Record with column accessors

Key interfaces:

RecordCursorFactory — Creates cursors
RecordCursor — Iterates over records
Record — Represents a single row
RecordMetadata — Schema information

Key Classes

SqlCompiler / SqlCompilerImpl

Location: io/questdb/griffin/SqlCompilerImpl.java:149 Main entry point for SQL compilation. Key methods:

compile(sql, context) — Compiles SQL to CompiledQuery
compileQuery(sql, context) — Compiles SELECT to RecordCursorFactory
execute(sql, context) — Executes DDL/DML

Example usage:

try (SqlCompiler compiler = engine.getSqlCompiler()) {
    CompiledQuery cq = compiler.compile("SELECT * FROM trades", context);
    try (RecordCursorFactory factory = cq.getRecordCursorFactory()) {
        try (RecordCursor cursor = factory.getCursor(context)) {
            Record record = cursor.getRecord();
            while (cursor.hasNext()) {
                // Read record columns
            }
        }
    }
}

SqlParser

Location: io/questdb/griffin/SqlParser.java Parses SQL text into QueryModel or other model objects. Key methods:

parseQuery(sql) — Parses SELECT statement
parseInsert(sql) — Parses INSERT statement
parseCreateTable(sql) — Parses CREATE TABLE
parseAlterTable(sql) — Parses ALTER TABLE

Parsing strategy:

Recursive descent parser
Zero-allocation (reuses objects from pools)
Error recovery (provides position of syntax errors)

ExpressionParser

Location: io/questdb/griffin/ExpressionParser.java Parses SQL expressions (WHERE clauses, SELECT expressions, etc.). Key methods:

parseExpression(sql) — Parses expression tree
parseFunction(sql) — Parses function calls

Precedence handling: Operator precedence table ensures correct expression trees. Example:

price > 100 AND (symbol = 'AAPL' OR symbol = 'GOOGL')

Parsed into:

ExpressionNode(AND)
  ├── ExpressionNode(>)
  │     ├── price
  │     └── 100
  └── ExpressionNode(OR)
        ├── ExpressionNode(=)
        │     ├── symbol
        │     └── 'AAPL'
        └── ExpressionNode(=)
              ├── symbol
              └── 'GOOGL'

SqlCodeGenerator

Location: io/questdb/griffin/SqlCodeGenerator.java Generates execution plans from optimized query models. Key methods:

generate(queryModel, context) — Generates RecordCursorFactory
generateFilter(queryModel) — Generates filter plan
generateJoin(queryModel) — Generates join plan
generateGroupBy(queryModel) — Generates aggregation plan

Factory selection:

Chooses optimal factory based on query characteristics
Considers indexes, partition pruning, parallelism
Generates SIMD-accelerated code where applicable

FunctionFactory

Location: io/questdb/griffin/FunctionFactory.java Base interface for SQL functions. Implementation:

Each function is a separate class (e.g., SumDoubleGroupByFunction)
Function factories registered in FunctionFactoryCache
Code generation instantiates appropriate function class

Function categories:

Scalar functions (e.g., round(), abs())
Aggregate functions (e.g., sum(), avg(), count())
Window functions (e.g., row_number(), rank())
Time-series functions (e.g., sample_by(), fill())

Example: io/questdb/griffin/engine/functions/math/AbsIntFunctionFactory.java

Query Execution Models

Table Scan

Simple sequential scan of a table: Factory: DataFrameRecordCursorFactory Execution:

Open table reader
Iterate partitions
For each partition, iterate rows
Return each row as Record

Optimization: Skip partitions outside time range (partition pruning)

Filtered Scan

Table scan with WHERE clause: Factory: FilteredRecordCursorFactory Execution:

Base cursor (table scan)
Evaluate filter for each row
Return only matching rows

Optimization: Use indexes where applicable (bitmap index scan)

Index Scan

Use bitmap index to find matching rows: Factory: BitmapIndexRecordCursorFactory Execution:

Lookup value in bitmap index
Get bitmap of matching rows
Iterate only matching rows

Example:

SELECT * FROM trades WHERE symbol = 'AAPL';

If symbol has a bitmap index:

Lookup “AAPL” in index → bitmap
Iterate only rows where bitmap[i] = 1

Performance: O(log n) lookup + O(k) scan (k = matching rows)

Hash Join

Join two tables using a hash table: Factory: HashJoinRecordCursorFactory Execution:

Build phase: Scan smaller table, build hash table on join key
Probe phase: Scan larger table, probe hash table for matches
Return joined rows

Join types supported:

INNER JOIN
LEFT JOIN
OUTER JOIN
CROSS JOIN

Optimization: Reorder joins to build hash table on smaller table

ASOF Join

Time-series specific join (join on nearest timestamp): Factory: AsOfJoinRecordCursorFactory Use case: Join time-series tables with different sampling rates Example:

SELECT *
FROM trades
ASOf JOIN quotes ON (symbol)

For each trade, find the most recent quote with the same symbol. Performance: O(n + m) with timestamp index

SPLICE Join

Time-series full outer join: Factory: SpliceJoinRecordCursorFactory Use case: Merge two time-series tables by timestamp Example:

SELECT *
FROM trades
SPLICE JOIN quotes ON (ts)

Interleave rows from both tables ordered by timestamp.

Group By / Aggregation

Aggregate data by key: Factory: GroupByRecordCursorFactory Execution:

Scan input cursor
For each row, lookup/create group in hash table
Update aggregates for group
Return aggregated results

Aggregate functions:

sum(), avg(), min(), max(), count()
first(), last() (time-series specific)
ksum() (Kahan summation for precision)

Optimization: Use SIMD for aggregations (see SIMD Optimizations)

Sample By

Time-series specific downsampling: Factory: SampleByRecordCursorFactory Use case: Aggregate data into time buckets Example:

SELECT ts, symbol, avg(price), sum(volume)
FROM trades
SAMPLE BY 1h;

Aggregates data into 1-hour buckets. Alignment: Buckets aligned to calendar (e.g., midnight for daily buckets) Fill modes:

FILL(NONE) — Skip empty buckets
FILL(NULL) — Return NULL for empty buckets
FILL(LINEAR) — Interpolate missing values
FILL(PREV) — Forward-fill previous value

LATEST ON

Deduplication by key (return latest row per key): Factory: LatestByAllIndexedRecordCursorFactory Example:

SELECT * FROM trades
LATEST ON ts PARTITION BY symbol;

Returns the most recent trade for each symbol. Performance: O(n) with deduplication index

Query Optimization Techniques

Partition Pruning

Skip partitions outside the query time range. Example:

SELECT * FROM trades
WHERE ts BETWEEN '2024-01-01' AND '2024-01-02';

With daily partitions:

Scan only 2024-01-01/ and 2024-01-02/ partitions
Skip all other partitions

Implementation: IntrinsicModel extracts time range from WHERE clause See: io/questdb/griffin/model/IntrinsicModel.java

Index Selection

Choose the best index for a filter. Heuristic:

Equality on indexed symbol column → bitmap index
IN list on indexed symbol → bitmap index with OR
Range on timestamp → partition pruning + timestamp index
Otherwise → full table scan with filter

Example:

SELECT * FROM trades
WHERE symbol = 'AAPL' AND price > 100;

Optimal plan:

Use bitmap index on symbol = 'AAPL'
Filter remaining rows with price > 100

Predicate Pushdown

Move filters closer to data source to reduce data scanned. Example:

SELECT * FROM (
  SELECT * FROM trades WHERE ts > '2024-01-01'
) WHERE symbol = 'AAPL';

Optimized:

SELECT * FROM trades
WHERE ts > '2024-01-01' AND symbol = 'AAPL';

Both filters applied at table scan (no intermediate results).

Constant Folding

Evaluate constant expressions at compile time. Example:

SELECT * FROM trades WHERE price > 100 * 2;

Optimized:

SELECT * FROM trades WHERE price > 200;

Multiplication computed once at compile time, not per row.

Join Reordering

Optimize join order to minimize intermediate results. Heuristic:

Smaller table on the right (build side of hash join)
Apply filters before joins
Joins on indexed columns preferred

Example:

SELECT * FROM large_table
JOIN small_table ON (key)
WHERE small_table.value > 100;

Optimized plan:

Filter small_table first (value > 100)
Build hash table on filtered small_table
Probe with large_table

JIT Compilation

Location: core/src/main/java/io/questdb/jit/ QuestDB can JIT-compile WHERE clause filters to native code for maximum performance. Supported:

x86-64 only (not available on ARM64)
Simple filters (comparisons, boolean logic)
Numeric types

Example:

SELECT * FROM trades
WHERE price > 100 AND volume < 1000;

Compiled to x86-64 assembly:

mov rax, [price]    ; load price
cmp rax, 100        ; compare with 100
jle skip            ; skip if <=
mov rbx, [volume]   ; load volume
cmp rbx, 1000       ; compare with 1000
jge skip            ; skip if >=
; row matches

Performance: 2-3x faster than interpreted filter evaluation See: io/questdb/jit/JitUtil.java

Parallel Query Execution

QuestDB can parallelize query execution across partitions. Strategy:

Each partition processed independently
Worker threads from shared pool
Results merged in order

Example:

SELECT symbol, sum(volume)
FROM trades
WHERE ts BETWEEN '2024-01-01' AND '2024-12-31'
GROUP BY symbol;

With daily partitions:

Split 365 partitions across N threads
Each thread aggregates its partitions
Merge aggregated results

Configuration:

shared.worker.count — Number of worker threads

Error Handling

SqlException

Location: io/questdb/griffin/SqlException.java All SQL errors throw SqlException with:

Error message
Position in SQL text (character offset)
Error code (optional)

Error position convention: Point to the specific offending character, not the start of the expression. Example:

throw SqlException.$(position, "Invalid column name: ").put(columnName);

Storage Engine — Cairo storage internals
SIMD Optimizations — Vector operations
Architecture Overview — System architecture
Testing — SQL testing conventions

Contributing

Internals

SQL Compiler (Griffin)

Overview

Architecture

Stage 1: Lexing and Parsing

Stage 2: Optimization

Stage 3: Code Generation

Stage 4: Execution

Key Classes

SqlCompiler / SqlCompilerImpl

SqlParser

ExpressionParser

SqlCodeGenerator

FunctionFactory

Query Execution Models

Table Scan

Filtered Scan

Index Scan

Hash Join

ASOF Join

SPLICE Join

Group By / Aggregation

Sample By

LATEST ON

Query Optimization Techniques

Partition Pruning

Index Selection

Predicate Pushdown

Constant Folding

Join Reordering

JIT Compilation

Parallel Query Execution

Error Handling

SqlException

Contributing

Internals

Documentation Index

​Overview

​Architecture

​Stage 1: Lexing and Parsing

​Stage 2: Optimization

​Stage 3: Code Generation

​Stage 4: Execution

​Key Classes

​SqlCompiler / SqlCompilerImpl

​SqlParser

​ExpressionParser

​SqlCodeGenerator

​FunctionFactory

​Query Execution Models

​Table Scan

​Filtered Scan

​Index Scan

​Hash Join

​ASOF Join

​SPLICE Join

​Group By / Aggregation

​Sample By

​LATEST ON

​Query Optimization Techniques

​Partition Pruning

​Index Selection

​Predicate Pushdown

​Constant Folding

​Join Reordering

​JIT Compilation

​Parallel Query Execution

​Error Handling

​SqlException

​Related Pages

Overview

Architecture

Stage 1: Lexing and Parsing

Stage 2: Optimization

Stage 3: Code Generation

Stage 4: Execution

Key Classes

SqlCompiler / SqlCompilerImpl

SqlParser

ExpressionParser

SqlCodeGenerator

FunctionFactory

Query Execution Models

Table Scan

Filtered Scan

Index Scan

Hash Join

ASOF Join

SPLICE Join

Group By / Aggregation

Sample By

LATEST ON

Query Optimization Techniques

Partition Pruning

Index Selection

Predicate Pushdown

Constant Folding

Join Reordering

JIT Compilation

Parallel Query Execution

Error Handling

SqlException

Related Pages