Build a columnar OLAP store (ClickHouse / Druid style) (13 scenes)
Scene 08 · ORDER BY — filtering becomes range-scan
Rows inside a part are sorted by ORDER BY; WHERE on a leftmost-prefix column is a contiguous range, anything else is a full scan.
Previously

Merge keeps the part count bounded, but what a query does inside each part — scan everything, or jump to a range — depends on how the rows were sorted when the part was written.

Scene 08
ORDER BY: filtering becomes range-scan
Diagram
One part shown as a horizontal strip of 60 row tiles in physical on-disk order, colored by the leading sort-key column. A WHERE clause at the top of the strip either draws a contiguous green range bracket (the engine reads one slice) or scattered red arrows (the engine has to scan every tile). Below the strip, a badge reports the cardinality of the leading sort-key column — the lever that decides whether ORDER BY can prune at all.
WHEREWHERE service = 'api'SORT KEYORDER BY (service, ts)ONE PART · 0 rows in physical on-disk orderLEADING-COLUMN CARDINALITYservice: 3 distinct values (low)filling part…Sort key decides what is contiguous on disk; WHERE on a leading prefix is a range scan.
Watch the part fill up with 60 row tiles in physical on-disk order — all the api rows first, then web, then cron, because the part was written with ORDER BY (service, ts). Once the strip lands, WHERE service='api' lights up the first 20 tiles as a contiguous range — the engine reads that slice and nothing else.
Implementation
Planner.plan_query(part, where)
walk ORDER BY left-to-right, match WHERE, pick range vs scan
1def plan_query(part, where):
2 sort_cols = part.order_by # e.g. ('service', 'ts')
3 # Find the longest prefix of sort_cols that WHERE
4 # constrains with an equality / range predicate.
5 prefix = []
6 for col in sort_cols:
7 if where.constrains(col):
8 prefix.append(col)
9 else:
10 break # gap kills everything to the right
11 if not prefix:
12 return FullPartScan() # non-prefix WHERE
13 return RangeScanBySortKey(prefix, where)
RangeScanBySortKey.execute(part, where)
on (service, ts): seek to service='api', read until it flips
1def range_scan_by_sort_key(part, where):
2 # Binary-search the in-memory primary.idx to slice
3 # the granule range where the prefix could live.
4 lo = bisect_left(part.sort_columns, where.prefix_lo)
5 hi = bisect_right(part.sort_columns, where.prefix_hi)
6 # Seek to first row where service = 'api',
7 # then sequential read until service != 'api'.
8 for row in part.rows[lo:hi]:
9 if where.matches(row):
10 yield row
# Leftmost-prefix rule
the schema-design law the planner is enforcing
1# ORDER BY (A, B):
2# WHERE A = ... -> range scan (fast)
3# WHERE A = .. AND B = .. -> range scan (fastest)
4# WHERE B = ... -> full scan (B's range
5# restarts inside every A)
6#
7# ORDER BY (user_id, ts) with user_id unique per row:
8# primary.idx has one entry per granule, each a
9# different user_id -> no granule can be skipped,
10# every WHERE degenerates to a full part scan.
11# This is the canonical ClickHouse anti-pattern.