How a query becomes points

Build a Prometheus-style time-series database (12 scenes)

Scene 07 · How a query becomes points

A read is four stages — parse, resolve label-selectors to series IDs, decompress the matching chunks, then aggregate. Stage 3 dominates.

Previously

Writes are durable. Now flip the system: a query says `{method=POST, status=500}` and we need to walk from those labels to actual bytes on disk.

Scene 07

How a query becomes points

Watch

Diagram

Four stages from left to right: Parse (the query becomes selectors and a time range), Resolve (selectors hit a black-box index that emits a small set of series IDs), Decompress (the matching chunks unpack into raw points), Aggregate (the points fold into a single result). A timing bar at the bottom shows ms spent per stage.

A PromQL query enters on the left. Watch it walk the four stages: parse splits text into selectors and a range, resolve hits a (still-opaque) index that emits a small set of series IDs, decompress unpacks the matching chunks for the [5m] window, and aggregate folds them into one number. The timing bar at the bottom shows where the milliseconds went.

Implementation

TSDB.query

the read path: four sequential stages, one per call

1def query(text, t0, t1, fn):
2    selectors, range = parseQuery(text)
3    series_ids = resolveSeries(selectors)
4    points = decompressAndAggregate(
5        series_ids, t0, t1, fn,
6    )
7    return points

parseQuery

stage 1 — tokenise text into selectors and a range

1def parseQuery(text):
2    ast = promql.parse(text)
3    selectors = []
4    for matcher in ast.label_matchers:
5        selectors.append(
6            (matcher.name, matcher.value),
7        )
8    range = ast.range  # e.g. [5m], [1h], [30d]
9    return selectors, range

resolveSeries

stage 2 — selectors hit the inverted index (black box)

1def resolveSeries(selectors):
2    # postings list per (label, value)
3    # cost depends on cardinality, NOT on range
4    postings = [
5        index.postings(label, value)
6        for (label, value) in selectors
7    ]
8    return intersect(postings)  # → {S3, S7}

decompressAndAggregate

stages 3 & 4 — unpack chunks in [t0,t1], then fold

1def decompressAndAggregate(ids, t0, t1, fn):
2    points = []
3    for sid in ids:
4        for chunk in chunksFor(sid):
5            if chunk.overlaps(t0, t1):
6                # delta-of-delta + XOR decode
7                points += chunk.decompress()
8    return fn(points)  # rate / sum / avg

PreviousHead chunks, WAL, and flushing NextInverted index — labels to series