Build a Prometheus-style time-series database (12 scenes)
Scene 07 · How a query becomes points
A read is four stages — parse, resolve label-selectors to series IDs, decompress the matching chunks, then aggregate. Stage 3 dominates.
Previously

Writes are durable. Now flip the system: a query says `{method=POST, status=500}` and we need to walk from those labels to actual bytes on disk.

Scene 07
How a query becomes points
Diagram
Four stages from left to right: Parse (the query becomes selectors and a time range), Resolve (selectors hit a black-box index that emits a small set of series IDs), Decompress (the matching chunks unpack into raw points), Aggregate (the points fold into a single result). A timing bar at the bottom shows ms spent per stage.
sum(rate(http_requests_total{method=POST,status=500}[5m]))QUERY1 · PARSE0 msSELECTORS{__name__=http_requests_to…{method=POST}{status=500}[5m]range2 · SERIES RESOLVE0 msINDEXposting list0 hitsSERIES IDS3 · DECOMPRESS0 msCHUNKSRAW POINTS4 · AGGREGATE0 msREDUCEΔ/trate()OUTPUTTIMING · per stage0 ms totalRead path idle — query waiting (range [5m]).
A PromQL query enters on the left. Watch it walk the four stages: parse splits text into selectors and a range, resolve hits a (still-opaque) index that emits a small set of series IDs, decompress unpacks the matching chunks for the [5m] window, and aggregate folds them into one number. The timing bar at the bottom shows where the milliseconds went.
Implementation
TSDB.query
the read path: four sequential stages, one per call
1def query(text, t0, t1, fn):
2 selectors, range = parseQuery(text)
3 series_ids = resolveSeries(selectors)
4 points = decompressAndAggregate(
5 series_ids, t0, t1, fn,
6 )
7 return points
parseQuery
stage 1 — tokenise text into selectors and a range
1def parseQuery(text):
2 ast = promql.parse(text)
3 selectors = []
4 for matcher in ast.label_matchers:
5 selectors.append(
6 (matcher.name, matcher.value),
7 )
8 range = ast.range # e.g. [5m], [1h], [30d]
9 return selectors, range
resolveSeries
stage 2 — selectors hit the inverted index (black box)
1def resolveSeries(selectors):
2 # postings list per (label, value)
3 # cost depends on cardinality, NOT on range
4 postings = [
5 index.postings(label, value)
6 for (label, value) in selectors
7 ]
8 return intersect(postings) # → {S3, S7}
decompressAndAggregate
stages 3 & 4 — unpack chunks in [t0,t1], then fold
1def decompressAndAggregate(ids, t0, t1, fn):
2 points = []
3 for sid in ids:
4 for chunk in chunksFor(sid):
5 if chunk.overlaps(t0, t1):
6 # delta-of-delta + XOR decode
7 points += chunk.decompress()
8 return fn(points) # rate / sum / avg