Cardinality is the killer

Build a Prometheus-style time-series database (12 scenes)

Scene 09 · Cardinality is the killer

Each unique label-set is one series with its own head chunk in RAM. Add an unbounded label like user_id and you OOM in minutes.

Previously

The index is fast — until you make it index a million tiny postings lists. Cardinality is where the architecture meets reality.

Scene 09

Cardinality is the killer

Watch

Diagram

Left: a grid of label-set combinations. Each cell is one time series the database has to hold a head chunk for. Right: a vertical RAM bar with a red dashed BUDGET line at 8 GB. As dangerous labels (host, user_id) get added, the cell count multiplies and the bar climbs; once the bar pierces the budget line, the head OOMs and the banner reads 'OOM-killed at t+4 min'.

Start with two safe labels — method and status. Four series, head chunk barely visible on the RAM bar. Then add `path` with 7 bounded values: 28 series, still calm. Both labels are *enumerable*, and that is what keeps the curve flat.

Implementation

Series.id

the full label-set IS the series identity

1def series_id(metric_name, labels):
2    # canonical: name + sorted (k, v) pairs
3    parts = [metric_name]
4    for k in sorted(labels.keys()):
5        parts.append(k + '=' + labels[k])
6    # one distinct value of ANY label
7    # mints a brand-new series id
8    return hash('|'.join(parts))

Head.onSample

lookup-or-create per scraped sample — create is the OOM path

1def on_sample(metric_name, labels, ts, value):
2    sid = series_id(metric_name, labels)
3    series = head.index.get(sid)
4    if series is None:
5        # NEW series: head chunk + index entry +
6        # one postings insert per label.
7        series = HeadSeries(labels)
8        head.index[sid] = series
9        for k, v in labels.items():
10            postings[(k, v)].append(sid)
11    series.append(ts, value)

Head.estimateRamMb

cardinality is the PRODUCT of label cardinalities

1HEAD_CHUNK_BYTES   = 12_288  # ~12 KB live chunk
2INDEX_ENTRY_BYTES  =  4_096  # postings + label strings
3BUDGET_MB          =  8_000  # head RAM budget
4 
5def estimate_ram_mb(dimensions, churn):
6    series = 1
7    for d in dimensions:
8        series *= len(d.values)   # MULTIPLY, not add
9    bytes_per = HEAD_CHUNK_BYTES + INDEX_ENTRY_BYTES
10    stale = 1.25 if churn else 1.0
11    mb = series * bytes_per * stale / (1024 * 1024)
12    assert mb < BUDGET_MB, 'OOM-killed'

PreviousInverted index — labels to series NextDownsampling — a retention pyramid