Design your LSM — and feel its trades — Build an LSM-tree storage engine (LevelDB / RocksDB style)

Build an LSM-tree storage engine (LevelDB / RocksDB style) (11 scenes)

Scene 11 · Design your LSM — and feel its trades

Capstone: pick a workload, set knobs, watch the verifier trace each choice back to the scene that taught it. The amp triangle is the load-bearing trade.

Previously

All ten knobs are on the table. The capstone asks: given a workload, which ones do you turn — and which scene's insight justifies each turn?

Scene 11

Design your LSM — and feel its trades

Watch

Diagram

Top: the workload SLO card (Service Level Objective — what the workload promises its users; here, read latency, disk budget, write rate). MIDDLE: the LSM as you've built it — memtable+WAL (scene 2) → L0 (just-flushed SSTables, scene 3) → L1+ (compacted, leveled, scene 7). RIGHT: the three roles — READER (consults memtable + bloom + blocks), WRITER (memtable + WAL only), COMPACTOR (rewrites SSTables) — each labeled with the knobs it owns. BOTTOM: three trade-off bars — the AMP TRIANGLE (write-amp / read-amp / space-amp from scene 8) — and a live VERIFIER panel that flags each knob choice against the earlier scene whose insight justifies (green) or contradicts (red) it.

Sources

OLTP mixed: 10M keys × 1KB values, p99 read <5ms, disk budget 2× live data. Default knobs are pre-loaded; the verifier traces each to its scene.

Implementation

LSM design checklist

knob → workload axis → earlier scene

1memtable size       → recovery time vs flush frequency  (scenes 2, 3)
2compaction policy   → write-amp vs read/space-amp        (scene 8)
3level multiplier    → #levels vs per-compaction work     (scene 7)
4bloom bits/key      → miss-cost vs RAM                   (scene 5)
5block size          → point-read I/O vs compression      (scene 10)
6compression         → CPU vs disk bytes                  (scene 10)
7block_cache_size    → hot-working-set fits in RAM?       (scene 10)
8tombstone retention → safe deletion vs space             (scene 9)

When LSM is the wrong answer

scene 1 is still in scope

1# fat values, low key cardinality, all point reads
2#   → Bitcask wins (scene 1 contrast)
3# transactional multi-key OLTP
4#   → B-tree DB (Postgres, MySQL) — LSM compaction is a tax here
5# columnar OLAP
6#   → Parquet + columnar engine — LSM cannot beat sort-merge-with-pushdown
7# tiny dataset that fits in RAM
8#   → just use a hash table

PreviousBlocks, cache, and the CPU/disk dial