Build an LSM-tree storage engine (LevelDB / RocksDB style) (11 scenes)
Scene 11 · Design your LSM — and feel its trades
Capstone: pick a workload, set knobs, watch the verifier trace each choice back to the scene that taught it. The amp triangle is the load-bearing trade.
Previously

All ten knobs are on the table. The capstone asks: given a workload, which ones do you turn — and which scene's insight justifies each turn?

Scene 11
Design your LSM — and feel its trades
Diagram
Top: the workload SLO card (Service Level Objective — what the workload promises its users; here, read latency, disk budget, write rate). MIDDLE: the LSM as you've built it — memtable+WAL (scene 2) → L0 (just-flushed SSTables, scene 3) → L1+ (compacted, leveled, scene 7). RIGHT: the three roles — READER (consults memtable + bloom + blocks), WRITER (memtable + WAL only), COMPACTOR (rewrites SSTables) — each labeled with the knobs it owns. BOTTOM: three trade-off bars — the AMP TRIANGLE (write-amp / read-amp / space-amp from scene 8) — and a live VERIFIER panel that flags each knob choice against the earlier scene whose insight justifies (green) or contradicts (red) it.
Workload — OLTP mixed (10M keys × 1 KB, p99 read <5ms, disk budget 2× live)Mixed read/write OLTP. Disk budget 2× live data — space-amp matters. p99 read <5ms — read-amp matters. Write rate ~10 MB/s — write-amp tolerable.CLUSTERRF=1 · partitions=11memtable (64 MB)2PWALmemtableL0 (just-flushed)4PL0/AL0/BL0/CL0/DL1+5PL1/0L1/1L1/2L1/3L1/…COMPACTION POLICY: LEVELED · 3 consumersreader (bloom 10 bits/key, blo…owns memtable, L0/A, L1/0writer (memtable + WAL)owns WAL, memtablecompactor (leveled, lz4)owns L0/A, L0/B, L0/C +4EOS LAYEROFFPIDproducer idEpochtxn epochGroup-Genrebalance genTRADE-OFFSThroughput4/5Durability4/5Complexity4/5WARNINGSleveled matches the workload (scene lsm-08). leveled + 10 b/kbloom
OLTP mixed: 10M keys × 1KB values, p99 read <5ms, disk budget 2× live data. Default knobs are pre-loaded; the verifier traces each to its scene.
Implementation
LSM design checklist
knob → workload axis → earlier scene
1memtable size → recovery time vs flush frequency (scenes 2, 3)
2compaction policy → write-amp vs read/space-amp (scene 8)
3level multiplier → #levels vs per-compaction work (scene 7)
4bloom bits/key → miss-cost vs RAM (scene 5)
5block size → point-read I/O vs compression (scene 10)
6compression → CPU vs disk bytes (scene 10)
7block_cache_size → hot-working-set fits in RAM? (scene 10)
8tombstone retention → safe deletion vs space (scene 9)
When LSM is the wrong answer
scene 1 is still in scope
1# fat values, low key cardinality, all point reads
2# → Bitcask wins (scene 1 contrast)
3# transactional multi-key OLTP
4# → B-tree DB (Postgres, MySQL) — LSM compaction is a tax here
5# columnar OLAP
6# → Parquet + columnar engine — LSM cannot beat sort-merge-with-pushdown
7# tiny dataset that fits in RAM
8# → just use a hash table