Build an LSM-tree storage engine (LevelDB / RocksDB style) (11 scenes)
Scene 08 · Every byte, written ten times
Leveled compaction pays 10–30× write amplification; tiered/universal trades that down to 3–9× at the cost of read-amp and space-amp.
Previously

Levels capped reads — at a price. We just chose a multiplier without counting the bytes. Time to count them.

Scene 08
Every byte, written ten times
Diagram
Two staircases under one shared ingest. LEFT: LEVELED (scene 7) — one sorted run per level, non-overlapping below L0; this scene names its cost. RIGHT: TIERED (a.k.a. 'universal compaction' — the alternative policy, where multiple sorted runs are allowed per level and merged less often). Each side shows three META METERS — the new word this scene is WRITE AMPLIFICATION (WA): bytes written to disk divided by bytes the user wrote. Sister meters: READ AMPLIFICATION (RA — files consulted per missing-key GET) and SPACE AMPLIFICATION (SA — disk bytes / live bytes). Shared write-rate on top; same 1 GB ingest drives both staircases.
shared ingest: 100 MB/s — bytes ingested: 0 Blevel multiplier 10× — same bytes drive both staircasesleveled (low SA · low RA · high WA)L0L1L2L3L4L5write amphidden during predictread amphidden during predictspace amphidden during predicttiered (low WA · high RA · high SA)L0L1L2L3L4write amphidden during predictread amphidden during predictspace amphidden during predictSame 1 GB ingest streams into both staircases. Watch the meters tick.
Same 1 GB ingest into both staircases. Leveled climbs to ~24× WA, tiered to ~6×. Read-amp and space-amp tell the inverse story.
Implementation
WA on a leveled LSM
every byte rewritten once per level transition
1# rough WA accounting (per byte ingested):
2WAL_write = 1
3flush_to_L0 = 1
4L0 -> L1 = 1 # rewritten with overlapping L1 files
5L1 -> L2 = 1
6L2 -> L3 = 1
7L3 -> L4 = 1
8L4 -> L5 = 1
9# total ≈ 7 in this simple model; real LSMs land at 10–30×
10# because a single L_i→L_{i+1} pulls in mul× overlap on the L_{i+1} side
WA on a tiered LSM
merges happen RARELY — multiple runs sit in each level
1# rough WA accounting:
2WAL_write = 1
3flush_to_L0_run = 1
4merge_when_K_runs = 1 # only when K runs accumulate
5# repeated maybe 3–4 times across the lifetime of a byte
6# total ≈ 3–9×
7# cost paid in read-amp (more files per get) and space-amp (more shadowed bytes)