Build a Bitcask-style KV store (9 scenes)
Scene 05 · Merge — GC without stopping writes
Merge scans only immutable files, writes a deduplicated copy, then atomically retargets keydir pointers. The active file is never touched.
Previously
RAM held one entry per live key. Disk held every record ever written. Merge is what makes the disk remember its dead — without ever asking the writer to slow down.
Scene 05
Merge — GC without stopping writes
Diagram
BEFORE (top) and AFTER (bottom) share one keydir. Pointers retarget atomically at the swap tick; the active file is identical in both panels.
Three immutable files at ~60% dead bytes — overwrites and tombstones marked dead in red. Below them, the active file is still accepting appends (the 'live writes during merge' counter ticks throughout). AFTER will hold a single merged data file with its hint sibling — only the latest record per key. Watch merge walk idle → scanning → emitting → swap → done; every keydir row flips at the swap tick, never half.
Implementation
Bitcask.should_merge
the trigger — operator-tunable, OR of two conditions
1def should_merge(stats, cfg, clock):2 if not within_window(clock, cfg.merge_window):3 return False4 frag = stats.dead_bytes / stats.total_bytes5 if frag >= cfg.frag_merge_trigger:6 return True # default 60%7 if stats.dead_bytes >= cfg.dead_bytes_merge_trigger:8 return True # default 512 MB9 return False
Bitcask.merge
scan immutables, emit deduped copy + hint, swap, delete
1def merge(files, keydir, active_file):2 # scope: immutable files only — active file excluded3 inputs = [f for f in files if f is not active_file]4 out = open_new_data_file()5 hint = open_new_hint_file()6 new_pointers = {}7 for f in inputs:8 for rec in f.scan():9 if live_record(rec, keydir):10 pos = out.append(rec)11 hint.append(rec.key, pos, rec.vsz)12 new_pointers[rec.key] = (out.id, pos, rec.vsz)13 fsync(out); fsync(hint)14 atomic_keydir_swap(keydir, new_pointers)15 for f in inputs: delete(f)
Bitcask.live_record
a record is live iff the keydir still points at exactly it
1def live_record(rec, keydir):2 entry = keydir.get(rec.key)3 if entry is None:4 return False # tombstoned or never existed5 # newer write would have moved the pointer elsewhere6 return (7 entry.file_id == rec.file_id8 and entry.value_pos == rec.value_pos9 )
Bitcask.atomic_keydir_swap
every key flips at one tick — never half
1def atomic_keydir_swap(keydir, new_pointers):2 with keydir.write_lock: # held only for the swap3 for key, ptr in new_pointers.items():4 current = keydir.get(key)5 # a concurrent put may have moved the key into the6 # active file while we were emitting — keep that.7 if current is None:8 continue9 if current.file_id == new_pointers_origin(key):10 keydir[key] = ptr11 # writers were never blocked from appending to active_file