Build an LSM-tree storage engine (LevelDB / RocksDB style) (11 scenes)
Scene 01 · Why Bitcask cannot give you ranges
Bitcask died on cardinality, the B-tree died on write rate. The brief LSM is built to satisfy: ordered reads, more keys than fit in RAM, no seek per write.
Scene 01
Why Bitcask cannot give you ranges
Diagram
Three panels under one shared workload of random PUTs. LEFT — Bitcask: a small 'keydir' RAM box (Bitcask's keydir is a hash table holding one row per live key — the previous curriculum) with a vertical bar that grows with key count, crossed by a red 'RAM budget' line; an OOM badge (Out-Of-Memory) fires when the bar pierces. MIDDLE — a B-tree on disk: a sorted tree where each random-key insert seeks to a random leaf; a 'seeks/sec' meter saturates red as the disk's seek ceiling is hit. RIGHT — an empty box with a '?': the slot the LSM (the design we're about to build) fills.
One workload of random-key writes drives three panels. Watch the first two panels die in turn — first on cardinality, then on write rate. The third panel is the empty slot we fill next.
Implementation
The brief — pseudocode
what the next scenes will satisfy
1# we want a single store that satisfies all of these:2store.put(key, value) # at memory speed3store.get(key) # bounded latency4store.scan(from_key, to_key) # ordered range56# subject to:7# key_count > RAM_BUDGET / row_overhead8# write_rate > disk_random_seeks_per_sec910# Bitcask fails the first constraint (RAM ceiling).11# B-tree fails the second (seek per random write).