Build an LSM-tree storage engine (LevelDB / RocksDB style) (11 scenes)
Scene 01 · Why Bitcask cannot give you ranges
Bitcask died on cardinality, the B-tree died on write rate. The brief LSM is built to satisfy: ordered reads, more keys than fit in RAM, no seek per write.
Scene 01
Why Bitcask cannot give you ranges
Diagram
Three panels under one shared workload of random PUTs. LEFT — Bitcask: a small 'keydir' RAM box (Bitcask's keydir is a hash table holding one row per live key — the previous curriculum) with a vertical bar that grows with key count, crossed by a red 'RAM budget' line; an OOM badge (Out-Of-Memory) fires when the bar pierces. MIDDLE — a B-tree on disk: a sorted tree where each random-key insert seeks to a random leaf; a 'seeks/sec' meter saturates red as the disk's seek ceiling is hit. RIGHT — an empty box with a '?': the slot the LSM (the design we're about to build) fills.
WORKLOADput random keys — 50/tick · randomness 100%ticks: 0Bitcask0 keysRAM · keydirRAM budgetB-tree on disk0/srootseekseeks/sec0 / 200?next scene?fast point reads, hot keys in cacheevery key is a row in RAMordered reads, range scansone seek per random-key insertgood at: ?next scenes fill this inBoth work at small scale. The brief is to keep both alive when the workload grows.
One workload of random-key writes drives three panels. Watch the first two panels die in turn — first on cardinality, then on write rate. The third panel is the empty slot we fill next.
Implementation
The brief — pseudocode
what the next scenes will satisfy
1# we want a single store that satisfies all of these:
2store.put(key, value) # at memory speed
3store.get(key) # bounded latency
4store.scan(from_key, to_key) # ordered range
5
6# subject to:
7# key_count > RAM_BUDGET / row_overhead
8# write_rate > disk_random_seeks_per_sec
9
10# Bitcask fails the first constraint (RAM ceiling).
11# B-tree fails the second (seek per random write).