Build an LSM-tree storage engine (LevelDB / RocksDB style) (11 scenes)
Scene 10 · Blocks, cache, and the CPU/disk dial
SSTables are blocks; the block cache holds hot decompressed blocks; compression trades CPU for disk bytes with no effect on correctness.
Previously

Reads, writes, deletes, and bounded levels are all sorted out. But every read so far has touched a whole SSTable. Real systems read BLOCKS, cache them, and squeeze them with compression.

Scene 10
Blocks, cache, and the CPU/disk dial
Diagram
The selected SSTable (scene 3) on the shelf zooms into a row of BLOCKS — fixed-size chunks (default 4 KB) that are the unit of disk I/O for an LSM, NOT individual keys. A tiny block-index above the strip lists the first key of each block → its byte offset, so the read path binary-searches the index and reads ONE block per get. In the RAM column, the new term lands: a BLOCK CACHE — an in-RAM map holding recently-read DECOMPRESSED blocks, so a hot working set never touches disk again after the first read. A 'recent hits/misses' strip shows cache effectiveness; a hit-rate readout updates live. Compression knob (none / lz4 / zstd) trades CPU for disk bytes per block.
RAMmemtablej = 9BLOCK CACHE32 MBhit 0%DISKSSTable shelfL1L1/000005.sst[a … z]immutableBLOCKS · 4 KBb0b1b2b3b4b5OPERATIONidle — no operation in flightAn SSTable is a strip of 4 KB blocks. Block index says first-key-of-each-block → byte offset.
Watch one GET land on one block (not the whole file), and a second GET in the same block hit the cache. Hot reads stay in RAM.
Implementation
Block-aware read
block index → 1 block → cache
1def read_value(sst, key):
2 block_offset = sst.block_index.lookup(key) # binary search in RAM
3 if (sst.id, block_offset) in block_cache:
4 block = block_cache.get(sst.id, block_offset) # RAM hit
5 else:
6 raw = disk.pread(sst.id, block_offset, block_size)
7 block = decompress(raw) # CPU work
8 block_cache.put(sst.id, block_offset, block)
9 return scan_block(block, key)
Compression: pure CPU↔disk
correctness unchanged; bytes and CPU move
1# compression presets:
2# none → 1.0× disk, 1.0× CPU (fastest reads, biggest disk)
3# lz4 → 0.55× disk, 1.4× CPU (LSM default)
4# zstd → 0.40× disk, 2.6× CPU (best ratio, slowest decode)
5# the choice is a workload knob, not a correctness knob.