Build a Bitcask-style KV store (9 scenes)
Scene 01 · Log on disk, hash in RAM
A Bitcask is exactly two structures: an append-only log of records on disk + a single in-memory hash (the keydir). Every other property follows from this split.
Scene 01
Log on disk, hash in RAM
Diagram
Left: data-file directory · active at bottom. Right: keydir hash. Bottom: latest op.
DATA DIRECTORY001.dataclosed · immutable002.dataclosed · immutable003.dataactive · acceptingKEYDIR (in RAM)keyfile_idvalue_posvalue_szOPERATIONcell: crc · tstamp · ksz · vsz · key=valueidle — no operation in flightEmpty directory · empty keydir.
An empty Bitcask: no data files written, no keydir rows. Watch what a PUT actually does — one record lands on disk, one row appears in the keydir. Two structures, growing in lockstep.
Implementation
Record on disk
fixed-size header + two variable blobs; appended at tail
1# one entry in <file_id>.data, written end-to-end:
2record = struct(
3 crc: u32, # checksum of the bytes that follow
4 tstamp: u32, # logical write time
5 ksz: u32, # key size in bytes
6 vsz: u32, # value size in bytes
7 key: bytes[ksz],
8 value: bytes[vsz],
9)
10# closed files are immutable forever; only the active file appends.
Bitcask.put
append a record at the active tail, then retarget the keydir slot
1def put(key, value):
2 tstamp = now()
3 bytes = encode(crc, tstamp, len(key), len(value), key, value)
4 offset = active_file.size # current end-of-file
5 active_file.append(bytes) # sequential write
6 fsync_if_policy(active_file) # sync_strategy
7 keydir[key] = Entry(
8 file_id = active_file.id,
9 value_pos = offset + header_sz + len(key),
10 value_sz = len(value),
11 tstamp = tstamp,
12 ) # one slot per LIVE key
13 return ok
Bitcask.get
one hash lookup; one pread at the recorded offset
1def get(key):
2 entry = keydir.lookup(key)
3 if entry is None: return not_found
4 fd = open_or_reuse(entry.file_id)
5 bytes = pread(fd, entry.value_pos, entry.value_sz)
6 return bytes # at most one disk seek