Build an S3-style distributed object store (12 scenes)
Scene 03 · Folders are a lie — the flat keyspace and its index
Keys are flat strings; the index maps key→location and shards by range, with a per-prefix throughput ceiling.
Previously
Surviving disk death means the bytes live somewhere other than one disk — so something has to record where; that something is the index, the first plane we crack open.
Scene 03
Folders are a lie — the flat keyspace and its index
Diagram
Left: object keys shown as flat opaque strings — every slash is just a byte in one string, not a folder. Right: the index, a sorted map from key to a (dim) storage location, cut into key-range shards. Each shard carries a request-rate meter with a ceiling line; the console-folder toggle shows the same keys as a fake '/'-split tree.
Each object is addressed by a key — and a key is one flat string; the slashes in "logs/2026/06/03/app.log" are just bytes, not folders. The map on the right sorts those strings and shards them by range. Try the console-folder toggle: it splits the keys on '/' into a tree, but there is no tree underneath.
Implementation
Index.resolve
a GET resolves a flat key string to its storage location
1def resolve(key):2 # key is ONE opaque string; '/' is just a byte.3 # No tree walk — binary search the sorted map.4 part = partitionForKey(key)5 entry = part.lookup(key) # key -> location + metadata6 return entry.location78def partitionForKey(key):9 # partitions are contiguous lexicographic key ranges,10 # so keys sharing a prefix land on ONE partition.11 return bisect(partitionBoundaries, key)
Partition.serveRequest
each key-range partition meters its own request rate
1# limits are PER partition, not per bucket2PUT_CEILING = 3500 # req/s3GET_CEILING = 5500 # req/s45def serveRequest(req):6 rate = self.observedRate(req.kind)7 ceiling = (PUT_CEILING if req.kind == PUT8 else GET_CEILING)9 if rate > ceiling:10 return Http503('SlowDown')11 return serve(req)
Index.maybeSplit
a saturated partition is gradually split into two
1# background loop on the index plane2def maybeSplit(part):3 if part.sustainedRate > part.ceiling:4 mid = part.medianKey()5 # split the range at mid -> two partitions,6 # each owning half the keys. Takes minutes,7 # so a sudden hot prefix throttles first.8 lo, hi = part.splitAt(mid)9 partitionBoundaries.insert(mid)
Not sure what to ask? Tap a question — the staff engineer answers in the chat panel.