Build an S3-style distributed object store (12 scenes)
Scene 07 · The payoff: durability is repair winning a race
Eleven nines is a rate equation: scrub finds rot, repair rebuilds lost fragments faster than disks destroy them.
Previously

Now that the index plane tracks every fragment and the data plane holds them, a background process can watch for losses and fix them — which is exactly the answer to the mystery we left open at the start: how do eleven nines survive disks failing every week?

Scene 07
The payoff: durability is repair winning a race
Diagram
One object's 17 data + 3 parity fragments live on 20 distinct disks inside a fleet of millions. A scrub worker (amber dashed) sweeps fragments at rest, recomputing checksums to catch silent bit rot; a repair worker (green glow) reads the survivors, rebuilds any lost fragment, and writes it to a fresh disk — restoring full redundancy. The REDUNDANCY DEFICIT gauge counts how many of the 20 are currently missing or bad.
DISK FLEET · scrub + repair2,400,000 disksMTTR ~9 hscrub ONDURABILITY11 ninesshowing 180 of 2,400,000 disksdata fragparity fragdeadrottedREDUNDANCY DEFICITbelow full k+m0 / 4 fragments below redundancyOne object, 17 data + 3 parity fragments, scattered across the fleet — watched by a scrub worker and a repair worker.
Disks die on the fleet cadence — every few minutes, somewhere, one fails and takes a fragment with it. Watch the two background workers keep the object whole: the scrub worker sweeps fragments at rest checking they haven't silently gone bad, and the repair worker reads the survivors and rebuilds each lost fragment onto a fresh disk. The deficit gauge keeps drifting back to zero.
Implementation
Scrub.sweep
the background sweep that catches silent rot at rest
1# runs continuously on the background-ops fleet
2def sweep():
3 for frag in fleet.fragments_at_rest():
4 stored = read(frag.location)
5 if checksum(stored) != frag.expected_checksum:
6 # silent bit rot: a live disk, a bad fragment
7 mark_lost(frag) # hand to Repair
8 advance_cursor()
Repair.rebuild
reads k survivors, reconstructs the lost fragment
1def rebuild(obj, lost_frag):
2 survivors = read_any(obj.fragments, count = k)
3 rebuilt = reed_solomon_decode(survivors)
4 fresh = pick_disk(distinct_failure_domain)
5 write(fresh, rebuilt) # full k+m restored
6 index.point(obj, lost_frag, fresh)
7 # MTTR = wall-clock this rebuild took
Durability.estimate
why repair time, not fragment count, sets the nines
1def annual_loss_probability(failure_rate, mttr):
2 # one fragment down opens a window of length mttr
3 window = failure_rate * mttr
4 # need MORE THAN m losses inside one window to lose obj
5 p_lose = window ** (m + 1)
6 # m (code) sets the exponent; mttr scales the base
7 return p_lose
Not sure what to ask? Tap a question — the staff engineer answers in the chat panel.