Build an S3-style distributed object store (12 scenes)
Scene 08 · Why strong consistency took S3 fourteen years
Eventual-consistency anomalies, the Dec 2020 flip, and the witness read-barrier — strong within a region only.
Previously
The bytes are durable, but the index that points at them was, for fourteen years, sometimes lying to readers — and making it stop lying was S3's hardest problem.
Scene 08
Why strong consistency took S3 fourteen years
Diagram
A client PUTs and GETs through a CACHE LAYER of several nodes sitting in front of the INDEX + PERSISTENCE TIER (the source of truth: key → location). Under eventual consistency a write updates one cache node while a read reads another — returning the OLD pointer (a stale read). In strong mode a WITNESS read barrier records per-object write order and reloads any stale cache node before answering. A cross-region replica lags regardless.
Eventual consistency: a PUT updates the index through one cache node, but a GET/LIST served from a DIFFERENT cache node still holds the OLD pointer. Watch three real anomalies replay in turn — each one a reader being told something that isn't true any more.
Implementation
IndexCache.read(key)
the read path — eventual returns whatever the node holds
1def read(key):2 node = cacheRing.nodeFor(key) # may differ from write's node3 entry = node.lookup(key)4 if not cfg.strong:5 return entry # may be stale: an older pointer, or a miss6 # strong: ask the witness whether this entry is current7 if witness.isStale(key, entry):8 entry = persistence.load(key) # reload source of truth9 node.put(key, entry) # repair the cache node10 return entry # read-your-writes, within this region
Not sure what to ask? Tap a question — the staff engineer answers in the chat panel.