Build a wide-column store (Cassandra / DynamoDB family) (13 scenes)
Scene 11 · Heal on read, heal on a schedule
When a read sees disagreement, push the winner to stale replicas; on a cron, compare every key.
Previously

Hints expire and replicas drift cold; the cluster needs a way to find and fix divergence on its own — one mechanism for hot data on every read, one for cold data on a schedule.

Scene 11
Heal on read, heal on a schedule
Diagram
The ring at RF=3, with one featured key replicated to three successive nodes. **read repair**: when a coordinator sees replicas disagree on a read, it picks the LWW winner and asynchronously updates the stale replica — but only for the keys actually read. **anti-entropy**: a scheduled background process diffs replicas using Merkle trees (the data structure used to diff two replicas in O(log n) — each subtree carries a hash of its leaves, so mismatched roots descend straight to the differing rows) and bulk-heals every divergent key, regardless of read activity.
mode: with-merkleS0S1S2S3S4S5S6S7user-42R1R2R3READ REPAIR (per-key inline)n3 merkleRn5 merkleRrepairdiffering leaf #2 → ship only that subtreeRead repair: a single key was read, replicas disagreed, the coordinator pushed the LWW winner to the stale re…
RF=3 — three replicas of user-42, but they don't agree
read repair → coordinator pushes LWW winner to stale replica (only this key)
Merkle tree: hash of subtree → spot mismatches in O(log n)
Watch the read on user-42. The three replicas reply with different values; the coordinator picks the latest by timestamp and pushes the winner back to the stale replica. The Merkle inset on the right shows where the two replicas disagreed — one glowing leaf.
Implementation
Coordinator.read_repair
heal the just-read key, then return to the client
1def read_repair(key, replicas):
2 responses = [r.read(key) for r in replicas]
3 winner = max(responses, key=lambda x: x.ts)
4 # only the keys actually read get healed
5 for resp in responses:
6 if resp.ts < winner.ts:
7 send_async(resp.replica,
8 Write(key, winner.value, winner.ts))
9 return winner.value
AntiEntropy.merkle_repair
diff two replicas' trees; bulk-heal every mismatched leaf
1def merkle_repair(a, b, range):
2 tree_a = a.build_merkle(range)
3 tree_b = b.build_merkle(range)
4 diffs = diff_subtree(tree_a.root, tree_b.root)
5 # diffs is the set of mismatched leaf ranges
6 for leaf_range in diffs:
7 rows_a = a.fetch(leaf_range)
8 rows_b = b.fetch(leaf_range)
9 merged = lww_merge(rows_a, rows_b)
10 a.bulk_write(merged)
11 b.bulk_write(merged)
AntiEntropy.diff_subtree
recursive descent — skip whole subtrees on hash match
1def diff_subtree(node_a, node_b):
2 if node_a.hash == node_b.hash:
3 return [] # entire subtree matches; skip
4 if node_a.is_leaf:
5 return [node_a.range]
6 out = []
7 out += diff_subtree(node_a.left, node_b.left)
8 out += diff_subtree(node_a.right, node_b.right)
9 return out