#52Cache Invalidation Across a Fleet
Write-through vs write-behind. Two generals.

A cache only earns its keep if readers trust it. The instant a write lands in the database, every cached copy of the affected key — across hundreds of cache hosts, in several regions, plus the CDN at the edge — is wrong. Cache invalidation across a fleet is the problem of propagating "this key changed" to all of those copies, quickly, reliably, and provably enough that users do not see stale data.

The hard part is not the happy path. It is that you can never be certain an invalidation was delivered (the Two Generals problem), so a system that assumes "I sent the delete, therefore the cache is correct" is wrong by construction. Production systems instead make invalidation idempotent, replayable from a durable log, version-guarded, and continuously measured — and accept that "correct" means "inconsistent for less than X milliseconds, less than one time in ten billion," not "never stale." This canonical models the look-aside fleet that Meta (memcache + mcsqueal + leases + Polaris), Netflix (EVCache), and Uber (CacheFront + Flux) actually run.

Reading: Scaling Memcache at Facebook (NSDI 2013) · TAO: Facebook's Distributed Data Store for the Social Graph (ATC 2013) · Cache Made Consistent / Polaris (Meta, 2022) · Netflix EVCache global replication · Uber CacheFront
write-through
write-behind
pub/sub invalidation
look-aside leases
binlog-tailed invalidation
cross-region staleness
consistency monitoring