MrBeast is live and 5 million phones are open at once. A goal scores in the World Cup final and 30 million pulse-vibrate. Somebody types "BRO" and a hundred thousand people see it in the same second — except, by design, the other four million nine hundred thousand don't. This is the dominant shape of "live comments + score updates": a write-explosive, read-explosive workload where O(n) publishers create O(n²) potential delivery work, where the right answer to "deliver every comment to every viewer" is no — sampling is the design, not a degradation.
This canonical is about the actual production fabric: where the long-lived socket terminates, where the sampler lives, how a per-room fanout actor survives a hot-room of 5M concurrent without a single shard saturating, how the publish leg never lets a moderation outage corrupt the user's send button, and how cross-region active-active maintains a single writer per room so the replay buffer doesn't fork on failover. It is also explicitly about the things that DON'T work at this scale — topic-per-room Kafka above 50K rooms, naïve EventSource 3-second reconnect, exact XTRIM MAXLEN, "dual-write for safety" during failover, the SSE-over-HTTP/2 HOL-blocking trap.
The two distinguishing design pressures, separate from any other real-time problem in the catalog, are (1) bidirectional: every viewer is both a publisher AND a subscriber on the same socket — not the asymmetric "broadcast-only" of sports scores — and (2) approximate by design: in a mega-room you SHIP a sampler that drops 99.99% of comments on a deliberate, ranked, fair-by-window policy, and the SLO for "did the user see this specific comment" is intentionally not 100%. Score events, conversely, are on a privileged lane that bypasses both the sampler and the moderation pipeline — they're low-volume, authoritative, and never sacrificed.