Build a video streaming platform like YouTube or Netflix. Users press play and expect a 4 K stream to start within two seconds; creators upload 50 GB mezzanines and expect the full ABR ladder published within minutes. Almost every byte the system serves is read; almost every byte it stores was produced by transcoding. The architectural pressure is not QPS — it is egress (you are 15% of the US internet at peak), fan-out (one source mezzanine becomes 250 K encode jobs across rungs × codecs × shots), and the popularity Pareto (the top 1% of catalogue serves 50% of bytes; the bottom 50% serves <0.1%).
The reference design below is what an SRE staff team at Netflix-class scale would actually run a five-year incident retro against — not the 45-minute whiteboard sketch. Every node has load-bearing internals; every edge declares wire semantics; every chaos card is grounded in a cited real-world incident. Focus is on the four pillars: ABR, CDN, transcoding, and hot/cold tiering. Comments / search / recommendations / ads are explicitly out of scope.