Scenes
250 scenes · 20 curricula · build the primitive

Build the primitive. Then nothing scares you.

Animated scenes that execute the canonical systems behind every interview — show, manipulate, predict. The day you finish your toy Kafka, every product design gets easier.

Build a Bitcask-style KV store
9 scenes

The simplest possible KV store that still works: an append-only log on disk + an in-memory hash index. Build it from first principles and feel which trade-offs every later store inherits.

~63 min
Build an LSM-tree storage engine (LevelDB / RocksDB style)
11 scenes

The simplest possible storage engine that gives you BOTH ordered reads AND more keys than fit in RAM, by accepting a deal: write to RAM at memory speed, log to disk for safety, then merge sorted files in the background forever.

~77 min
Build a B-tree storage engine (SQLite-style)
11 scenes

What actually happens when you run INSERT INTO users(...). One file of fixed-size pages, organized as B-trees, with a write-ahead log that turns commits into appends. Build it from a SQL writer's perspective and feel why every knob exists.

~77 min
Build a wide-column store (Cassandra / DynamoDB family)
13 scenes

One server is not enough — disk fills, throughput maxes, the box dies. Build a multi-node store from first principles: hash sharding, the consistent-hash ring, vnodes, replication, eventual consistency, tunable W+R quorum, hinted handoff, read repair. Every modern Dynamo-style store is a point in this design space.

~91 min
Build a Prometheus-style time-series database
12 scenes

The simplest database that can absorb a 1M-points-per-second firehose and still answer `sum(rate(http_requests_total{status="500"}[5m]))` in milliseconds — built bit by bit, literally.

~84 min
Build a graph database (Neo4j / Dgraph-style)
16 scenes

When the workload is 'friends of friends', a relational join melts. Build a store where edges are first-class — index-free adjacency, traversals that follow pointers instead of joining tables, and a query language (Cypher / GraphQL+) that thinks in patterns. Feel why graph storage shines for traversal-heavy work and stumbles on full-graph aggregates.

~112 min
Build a vector database (Pinecone / Weaviate / pgvector style)
15 scenes

Approximate nearest-neighbor over a billion 1536-dim vectors in 10 ms. Build the index from scratch (HNSW, IVF, PQ), pay the recall-vs-latency tax explicitly, support filtered + hybrid search, and feel why every LLM stack in 2026 has a vector store next to its KV store.

~105 min
Build an S3-style distributed object store
12 scenes

Eleven nines of durability over disks that fail weekly. Build the object store from first principles — flat keyspace, immutable objects, erasure coding instead of replication, eventual consistency turned strong, multipart upload, lifecycle and tiering — and feel why every modern data lake sits on top of something shaped exactly like this.

~84 min
Build Kafka
13 scenes

A partitioned, replicated, append-only log. The log is the database — internalize that, and a dozen product designs get easier.

~91 min
Build a CDC pipeline (Debezium + outbox)
12 scenes

Your service writes to its DB and publishes to Kafka — and any crash between those two writes is permanent inconsistency. Build a Change Data Capture pipeline (modeled on Debezium + the outbox pattern) that closes the gap by making the database itself the event source.

~84 min
Build a distributed logging stack (ELK / Loki)
12 scenes

Ship lines off N hosts, choose what to index, age data through tiers, retain or delete on schedule, and survive a chatty service — built one decision at a time.

~84 min
Build Redis
10 scenes

An in-memory data-structure server: one thread, rich types, optional persistence, async replication. Internalize the cost of single-threaded simplicity and a dozen caching/HA decisions get easier.

~70 min
Build a CDN
13 scenes

A globally-distributed reverse proxy whose only job is to (a) terminate the user's TCP/TLS milliseconds away and (b) serve a cached origin response so origin never sees the request. Internalize edge caching, anycast, TTL, revalidation, SWR, purge, the Vary footgun, origin shield, bypass, and hit ratio — and the dozen ways to misconfigure each.

~91 min
Build a distributed search engine (Elasticsearch / OpenSearch style)
12 scenes

Five million books, a search box, and a 100 ms budget. Build the engine from the inverted index up — segment, refresh, shard, replica, scatter-gather, BM25 — and feel why every guarantee that lives across shards is paid for in either an extra round trip or a small lie about the rankings.

~84 min
Build Raft — consensus you can defend
12 scenes

Replicate a deterministic state machine across N servers with safety as a theorem and liveness under partial synchrony. Build the protocol from term to commit to safety proof to reads, and feel why etcd, Cockroach, and TiKV ship slightly different Rafts.

~84 min
Build a Message Queue (RabbitMQ / SQS)
14 scenes

A point-to-point work queue — the messaging primitive Kafka is NOT. Each message goes to one consumer, ack deletes, retries push to a dead-letter queue, and a poisoned message is everyone's problem. Internalize ack vs visibility timeout vs DLQ vs prefetch vs FIFO groups — and learn to tell when Kafka is the wrong tool and when a queue is.

~98 min
Build a columnar OLAP store (ClickHouse / Druid style)
13 scenes

OLTP picks one row by key; OLAP scans a billion rows of one column and asks for a percentile. Build the analytical engine that makes that fast: columnar layout, dictionary/RLE/delta compression, vectorized execution, late materialization, MPP shuffle. Internalize why Postgres is 1000× slower than ClickHouse on the same query and why the inverse is also true.

~91 min
Build a gRPC-style RPC framework
14 scenes

Every microservice talks over RPC, and the framework you ship determines half the system's failure modes. Build an RPC framework with codec, streams, deadlines, cancellation, retries, interceptors, and load-aware client-side balancing — and feel why gRPC ate the polyglot RPC market and why Thrift and JSON-over-HTTP linger.

~98 min
Build a workflow engine (Temporal / Airflow / Cadence style)
13 scenes

A function that survives crashes, restarts, and re-deploys — and still finishes. Build a durable execution engine where workflow code is replayed deterministically from an event history, activities retry with exponential backoff, sagas compensate on failure, and the same workflow definition runs identically a year later. Internalize why 'just retry the cron job' breaks at the second step.

~91 min
Build a Service Mesh (Envoy / Istio style)
13 scenes

Every microservice request crosses two proxies. This curriculum is what they do: routing, load balancing, timeout-and-retry-budget, circuit breakers, outlier detection, token-bucket rate limits, mTLS with workload identity, and a control plane that streams config to all of them. Build it in the order the production problems show up — and feel why Envoy plus a control plane has eaten the east-west world.

~91 min
Coming soon
Build a Spanner-style strongly consistent distributed database
Coming soon

External consistency at global scale. Build a database where a Paxos group per shard agrees on every write, TrueTime turns a clock interval into a serialization point, and 2PC across shards stays correct because everyone honors the same wait. Feel why CockroachDB and YugabyteDB diverge from Spanner precisely where TrueTime sits.

Build a distributed SQL engine (CockroachDB-style)
Coming soon

SQL on top of a sea of Raft groups. Build a serializable, horizontally-scaled SQL database where every range is its own Raft group, transactions stitch commits across ranges, and the SQL layer is just an interpreter over a distributed KV store. Internalize why a single transaction can touch dozens of consensus groups and still be correct.

Build a document store (MongoDB-style)
Coming soon

Schemaless documents that still behave like a database. Build a system that stores JSON/BSON natively, ships replica sets with primary elections, shards a collection by key range or hash, and keeps secondary indexes consistent under write — and feel exactly where 'flexible schema' turns into 'silent inconsistency'.

Build a stream processor (Flink / Kafka Streams style)
Coming soon

Event time isn't processing time. Build a stream processor that tracks watermarks, windows by event time, holds keyed state with checkpoints, and recovers exactly-once after a node crash — and internalize why 'streaming SQL' is mostly the dataflow model in a different syntax.

Build a pub/sub system (Google Pub/Sub / Redis Pub/Sub style)
Coming soon

One publisher, N subscribers, no shared queue. Build the fanout primitive that powers notifications, cache invalidation, and reactive UIs — with explicit knobs for at-most-once vs at-least-once delivery, durable vs ephemeral subscriptions, and what happens when a slow subscriber stalls the pipe.

Build a distributed cache (Memcached / Pelikan style)
Coming soon

One Redis is easy. A hundred Redises serving 10M QPS with sub-millisecond p99 across a fleet is the real test. Build the client-side consistent-hashing distributed cache: hot-key replication, mirroring for fault tolerance, the thundering-herd dogpile, and the cache-coherence gymnastics that come with sharding.

Build an inverted index (Lucene-style)
Coming soon

The data structure underneath every search engine. Build a postings-list-based inverted index that survives ingest, merge, and query — term dictionary, skip lists, segment merges, deletes via tombstone. Internalize the per-segment immutable design before you scale it across shards in the Elasticsearch chapter.

Metrics / Monitoring System
Coming soon

Time-series at scale. Cardinality is the enemy.

Build a coordination service (ZooKeeper / etcd style)
Coming soon

Raft alone isn't enough. Build the service layer on top of Raft that the rest of your infrastructure depends on: a hierarchical namespace, watches, ephemeral nodes, leases, and the recipe library (locks, leader election, queues) that everyone reimplements badly. Internalize why Kubernetes, Kafka, and Consul all run on something shaped like this.

Build a CRDT library
Coming soon

Conflict-free merging as a math problem. Build the canonical CRDTs — counters, sets, registers, sequences — and feel why a commutative-associative-idempotent merge function buys you offline-first sync without a coordinator. The price is paid in metadata growth, and that's the load-bearing trade.

Build a load balancer (HAProxy / NGINX style)
Coming soon

The first hop every request takes — and the single piece of infra most likely to break a deploy. Build a load balancer from L4 (TCP) through L7 (HTTP) — connection pooling, health checks, sticky sessions, slow start, drain, the thundering-herd retry storm. Feel why 'just round-robin it' is the wrong default.

Build a distributed tracing system (Jaeger / Zipkin style)
Coming soon

Logs tell you what happened in one service; metrics tell you the rate; only traces tell you the full causal chain across N services for one user request. Build the third pillar of observability: span propagation, sampling that doesn't lie, span ingestion, trace storage by trace-id, and the UI flame graph that finally answers 'where did the latency go?'.