Build the primitive. Then nothing scares you.

One server is not enough — disk fills, throughput maxes, the box dies. Build a multi-node store from first principles: hash sharding, the consistent-hash ring, vnodes, replication, eventual consistency, tunable W+R quorum, hinted handoff, read repair. Every modern Dynamo-style store is a point in this design space.

Build a Prometheus-style time-series database

The simplest database that can absorb a 1M-points-per-second firehose and still answer `sum(rate(http_requests_total{status="500"}[5m]))` in milliseconds — built bit by bit, literally.

Build a graph database (Neo4j / Dgraph-style)

16 scenes

When the workload is 'friends of friends', a relational join melts. Build a store where edges are first-class — index-free adjacency, traversals that follow pointers instead of joining tables, and a query language (Cypher / GraphQL+) that thinks in patterns. Feel why graph storage shines for traversal-heavy work and stumbles on full-graph aggregates.

~112 min

Build a vector database (Pinecone / Weaviate / pgvector style)

15 scenes

Approximate nearest-neighbor over a billion 1536-dim vectors in 10 ms. Build the index from scratch (HNSW, IVF, PQ), pay the recall-vs-latency tax explicitly, support filtered + hybrid search, and feel why every LLM stack in 2026 has a vector store next to its KV store.

~105 min

Build an S3-style distributed object store

Eleven nines of durability over disks that fail weekly. Build the object store from first principles — flat keyspace, immutable objects, erasure coding instead of replication, eventual consistency turned strong, multipart upload, lifecycle and tiering — and feel why every modern data lake sits on top of something shaped exactly like this.

Build Kafka

A partitioned, replicated, append-only log. The log is the database — internalize that, and a dozen product designs get easier.

Build a CDC pipeline (Debezium + outbox)

Your service writes to its DB and publishes to Kafka — and any crash between those two writes is permanent inconsistency. Build a Change Data Capture pipeline (modeled on Debezium + the outbox pattern) that closes the gap by making the database itself the event source.

Build a distributed logging stack (ELK / Loki)

Ship lines off N hosts, choose what to index, age data through tiers, retain or delete on schedule, and survive a chatty service — built one decision at a time.

Build Redis

10 scenes

An in-memory data-structure server: one thread, rich types, optional persistence, async replication. Internalize the cost of single-threaded simplicity and a dozen caching/HA decisions get easier.

~70 min

Build a CDN

A globally-distributed reverse proxy whose only job is to (a) terminate the user's TCP/TLS milliseconds away and (b) serve a cached origin response so origin never sees the request. Internalize edge caching, anycast, TTL, revalidation, SWR, purge, the Vary footgun, origin shield, bypass, and hit ratio — and the dozen ways to misconfigure each.

Build a distributed search engine (Elasticsearch / OpenSearch style)

Five million books, a search box, and a 100 ms budget. Build the engine from the inverted index up — segment, refresh, shard, replica, scatter-gather, BM25 — and feel why every guarantee that lives across shards is paid for in either an extra round trip or a small lie about the rankings.

Build Raft — consensus you can defend

Replicate a deterministic state machine across N servers with safety as a theorem and liveness under partial synchrony. Build the protocol from term to commit to safety proof to reads, and feel why etcd, Cockroach, and TiKV ship slightly different Rafts.

Build a Message Queue (RabbitMQ / SQS)

14 scenes

A point-to-point work queue — the messaging primitive Kafka is NOT. Each message goes to one consumer, ack deletes, retries push to a dead-letter queue, and a poisoned message is everyone's problem. Internalize ack vs visibility timeout vs DLQ vs prefetch vs FIFO groups — and learn to tell when Kafka is the wrong tool and when a queue is.

~98 min

Build a columnar OLAP store (ClickHouse / Druid style)

OLTP picks one row by key; OLAP scans a billion rows of one column and asks for a percentile. Build the analytical engine that makes that fast: columnar layout, dictionary/RLE/delta compression, vectorized execution, late materialization, MPP shuffle. Internalize why Postgres is 1000× slower than ClickHouse on the same query and why the inverse is also true.

Build a gRPC-style RPC framework

14 scenes

Every microservice talks over RPC, and the framework you ship determines half the system's failure modes. Build an RPC framework with codec, streams, deadlines, cancellation, retries, interceptors, and load-aware client-side balancing — and feel why gRPC ate the polyglot RPC market and why Thrift and JSON-over-HTTP linger.

~98 min

Build a workflow engine (Temporal / Airflow / Cadence style)

A function that survives crashes, restarts, and re-deploys — and still finishes. Build a durable execution engine where workflow code is replayed deterministically from an event history, activities retry with exponential backoff, sagas compensate on failure, and the same workflow definition runs identically a year later. Internalize why 'just retry the cron job' breaks at the second step.

Build a Service Mesh (Envoy / Istio style)

Every microservice request crosses two proxies. This curriculum is what they do: routing, load balancing, timeout-and-retry-budget, circuit breakers, outlier detection, token-bucket rate limits, mTLS with workload identity, and a control plane that streams config to all of them. Build it in the order the production problems show up — and feel why Envoy plus a control plane has eaten the east-west world.