Build a wide-column store (Cassandra / DynamoDB family)
13 scenes · ~91 min · build the primitive

Build your own wide-column store (Cassandra / DynamoDB family)

One server is not enough — disk fills, throughput maxes, the box dies. Build a multi-node store from first principles: hash sharding, the consistent-hash ring, vnodes, replication, eventual consistency, tunable W+R quorum, hinted handoff, read repair. Every modern Dynamo-style store is a point in this design space.

  1. 01
  2. 02
  3. 03
  4. 04
  5. 05
  6. 06
  7. 07
  8. 08
  9. 09
  10. 10
  11. 11
  12. 11a
  13. 12
  1. 01
    One server, three ways to die
    A single box hits a capacity wall, a throughput wall, and a death event — and your data is gone.
    ~7 min
  2. 02
    Split keys with hash mod N
    Hash the key, take it mod N, route to that server — keys spread evenly across the cluster.
    ~7 min
  3. 03
    Adding a server remaps everything
    When N changes from 4 to 5, almost every key's home changes — a cluster-wide migration.
    ~7 min
  4. 04
    The ring — keys and nodes on a circle
    Map both keys and servers onto a circle; each key belongs to the next server clockwise.
    ~7 min
  5. 05
    Vnodes flatten the lumpy ring
    Give each physical server many small ring positions; arcs become uniform and deaths spread their load.
    ~7 min
  6. 06
    Copies on the next N servers
    Store each key on the next RF distinct servers clockwise so a node death loses no data.
    ~7 min
  7. 07
    Replicas disagree, then converge
    Concurrent writes hit replicas at different times; for a moment they disagree, but they converge under LWW.
    ~7 min
  8. 08
    W plus R greater than N
    Make every read overlap every write on at least one replica by sliding two knobs.
    ~7 min
  9. 09
    Partition forces a choice
    Split the cluster in two; one side keeps quorum, the other goes unavailable — or you accept divergence.
    ~7 min
  10. 10
    Hold the write until it wakes up
    When a replica is briefly unreachable, the coordinator stashes the write and replays it on return.
    ~7 min
  11. 11
    Heal on read, heal on a schedule
    When a read sees disagreement, push the winner to stale replicas; on a cron, compare every key.
    ~7 min
  12. 11a
    Gossip keeps everyone informed
    Each node, every second, swaps state with a few peers; the cluster picture converges without a master.
    ~7 min
  13. 12
    Design canvas — pick every knob
    Given a real workload, pick partition key, RF, W/R, vnodes, and DC topology — and grade the result.
    ~7 min