Build a graph database (Neo4j / Dgraph-style) (16 scenes)
Scene 14 · Design your graph database
Every graph-DB deployment is a deliberate set of choices — storage layout, index strategy, supernode handling, single-node ACID vs distributed, and traversal-heavy vs aggregate-heavy workload — and the right configuration for a fraud-ring traversal is wrong for a PageRank pipeline even though the primitives are identical.
Previously

You've felt every force: index-free adjacency makes local traversal fly, the supernode and the partition cut break it, ACID anchors correctness on one machine, and Pregel inverts the payoff for whole-graph work. The capstone is choosing deliberately for a real workload — every knob traceable to the scene that justified it, with the local-vs-global spine as your compass.

Scene 14
Design your graph database
Diagram
The capstone canvas. Four workload cards are docked across the top — each carries a one-line constraints summary (mutating or read-mostly, fits one box or too big, supernodes, local hops or whole-graph). One card is live at a time. Down the side sits the palette of every knob the arc earned, in six groups: storage layout (doubly-linked chains vs CSR array), index strategy (the anchor you SEEK before you EXPAND), supernode handling (expand the cheap direction, dense-node grouping, relationship-chain locks), consistency (single-primary ACID vs distributed/weaker isolation), distribution (single node, edge-cut, vertex-cut, predicate sharding), and workload model (on-demand local traversal vs whole-graph Pregel). As you choose knobs, the verifier grades each one against the LIVE workload — green 'fits', amber 'wasteful' (you're paying on an axis this workload doesn't constrain), red 'violation' (it breaks a hard limit) — and every verdict cites the scene that justifies it. The compass is the local-vs-global spine: a local query lights a few nodes; a whole-graph pass lights them all.
ACTIVEFraud-ring traver…real-time multi-hop · constantl…fraud-traversalKnowledge graphbillions of typed edges · type-…knowledge-graphSocial feedcelebrity supernodes · follow-f…social-feedNightly PageRankwhole-graph scan · read-mostly …pagerank-batchDESIGN PALETTEStorage layoutdoubly-linked chains (…CSR array (compact / i…Index strategyB-tree anchor on the l…full-text (Lucene) anc…no anchor index (label…Supernode han…expand the low-degree …dense-node relationshi…relationship-chain loc…Consistencysingle-primary ACID (l…distributed / weaker i…Distributionsingle node (no partit…edge-cut partitionvertex-cut partitionpredicate sharding (ed…Workload modelon-demand local traver…whole-graph Pregel / B…VERIFIERFITSdoubly-linked chains (mutable / O…↳ Doubly-linked chains take cheap edge inserts on a … · scene 6.FITSB-tree anchor on the lookup prope…↳ You still SEEK the flagged account once via an index before… · scene 5.FITSsingle-primary ACID (logical log)↳ A mutating fraud graph on one box gets full ACID with no … · scene 10.FITSsingle node (no partition)↳ It fits one big box, so every hop stays a pointer dereference — no… · scenes 3, 11.FITSon-demand local traversal↳ k-hop 'who-transacted-with-whom' lights a few nodes — the local … · scene 3.Configuring: Fraud-ring traversal — real-time multi-hop · constantly mutating · fits one big box
four workloads, one toolkit
the spine is your compass: local lights a few · global lights all
every knob traces back to a scene →
Here is everything the arc built, in one place. Four workload cards are docked across the top — *fraud-ring traversal*, *knowledge graph*, *social feed*, *nightly PageRank* — and each card states its real constraints: does the graph mutate or sit read-mostly, does it fit one box or sprawl across machines, are there celebrity supernodes, and does a query touch a few nodes or every node. Down the side is the palette: *storage layout* (mutable doubly-linked chains vs compact CSR), the *index* anchor you SEEK before you EXPAND, *supernode handling*, *consistency* (single-primary ACID vs distributed), *distribution* (single node, edge-cut, vertex-cut, predicate sharding), and the *workload model* (on-demand local traversal vs whole-graph Pregel). Every one of those knobs was earned in an earlier scene. The job now is not to learn anything new — it's to read each workload off the local-vs-global spine and pick the honest configuration. The same toolkit; four different right answers.
Not sure what to ask? Tap a question — the staff engineer answers in the chat panel.