Build a graph database (Neo4j / Dgraph-style) (16 scenes)
Scene 08 · Anchor selection and cardinality
The planner's real job is to anchor on the most selective node and estimate cardinality — start from the few, not the many — so flipping a WHERE filter can flip which end is cheapest and reverse the whole traversal direction.
Previously
Cypher hid one decision: which end of the pattern to start from. The planner makes it by estimating cardinality and anchoring on the few, not the many — and a WHERE clause can flip that choice and reverse the whole walk. But this entire optimization rests on an assumption: that every node has a SMALL, comparable number of edges. What happens when one node has ten million?
Scene 08
Anchor selection and cardinality
Diagram
The same social graph and the same Cypher pattern from last scene, with one addition: each end of the pattern now carries an estimated-count badge — its cardinality estimation, the planner's guess of how many nodes that end pulls into memory. Alice's RATED end is tiny (~12 movies); the popular Movie end is huge (~1,000,000 raters). The query planner is the component that compares those estimates and anchors on the smaller end, then expands outward; the ringed node is the anchor it chose. Add a WHERE clause that pins a single rare title and the Movie end shrinks to ~1 — now it's the smallest, so the planner re-anchors on the Movie and the whole walk reverses direction.
Same pattern as last scene: Alice's RATED movies. But now look at the two ends. Each one gets a number — an estimate of how many nodes that end pulls into memory. Alice rated about 12 movies. The movie she rated, The Matrix, was rated by about a million people. The chooser compares those two numbers and starts from the SMALLER end — Alice — then expands outward. Step through it.
Implementation
Planner.estimateEnds
the planner's only real cost question: how many nodes does each end pull in?
1def estimate_ends(pattern):2 # cardinality estimate per BOUND end of the pattern3 user_end = stats.out_degree(pattern.user, ':RATED') # Alice ~124 if pattern.where: # WHERE m.title = '...'5 movie_end = stats.selectivity(pattern.where) # pins ~16 else:7 movie_end = stats.in_degree(pattern.movie, ':RATED') # raters8 return {user_end, movie_end} # the numbers on the badges
Planner.pickAnchor
no join planner for an expand — just pick the smaller end, SEEK it once, expand outward
1def pick_anchor(pattern):2 ends = estimate_ends(pattern)3 # anchor on the SMALLEST estimated end — the few, not the many4 anchor = argmin(ends, key=estimate)5 start = index_seek(anchor) # the one index lookup6 # expand AWAY from the anchor; direction follows the choice7 return expand(start, ':RATED', toward=other_end(pattern, anchor))
Not sure what to ask? Tap a question — the staff engineer answers in the chat panel.