Vnodes flatten the lumpy ring

Build a wide-column store (Cassandra / DynamoDB family) (13 scenes)

Scene 05 · Vnodes flatten the lumpy ring

Give each physical server many small ring positions; arcs become uniform and deaths spread their load.

Previously

The ring is great in theory, but with one position per server those arcs were wildly unequal — and worse, a dead server dumped its entire arc onto a single clockwise neighbour. We need to chop each server's stake into many small pieces.

Scene 05

Vnodes flatten the lumpy ring

Watch

Diagram

Same 8-server ring as scene 4. Each physical server now owns many small wedges scattered around the ring — its **vnodes**. Wedges of the same colour belong to the same physical server (a vnode is a virtual ring position, NOT a virtual machine). The bar at the top reports load variance — how lumpy the arcs are. When a server dies its wedges grey out, and the next-clockwise wedges (now owned by many different servers, not one) absorb the load.

↑ vnode — a virtual ring position (NOT a VM)

← variance shrinks as vnodes/server goes up

Same 8 servers as scene 4 — but each one now holds 4 small wedges scattered around the ring. The variance bar up top is what to watch: arcs are far closer to the ideal 45° share, and no single neighbour is on the hook for a whole server's worth of keys anymore.

Implementation

tokensForNode

deterministic vnode positions per physical server

1def tokensForNode(nodeId, vnodeCount):
2    out = []
3    for i in range(vnodeCount):
4        # pure arithmetic, no rng — same answer every render
5        t = ((nodeId * 1009 + i * 2017) * 137) mod 360
6        out.append(t)
7    return sorted(out)
8 
9# ring tokens = union of tokensForNode(n, k) for every node n

loadVariance

variance of arc sizes shrinks as vnodeCount grows

1def loadVariance(nodes, vnodeCount):
2    arcs = []   # (owner, size) per ring slice
3    for token in sorted(all_tokens(nodes, vnodeCount)):
4        arcs.append((owner(token), arc_size(token)))
5    per_node_total = sum_by_owner(arcs)
6    return stddev(per_node_total) / mean(per_node_total)
7 
8# law of large numbers: each node's stake is a sum of
9# vnodeCount random arcs, so variance ~ 1 / sqrt(vnodeCount).
10# 1 -> ~50% ; 4 -> ~20% ; 64 -> ~5%.

absorbDeadNode

where a dead server's load goes

1def absorbDeadNode(dead):
2    for vnode in dead.tokens:
3        # each greyed wedge hands its keys to its own
4        # clockwise successor — a different physical node
5        # than the previous wedge, when vnodeCount is high.
6        successor = ring.successor(vnode + 1)
7        successor.adopt(vnode.keys)
8 
9# 1 token   -> 1 successor crushed
10# 64 tokens -> ~64 successors each take 1/64 of the load

PreviousThe ring — keys and nodes on a circle NextCopies on the next N servers