Build a vector database (Pinecone / Weaviate / pgvector style) (15 scenes)
Scene 07 · Express lanes — layers and ef_search
Skip-list layers add express-lane nodes that cross huge distances in one hop, then descend for local refinement — roughly log search, with ef_search trading beam width for recall until it plateaus.
Previously
A flat graph hops one short link at a time; HNSW's promoted upper layers are express lanes that cross the space in a few big hops, then descend for precise local steps — roughly logarithmic search, with ef_search as the recall-vs-latency dial. But all that speed and recall has a hidden bill: HNSW keeps every full vector AND every edge in memory.
Scene 07
Express lanes — layers and ef_search
Diagram
The same 14-song map, now drawn as stacked translucent planes. The TOP plane holds only 2-3 promoted nodes joined by long edges — the express lanes that cover big distances in one hop. The BOTTOM plane holds all 14 songs with dense local links for fine steps. The search enters at the top, takes one big hop near the query's region, drops a layer, and finishes with small careful hops. The mini-chart on the right is the recurring recall-vs-latency frontier; the live HNSW dot climbs as the candidate beam widens, then stalls.
Last scene the graph was flat — one short hop at a time. Picture searching a billion points that way: from a random start you'd take thousands of tiny steps just to cross the map. The fix is borrowed straight from a skip list. Most songs live only on the BOTTOM layer with their close local links. But a few are PROMOTED to a sparse TOP layer and joined by long edges — think of an express train that skips every local stop, gets you across town in one ride, and then you switch to the local for the last block. Watch the search enter at the top, take one big express hop into the query's region, drop down a layer, and finish with small careful hops — landing on Midnight Drive, Synth Dawn, and Lonely Synth, the query's true neighbors.
Implementation
HNSW.search
enter at the top, descend layer by layer to the query
1def search(query, k):2 ep = entry_point # a promoted top-layer node3 # express lanes: greedy-hop down the sparse upper layers,4 # keeping just the single best node at each level5 for level in range(top_layer, 0, -1):6 ep = search_layer(query, ep, ef=1, level)7 # bottom layer: dense local refinement with the full beam8 candidates = search_layer(query, ep, ef=ef_search, level=0)9 return candidates.nearest(k)
HNSW.searchLayer
greedy best-first walk with a beam of width ef
1def search_layer(query, ep, ef, level):2 visited = {ep}3 candidates = pq([ep]) # frontier to explore4 found = pq([ep]) # best ef so far5 while candidates:6 c = candidates.pop_nearest(query)7 if dist(c, query) > found.farthest_dist():8 break # nothing closer left to find9 for n in neighbors(c, level):10 if n not in visited:11 visited.add(n)12 candidates.push(n); found.push(n)13 found.trim_to(ef) # keep only ef closest14 return found
HNSW.insert
exponential decay promotes a few nodes — the express lanes
1def insert(node):2 # most nodes stay at level 0; a rare few are promoted high,3 # which is what builds the sparse upper express lanes4 level = floor(-ln(rand()) * mL)5 ep = entry_point6 for lvl in range(top_layer, level, -1):7 ep = search_layer(node, ep, ef=1, lvl)8 for lvl in range(min(level, top_layer), -1, -1):9 nbrs = search_layer(node, ep, ef=ef_construction, lvl)10 node.link(select_M(nbrs), lvl) # M edges per level11 if level > top_layer: entry_point = node
Not sure what to ask? Tap a question — the staff engineer answers in the chat panel.