Build a vector database (Pinecone / Weaviate / pgvector style) (15 scenes)
Scene 04 · IVF — carve the space into cells
k-means splits the space into Voronoi cells; at query time scan only the nprobe cells nearest the query — turning an all-N scan into a fraction of it. The first explicit recall-for-speed trade.
Previously
Flat touched all N vectors; IVF carves the space into cells and touches only the nprobe cells near the query, buying a 10-100× speedup for a sliver of recall. But that sliver has a specific, repeatable shape — and it's worth its own scene, because it's the gotcha that bites every junior who ships IVF.
Scene 04
IVF — carve the space into cells
Diagram
The same song map, now divided into colored neighborhoods. k-means plants a handful of centroids; every song joins its nearest centroid's cell (the colored region around it). The query 'Now Playing' sits in the middle cell. At search time IVF only opens the few cells closest to the query (drawn in full color) and skips the rest (greyed) — so a song in a skipped cell can't be returned, no matter how close it actually is. The little chart on the right is the recurring recall-vs-speed plot: Flat sits at perfect recall but slow; the IVF dot sits to its right — faster, but below 1.0 recall.
Flat's whole problem was that it compared the query against every single song. IVF — short for *inverted file index* — fixes that with one upfront step: it runs a clustering pass called *k-means* that drops a handful of marker points (centroids) into the space, and every song joins whichever centroid it's nearest to. The colored region a centroid owns is its *cell*. Watch the centroids drop in and the boundaries snap shut — the map is now carved into a few neighborhoods. The query 'Now Playing' lands squarely in the middle cell, alongside Midnight Drive, Synth Dawn, and Study Beats. (A common rule of thumb sizes the number of cells around 4·√N to 16·√N — small enough that each holds a real neighborhood.)
Implementation
IVF.train
one upfront k-means pass carves the space into cells
1def train(vectors, nlist):2 # Lloyd's algorithm: nlist marker points3 centroids = kmeans(vectors, k = nlist)4 # inverted lists: cell_id -> vectors in that cell5 lists = {c: [] for c in range(nlist)}6 for v in vectors:7 c = argmin(dist(v, centroids)) # nearest cell8 lists[c].append(v)9 return centroids, lists
IVF.coarseQuantize
pick which cells to open for this query
1def probe_set(query, centroids, nprobe):2 # rank cells by centroid distance to the query3 ordered = sort(4 range(nlist),5 key = lambda c: dist(query, centroids[c]),6 )7 # open only the nprobe nearest; skip the rest8 return ordered[:nprobe]
IVF.search
scan only the opened cells, then keep the top-k
1def search(query, centroids, lists, nprobe, k):2 cells = probe_set(query, centroids, nprobe)3 candidates = []4 for c in cells: # not all nlist cells5 for v in lists[c]: # only the opened lists6 candidates.append((dist(query, v), v))7 return smallest_k(candidates, k)
Not sure what to ask? Tap a question — the staff engineer answers in the chat panel.