IVF — carve the space into cells — Build a vector database (Pinecone / Weaviate / pgvector style)

Build a vector database (Pinecone / Weaviate / pgvector style) (15 scenes)

Scene 04 · IVF — carve the space into cells

k-means splits the space into Voronoi cells; at query time scan only the nprobe cells nearest the query — turning an all-N scan into a fraction of it. The first explicit recall-for-speed trade.

Previously

Flat touched all N vectors; IVF carves the space into cells and touches only the nprobe cells near the query, buying a 10-100× speedup for a sliver of recall. But that sliver has a specific, repeatable shape — and it's worth its own scene, because it's the gotcha that bites every junior who ships IVF.

Scene 04

IVF — carve the space into cells

Watch

Diagram

The same song map, now divided into colored neighborhoods. k-means plants a handful of centroids; every song joins its nearest centroid's cell (the colored region around it). The query 'Now Playing' sits in the middle cell. At search time IVF only opens the few cells closest to the query (drawn in full color) and skips the rest (greyed) — so a song in a skipped cell can't be returned, no matter how close it actually is. The little chart on the right is the recurring recall-vs-speed plot: Flat sits at perfect recall but slow; the IVF dot sits to its right — faster, but below 1.0 recall.

Sources

Flat's whole problem was that it compared the query against every single song. IVF — short for *inverted file index* — fixes that with one upfront step: it runs a clustering pass called *k-means* that drops a handful of marker points (centroids) into the space, and every song joins whichever centroid it's nearest to. The colored region a centroid owns is its *cell*. Watch the centroids drop in and the boundaries snap shut — the map is now carved into a few neighborhoods. The query 'Now Playing' lands squarely in the middle cell, alongside Midnight Drive, Synth Dawn, and Study Beats. (A common rule of thumb sizes the number of cells around 4·√N to 16·√N — small enough that each holds a real neighborhood.)

Implementation

IVF.train

one upfront k-means pass carves the space into cells

1def train(vectors, nlist):
2    # Lloyd's algorithm: nlist marker points
3    centroids = kmeans(vectors, k = nlist)
4    # inverted lists: cell_id -> vectors in that cell
5    lists = {c: [] for c in range(nlist)}
6    for v in vectors:
7        c = argmin(dist(v, centroids))  # nearest cell
8        lists[c].append(v)
9    return centroids, lists

IVF.coarseQuantize

pick which cells to open for this query

1def probe_set(query, centroids, nprobe):
2    # rank cells by centroid distance to the query
3    ordered = sort(
4        range(nlist),
5        key = lambda c: dist(query, centroids[c]),
6    )
7    # open only the nprobe nearest; skip the rest
8    return ordered[:nprobe]

IVF.search

scan only the opened cells, then keep the top-k

1def search(query, centroids, lists, nprobe, k):
2    cells = probe_set(query, centroids, nprobe)
3    candidates = []
4    for c in cells:           # not all nlist cells
5        for v in lists[c]:    # only the opened lists
6            candidates.append((dist(query, v), v))
7    return smallest_k(candidates, k)

Not sure what to ask? Tap a question — the staff engineer answers in the chat panel.

PreviousThe Flat index — the 100%-recall baseline NextThe boundary miss — one lonely point