Build a vector database (Pinecone / Weaviate / pgvector style) (15 scenes)
Scene 15 · Design your vector database
Capstone: pick the index, filter, hybrid, and sharding for RAG vs recommendation vs semantic cache vs billion-on-a-budget — each knob traceable to the scene that justified it.
Previously
The same primitives — vector, metric, index, filter, hybrid, shards — configure radically different systems. The capstone is choosing them deliberately for a real workload, with every knob traceable to the scene that justified it, and the trilemma as the compass.
Scene 15
Design your vector database
Diagram
The capstone canvas. Four workload cards are docked across the top — each carries a one-line constraints summary (corpus size, latency budget, RAM, how much recall it can forgive). One card is live at a time. Down the side sits the palette of every knob the arc earned: index type, metric, hybrid, sharding, re-rank. As you choose knobs, the verifier panel grades each one against the LIVE workload — green 'fits', amber 'wasteful' (you're paying on an axis this workload doesn't constrain), red 'violation' (it breaks a hard limit) — and every verdict cites the scene that justifies it.
four workloads, one toolkit
every knob traces back to a scene →
Here is everything the arc built, in one place. Four workload cards are docked across the top — *RAG over docs*, *Recommendation*, *Semantic cache*, *Billion-on-a-budget* — and each card states its real constraints: how many vectors, how tight the latency budget, how much RAM, and how much being wrong actually costs. Down the side is the palette: the *index* choices (Flat, IVF, HNSW, IVFPQ), the *metric* choice, *hybrid* (dense + sparse fused by RRF), *sharding*, and *re-rank*. Every one of those knobs was earned in an earlier scene. The job now is not to learn anything new — it's to read each workload's constraints off the *trilemma* (recall, latency, memory) and pick the honest configuration. The same toolkit; four different right answers.
Not sure what to ask? Tap a question — the staff engineer answers in the chat panel.