All scenes
Build a distributed search engine (Elasticsearch / OpenSearch style)
12 scenes · ~84 min · build the primitive
Build your own distributed search engine (Elasticsearch / OpenSearch style)
Five million books, a search box, and a 100 ms budget. Build the engine from the inverted index up — segment, refresh, shard, replica, scatter-gather, BM25 — and feel why every guarantee that lives across shards is paid for in either an extra round trip or a small lie about the rankings.
- 01Five million books, under 100 ms5M books, a search box, and a 100ms budget — the naive scan-every-row plan never crosses the line, no matter the cores or the disk.~7 min
- 02The inverted index — term to docsFlip the relation: a precomputed map from each word to the doc IDs it appears in turns O(documents) Boolean queries into O(matches) sorted-list merges.~7 min
- 03Segments — many small immutable indexesLucene appends a fresh tiny inverted index per write batch and merges in the background — readers never lock, deletes are bit flips, mutability is an asynchronous receipt.~7 min
- 04Refresh, flush, translog — three cadencesRefresh = visible to search; flush = survives a crash; translog bridges the gap. Three cadences, three durability properties, one famous source of confusion.~7 min
- 05Shards — hash(routing) mod NEach document lives on shard = hash(routing) mod number_of_primary_shards — and that mod is exactly why you cannot change the shard count after you create the index.~7 min
- 06Replicas help reads, not writesReplicas linearly add read capacity and HA but do not speed up writes — every primary fans every write out to all in-sync replicas before acking.~7 min
- 07Scatter, reduce, fetch_search is two round trips: scatter top-(from+size) doc-ids+scores from every shard, reduce, then fetch only the global winners — which is where the 10,000 page wall comes from.~7 min
- 08BM25 — TF saturation × IDF × length normBM25 ranks each match by IDF × saturating-TF / length-norm — and the saturation knob k1 is exactly what stops a keyword-stuffed doc from winning the page.~7 min
- 09Distributed scoring is approximateEvery shard computes IDF from its own local corpus, so the same document scores differently depending on where it lives — dfs_query_then_fetch buys correctness for one extra round trip.~7 min
- 10Top-N aggregations can miss the winnerEach shard returns its own top shard_size buckets and the coordinator merges — a term that is 11th on every shard is invisible in a top-10 result, and shard_size is the knob.~7 min
- 11Operational sharp edgesRefresh storm, mapping explosion, hot shard, deep pagination — the four most expensive Elasticsearch outages are misuses of knobs the earlier scenes already introduced.~7 min
- 12Design your search clusterCapstone: pick e-commerce, logs, or security analytics and configure shards, replicas, refresh, scoring, and ILM — the verifier traces every ✓/✗ back to the scene that earned it.~7 min