Build your own distributed search engine (Elasticsearch / OpenSearch style)

Five million books, a search box, and a 100 ms budget. Build the engine from the inverted index up — segment, refresh, shard, replica, scatter-gather, BM25 — and feel why every guarantee that lives across shards is paid for in either an extra round trip or a small lie about the rankings.

01
Five million books, under 100 ms
5M books, a search box, and a 100ms budget — the naive scan-every-row plan never crosses the line, no matter the cores or the disk.
~7 min
02
The inverted index — term to docs
Flip the relation: a precomputed map from each word to the doc IDs it appears in turns O(documents) Boolean queries into O(matches) sorted-list merges.
~7 min
03
Segments — many small immutable indexes
Lucene appends a fresh tiny inverted index per write batch and merges in the background — readers never lock, deletes are bit flips, mutability is an asynchronous receipt.
~7 min
04
Refresh, flush, translog — three cadences
Refresh = visible to search; flush = survives a crash; translog bridges the gap. Three cadences, three durability properties, one famous source of confusion.
~7 min
05
Shards — hash(routing) mod N
Each document lives on shard = hash(routing) mod number_of_primary_shards — and that mod is exactly why you cannot change the shard count after you create the index.
~7 min
06
Replicas help reads, not writes
Replicas linearly add read capacity and HA but do not speed up writes — every primary fans every write out to all in-sync replicas before acking.
~7 min
07
Scatter, reduce, fetch
_search is two round trips: scatter top-(from+size) doc-ids+scores from every shard, reduce, then fetch only the global winners — which is where the 10,000 page wall comes from.
~7 min
08
BM25 — TF saturation × IDF × length norm
BM25 ranks each match by IDF × saturating-TF / length-norm — and the saturation knob k1 is exactly what stops a keyword-stuffed doc from winning the page.
~7 min
09
Distributed scoring is approximate
Every shard computes IDF from its own local corpus, so the same document scores differently depending on where it lives — dfs_query_then_fetch buys correctness for one extra round trip.
~7 min
10
Top-N aggregations can miss the winner
Each shard returns its own top shard_size buckets and the coordinator merges — a term that is 11th on every shard is invisible in a top-10 result, and shard_size is the knob.
~7 min
11
Operational sharp edges
Refresh storm, mapping explosion, hot shard, deep pagination — the four most expensive Elasticsearch outages are misuses of knobs the earlier scenes already introduced.
~7 min
12
Design your search cluster
Capstone: pick e-commerce, logs, or security analytics and configure shards, replicas, refresh, scoring, and ILM — the verifier traces every ✓/✗ back to the scene that earned it.
~7 min