Build a distributed search engine (Elasticsearch / OpenSearch style)
12 scenes · ~84 min · build the primitive

Build your own distributed search engine (Elasticsearch / OpenSearch style)

Five million books, a search box, and a 100 ms budget. Build the engine from the inverted index up — segment, refresh, shard, replica, scatter-gather, BM25 — and feel why every guarantee that lives across shards is paid for in either an extra round trip or a small lie about the rankings.

  1. 01
  2. 02
  3. 03
  4. 04
  5. 05
  6. 06
  7. 07
  8. 08
  9. 09
  10. 10
  11. 11
  12. 12
  1. 01
    Five million books, under 100 ms
    5M books, a search box, and a 100ms budget — the naive scan-every-row plan never crosses the line, no matter the cores or the disk.
    ~7 min
  2. 02
    The inverted index — term to docs
    Flip the relation: a precomputed map from each word to the doc IDs it appears in turns O(documents) Boolean queries into O(matches) sorted-list merges.
    ~7 min
  3. 03
    Segments — many small immutable indexes
    Lucene appends a fresh tiny inverted index per write batch and merges in the background — readers never lock, deletes are bit flips, mutability is an asynchronous receipt.
    ~7 min
  4. 04
    Refresh, flush, translog — three cadences
    Refresh = visible to search; flush = survives a crash; translog bridges the gap. Three cadences, three durability properties, one famous source of confusion.
    ~7 min
  5. 05
    Shards — hash(routing) mod N
    Each document lives on shard = hash(routing) mod number_of_primary_shards — and that mod is exactly why you cannot change the shard count after you create the index.
    ~7 min
  6. 06
    Replicas help reads, not writes
    Replicas linearly add read capacity and HA but do not speed up writes — every primary fans every write out to all in-sync replicas before acking.
    ~7 min
  7. 07
    Scatter, reduce, fetch
    _search is two round trips: scatter top-(from+size) doc-ids+scores from every shard, reduce, then fetch only the global winners — which is where the 10,000 page wall comes from.
    ~7 min
  8. 08
    BM25 — TF saturation × IDF × length norm
    BM25 ranks each match by IDF × saturating-TF / length-norm — and the saturation knob k1 is exactly what stops a keyword-stuffed doc from winning the page.
    ~7 min
  9. 09
    Distributed scoring is approximate
    Every shard computes IDF from its own local corpus, so the same document scores differently depending on where it lives — dfs_query_then_fetch buys correctness for one extra round trip.
    ~7 min
  10. 10
    Top-N aggregations can miss the winner
    Each shard returns its own top shard_size buckets and the coordinator merges — a term that is 11th on every shard is invisible in a top-10 result, and shard_size is the knob.
    ~7 min
  11. 11
    Operational sharp edges
    Refresh storm, mapping explosion, hot shard, deep pagination — the four most expensive Elasticsearch outages are misuses of knobs the earlier scenes already introduced.
    ~7 min
  12. 12
    Design your search cluster
    Capstone: pick e-commerce, logs, or security analytics and configure shards, replicas, refresh, scoring, and ILM — the verifier traces every ✓/✗ back to the scene that earned it.
    ~7 min