Build a distributed search engine (Elasticsearch / OpenSearch style) (12 scenes)
Scene 04 · Refresh, flush, translog — three cadences
Refresh = visible to search; flush = survives a crash; translog bridges the gap. Three cadences, three durability properties, one famous source of confusion.
Previously

Sealing a segment to disk costs hundreds of milliseconds; real users want sub-second visibility AND a guarantee they won't lose acknowledged writes. Three clocks let visibility and durability run on different cadences, with a write-ahead log bridging the gap.

Scene 04
Refresh, flush, translog — three cadences
Diagram
Top row: three cadence timers — refresh (visibility), flush (durability), translog (write-ahead log). Left: the in-memory IndexWriter buffer with the docs you just PUT, and below it the append-only translog ribbon. Right: the immutable segment stack — segments are searchable as soon as refresh seals them, but only become durable on disk when flush advances segments_N.
REFRESHrefresh · 1s0ms / 1sFLUSHflush · auto0s / 12sMERGEtranslog · fsync …durable on ackIndexWriter BUFFER (RAM)not searchablenot durableTRANSLOG (append-only)emptyIMMUTABLE SEGMENT STACK · newest top(no merge in flight)watch one doc cross all three checkpoints
A new book is indexed. Step 1: it lands in the IndexWriter buffer and the translog. Step 2: at the next refresh tick, the buffer becomes a new searchable segment. Step 3: at flush, the segment is committed to disk and the translog is truncated.
Implementation
index(doc)
append to translog, append to in-memory buffer; ack only after fsync
1def index(doc):
2 buffer.append(doc) # in-memory, not searchable yet
3 translog.append(op(doc)) # WAL row
4 if durability == 'request':
5 translog.fsync() # acked write is durable
6 return ack
refresh()
seal the buffer into a fresh in-memory segment — visibility, not durability
1def refresh(): # every refresh_interval
2 if buffer.empty(): return
3 seg = open_new_in_memory_segment()
4 for doc in buffer.drain():
5 seg.add_to_inverted_index(doc)
6 seg.seal() # searchable now
7 live_segments.append(seg) # NOT on disk yet
flush()
IndexWriter.commit() — segments_N advances, translog truncates
1def flush(): # every flush_threshold
2 refresh() # drain any pending buffer
3 index_writer.commit() # fsync segments_N
4 translog.rotate_and_truncate() # WAL no longer needed