Build Raft — consensus you can defend (12 scenes)
Scene 08 · Snapshots — compact without violating consistency
Per-replica snapshots at applied index. (lastIncludedIndex, lastIncludedTerm) substitute for the truncated tail in the AppendEntries consistency check, so Log Matching survives compaction. InstallSnapshot ships the prefix to far-behind followers.
Previously
Scene 7 made cluster membership safe to change: every transitional decision still passes through a majority that overlaps both the old and the new rosters. So now the cluster can grow and shrink without breaking Election Safety. The next operational reality is that the log itself keeps growing — and truncating it has to preserve Log Matching just as carefully as reconfiguration preserved Election Safety.
Scene 08
Snapshots — compact the log without breaking Log Matching
Diagram
Three Raft servers in a row. Each cell shows role badge, currentTerm, votedFor, and a horizontal log strip — at most 8 entries are visible, and a '+N more' chip stands in for the rest. commitIndex (▲, the highest log index known to be replicated on a majority) and lastApplied (▽, the highest index fed into the state machine) sit beneath each strip. RPCs are color-coded: AppendEntries blue, RequestVote amber, InstallSnapshot violet — the fourth RPC kind in Raft, used only when a follower has fallen below the leader's truncation point.
log already overflowing
By scene 8, our log keeps growing. Every replica has to hold it on disk forever — that's untenable for a long-running cluster. The solution: occasionally take a snapshot of the state machine's current state, then throw away the log entries that built up to it. Here are three servers, all caught up at applied index 200, with a log strip that's already overflowing. Watch the next three captions — they install the three new terms this scene needs: snapshot, the (lastIncludedIndex, lastIncludedTerm) pair, and InstallSnapshot RPC.
Implementation
Replica.takeSnapshot
local operation: serialize state machine, persist metadata, truncate log prefix
1def takeSnapshot(self):2 # 1. serialize state machine at lastApplied3 snap = self.stateMachine.serialize()4 snap.lastIncludedIndex = self.lastApplied5 snap.lastIncludedTerm = self.log[self.lastApplied].term6 # 2. include latest committed configuration in the snapshot7 snap.config = self.latestCommittedConfig()8 # 3. fsync the snapshot file durably9 persist(snap)10 # 4. truncate log prefix — entries 1..lastIncludedIndex go away11 self.log.discardThrough(snap.lastIncludedIndex)12 # NOTE: no RPC, no quorum, no leader involvement.
Leader.replicateTo(follower)
AppendEntries when in-range; InstallSnapshot when below truncation
1def replicateTo(self, F):2 if self.nextIndex[F] >= self.logStartIndex:3 # in-range: ordinary AppendEntries4 prev = self.nextIndex[F] - 15 send(AppendEntries(6 term=self.currentTerm, prevLogIndex=prev,7 prevLogTerm=self.termAt(prev),8 entries=self.log[self.nextIndex[F]:],9 leaderCommit=self.commitIndex), to=F)10 else:11 # F has fallen below our truncation point12 send(InstallSnapshot(13 term=self.currentTerm, leaderId=self.id,14 lastIncludedIndex=self.snap.lastIncludedIndex,15 lastIncludedTerm=self.snap.lastIncludedTerm,16 data=self.snap.bytes, done=True), to=F)17 # AppendEntries resume next tick from lastIncludedIndex+1
Follower.handleInstallSnapshot
adopt if past commitIndex; the metadata is the synthetic prevLog
1def handleInstallSnapshot(self, msg):2 if msg.term < self.currentTerm: return Reject3 self.stepDownIfHigherTerm(msg.term)4 if msg.lastIncludedIndex <= self.commitIndex:5 return Ok # snapshot is older than what we have; ignore6 # 1. install the snapshot bytes into the state machine7 self.stateMachine.restore(msg.data)8 # 2. record the synthetic prevLog: any future AppendEntries with9 # prevLogIndex == lastIncludedIndex matches lastIncludedTerm10 self.log.resetTo(msg.lastIncludedIndex, msg.lastIncludedTerm)11 self.commitIndex = msg.lastIncludedIndex12 self.lastApplied = msg.lastIncludedIndex13 return Ok