Build Raft — consensus you can defend (12 scenes)
Scene 07 · Membership changes — joint consensus and single-server
Naive swap creates disjoint majorities and risks Election Safety violation. Single-server changes (production default) preserve majority overlap by N-vs-(N±1); joint consensus C_old,new requires both majorities during transition.
Previously
Scene 6 closed the safety story for a cluster whose membership is fixed: Election Safety, Leader Append-Only, Log Matching, Leader Completeness, State Machine Safety — five named invariants, one proof. So now the cluster cannot disagree with itself, as long as nobody changes who is IN the cluster. The natural next question is the operational one: how do we change cluster membership — add a server, replace a dead one, migrate to bigger machines — without breaking that same safety chain on the first try?
Scene 07
Membership changes — growing and shrinking the cluster safely
Diagram
Up to 7 Raft servers in a row, each with role badge, currentTerm, and a log strip. Configuration entries (the special log entries that change cluster membership) carry a small marker so you can distinguish them from regular client commands. A vertical dashed wall renders only in the naive-swap frame, separating the two disjoint majorities that elect simultaneously. Arrows are AppendEntries (blue, the replication / heartbeat message) and RequestVote (amber, the candidate's request for a vote).
old roster = {S1..S5}
Suppose your 3-server cluster needs to become a 5-server cluster — or, as drawn here, your 5-server cluster {S1..S5} needs to become {S1, S2, S3, S6, S7}. The obvious plan is to update everyone at once: tell every server about the new roster, restart each one, done. Here's why that breaks. Watch one beat past the restart — that's where the bug lives.
Implementation
Operator.naiveSwap (BAD)
the broken plan: restart each server with the new config at its own pace
1def naiveSwap(oldCluster, newCluster):2 for server in oldCluster ∪ newCluster:3 # operator restarts each server independently4 server.config = newCluster5 server.restart()6 # PROBLEM: during this loop, some servers see Cold,7 # others see Cnew. Two disjoint majorities can form,8 # both elect for the same term — Election Safety broken.
Leader.addServerSingleStep
production default: learner first, then a single-entry config change
1def addServer(self, newServerId):2 # 2015 Ongaro fix — gate on current-term commit3 if not self.hasCommittedEntryInCurrentTerm():4 return Defer # wait for no-op-on-election to commit5 # 1. add as LEARNER: replicates, does not vote6 self.learners.add(newServerId)7 self.streamAppendEntriesUntilCaughtUp(newServerId)8 # 2. propose single-entry config change Cnew = voters ∪ {new}9 entry = ConfigEntry(10 kind='add-voter', server=newServerId,11 Cnew=self.voters ∪ {newServerId})12 self.appendAndReplicate(entry)13 # safety: |majority(N)| + |majority(N+1)| > N+1, so any two14 # such majorities intersect by ≥1 — no disjoint majorities.