Build Kafka (13 scenes)
Scene 05 · Durability is four knobs, not one
acks, min.insync.replicas, RF, unclean — and how 'all' silently means one.
Previously
ISR decides commit, the controller decides leadership. Now the four knobs that decide what 'durable' actually means — and how a wrong combination turns 'acks=all' into 'acks=one' without any error.
Scene 05
Durability is four knobs, not one
Diagram
A producer on the left sending writes to a partition that has N replicas across brokers. Four knobs are exposed as side controls — acks (0/1/all), min.insync.replicas, RF (replication factor), unclean.leader.election. The diagram lights up the failure path that opens when the chosen combination silently degrades — for example, ISR shrinking to {leader} when min.insync.replicas=1.
Default config: RF=3, min.insync.replicas=2, acks=all, unclean.leader.election=off. Producer sends, all three replicas append, leader acks. The downstream reader sees a fully replicated log — a leader crash here loses zero acked writes.
Implementation
Producer.send(record, acks=all)
the producer-side wait loop — block until the broker acks
1def send(record):2 leader = metadata.leaderFor(record.partition)3 resp = leader.produce(record, acks=cfg.acks)4 if cfg.acks == 0:5 return # fire-and-forget, no wait6 if cfg.acks == 1:7 return resp # leader-only ack8 # acks == 'all': leader replies only after the9 # broker-side commit gate has accepted the record.10 return resp
Broker.onProduce(record)
the commit gate — reject before silently degrading
1def onProduce(record):2 self.log.append(record) # leader appends first3 if cfg.acks != 'all':4 return Ack() # no ISR check on acks=0/15 if len(ISR) < cfg.min_insync_replicas:6 raise NotEnoughReplicas(7 isr=len(ISR), required=cfg.min_insync_replicas,8 )9 waitForFetch(record.offset, replicas=ISR)10 self.hw = min(leo[r] for r in ISR)11 return Ack()
Controller.maybeUncleanElect(partition)
the gate that lets a stale replica win an empty-ISR election
1def maybeUncleanElect(partition):2 if len(ISR) > 0:3 return promote(pickFromISR())4 if not cfg.unclean_leader_election_enable:5 partition.state = UNAVAILABLE6 return None # KIP-106 default7 candidate = pickAnyLiveReplica()8 candidate.epoch += 1 # bumps leader epoch9 return promote(candidate)
Not sure what to ask? Tap a question — the staff engineer answers in the chat panel.