Build a CDC pipeline (Debezium + outbox) (12 scenes)
Scene 01 · The dual-write trap
Service writes to its DB and publishes to Kafka — and any crash between those two writes is permanent inconsistency. Four scenarios, four divergences, one structural fix.
Scene 01
The dual-write trap
Diagram
A Service node sits center-left. Two arrows leave it — one down to a DB cylinder, one right to a Kafka topic strip — and a dotted box around the service is the only transaction it has; the two arrows visibly escape that box. **Dual-write** — two writes to two systems (the DB and Kafka here) with no shared commit, so any crash between them produces a permanent skew. **Atomicity boundary** — the region inside which a set of writes either all commit or all roll back together; the diagram shows it covering ONLY the DB write, not the publish. A vertical timeline above the arrows carries four crash markers (T1..T4); the side panel reads out 'DB says X, Kafka says Y' for the selected scenario.
Watch the healthy baseline. The service writes user.balance=100 to its DB, then publishes BalanceUpdated to Kafka — two separate writes to two separate systems. That pair-without-a-shared-transaction is the dual-write problem. As long as nothing crashes, both downstream views agree.
Implementation
Service.handleSignup(event)
the dual-write — two writes, no shared transaction
1def handleSignup(event):2 tx = db.begin()3 tx.execute('UPDATE users SET balance=100 ...')4 tx.commit() # leg 1: DB5 kafka.publish('BalanceUpdated', # leg 2: Kafka6 {balance: 100})7 return ok
Service.handleSignup_with_retry(event)
the 'obvious' fix — wrap publish in retry, walk into T3
1def handleSignup(event):2 tx = db.begin()3 tx.execute('UPDATE users SET balance=100 ...')4 tx.commit()5 for attempt in range(MAX_RETRIES):6 try:7 kafka.publish('BalanceUpdated',8 {balance: 100})9 return ok10 except AckTimeout:11 continue # broker may have it already12 raise PublishFailed
// Four crash windows, one root cause
no atomicity boundary spans both legs
1# T1: commit OK, publish dies -> event lost2# T2: publish OK, commit fails -> phantom event3# T3: both OK, ack lost, retry -> duplicate event4# T4: two writers race -> DB and Kafka5# disagree on order6#7# All four are the same bug: the service has one8# transaction (the DB tx); the publish escapes it.