Build a CDC pipeline (Debezium + outbox) (12 scenes)
Scene 12 · Design canvas — compose your CDC pipeline
Capstone: pick a workload (search index / audit log / microservice events / read model), configure the six slots, fire failures. The verifier traces each absorbed or broken outcome back to the scene that introduced the responsible component.
Previously
Every piece is on the table — log, slot, snapshot, schema registry, outbox, cleanup, partitioning, idempotency. The capstone move: compose them into a pipeline for a real workload and trace each choice back to the failure it prevents.
Scene 12
Design canvas — compose your CDC pipeline
Diagram
Workload selector at the top picks one of four targets. The 2x3 slot grid in the middle is your pipeline: **source mode** (direct-CDC of business tables vs **outbox**, the dedicated table written in the same transaction), **snapshot mode** (initial / never / no_data — what history Debezium replays before streaming), **cleanup** (hard-delete / **tombstone**+**log compaction** / date-partition / INSERT+DELETE same-tx), **partition key** (which value goes through hash(key) mod N), **schema policy** (none / BACKWARD / BACKWARD-with-aliases / FULL — what the **schema registry** rejects), and **sink dedup** (none / by-event-id / by-(table,pk,**LSN**) — what makes the consumer **idempotent**). The injector at the bottom-left fires one of six failures the prior scenes introduced; the verifier card on the right traces it through your slots and either marks it ABSORBED (with the responsible slot) or BROKEN (with a pointer back to the scene number that introduced the missing component). Glossary refresh: **dual-write** is the unsafe pattern of committing to the DB then publishing to Kafka with no atomicity boundary spanning both. The **outbox** closes that gap by making the event part of the same transaction. A **replication slot** is the DB-side cursor (an **LSN**, monotonic byte offset) that prevents WAL recycling — a stalled slot fills the disk. **Snapshot** stitches history to the live stream at a single LSN seam. The **schema registry** decides whether a DDL change registers as a compatible new version. **Tombstone** + **log compaction** is Kafka's null-payload + cleanup.policy=compact retention story. Debezium delivers **at-least-once**; an **idempotent consumer** dedupes on a stable key.
Default canvas: microservice-events with outbox + INSERT+DELETE same-tx + aggregate_id + BACKWARD-with-aliases + by-event-id dedup. The verifier cycles three injections — watch each get absorbed and which slot does the absorbing.
Implementation
Verifier.evaluate
dispatch on injection, ask the slot that owns it
1def evaluate(slots, workload, injection):2 match injection.id:3 case 'slot-grows':4 return broken(sceneRef=5) # no slot covers this5 case 'rename-column':6 return checkSchema(slots, workload)7 case 'snapshot-restart':8 return checkSnapshotSeam(slots, workload)9 case 'hard-delete-race':10 return checkOutboxCleanup(slots)11 case 'consumer-replay':12 return checkSinkDedup(slots)13 case 'cross-key-order':14 return checkPartitionKey(slots)
smallestViablePipeline
the recommended slots for each workload
1def smallestViablePipeline(workload):2 match workload:3 case 'search-index':4 # idempotent by document id; DDL churn is the risk5 return Slots(sourceMode='direct-cdc',6 schemaPolicy='BACKWARD-with-aliases',7 sinkDedup='by-table-pk-lsn')8 case 'audit-log':9 # history IS the product — never drop it10 return Slots(sourceMode='outbox',11 snapshotMode='initial',12 partitionKey='aggregate_id',13 sinkDedup='by-event-id')14 case 'microservice-events':15 # canonical: outbox + aliases + per-aggregate order16 return Slots(sourceMode='outbox',17 schemaPolicy='BACKWARD-with-aliases',18 partitionKey='aggregate_id',19 cleanup='insert-delete-same-tx',20 sinkDedup='by-event-id')21 case 'read-model':22 # outbox is the deeper fix; aliases the band-aid23 return Slots(sourceMode='outbox',24 schemaPolicy='BACKWARD-with-aliases',25 sinkDedup='by-event-id')
failureAbsorbedBy[]
the persistent reference: failure → slot that closes it
1# failure → absorbed-by slot (origin scene)2# ---------------------------------------------------------------3# slot-grows → (none — operational) (scene 5)4# rename-column → sourceMode=outbox (scene 8)5# rename-column (band-aid) → schemaPolicy=aliases (scene 7)6# snapshot-restart → sinkDedup=by-event-id (scene 11)7# audit-log + never → snapshotMode=initial (scene 6)8# hard-delete-race → cleanup=insert+del-tx (scene 9)9# consumer-replay → sinkDedup (any) (scene 11)10# cross-key-order → partitionKey=aggregate (scene 10)11# tenant_id / random key → (broken, no absorber) (scene 10)12# direct-cdc + DDL churn → (broken, see outbox) (scene 8)