Build a CDC pipeline (Debezium + outbox) (12 scenes)
Scene 09 · Outbox cleanup — pick your poison
Hard-delete after emit, tombstone + log compaction, partition-by-date drop — three trade-offs across WAL noise, race risk, and operational complexity. INSERT+DELETE same-tx collapses cleanup into the write.
Previously

The outbox closes the dual-write gap — but every business write now also writes a row whose entire purpose is to be deleted, so the table will grow forever unless we pick a cleanup strategy. Three strategies are on the table; each loses something different.

Scene 09
Outbox cleanup — pick your poison
Diagram
Three vertical lanes show the same write workload aged from t=0 to t=30d under three cleanup strategies: (1) hard-delete after the connector emits, (2) **tombstone** + **log compaction**, (3) date-partitioned drop. Each lane shows outbox-table rows, op=d events per hour in the WAL, and Kafka topic size at 1h / 24h / 30d. Term refresh — **outbox**: a dedicated table written inside the same transaction as the business write, born to die. **Single Message Transform (SMT)**: a per-message Kafka Connect plugin that reshapes events between connector and topic (Outbox Event Router maps aggregate_id→key, payload→value, and can filter op=d). New terms — **Tombstone**: a Kafka message with a null value that signals log compaction to drop the prior value for that key (this is the Kafka log-compaction tombstone — null payload — not a soft-delete row). **Log compaction**: Kafka's per-key 'keep latest' retention policy; when enabled on a topic, older values for the same key are eventually removed.
Outbox cleanup · pick your poison · scrub t = 0dt=01h1d1w30dlane 1Hard-delete after emitoutbox rows · 1,500WAL noise · 18000/hr op=dtopic size · 1h / 24h / 30d1h24h30dspurious op=d events; race window ifconsumer lagslane 2Tombstone + log compactionoutbox rows · 600WAL noise · 0/hr op=dtopic size · 1h / 24h / 30d1h24h30dlane 3Partition by date, drop month…outbox rows · 0WAL noise · 0/hr op=dtopic size · 1h / 24h / 30d1h24h30doperationally simple, but needs adate-partitioned table from day one
Three lanes, one workload. Watch each lane age across 30 days — the outbox row count, the op=d noise in the WAL, and the Kafka topic size diverge. Lane 1 stays small but pumps op=d events; lane 2 stays small via tombstones + compaction; lane 3 climbs daily and drops at month-end.
Implementation
Connector.hardDeleteAfterEmit
naive lane 1: connector emits, then deletes the row
1def hardDeleteAfterEmit(row):
2 kafka.produce(
3 topic = route(row),
4 key = row.aggregate_id,
5 value = row.payload,
6 )
7 # race: if the consumer lags, op=d may arrive
8 # before the consumer has read op=c
9 db.execute(
10 'DELETE FROM outbox WHERE id = %s',
11 row.id,
12 )
13 # WAL now records op=d; Outbox SMT must filter
Connector.tombstoneAndCompact
lane 2: emit a null-valued tombstone; Kafka compacts per key
1# topic configured cleanup.policy=compact
2def tombstoneAndCompact(key):
3 kafka.produce(
4 topic = outbox_topic,
5 key = key,
6 value = None, # tombstone
7 )
8 # log compaction will eventually drop prior
9 # values for this key — per-key 'keep latest'
10 # race: lagging consumer may miss the value
11 # before compaction reclaims it
Service.insertDeleteSameTx
lane 1 collapsed: cleanup folded into the write path
1def insertDeleteSameTx(event):
2 db.execute('BEGIN')
3 db.execute(
4 'INSERT INTO outbox(...) VALUES (...)',
5 event,
6 )
7 db.execute(
8 'DELETE FROM outbox WHERE id = %s',
9 event.id,
10 )
11 db.execute('COMMIT')
12 # WAL has both records; Debezium emits op=c
13 # then op=d atomically — sink ignores op=d