Build a CDC pipeline (Debezium + outbox) (12 scenes)
Scene 10 · Ordering is per key, never global
Kafka guarantees order within a partition; partition by aggregate_id keeps per-aggregate events ordered while accepting that cross-aggregate order is never preserved.
Previously

Cleanup is sorted; the outbox emits one event per business action — but those events are about to be sharded across Kafka partitions, and the question of which events stay in order matters as soon as there is more than one partition.

Scene 10
Ordering is per key, never global
Diagram
Left: the **Debezium connector** emitting one **change event** per row in the **outbox** (the dedicated table whose entire purpose is to be published). Center: 3 **partitions** — each is a single ordered slice of the topic; order is preserved WITHIN a partition, never across partitions. Right: a single consumer reading all three. The router shows hash(key) mod N — change the key on the slider and watch the lanes redraw.
Debezium c…aggregate=o…Debezium c…aggregate=o…Debezium c…aggregate=o…Debezium c…aggregate=o…hash(aggregate_id)mod 3PARTITIONS · 3P0→ ConsumerP1→ ConsumerP2→ ConsumerCONSUMER GROUP · 1 CONSUMERreads all partitions; per-partition order preservedC0owns P0, P1, P2
Watch the connector emit change events from the outbox. Each event's aggregate_id is hashed to a partition — events for the same aggregate (same color) always land on the same lane, and within that lane they keep commit order.
Implementation
KafkaProducer.partitionFor(key)
Hash the key, mod numPartitions — that's it.
1def partition_for(key, num_partitions):
2 if key is None:
3 return sticky_round_robin() # no key = no order promise
4 # murmur2 in real Kafka; any stable hash works
5 h = murmur2(serialize(key))
6 # aggregate_id -> one lane per order (correct)
7 # tenant_id -> one lane per tenant (collapse)
8 # random/none -> any lane every time (kill order)
9 return (h & 0x7fffffff) % num_partitions
Connector.recordToKey(record, mode)
Outbox SMT picks the field that becomes the message key.
1def record_to_key(record, mode):
2 # record came from the outbox row Debezium just read
3 if mode == 'aggregate_id':
4 return record.aggregate_id # unit of consistency
5 if mode == 'tenant_id':
6 return record.tenant_id # coarser than aggregate
7 if mode == 'random':
8 return uuid4() # different key every event
9 raise ValueError('unknown key mode')
Consumer.canAssumeOrder(a, b)
Per-key ordering is the only contract you may rely on.
1def can_assume_order(a, b):
2 # 'a happened-before b' is sound iff they share a key.
3 if key_of(a) != key_of(b):
4 return False # different lanes, or hash collision
5 # same key -> same partition -> commit-order preserved
6 return True