Build a CDC pipeline (Debezium + outbox) (12 scenes)
Scene 10 · Ordering is per key, never global
Kafka guarantees order within a partition; partition by aggregate_id keeps per-aggregate events ordered while accepting that cross-aggregate order is never preserved.
Previously
Cleanup is sorted; the outbox emits one event per business action — but those events are about to be sharded across Kafka partitions, and the question of which events stay in order matters as soon as there is more than one partition.
Scene 10
Ordering is per key, never global
Diagram
Left: the **Debezium connector** emitting one **change event** per row in the **outbox** (the dedicated table whose entire purpose is to be published). Center: 3 **partitions** — each is a single ordered slice of the topic; order is preserved WITHIN a partition, never across partitions. Right: a single consumer reading all three. The router shows hash(key) mod N — change the key on the slider and watch the lanes redraw.
Watch the connector emit change events from the outbox. Each event's aggregate_id is hashed to a partition — events for the same aggregate (same color) always land on the same lane, and within that lane they keep commit order.
Implementation
KafkaProducer.partitionFor(key)
Hash the key, mod numPartitions — that's it.
1def partition_for(key, num_partitions):2 if key is None:3 return sticky_round_robin() # no key = no order promise4 # murmur2 in real Kafka; any stable hash works5 h = murmur2(serialize(key))6 # aggregate_id -> one lane per order (correct)7 # tenant_id -> one lane per tenant (collapse)8 # random/none -> any lane every time (kill order)9 return (h & 0x7fffffff) % num_partitions
Connector.recordToKey(record, mode)
Outbox SMT picks the field that becomes the message key.
1def record_to_key(record, mode):2 # record came from the outbox row Debezium just read3 if mode == 'aggregate_id':4 return record.aggregate_id # unit of consistency5 if mode == 'tenant_id':6 return record.tenant_id # coarser than aggregate7 if mode == 'random':8 return uuid4() # different key every event9 raise ValueError('unknown key mode')
Consumer.canAssumeOrder(a, b)
Per-key ordering is the only contract you may rely on.
1def can_assume_order(a, b):2 # 'a happened-before b' is sound iff they share a key.3 if key_of(a) != key_of(b):4 return False # different lanes, or hash collision5 # same key -> same partition -> commit-order preserved6 return True