Build a Message Queue (RabbitMQ / SQS) (14 scenes)
Scene 08 · Dead-letter queue — the escape valve
After N redeliveries, the broker routes the cell to a sibling DLQ; the main pool keeps moving. maxReceiveCount is the knob with two failure modes.
Previously
The badge climbing past a threshold is the broker's only signal that this message is poison rather than transient. So we route it to a sibling channel — quarantine — and let ops decide when (and whether) to redrive it.
Scene 08
Dead-letter queue — the escape valve
Diagram
Two stacked strips: the MAIN queue above (head on the left, workers on the right), and the DLQ below (no workers — quarantine). Each cell carries a delivery-count badge in its top-right corner. The labeled vertical arrow shows the maxReceiveCount threshold; when a cell's badge hits it, the broker routes the cell down into the DLQ instead of back to the main head. The redrive button is a manual op — the DLQ does not drain on its own.
Same poison cell from the last scene — but now the broker watches the badge. Workers grab it, fail, the count ticks 1, 2, 3. At maxReceiveCount=3 the cell slides down the chute into the DLQ. The main queue keeps going; the workers can finally drain the healthy cells behind it.
Implementation
Broker.nack(cell, requeue=true)
bump the delivery count; route to DLQ once it crosses the line
1def nack(cell, requeue=True):2 cell.deliveryCount += 13 if cell.deliveryCount >= maxReceiveCount:4 # quarantine: sibling queue, no workers attached5 dlq.append(cell)6 return7 if requeue:8 # back to the head, next worker grabs it9 mainQueue.pushFront(cell)10 # else: drop silently (rare; opt-in)
Broker.redrive()
manual op: drain the DLQ back to the main tail
1def redrive():2 # ops triggers this AFTER fixing the underlying bug3 while not dlq.empty():4 cell = dlq.popFront()5 cell.deliveryCount = 0 # fresh attempts budget6 mainQueue.pushBack(cell)7 # DLQ does NOT drain on its own — quarantine is sticky
Broker.config — the symmetric knob
the same threshold expresses two opposite failure modes
1# threshold = 1: trigger-happy2# one nack -> DLQ, no retry budget at all3config.maxReceiveCount = 14# a 200ms DB blip at 1000 msg/s ->5# ~200 healthy messages quarantined per blip67# threshold = 100: slow quarantine8# poison cell loops 99 extra times9config.maxReceiveCount = 10010# workers pinned; healthy traffic starves11# sweet spot in practice: 3..10