Build a workflow engine (Temporal / Airflow / Cadence style) (13 scenes)
Scene 07 · Durable timers: sleep 30 days on zero compute
A thread that sleeps for a month dies on the first crash; a durable timer records the wait as an event, so the workflow goes dormant until the engine fires the wake-up.
Previously

We've now survived crashes at every step of a FAST order. But real orders aren't fast — ORDER #1001 must wait 7 days for the 'rate this product' email. A thread that sleeps for a week dies on the first crash in that week. So the engine records the wait as a durable event in the history and the workflow goes completely dormant, using zero compute, until the engine fires the wake-up event. Time becomes a first-class durable object: the timer.

Scene 07
Durable timers: sleep 30 days on zero compute
Diagram
TOP: a thread-based sleep(7d) that holds a real worker — a crash during the week erases the in-RAM countdown and the order stalls. BOTTOM: a durable timer — the wait is recorded as a TimerStarted event in ORDER #1001's history, the engine owns the countdown, and the workflow box goes dark: dormant, using essentially zero compute. When the deadline arrives the engine fires TimerFired and wakes the workflow via the task queue. The 'crashes during the wait' slider shows the thread model losing the timer while the durable model survives every restart.
ORDER #1001 · wait 7d before the “rate this product” emailcrashes during the wait: 0THREAD-BASED sleep(7d)a real worker is HELD for the whole waitworker processin-RAM sleep4d leftworker compute while waitingHIGH (held)✓ fires on timeDURABLE TIMER — the wait is an event the engine ownsworkflow dormant · the engine tracks the deadlineworkflow (blue)dormant · 0 computeworker compute while waiting≈ 0 (free)ENGINE COUNTDOWN4d leftEVENT HISTORY (append-only) — the timer lives here, not in a threadWorkflowStartedorder=1001ActivityCompletedShipPackage okTimerStartedrate-email +7dTimerStartedrate-email +7dSame wait, two models: a held thread dies with the first crash; a durable timer is an event the engine owns, so it …
ORDER #1001 has shipped. Now it must wait 7 days before sending the "rate this product" email. The obvious way — call sleep(7 days) — holds a real worker process for the whole week, which is what the TOP model shows. But the engine has a better way. When your workflow code asks to wait, the engine doesn't block a thread; it appends a **TimerStarted** event to ORDER #1001's history and the workflow box goes completely dark: **dormant** — holding no worker and burning essentially zero compute, because the only thing tracking the deadline now is the engine itself. When the countdown hits zero, the engine appends a **TimerFired** event and wakes the workflow via the task queue. That whole mechanism — a wait recorded as a TimerStarted/TimerFired event pair that the engine owns — is a **durable timer**: time is stored as an event in history, not held in a sleeping thread. Watch the durable model start the wait and go dormant, then fire on schedule.
Implementation
Workflow.waitForRating
two ways to wait — one dies on crash, one records an event
1def waitForRating(order):
2 ship(order)
3 # BROKEN: held thread, countdown lives in RAM
4 sleep(days=7) # dies on first crash
5 # DURABLE: yields, recording a TimerStarted event
6 await workflow.sleep(days=7) # returns control; goes dormant
7 send_rating_email(order)
Engine.onTimerStarted
persist the deadline to history, then unload the workflow
1def onTimerStarted(wf_id, duration):
2 fire_at = now() + duration
3 history.append(wf_id, TimerStarted(fire_at))
4 schedule.add(wf_id, fire_at) # engine owns the countdown
5 unload(wf_id) # dormant: no worker held
Engine.fireTimers
deadline (or recovery) appends TimerFired and re-queues
1def fireTimers(): # also runs on recovery after a crash
2 for wf_id, fire_at in schedule.due(now()):
3 history.append(wf_id, TimerFired())
4 task_queue.put(wf_id) # wake it
5 replay(wf_id) # resumes right after the sleep
Not sure what to ask? Tap a question — the staff engineer answers in the chat panel.