All scenes
Build a workflow engine (Temporal / Airflow / Cadence style)
13 scenes · ~91 min · build the primitive
Build your own workflow engine (Temporal / Airflow / Cadence style)
A function that survives crashes, restarts, and re-deploys — and still finishes. Build a durable execution engine where workflow code is replayed deterministically from an event history, activities retry with exponential backoff, sagas compensate on failure, and the same workflow definition runs identically a year later. Internalize why 'just retry the cron job' breaks at the second step.
- 01The crash that charges you twiceA plain function keeps its progress in RAM, so a crash after step one erases it — and a naive cron retry re-runs from the top and charges the card a second time.~7 min
- 02The event history is the source of truthStop trusting RAM: append every step's result to a durable, append-only event history, so a crash that wipes memory leaves the record of what already happened intact.~7 min
- 03Replay: re-run the code against the historyTo resume, the engine re-runs your code from the top and hands back the recorded results instead of redoing them — which only works if the code is deterministic.~7 min
- 04Activities: quarantine for side effectsReplay re-runs workflow code, so every side effect must move into an activity whose result is recorded — replay hands the result back instead of charging again.~7 min
- 05Task queues: workers pull, so redeploys are safeThe engine never pushes work; stateless workers pull tasks from a queue, so a redeploy is just 'no worker for a moment' and the task simply waits to be picked up.~7 min
- 06Retries and exponential backoffThe engine retries a failed activity on its own, widening the gap between attempts so a sick downstream can recover instead of being pinned down by a retry storm.~7 min
- 06aIdempotency keys: the last hole in the double-chargeAn activity can run twice if it succeeds then crashes before recording — a stable idempotency key lets the downstream recognize the repeat and refuse the second charge.~7 min
- 07Durable timers: sleep 30 days on zero computeA thread that sleeps for a month dies on the first crash; a durable timer records the wait as an event, so the workflow goes dormant until the engine fires the wake-up.~7 min
- 08Signals and queries: the workflow as an actorA running workflow is an addressable actor: a signal delivers external input durably and can change its path; a query reads its state without mutating it.~7 min
- 09The saga: compensate partial failure in reverseYou can't wrap steps across services in one transaction; a saga gives each step an undo and runs the compensators for completed steps in reverse — a refund, not a rollback.~7 min
- 10Child workflows and ContinueAsNewChild workflows isolate sub-units with their own histories, and ContinueAsNew restarts an endless workflow with a fresh history but the same ID before it hits the limit.~7 min
- 11Versioning: the year-later replayA year-old in-flight execution can't be hot-fixed; a version gate routes old executions down the old path and new ones down the new path, so both replay deterministically.~7 min
- 12Design canvas: choose the right engineMatch each workload to its model — code-as-workflow replay, a DAG scheduler, or a state machine — and defend every choice with the scene that taught the requirement.~7 min