Build a workflow engine (Temporal / Airflow / Cadence style) (13 scenes)
Scene 11 · Versioning: the year-later replay
A year-old in-flight execution can't be hot-fixed; a version gate routes old executions down the old path and new ones down the new path, so both replay deterministically.
Previously

A version gate lets ORDER #1001, started a year ago under v1, wake and replay safely alongside v2 executions. You now hold the full toolkit: history, replay, activities, workers, retries, idempotency, timers, signals, sagas, children, versioning. The last question isn't a new mechanism — it's judgment: for a real workload, is this code-as-workflow replay model even the right tool, or is a DAG scheduler or a state machine the better fit?

Scene 11
Versioning: the year-later replay
Diagram
Two executions of the order workflow replay through ONE deployed codebase (v2 inserts a FraudCheck between ChargeCard and ReserveInventory). ORDER #1001 is in-flight — started under v1, still running on its timer, its history has no FraudCheck. Without a version gate, #1001 replays the v2 path, expects a FraudCheck that isn't in its history, diverges, and throws the non-determinism error from scene 3. The getVersion gate reads the version recorded in each execution's OWN history and routes #1001 down the v1 branch and #1002 down the v2 branch — both deterministic. In-flight execution: a run started under older code that is still alive and carries frozen expectations in its history. Version gate (getVersion/patching): the if-branch that reads each execution's recorded version and sends old runs down the old path, new runs down the new. Editing live workflow code is a breaking change, not a hot-fix; gates accumulate as cruft until old executions drain out of retention.
DEPLOY v2 — one codebase, two live executionsVERSION GATEversion gate OFF — old run replays new …deployed code · v2ChargeCard($42)FraudCheck()ReserveInventory()Ship()Email()blue = workflow (replayed)red = activity (runs once, recorded)ORDER #1001started v1 · still in flightStart v1replayedChargeCard ✓ $42replayedTimer 30dexecuting for realreplay head — grays replayed steps, stops at the first un-recorded onereplays cleanly on its routed branchORDER #1002started v2 · todayStart v2replayedChargeCard ✓ $42replayedFraudCheck ✓executing for realreplay head — grays replayed steps, stops at the first un-recorded onereplays cleanly on its routed branchORDER #1001 was started a year ago under v1; #1002 today under v2. Both replay through this one deployed codebase.
↓ one deployed codebase (v2) — both runs replay through it
It's a year later. ORDER #1001 charged the card $42, then went to sleep on its 30-day timer — and it's STILL alive, waiting to wake. Today you ship v2 of the order workflow: it inserts a FraudCheck between ChargeCard and ReserveInventory. Here's the trap you can't see in a normal service. There is only ONE deployed codebase now — v2 — and BOTH executions replay through it. But #1001's recorded history was written under v1: it never has a FraudCheck event, because that step didn't exist when it ran. A run like #1001 — started under older code and still alive, carrying expectations frozen into its history — is an **in-flight execution**. #1002, started today, ran against v2 from its first step, so its history already carries FraudCheck. Watch both executions appear on the strip: same code ahead of them, but two different histories behind them.
Implementation
Worker.replay
re-run the code, match each command against history
1def replay(execution):
2 history = execution.history # frozen, append-only
3 cursor = Cursor(history)
4 for cmd in run_workflow(execution):
5 recorded = cursor.next_event()
6 if recorded is None:
7 return execute_live(cmd) # caught up
8 if cmd.kind != recorded.kind:
9 raise NonDeterminismError(cmd, recorded)
10 feed_back(cmd, recorded.result) # don't re-run
Workflow.orderV2
the one deployed codebase both runs replay
1def order_workflow(order):
2 charge_card(order, 42)
3 v = get_version('fraud', min=1, max=2)
4 if v >= 2:
5 fraud_check(order)
6 reserve_inventory(order)
7 ship(order); send_email(order)
Engine.getVersion
read the version from THIS run's own history
1def get_version(change_id, min, max):
2 marker = history.find(change_id)
3 if marker is not None:
4 return marker.version # what this run committed to
5 # first time: record max so future replays agree
6 record(VersionMarker(change_id, max))
7 return max
Not sure what to ask? Tap a question — the staff engineer answers in the chat panel.