Build a gRPC-style RPC framework (14 scenes)
Scene 07 · Retries: idempotency and a token budget
Only an idempotent method is safe to auto-retry, and even then a token-bucket budget must cap retries — or a brownout turns into a self-sustaining retry storm.
Previously
We learned to abort doomed work with deadlines and cancellation; now the mirror problem — work that FAILED and might be worth re-sending. But scene 1 warned we often can't tell whether the first attempt already ran, so a retry is a loaded gun.
Scene 07
Retries: idempotency and a token budget
Diagram
On the left a client calls a single backend (GreeterService) that is in a brownout (slow/failing). The arrows are the original call plus one per retry. The big top meter is the OFFERED LOAD on the backend — 1.0× means exactly its capacity; past ~4× it turns into a storm and the backend flatlines, latching a METASTABLE badge when it stays down after the original trigger clears. Idempotency: a method is idempotent when re-running it has no extra effect (greet) — its retry arrows are green (safe); a non-idempotent method (charge) turns them red because every retry risks doing the work twice. Retry budget (token bucket): the lower-left bucket holds tokens; each failure drains one, each success refills tokenRatio; once tokens drop below half the bucket, retries PAUSE and the offered-load meter caps near 1.0×.
The backend is browning out — slow and dropping calls. With no budget and 3 retries per failed call, every client re-sends at once. Watch the OFFERED-LOAD meter on the backend climb. Partway through, the original slowness clears (the trigger turns off) — but the load DOESN'T drop, because the retries are now feeding themselves. That self-sustaining overload, where the service stays down after its own cause is gone, is a *retry storm*: blind retries pile on exactly when a service can least afford it. This scene is about the two things that make retrying safe.
Implementation
Client.callWithRetry
the retry loop wrapping every outbound call
1def callWithRetry(method, req):2 attempt = 03 while attempt < maxAttempts: # slider: retries + 14 status = send(method, req)5 if status == OK:6 budget.onSuccess() # refill tokenRatio7 return8 if status not in retryableStatusCodes:9 raise # e.g. not UNAVAILABLE10 budget.onFailure() # drain one token11 if not budget.allow(): # bucket below half12 raise13 sleep(backoffWithJitter(attempt))14 attempt += 1
RetryBudget.allow
token bucket capping retries as a fraction of traffic
1tokens = maxTokens # full bucket23def onFailure():4 tokens = max(0, tokens - 1)56def onSuccess():7 tokens = min(maxTokens, tokens + tokenRatio)89def allow():10 if not enabled:11 return True # no budget: never pause12 return tokens >= maxTokens / 2
Method.execute
why a replay is safe only for an idempotent method
1# greet is idempotent: re-running returns the same value2def greet(name):3 return 'hello ' + name45# charge is NOT: each call moves money6def charge(name, amount):7 account[name].balance -= amount # replay double-bills8 return receipt()910# the leak: a retry can't tell if the first attempt ran