Build a CDN (13 scenes)
Scene 11 · Hit ratio — the headline and the diagnostic ladder
Request hit ratio vs byte hit ratio, and the 5-step ladder when it crashes: Vary cardinality → TTL config → purge frequency → bypass rules → cookie key.
Previously

Every lever we've built — TTL, Vary, purge, shield, bypass — moves one number, origin RPS, and the headline that summarizes them all is hit ratio; this is the dashboard you read, and the ladder you walk when it's low.

Scene 11
Hit ratio — the headline and the diagnostic ladder
Diagram
Top: three big metric tiles — **request hit ratio** is the fraction of requests served from cache; **byte hit ratio** is the fraction of bytes served from cache. They diverge when small objects cache well but big ones don't. Origin RPS is the headline failure mode every other lever pushes or pulls. Middle: a Sankey-style flow where incoming requests split into a fat green HIT branch and a thin red MISS branch; the MISS branch fans into five named buckets — Vary explosion, no-store/no-cache, recent purge, bypass rule, cookie in cache key. The active incident's bucket inflates and throbs. Right: a 5-step diagnostic ladder; selecting the step whose bucket matches the incident lights up the matching bucket as the root cause.
Request hit ratio92.0%Byte hit ratio88.0%Origin RPS38.0REQUEST FLOW · HIT vs MISSband thickness ∝ share of trafficincoming100%HIT · 92.0%served from edge cacheMISS · 8.0%Vary explosion20% of missno-store / no-cache header20% of missrecent purge20% of missbypass rule20% of misscookie in cache key20% of missDIAGNOSTIC LADDER5 steps11 — Inspect VaryVary on cookies / UA fragment…22 — Read response headersno-store / no-cache forces re…33 — Check purge logRecent purge cleared the work…44 — Audit bypass rulesCookie / query rule bypassing…55 — Examine cache keyCookie pinned into the cache …request hit ratio — fraction of requests served from cachebyte hit ratio — fraction of bytes (large objects swing this)Step 1 — Vary cardinality (scene 8)Step 2 — TTL / Cache-Control (scene 4)Step 3 — purge frequency (scene 7)Step 4 — bypass rules (scene 10)Step 5 — cookie cache key (cookie footgun)Healthy dashboard. Request hit ratio 92%, byte hit ratio 88%, origin RPS low. Miss bucket distribution is uniform — no single failure mode d…
How do you tell if your CDN is earning its money? The first number you check is what fraction of requests it serves from its own cache vs forwarding to origin — that fraction is the **hit ratio**, and it splits into two: request hit ratio (what users feel) and byte hit ratio (what the CFO pays for). Healthy dashboard shown: request hit ratio 92%, byte hit ratio 88%, origin RPS low; the Sankey is mostly green-HIT and the thin red MISS share is spread evenly across the five buckets — no single failure mode dominates.
Implementation
Dashboard.computeHitRatio
the two headline numbers, computed from edge log lines
1def computeHitRatio(edge_logs):
2 hits = count(l for l in edge_logs if l.cache == 'HIT')
3 misses = count(l for l in edge_logs if l.cache == 'MISS')
4 request_hit_ratio = hits / (hits + misses)
5
6 bytes_from_cache = sum(l.bytes for l in edge_logs
7 if l.cache == 'HIT')
8 total_bytes = sum(l.bytes for l in edge_logs)
9 byte_hit_ratio = bytes_from_cache / total_bytes
10
11 # diverges when one big-object class misses
12 return request_hit_ratio, byte_hit_ratio
Operator.diagnose
the 5-step ladder, ordered cheapest-investigation-first
1def diagnose(metrics):
2 if metrics.vary_cardinality_per_url > 5:
3 return 'vary' # one curl, read Vary header
4 if metrics.ttl_avg < 60 or metrics.no_store_pct > 0.1:
5 return 'ttl' # curl asset, read Cache-Control
6 if metrics.purge_rate_per_hour > 10:
7 return 'purge' # check CI logs for zone-wide purges
8 if metrics.bypass_pct > 0.2:
9 return 'bypass' # audit /api/* style rules
10 if metrics.cookie_in_key:
11 return 'cookie' # strip cookies for asset paths
12 return 'investigate_origin'