Build a CDN (13 scenes)
Scene 10 · Bypass — when caching is wrong, the CDN still earns its keep
Auth and per-user routes must bypass the cache, and the CDN still pays for itself there: TLS termination at the POP, anycast routing, DDoS absorption.
Previously
Shield helps the cacheable traffic — but plenty of traffic is per-user and should never be cached at all; what does the CDN do for those routes? Some routes — like /api/me — should NEVER be cached because the response is per-user; the CDN must instead route them straight through to origin. That straight-through pattern is called bypass.
Scene 10
Bypass — when caching is wrong, the CDN still earns its keep
Diagram
Two parallel lanes between user and POP and origin. Top lane is /static/logo.png — cacheable, hits the cache cell, returns in milliseconds. Bottom lane is /api/me — per-user, configured to bypass the cache: the request visibly arcs around the cache box and proxies through to origin. The POP chips below the cache list what the POP STILL does on every request regardless of caching: TLS terminated, anycast routed, DDoS filter. **bypass** — a configured route where the POP forwards every request to origin without consulting the cache (also called pass-through).
Sources
- RFCRFC 9111 §3.5 — Storing Responses to Authenticated Requests
- docMDN: Cache-Control — private vs public
- codeSidekiq issue #5936 — cookie leaking via CDN
- codeHackerOne: ThisData insecure Cache-Control
- docCloudflare: Bypass cache on cookie
- docCloudflare: DDoS protection overview
- bloggbHackers: Cache deception attack
Even on the bypass lane (no cache lookup), the POP still does TLS termination, anycast routing, and DDoS filtering on every request. That's the half of the CDN's value learners forget.
Two requests in parallel: /static/logo.png hits cache and returns in 5 ms; /api/me bypasses cache and returns in 80 ms; both go through the same POP. Notice the chips on the POP — TLS termination, anycast routing, and DDoS filtering happen on EVERY request, not just cacheable ones.
Implementation
POP.shouldCache
edge classifier — decide cache vs bypass per response
1def shouldCache(req, response):2 cc = parse(response.headers['Cache-Control'])3 # 'private' = shared caches MUST NOT store4 if 'private' in cc: return False5 if 'no-store' in cc: return False6 # explicit bypass rule (e.g. /api/*, /admin/*)7 if matches_bypass_rule(req.path): return False8 # 'public' overrides the auth-skip default9 if 'public' in cc: return True10 if 'Authorization' in req.headers: return False11 return cc.max_age is not None
POP.handle
what the POP still does on every request, cache or not
1def handle(connection):2 # 1. terminate TLS at the edge — saves an origin RTT3 req = tls.terminate(connection)4 # 2. anycast already routed the user to THIS POP via BGP5 # 3. volumetric DDoS filter runs before any app logic6 if ddos.shouldDrop(req): return7 cached = cache.lookup(cache_key(req))8 if cached and not cached.expired:9 return cached.response # HIT10 response = origin.fetch(req) # bypass / MISS path11 if shouldCache(req, response):12 cache.store(cache_key(req), response)13 return response
POP.cacheKey
the leak path — key is URL only unless you opt in to Vary
1def cache_key(req):2 # default: method + scheme + host + path + query3 parts = [req.method, req.url]4 # cookies / auth enter the key only via explicit Vary5 for axis in response.headers.get('Vary', []):6 parts.append(req.headers.get(axis, ''))7 return hash(parts)89# misconfigured-public on /api/me:10# user A -> store(key('/api/me'), A.profile)11# user B -> lookup(key('/api/me')) -> A.profile # LEAK