Build a Service Mesh (Envoy / Istio style)
13 scenes · ~91 min · build the primitive

Build your own Service Mesh (Envoy / Istio style)

Every microservice request crosses two proxies. This curriculum is what they do: routing, load balancing, timeout-and-retry-budget, circuit breakers, outlier detection, token-bucket rate limits, mTLS with workload identity, and a control plane that streams config to all of them. Build it in the order the production problems show up — and feel why Envoy plus a control plane has eaten the east-west world.

  1. 01
  2. 02
  3. 03
  4. 04
  5. 05
  6. 06
  7. 07
  8. 07a
  9. 08
  10. 09
  11. 10
  12. 11
  13. 12
  1. 01
    Fifty services, fifty broken retry policies
    Every team picks its own retry, timeout, breaker, and mTLS library. One slow dependency turns into a fleet-wide outage.
    ~7 min
  2. 02
    The sidecar — one proxy per pod
    Put a small proxy next to every service. App talks to localhost; cross-service traffic flows sidecar to sidecar. The fleet of sidecars is the service mesh.
    ~7 min
  3. 03
    L4 vs L7 — bytes or requests
    An L4 proxy forwards opaque TCP bytes; an L7 proxy parses HTTP and can act on path, method, and headers. The mesh is L7 for everything that follows.
    ~7 min
  4. 04
    Listener and route — bind and match
    A listener accepts on a port; an ordered route table picks a destination on the first match. Reorder the rules and the same request lands somewhere else.
    ~7 min
  5. 05
    Cluster and load balancing — pick one of many
    Behind one destination name is a cluster of replicas; round-robin, least-request, or ring-hash decides who serves each request. Round-robin is the wrong default under heterogeneous latency.
    ~7 min
  6. 06
    Timeout and retry budget — bounded patience
    Naive multi-hop retries amplify load 243x on a failing backend. A retry budget caps total retries as a fraction of normal traffic so retries can't become the outage.
    ~7 min
  7. 07
    Circuit breaker — the state machine
    Closed → open → half-open. Fast-fail to a known-broken dependency and periodically probe for recovery, so callers stop wasting resources on guaranteed failures.
    ~7 min
  8. 07a
    Outlier detection — eject one bad replica
    Don't trip the whole cluster — pull just the misbehaving replica from the pool. Passive (real 5xx) catches what active /healthz probes miss.
    ~7 min
  9. 08
    Rate limiting — the token bucket
    Per-client token bucket: each request takes a token, an empty bucket returns 429. Local is cheap and drifts; global stays exact via a coordinator.
    ~7 min
  10. 09
    mTLS — identity for both sides
    TLS proves the server, mTLS proves both. The control plane mints short-lived certs that carry a stable workload identity — pod IPs are not identities.
    ~7 min
  11. 10
    Control plane and data plane — config over gRPC
    Sidecars (data plane) handle traffic; the control plane (Istiod) streams listener/route/cluster/cert config via xDS. Kill the control plane and traffic keeps flowing.
    ~7 min
  12. 11
    Trace and span — stitching one user request
    Every sidecar emits a span tagged with the trace id from `traceparent`. One forgotten header rebuild breaks the trace silently — RED metrics keep flowing regardless.
    ~7 min
  13. 12
    Design canvas — configure the mesh
    Four workloads, every knob from the prior scenes. Each verifier note cites the scene that earned it.
    ~7 min