Build a gRPC-style RPC framework
14 scenes · ~98 min · build the primitive

Build your own gRPC-style RPC framework

Every microservice talks over RPC, and the framework you ship determines half the system's failure modes. Build an RPC framework with codec, streams, deadlines, cancellation, retries, interceptors, and load-aware client-side balancing — and feel why gRPC ate the polyglot RPC market and why Thrift and JSON-over-HTTP linger.

  1. 01
  2. 02
  3. 03
  4. 04
  5. 05
  6. 06
  7. 06a
  8. 07
  9. 08
  10. 09
  11. 10
  12. 10a
  13. 11
  14. 12
  1. 01
    Calling a function on another machine
    A remote call is dressed up to look like a local one — but unlike a local call it can fail after the server already did the work, so the caller can't tell if it happened.
    ~7 min
  2. 02
    TCP gives you bytes, not messages
    TCP is a byte stream, not a message stream: two messages can arrive glued or split. A length prefix is what lets the receiver read exactly one message back.
    ~7 min
  3. 03
    The schema: field numbers, not field names
    A schema keys each field by a stable number, not its name — so an old reader can skip a field it doesn't know and still decode the rest. Never reuse a field number.
    ~7 min
  4. 04
    One RPC, one stream, one pipe for many
    HTTP/2 multiplexes many independent streams over one long-lived TCP connection; gRPC maps one RPC to one stream. A hundred calls share one pipe, not a hundred sockets.
    ~7 min
  5. 05
    Four call shapes from one stream
    Unary and the three streaming shapes are the same stream — only the number and direction of DATA frames differ. Streaming is a consequence of the transport, not a bolt-on.
    ~7 min
  6. 06
    Deadlines, not timeouts
    A timeout is relative and resets each hop; a deadline is absolute and propagates the time remaining — so a downstream service never works for a caller that already gave up.
    ~7 min
  7. 06a
    Cancellation: an event, not a clock
    A deadline fires from a clock; cancellation fires from an event. Both ride one Context object down the chain, so aborting a parent stops all the doomed downstream work.
    ~7 min
  8. 07
    Retries: idempotency and a token budget
    Only an idempotent method is safe to auto-retry, and even then a token-bucket budget must cap retries — or a brownout turns into a self-sustaining retry storm.
    ~7 min
  9. 08
    Interceptors: the middleware onion
    An interceptor wraps every call as one composable layer, so auth, metrics, tracing, and the retry policy are written once around the handler instead of per method.
    ~7 min
  10. 09
    The L4 pinning trap: balance requests, not connections
    HTTP/2's one long-lived connection means an L4 load balancer pins every RPC to one backend. Client-side balancing discovers all backends and picks one per request.
    ~7 min
  11. 10
    Flow control: a slow reader slows the writer
    The receiver advertises a window of credit; the sender may only send DATA up to it. A slow reader stops granting credit, so the producer pauses instead of OOMing.
    ~7 min
  12. 10a
    Head-of-line blocking and the QUIC fix
    HTTP/2 fixed app-layer head-of-line blocking, but all streams share one in-order TCP pipe, so one lost packet stalls them all. QUIC moves streams below the loss boundary.
    ~7 min
  13. 11
    mTLS and propagating identity
    TLS encrypts and proves the server; mTLS proves both peers with a workload identity. The original caller's identity must propagate across hops, just like a deadline.
    ~7 min
  14. 12
    Design canvas: configure the RPC stack
    Pick codec, transport, deadline, retry, balancing, and security for four named workloads — each defended by the scene that taught it. gRPC inside the fleet, REST at the edge.
    ~7 min