Build Kafka (13 scenes)
Scene 09 · Design canvas — pick the knobs
Capstone: apply scenes 2-8 to a fresh problem and articulate the trade-off you took.
Previously

You've seen every knob — log shape, partitions, replication, durability, compaction, leader epoch, rebalance, exactly-once. Time to assemble them for a real workload, and tilt the throughput / durability / simplicity triangle deliberately rather than by default.

Scene 09
Design canvas — pick the knobs
Diagram
A budget canvas: pick partition count, replication factor, min.insync.replicas, acks policy, group strategy, and EOS on/off from the controls on the left. The center renders the resulting cluster topology — brokers, replicas, consumer assignments. The right side lights up warning chips when your choices conflict (e.g. acks=all + min.insync.replicas=1 + unclean=true is a silent loss path).
Click-stream pipeline · 100k events/secTwo consumers: a real-time dashboard (latency-sensitive) and a daily warehouse loader (throughput-sensitive). Lost events lose revenue; downtime is unacceptable. Pick the knobs.CLUSTERRF=2 · partitions=3B11Pp0B21Pp1B31Pp2SINGLE GROUP (DASHBOARD + WAREHOUSE TOGETHER) · 3 consumersc1owns p0c2owns p1c3owns p2EOS LAYEROFFPIDproducer idEpochtxn epochGroup-Genrebalance genTRADE-OFFSThroughput1/5Durability1/5Complexity1/5WARNINGSacks=1: leader-only ack. A leader crash before fetch loses thewrite. Scene 4.100k/sec across 3 partitions ≈ 33k/sec each — single-broker hotspot.
This is the architecture-interview question. The defaults — P=3, RF=2, MIR=1, acks=1, single group, EOS=off — are wrong on purpose. Read the warnings, watch the trade-off bars, and feel where the design is exposed before you start tuning.
Implementation
producer.properties
producer-side dials — durability contract & retry shape
1bootstrap.servers=broker-1:9092,broker-2:9092
2acks=0 # 0 fire-and-forget
3acks=1 # 1 leader-only ack
4acks=all # all in-ISR replicas
5
6enable.idempotence=true # PID + seq# dedupe retries
7transactional.id=clickstream-tx-1
8 # cross-session epoch fencing
9
10compression.type=lz4 # batch-level; cheap throughput
11linger.ms=20 # fill batches before sending
12max.in.flight.requests.per.connection=5
server.properties (broker)
broker-side dials — replica policy & election safety
1broker.id=1
2log.dirs=/var/lib/kafka/data
3
4default.replication.factor=<RF>
5min.insync.replicas=<MIR> # floor on the ISR
6 # rule of thumb: MIR = RF - 1
7unclean.leader.election.enable=false
8 # silent loss > unavailability
9
10replica.lag.time.max.ms=30000
11 # ISR eviction threshold
12log.retention.ms=604800000 # 7d default; per-topic overrides
13log.cleanup.policy=delete # not compact: events expire
kafka-topics --alter (per-topic)
per-topic overrides — what the brief actually demands
1kafka-topics.sh --bootstrap-server broker-1:9092 \
2 --create --topic clickstream \
3 --partitions <P> --replication-factor <RF> \
4 --config min.insync.replicas=<MIR> \
5 --config cleanup.policy=delete \
6 --config retention.ms=604800000 \
7 --config compression.type=producer \
8 --config segment.ms=3600000
9
10# downstream consumer (warehouse / dashboard):
11isolation.level=read_committed
12 # skip aborted txn markers