Build a columnar OLAP store (ClickHouse / Druid style) (13 scenes)
Scene 13 · Design canvas — pick a workload, ship a schema
Every knob — sort key, partition, skip index, MV, denormalization — is a workload-shaped choice. The right observability schema is wrong for BI.
Previously

We've now seen every lever — layout, encoding, execution, parts, merge, sort key, sparse index, skip index, MVs, denormalization. The last move is to face a real workload and pick the configuration end-to-end.

Scene 13
Design canvas: pick a workload, ship a schema
Diagram
Top: WORKLOAD ARCHETYPE cards — a workload archetype is the recurring shape of writes + queries that dictates which knobs matter (observability / BI / event analytics). Middle: the engine config you'd ship — **sort key** (Scene 8), **partition** (Scene 6), **skip indexes** (Scene 10), **materialized view** rollups (Scene 11), **denormalization** vs dictGet (Scene 12), shards × replicas. Right: live simulation — insert rate, **parts** vs threshold (Scene 6), **query p99** (Scene 8/9), merge queue depth, MV drift. Bottom: status banner — green / yellow / red — with a chip naming the scene whose insight diagnoses the failure (and 'wrong tool' when the archetype doesn't fit at all).
WORKLOADACTIVEObservability / metricscardhigh · per-service tagswriteappend-heavy, time-orderedquerytime range × service filterBI dashboardscardmedium · dim joinswritenightly batch + tricklequerygroup-by, top-N, p95Event analytics / clickstreamcardvery high · user_id, urlwritestreaming event firehosequeryfunnels, retention, cohortENGINE CONFIGsort key(service, ts)partitionmonthlyskip indexeshost:set(100)trace_id:bloom_filterMV rollupsper-minute service latency (AggregatingMergeTree)denormalizationOFF · joined at queryshards × replicas4 shards × 2 replicasSIMULATIONinsert rate195.0k rows/sparts vs threshold64 / 300query p99180 msmerge queue2 pendingMV drift0%HEALTHYall meters below thresholds — design fits the workloadEvery knob traces back to a named earlier scene.
Workload A — Observability. Defaults are pre-loaded: ORDER BY (service, ts), partition monthly. The verifier walks green and the chips trace every knob back to an earlier scene. This is what twelve scenes of column-store thinking cash out to.
Implementation
Stop A — Observability schema
(service, ts) prefix · monthly partition · skip indexes on host/trace_id
1CREATE TABLE events (
2 service LowCardinality(String),
3 host LowCardinality(String),
4 ts DateTime CODEC(Delta, ZSTD),
5 trace_id String,
6 metric Float64,
7 INDEX idx_host host TYPE set(100) GRANULARITY 4,
8 INDEX idx_trace trace_id TYPE bloom_filter GRANULARITY 4
9) ENGINE = MergeTree
10ORDER BY (service, ts)
11PARTITION BY toYYYYMM(ts)
12SETTINGS index_granularity = 8192;
Stop B — BI dashboards schema
fact table + AggregatingMergeTree MV · LowCardinality dim columns
1CREATE TABLE events (
2 service LowCardinality(String),
3 region LowCardinality(String),
4 ts DateTime CODEC(Delta, ZSTD),
5 customer_id UInt64,
6 revenue Decimal(18, 2)
7) ENGINE = MergeTree
8ORDER BY (service, ts)
9PARTITION BY toYYYYMM(ts);
10
11CREATE MATERIALIZED VIEW daily_revenue_mv
12ENGINE = AggregatingMergeTree
13ORDER BY (toDate(ts), region) AS
14SELECT toDate(ts) AS d, region,
15 sumState(revenue) AS revenue_state
16FROM events GROUP BY d, region;
Stop D — Anti-pattern: per-row UPDATE
verifier REFUSES — every UPDATE rewrites a whole part
1CREATE TABLE inventory (
2 sku_id UInt64,
3 qty Int32,
4 price Decimal(18, 2),
5 updated DateTime
6) ENGINE = MergeTree
7ORDER BY (sku_id)
8PARTITION BY toYYYYMM(updated);
9
10-- Per-row UPDATE arrives from the OLTP app:
11ALTER TABLE inventory UPDATE qty = qty - 1
12 WHERE sku_id = 42; -- rewrites every part touching `qty`
13
14-- Verifier REFUSES: parts are immutable; one new part per write.
15-- Merges cannot keep up → 'too many parts' fires in minutes.
16-- Right answer: use Postgres, or stage in PG and CDC to ClickHouse.
Wrong knobs — yellow / red on stops A & B
leading with ts → useless sparse index · hourly → too many parts
1CREATE TABLE events_bad (
2 service LowCardinality(String),
3 ts DateTime,
4 metric Float64
5) ENGINE = MergeTree
6ORDER BY (ts)
7-- Leading with ts: every WHERE service=... scans every
8-- granule; sparse index prunes nothing.
9PARTITION BY toStartOfHour(ts);
10-- Hourly × shards → tiny-part explosion within minutes;
11-- INSERTs throw 'too many parts', merges never catch up.