Build a columnar OLAP store (ClickHouse / Druid style) (13 scenes)
Scene 11 · Materialized views are INSERT triggers
A column-store MV is not a refresh — it's a trigger over the incoming block. Direct INSERT into the target silently bypasses it and drifts.
Previously

Skip indexes prune at read time. The bigger lever is to PRE-COMPUTE the answer at write time — turn a billion-row scan into a two-hundred-row scan by materializing the aggregation as data lands.

Scene 11
Materialized views are INSERT triggers
Diagram
LEFT is the source table `events` — wide parts of raw rows. MIDDLE is the **materialized view** (a stored SELECT + a per-block trigger): it fires on every INSERT into the source and runs its SELECT against the INCOMING BLOCK only, not the whole table. RIGHT is the target rollup `events_by_country_hourly` — narrow pre-aggregated parts that the dashboard reads. The dashed bypass arrow at the bottom appears when something writes straight to the target without the MV running; the warn banner appears when the MV was attached AFTER data already existed in the source.
SOURCETARGET (rollup)eventsraw rows · wide partsINSERT block · 1.0M rowsMATERIALIZED VIEW · INSERT TRIGGERruns SELECT against incoming block ONLY:
CREATE MATERIALIZED VIEW events_by_country_mv
TO events_by_country_hourly AS
SELECT toStartOfHour(ts) AS hour,
       country,
       count() AS events
FROM events
GROUP BY hour, country;
events_by_country_hourlypre-aggregated · narrow partsINSERT → events fires MV → SELECT over incoming block → appends rollup → target.
An INSERT lands in `events`. The MV's SELECT runs against the just-arrived 1M-row block and appends ~5 rollup rows to `events_by_country_hourly`. The dashboard reads the rollup in 2 ms instead of scanning the source for 30 s.
Implementation
Source.on_insert_to_source
INSERT into source fires every attached MV against the incoming block
1def on_insert_to_source(source, block):
2 writePart(source, block) # immutable part on source
3 for mv in source.attached_mvs: # post-INSERT trigger
4 rollup = mv.select.run(block) # block-only, not whole table
5 writePart(mv.target, rollup) # appends to target rollup
Backfill.direct_insert_to_target
writes straight into the target — the trigger never fires
1def direct_insert_to_target(target, rows):
2 # writes a part on the TARGET table
3 writePart(target, rows)
4 # nothing inspects target.attached_mvs — MVs trigger off SOURCE
5 # rollup now contains rows the source has no record of
6 return # silent drift: target > source
MaterializedView.create
trigger attaches NOW — only future inserts pass through it
1def create_mv(source, select_stmt, target, populate=False):
2 mv = Mv(select_stmt, target)
3 source.attached_mvs.append(mv) # active from this moment on
4 if populate: # opt-in one-shot backfill
5 mv.select.run(scan(source)) >> writePart(target)
6 # rows already in `source` were inserted BEFORE mv existed —
7 # they never ran through select_stmt. Backfill by hand:
8 # INSERT INTO target SELECT ... FROM source WHERE ts < cutoff
9 return mv