Build a distributed logging stack (ELK / Loki) (12 scenes)
Scene 04 · String vs map — fields and labels
A log line is either a string parsed at read-time or a typed map parsed at write-time, and the two systems we'll meet attach different names to the same idea — fields in ELK, labels in Loki.
Previously

The agent survived the backend outage by spilling to disk — but every line it spilled was still just a string. Before we ask the backend to index this stuff, we have to decide whether a log line is text or a typed record.

Scene 04
String vs map — fields and labels
Diagram
Left half (unstructured): the raw log line as a STRING; a regex cursor scrubs left-to-right looking for `user_id=42`, and a clock badge ticks the read-side cost — every query pays it again. Right half (structured): the same line as a JSON MAP with named keys; `user_id` resolves via an O(1) hash lookup, no read clock — but a small write-side clock badge appears, because the JSON had to be marshalled at emit time. Below (revealed at slider position 2): a 4-row, 3-column Rosetta-stone panel. Column 1 is the concept; column 2 is what ELK calls it; column 3 is what Loki calls it. Row 2 ("body") greys out the Loki cell with a `not in index` tag — a tease, not the topic.
UNSTRUCTUREDSTRUCTUREDraw line · regex parse at read time2026-05-09 12:00:01 ERROR [api] user 42 checkout failedregex: /user_id=([^\s]+)/QUERY0uswrite fast · read slowJSON map · parse at write time{"ts": "2026-05-09T12:0…","level": "ERROR" ← label,"service": "api" ← label,"env": "prod" ← label,"user_id": "42","event": "checkout_failed"}Unstructured: the line is a STRING. The query 'find user 42' runs a regex over every character — the read clock ticks; the write …
One line, emitted twice. On the LEFT, it lands as an opaque string; the regex cursor scrubs across it looking for `user_id=42`, and the read-clock ticks — every character is work. On the RIGHT, the same line is a JSON map; `user_id` is a key, the hash lookup is instant, and the read-clock stays silent. Watch the asymmetry before we touch a slider.
Implementation
App.emit_unstructured
the line is built as a STRING — work deferred to read time
1def handle_request(uid, path, code):
2 # printf-style: format chars into one opaque blob
3 line = f"user_id={uid} path={path} status={code}"
4 log.info(line)
5
6# later, every query pays the parse cost:
7def query(term):
8 for line in scan_log_files():
9 if re.search(term, line): # regex over bytes
10 yield line
App.emit_structured
slog/zap-style: the line is a typed MAP — paid once at write
1def handle_request(uid, path, code):
2 # named slots, typed values — JSON-marshalled at emit
3 log.info(
4 "checkout_failed",
5 "user_id", uid,
6 "path", path,
7 "status", code,
8 )
9
10# later, every query is an O(1) hash lookup on the key:
11def query(key, value):
12 return index[key].get(value, [])