Build a Prometheus-style time-series database (12 scenes)
Scene 04 · XOR crushes adjacent floats
Adjacent float64s share most of their IEEE-754 bits. XOR them and the leftover bits are tiny — an unchanged value costs 1 bit.
Previously

Timestamps shrunk because they have a clock. Floats from the same series have their own kind of structure — the bits, not the values, are nearly stationary.

Scene 04
XOR crushes adjacent floats
Diagram
Top: two float pills, prev and curr. Middle: their IEEE-754 64-bit representations stacked, with bits XOR-ing into a third row — leading and trailing zeros render gray, meaningful bits in the middle render emerald. Bottom: the encoded output bits — `0` for unchanged, `10` for a reused window, `11` + 5-bit leading + 6-bit length + meaningful bits for a fresh window.
XOR of two float64 samples — control prefix `0`BITS SO FAR0 / 64exponent (11)mantissa (52)prev = 42.00000000000000000000000000000000000000000000000000100010101000000curr = 42.00000000000000000000000000000000000000000000000000100010101000000prev XOR curr0000000000000000000000000000000000000000000000000000000000000000encoded (0)0Three pairs incoming. Watch the XOR row cancel almost everything when prev ≈ curr.Naive: 64 bits / pointEncoded so far: 0 bits
Two adjacent scrapes of the same gauge usually report numbers that are almost identical. If you XOR them, most of the bits cancel — the leftover bits are tiny and easy to encode. The name for this is XOR encoding. Watch three pairs walk through it.
Implementation
Encoder.encodeXorValue
dispatch on the XOR; one of three control prefixes
1def encodeXorValue(prev, curr, prev_window):
2 xor = float64_bits(prev) ^ float64_bits(curr)
3 if xor == 0:
4 write_bit(0) # control '0'
5 return prev_window
6 lead, length, body = meaningfulBits(xor)
7 if fits_in(prev_window, lead, length):
8 write_bits(0b10, 2) # control '10'
9 write_bits(body, prev_window.length)
10 return prev_window
11 write_bits(0b11, 2) # control '11'
12 write_bits(lead, 5) # 5-bit leading zeros
13 write_bits(length, 6) # 6-bit meaningful length
14 write_bits(body, length)
15 return Window(lead, length)
Encoder.meaningfulBits
trim leading and trailing zeros from the XOR
1def meaningfulBits(xor):
2 # adjacent float64s share sign/exponent/high mantissa,
3 # so xor has long runs of zeros at both ends.
4 lead = clz64(xor) # count leading zeros
5 trail = ctz64(xor) # count trailing zeros
6 length = 64 - lead - trail # meaningful bit count
7 body = (xor >> trail) & ((1 << length) - 1)
8 return (lead, length, body)
Decoder.decodeXorValue
symmetric reader; control prefix selects the path
1def decodeXorValue(prev, prev_window):
2 if read_bit() == 0:
3 return prev # unchanged
4 if read_bit() == 0: # control '10'
5 body = read_bits(prev_window.length)
6 xor = body << prev_window.trail
7 return from_bits(float64_bits(prev) ^ xor)
8 lead = read_bits(5) # control '11'
9 length = read_bits(6)
10 body = read_bits(length)
11 xor = body << (64 - lead - length)
12 return from_bits(float64_bits(prev) ^ xor)