Build a B-tree storage engine (SQLite-style) (11 scenes)
Scene 06 · Indexes — a second B-tree
Each CREATE INDEX builds a second B-tree; reads do TWO descents, but every write must update every index.
Previously
We've covered the table B-tree end-to-end — one tree per table, growing on insert and not shrinking on delete. CREATE INDEX builds another one beside it, and that 'beside it' is where both reads and writes get more interesting.
Scene 06
Indexes — a second B-tree
Diagram
Two B-trees side by side: the table B-tree (rowid → row) on the left, the index B-tree on email (email → rowid) on the right. The dashed cross arrow shows that an index hit yields a rowid that must be re-looked-up in the table tree. The write-amplification meter below counts B-trees touched per INSERT.
SELECT WHERE email='alice@x' fires. Watch the cursor descend the email index, find rowid=42, then cross to the table tree and descend again to fetch the row.
Implementation
SELECT WHERE email = ?
two-tree read path: index descent, then table descent
1def select_by_email(email):2 # 1. Walk the index B-tree; leaf cells = (email, rowid).3 rowid = index_tree.find(email) # ~3 page reads4 if rowid is None:5 return None6 # 2. Walk the table B-tree by rowid to fetch the row.7 return table_tree.find(rowid) # ~3 more page reads
INSERT INTO users(...)
write path fans out across the table tree + every index
1def insert(row):2 # 1. Always: insert into the table B-tree (keyed by rowid).3 table_tree.insert(row.rowid, row)4 # 2. For EVERY secondary index, insert (col_value, rowid).5 for idx in secondary_indexes: # runs N times6 # index leaf cell = (col_value, rowid) — no payload.7 idx.tree.insert(8 key = (row[idx.col], row.rowid),9 payload = None,10 )11 # B-trees touched per INSERT = 1 + N (write amplification).
Tree.insert (per B-tree)
what every one of those N+1 trees runs on each INSERT
1def insert(key, payload):2 # 1. Descend interior pages to the target leaf.3 leaf = root4 while leaf.kind == 'interior':5 leaf = leaf.child_for(key) # ~log_B(rows) page reads6 # 2. Insert the new cell in key order on the leaf.7 leaf.insert_cell(key, payload)8 # 3. If the leaf overflows, split and propagate upward.9 if leaf.bytes_used > PAGE_SIZE:10 split_and_propagate(leaf) # may grow the tree