Build a B-tree storage engine (SQLite-style) (11 scenes)
Scene 01 · Find the row — why a file scan won't do
A SQL engine has to answer point, range, and indexed lookups against ONE file — the only family of structures that can do all three is an ordered tree.
Scene 01
Find the row — why a file scan won't do
Diagram
On the left, **users.db** — one file holding every **row** of the **users table**, drawn as scattered colored cells. Each cell's badge is the row's **primary key (PK)** — its rowid. On the right, three queries fire against the file: a **point** lookup by id, a **range** scan over an id band, and an **indexed-column** lookup by email. The cost meter underneath reports rows scanned per query.
users.db — one file, every row of the users table
point: one row by PK
Three queries fire one after another — point, range, indexed. Watch the cursor walk every row in the file for each one. The counter tells you the cost.
Implementation
select_row(file, target_id)
unsorted users.db: every query walks every cell
1def select_row(file, target_id):2 # no order to exploit — read the cells in disk order3 for cell in file.cells: # O(N) page reads4 if cell.rowid == target_id:5 return cell.row # point hit (still scanned)6 return None # range / indexed: full file
insert_row_into_sorted(file, row)
sorted-flat users.db: log-N to find, but N to make room
1def insert_row_into_sorted(file, row):2 # file is kept in PK order on disk3 pos = bisect_left(file.cells, row.rowid) # O(log N)4 # make room at pos by sliding the tail one slot right5 for i in range(len(file.cells) - 1, pos - 1, -1):6 file.cells[i + 1] = file.cells[i] # O(N) writes7 file.cells[pos] = row8 file.length += 1