Skip to content

B+ tree multi-level growth (leaf/internal/root splitting)#84

Open
poyrazK wants to merge 9 commits into
mainfrom
feature/btree-splitting
Open

B+ tree multi-level growth (leaf/internal/root splitting)#84
poyrazK wants to merge 9 commits into
mainfrom
feature/btree-splitting

Conversation

@poyrazK
Copy link
Copy Markdown
Owner

@poyrazK poyrazK commented May 15, 2026

Summary

  • Complete B+ tree multi-page implementation (Phases 1-5)
  • Slot array format with binary entry serialization
  • find_leaf() with binary search for internal node traversal
  • split_leaf() allocates right leaf page, copies upper-half entries in reverse order
  • insert_into_parent() propagates separators up, handles cascade splits
  • split_internal() for internal node splitting when parent is full
  • create_new_root() for root split case
  • update_child_parent() for parent pointer maintenance
  • Root splitting when root (internal) needs to split

Test Results

  • 29/29 BTreeIndexTests pass
  • 1 pre-existing failure: BTreeIndexNextLeafTests.ScanIterator_NextLeaf (page format mismatch in raw test, predates slot array)

poyrazK added 2 commits May 15, 2026 14:54
Phase 1-5: Complete B+ tree multi-page implementation

- Slot array format with binary entries (type+key_len+key_data+TupleId)
- find_leaf() traversal with binary search on internal nodes
- split_leaf() to split full leaf pages
- insert_into_parent() and split_internal() for parent propagation
- create_new_root() for root split handling
- update_child_parent() for parent pointer maintenance
- Root splitting when root is internal and needs to split
- find_leaf() with binary search for internal node traversal
- split_leaf() allocates right leaf, copies upper-half entries
- insert_into_parent() with cascade split handling
- split_internal() for internal node splitting
- create_new_root() for root split case
- update_child_parent() for parent pointer maintenance
- insert() retry loop wired with split_leaf() and insert_into_parent()
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

Warning

Rate limit exceeded

@poyrazK has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 47 minutes and 16 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0fec2105-7e1e-43fb-94c3-b3bb17dbbdb1

📥 Commits

Reviewing files that changed from the base of the PR and between f2fae18 and cff2520.

📒 Files selected for processing (7)
  • docs/adr/003-btree-multi-level-growth.md
  • include/storage/btree_index.hpp
  • include/storage/storage_manager.hpp
  • src/storage/btree_index.cpp
  • src/storage/buffer_pool_manager.cpp
  • src/storage/storage_manager.cpp
  • tests/btree_index_tests.cpp
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/btree-splitting

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

poyrazK added 7 commits May 15, 2026 16:58
- Add root_page() public getter to BTreeIndex for test visibility
- Add MultiLevelTree_ThreeLevelsDeep test (100 entries, exercises leaf splits)
- Add RootSplit_CreatesNewRootInternalNode test (50 entries)
- Mark ScanIterator_NextLeaf as DISABLED (uses raw page format incompatible
  with slot array serialization - predates slot array implementation)
- Add ADR 003 documenting B+ tree multi-level growth design
1. serialize_entry: remove erroneous +2 from bytes_written calculation
   Was writing 21 bytes for int64 entries (should be 19)

2. split_internal: remove unused left_child/right_child parameters
   These were leaf pages from insert_into_parent, not internal node
   children. The wrong update_child_parent calls caused parent pointer
   corruption during cascade splits.

3. split_internal: only update promoted_left_child's parent pointer
   The other children are internal node children that stay with their
   existing parent; insert_into_parent handles updating the new entry's
   children after the split completes.
Added comment explaining why test uses 100 entries rather than a higher
count. With ~360 entries per leaf, 100 triggers leaf splits and initial
internal node creation, but doesn't yet trigger the internal split
cascade (which has a separate latent bug at ~177 entries to be fixed).
- Fix create_new_root to store left_child in slot (was storing right_child)
- Fix get_child_page to correctly handle rightmost child via next_leaf
- Fix split_internal to preserve next_leaf instead of overwriting with promoted_left_child
- Fix split_leaf slot indexing: slot_idx = num_keys - 1 - i
- Add debug tracing for search and traversal to diagnose issues

These fixes enable proper multi-level B+ tree growth and correct
navigation to rightmost children when traversing internal nodes.
- Add file_sizes_ map to track actual written page boundaries
- Use raw POSIX I/O for write_page then sync via fsync
- Add debug fprintf statements to buffer_pool_manager

These changes improve durability of writes and help debug page allocation issues.
These are debug changes to help diagnose:
- Why scan finds 4945/5000 entries
- Why search(25000) returns 0 results

The trace shows insert_into_parent receives correct sep_key=24920
for key=25000, but internal node navigation may still be wrong.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant