B+tree | CΛTΞИCOΔΞ

/biː-plʌs-triː/

noun — "optimized B-tree variant for database indexing."

B+ tree is an extension of the B-tree data structure designed to optimize range queries, sequential access, and storage utilization in database systems and file systems. Unlike standard B-trees, all actual data records in a B+ tree reside in the leaf nodes, while internal nodes store only keys for routing searches. Leaf nodes are linked sequentially, forming a linked list that enables efficient in-order traversal and range scans, which are critical for queries that retrieve multiple contiguous records.

Technically, a B+ tree of order m has internal nodes containing up to m–1 keys and m child pointers, just like a standard B-tree. Leaf nodes contain the actual data entries along with pointers to the next leaf node, providing a natural sequence for range-based operations. When insertion causes a leaf or internal node to overflow, the node splits, and the median key is promoted to the parent, maintaining balance and logarithmic search performance. Deletion may trigger node merging or redistribution to prevent underflow while preserving the tree’s structure.

In workflow terms, consider a relational database storing transaction records indexed by transaction_date. A query requesting all transactions between two dates benefits from the B+ tree structure: the search quickly navigates to the starting leaf node and then follows the linked leaf nodes sequentially, retrieving all matching records efficiently without additional tree traversal.

A simplified code example of a B+ tree leaf traversal for a range query in pseudocode:

function rangeQueryBPlusTree(startKey, endKey):
    node = searchLeaf(root, startKey)
    results = []
    while node is not null and node.keys[0] <= endKey:
        for i in 0..node.numKeys-1:
            if startKey <= node.keys[i] <= endKey:
                results.append(node.values[i])
        node = node.nextLeaf
    return results

This demonstrates how the sequential leaf linking allows contiguous data retrieval efficiently, a major advantage over standard B-trees for range queries and full scans.

B+ trees are widely used in modern databases (MySQL InnoDB, PostgreSQL), file systems (NTFS, ext4), and key-value stores, providing predictable O(logn) search, insertion, and deletion times while enabling high-performance sequential and range operations. Their separation of routing keys from data entries and linked leaf nodes makes them particularly suitable for disk-based storage, where minimizing disk I/O is critical.

Conceptually, a B+ tree is like a multi-tiered filing system where each internal folder contains only labels directing you to subfolders, and all actual documents are stored in sequentially linked filing cabinets at the bottom level, making both precise lookups and batch reads efficient.

See B-tree, Index, Database.

Structure

Organization

Database