Introduction to B Trees
B Trees are a type of self-balancing search tree in which nodes can have multiple children. They are designed to work well on systems that read and write large blocks of data, such as databases and file systems. B Trees maintain sorted data and allow searches, sequential access, insertions, and deletions in logarithmic time.
Key Characteristics of B Trees
- Balanced Tree: B Trees are balanced, meaning that the path from the root to any leaf is of the same length.
- Multiple Children: Each node can have more than two children.
- Sorted Order: The keys within a node are sorted.
- Node Capacity: Each node can contain a certain number of keys, typically defined by a parameter called the minimum degree (t).
Structure of a B Tree
A B Tree of minimum degree t
has the following properties:
- Every node has at most
2t - 1
keys. - Every node (except root) has at least
t - 1
keys. - The root has at least one key.
- All leaves appear on the same level.
- A non-leaf node with
k
keys hask + 1
children.
Example of a B Tree
Consider a B Tree of minimum degree t = 3
:
Basic Operations on B Trees
- Search
Searching in a B Tree is similar to searching in a binary search tree but involves more comparisons due to multiple keys per node.
Algorithm:
- Start at the root node.
- Perform a binary search within the node.
- If the key is found, return it.
- If the key is not found and the node is a leaf, return null.
- If the key is not found and the node is not a leaf, recursively search the appropriate child.
Example:
class BTreeNode: def __init__(self, t, leaf=False): self.t = t # Minimum degree self.leaf = leaf self.keys = [] self.children = [] def search(node, key): i = 0 while i < len(node.keys) and key > node.keys[i]: i += 1 if i < len(node.keys) and key == node.keys[i]: return node if node.leaf: return None return search(node.children[i], key)
- Insertion
Insertion in a B Tree involves finding the appropriate leaf node and inserting the key while maintaining the B Tree properties. If a node overflows (i.e., exceeds 2t - 1
keys), it is split.
Algorithm:
- Start at the root.
- If the root is full, split it and adjust the tree.
- Traverse to the appropriate child node.
- Insert the key in the appropriate position in the leaf node.
- If the leaf node overflows, split it and propagate the split upwards if necessary.
Example:
def insert_non_full(node, key): i = len(node.keys) - 1 if node.leaf: node.keys.append(0) while i >= 0 and key < node.keys[i]: node.keys[i + 1] = node.keys[i] i -= 1 node.keys[i + 1] = key else: while i >= 0 and key < node.keys[i]: i -= 1 i += 1 if len(node.children[i].keys) == 2 * node.t - 1: split_child(node, i) if key > node.keys[i]: i += 1 insert_non_full(node.children[i], key) def split_child(parent, i): t = parent.children[i].t new_node = BTreeNode(t, parent.children[i].leaf) parent.children[i].keys, new_node.keys = parent.children[i].keys[:t-1], parent.children[i].keys[t:] if not parent.children[i].leaf: parent.children[i].children, new_node.children = parent.children[i].children[:t], parent.children[i].children[t:] parent.children.insert(i + 1, new_node) parent.keys.insert(i, parent.children[i].keys.pop(t-1))
- Deletion
Deletion in a B Tree is more complex than insertion and involves several cases to ensure the tree remains balanced.
Algorithm:
- If the key is in a leaf node, remove it directly.
- If the key is in an internal node:
- If the child preceding the key has at least
t
keys, replace the key with the predecessor. - If the child following the key has at least
t
keys, replace the key with the successor. - If both children have fewer than
t
keys, merge the key and the two children.
- If the child preceding the key has at least
- If the key is not in the node, ensure the appropriate child has at least
t
keys before descending.
Example:
def delete(node, key): t = node.t i = 0 while i < len(node.keys) and key > node.keys[i]: i += 1 if i < len(node.keys) and key == node.keys[i]: if node.leaf: node.keys.pop(i) else: if len(node.children[i].keys) >= t: node.keys[i] = get_predecessor(node, i) delete(node.children[i], node.keys[i]) elif len(node.children[i + 1].keys) >= t: node.keys[i] = get_successor(node, i) delete(node.children[i + 1], node.keys[i]) else: merge(node, i) delete(node.children[i], key) else: if node.leaf: return if len(node.children[i].keys) < t: fill(node, i) delete(node.children[i], key) def get_predecessor(node, i): current = node.children[i] while not current.leaf: current = current.children[-1] return current.keys[-1] def get_successor(node, i): current = node.children[i + 1] while not current.leaf: current = current.children[0] return current.keys[0] def merge(node, i): child = node.children[i] sibling = node.children[i + 1] child.keys.append(node.keys.pop(i)) child.keys.extend(sibling.keys) if not child.leaf: child.children.extend(sibling.children) node.children.pop(i + 1) def fill(node, i): t = node.t if i != 0 and len(node.children[i - 1].keys) >= t: borrow_from_prev(node, i) elif i != len(node.children) - 1 and len(node.children[i + 1].keys) >= t: borrow_from_next(node, i) else: if i != len(node.children) - 1: merge(node, i) else: merge(node, i - 1) def borrow_from_prev(node, i): child = node.children[i] sibling = node.children[i - 1] child.keys.insert(0, node.keys[i - 1]) if not child.leaf: child.children.insert(0, sibling.children.pop()) node.keys[i - 1] = sibling.keys.pop() def borrow_from_next(node, i): child = node.children[i] sibling = node.children[i + 1] child.keys.append(node.keys[i]) if not child.leaf: child.children.append(sibling.children.pop(0)) node.keys[i] = sibling.keys.pop(0)
Practical Exercises with B Trees
Exercise 1: Implement a B Tree Insertion
Task: Implement the insert
function for a B Tree.
Solution:
class BTree: def __init__(self, t): self.root = BTreeNode(t, True) self.t = t def insert(self, key): root = self.root if len(root.keys) == 2 * self.t - 1: new_root = BTreeNode(self.t, False) new_root.children.append(self.root) split_child(new_root, 0) self.root = new_root insert_non_full(self.root, key)
Exercise 2: Implement a B Tree Search
Task: Implement the search
function for a B Tree.
Solution:
def search(node, key): i = 0 while i < len(node.keys) and key > node.keys[i]: i += 1 if i < len(node.keys) and key == node.keys[i]: return node if node.leaf: return None return search(node.children[i], key)
Exercise 3: Implement a B Tree Deletion
Task: Implement the delete
function for a B Tree.
Solution:
def delete(node, key): t = node.t i = 0 while i < len(node.keys) and key > node.keys[i]: i += 1 if i < len(node.keys) and key == node.keys[i]: if node.leaf: node.keys.pop(i) else: if len(node.children[i].keys) >= t: node.keys[i] = get_predecessor(node, i) delete(node.children[i], node.keys[i]) elif len(node.children[i + 1].keys) >= t: node.keys[i] = get_successor(node, i) delete(node.children[i + 1], node.keys[i]) else: merge(node, i) delete(node.children[i], key) else: if node.leaf: return if len(node.children[i].keys) < t: fill(node, i) delete(node.children[i], key)
Conclusion
B Trees are powerful data structures that provide efficient insertion, deletion, and search operations. They are particularly useful in database and file system implementations due to their ability to handle large blocks of data efficiently. Understanding B Trees and their operations is crucial for any programmer working with large datasets or developing systems that require efficient data retrieval and storage.
In the next module, we will explore Graphs, another fundamental data structure with wide-ranging applications in computer science.
Data Structures Course
Module 1: Introduction to Data Structures
Module 2: Lists
Module 3: Stacks
- Introduction to Stacks
- Basic Operations with Stacks
- Stack Implementation
- Applications of Stacks
- Exercises with Stacks
Module 4: Queues
- Introduction to Queues
- Basic Operations with Queues
- Circular Queues
- Priority Queues
- Exercises with Queues
Module 5: Trees
Module 6: Graphs
- Introduction to Graphs
- Graph Representation
- Graph Search Algorithms
- Shortest Path Algorithms
- Applications of Graphs
- Exercises with Graphs