The size of a node in a B-Tree is as large as the size of a whole page on disk. This is because we want to read as much data as possible in a single disk seek.
A large branching factor reduces the height of the tree and the no of disk accesses required to find any key.
The essence of a B-Tree, is that given an element i
, it will search through an array,
if the element is not found, it gives the location of another node, where this search can be
done based on the condition that the location contains all elements in a reduced set of which
i
is a member.
So, it recursively reduces the search space by a factor of t
until the element is found.
Node(x) Attributes:
n
, the no of keys stored in the node currentlykeys
, an array of sizen
stored sorted and in ascending orderleaf?
, whether the node is a leaf nodechildren
, an array of sizen+1
pointers to the child nodes. Leaf nodes have this as NIL
Properties:
- Given a node
x
, any key has a child nodex.c
i
- All leaves have the same height.
-
t
is the minimum degree for a B-Tree.t >= 2
It is the minimum no of child nodes that a node can have For non-root nodes: 2.1.t-1
<=x.n
<=2t - 1
2.2t
<=size(x.c)
<=2t
For root nodes: 3.11
<=x.n
<=2t - 1
3.22
<=size(x.c)
<=2t
- The height of a tree is given by:
h <= log(((n + 1) / 2), t)
search(T.root, k)
, where T
is the B-Tree, and k
is the key we are searching for.
Result = (Node, index)
, where Node is the pointer to the node in which the key resides, and index is the index of the key in the node, such that, Node.key[index] = k
Overview:
- Find appropriate index where the key is likely to be found
- Check if the key is found
- Recursively search the subtree if not a leaf node. Since we are already at the index where the key can be found, we can look at the corresponding child node.
We insert a new node in the appropriate lead node. Since, simply inserting it at the leaf node would break the properties (3.2.1) of a B-Tree, we split the node at the median key, and move it up to the parent node. We perform this split operation recursively on the parent node, until the parent node is NIL.
This introduces two parts to the insertion algorithm - search for right place, node split. To perform the operation in a single pass, we split each full node we come along the way, so that we are assured that that each node's parent is not full. This way, we avoid doing two passes on the tree for insertion.
Note: Handle the case where the root node is full separately before traversing the tree.
- If x is a leaf node:
- Copy over keys and children to
i+1
, wherei
is the index where the key needs to be inserted. - Assign key and child pointer new values at
i
. - else
- Find index
i
where the node needs to be inserted - Split child node if full
- Recurse to child
Given a non-full internal node x
, and index i
such that x.ci
is a full child(n = 2t - 1).
- Create empty node. It is intended to be at the same level as the full node.
- Copy keys and children to new node
- Reduce
n
of original node, to indicate new node size; (delete extra keys,pointers) - Resize current node, by moving the child node pointers and keys right by one index.
- Assign key in current node the median key from old child
- Increment size of B-Tree node
Given a tree rooted at x
, key to remove from the tree - k
.
Given that child y
that precedes k
such that x.keys[i] = k
and x.children[i] = y
.
Given that child z
that succeeds k
such that x.keys[i] = k
and x.children[i+1] = z
.
- If
k
is inx
- If
x
is a lead node: 1. Deletek
fromx
. - If
x
is an internal node: 1. Ify.n >= t
:- Find
k'
such thatk' = predecessor(k)
. - Recursively delete
k'
. - Replace
k
byk'
inx
. 2. Ify.n < t
- Find
k'
such thatk' = successor(k)
. - Recursively delete
k'
. - Replace
k
byk'
inx
. 3. Ify.n <= z.n < t
: - Merge
k
and all ofz
intoy
such thatx.keys[i] != k
andx.children[i+1] != z
andy.n = 2t - 1
. - Free
z
- Recursively delete
k
fromy
- Find
- Else(
k
not inx
): - Find
i
such thatx.keys[i-1] < k < x.keys[i]
- If
x.children[i].n < t
and (x.children[i-1].n >= t
orx.children[i+1].n >= t
) 1. Move key fromx
down tox.children[i]
2. Move key fromx.children[i]
's immediate left or right sibling up intox
. 3. Move appropriate child pointer from sibling intox.children[i]
. - If
x.children[i].n < t
and (x.children[i-1].n < t
andx.children[i+1].n < t
): 1. Mergex.children[i]
with one sibling 2. Move a key fromx
down into merged node to become median node. - Recurse
- Predecessor: In the sorted list of the BTree's keys, get the predecessor of the given key according to the sorted order. This is done by recursing to the left-most leaf node of the left sub-tree.
- Successor: In the sorted list of the BTree's keys, get the successor of the given key according to the sorted order. This is done by recursing to the right-most leaf node of the right sub-tree.
- MergeChildren: When both children are low on keys(
n < t
), merge the child nodes into a single node with the key from the node as the median key.
- B-Tree can have sparse levels. Since the cost of search is O(log(h)), the cost of search will be less if the no of levels/height is low with dense nodes.