Note: The passage is originally written in Chinese and translated into English via ChatGPT.
B-Tree, LLRB
Hug usually smoothly transitions and introduces new concepts, but the transitions written in the notes can be a bit verbose, so I will provide a concise version here.
Limitations of BST
- First, let’s introduce two properties of BST: height and average depth.
- Height: Determines the worst-case runtime to find a node.
 - Average depth: Determines the average-case runtime to find a node.
 
 
- Now let’s look at various properties of BSTs.
- Worst case: $\Theta(N)$ height
 - Best case: $\Theta(\log N)$ height
 - Random trees have $\Theta(\log N)$ average depth and height (bushy).
 
 - Random trees perform well, but the problem arises when we keep inserting data from the right side, leading to imbalance, such as:
- add(“01-Jan-2019, 10:31:00”)
 - add(“01-Jan-2019, 18:51:00”)
 
 
B-Trees / 2-3 trees / 2-3-4 trees
To address the issue of imbalance, we keep inserting new keys into existing nodes and perform node splitting. This gives rise to a new type of tree called B-trees.
B-trees should be called juicy trees.
Definition
- B-trees are self-balancing trees and can be classified into different types:
- B-trees of order L=3 (like the ones used today) are also called a 2-3-4 tree or a 2-4 tree (as shown in the figure).
 - B-trees of order L=2 are also called a 2-3 tree (as shown in the figure).
 - B-trees of larger L are used in practice for databases, etc.
 
 
Construction Mechanism
(node-split)
- 
- Insert into a leaf until the number of keys exceeds L.
 
 - 
- In case of excessive keys, send the second (1st) node up and split the node.
 
 
Examples:
Properties
(or Invariants)
Based on the construction mechanism mentioned above, we can derive some properties of B-trees. B-trees are balanced in that:
- All leaves must be the same distance from the source.
- Consider: When a root split occurs, the height increases by 1, and all nodes descend one level.
 - If there is no root split, the height remains the same.
 
 - A non-leaf node with k items must have exactly k+1 children.
 
Runtime Analysis
- Height: $\Theta(\log N)$ for both worst-case and best-case scenarios.
- In the best case, each node has L items (full), so the height grows with $\Theta(\log_{L+1}N)$.
 - In the worst case, each node has only 1 item, so the height grows with $\Theta(\log_2N)$, the same as a BST.
 
 - Operation runtime: Both 
contains()andadd()have a complexity of $O(\log N)$.- Worst case for 
contains(): $C = (\text{number of layers}) \log_{L+1}{N} \cdot (\text{nodes per layer}) 1 \cdot (\text{work per node}) L$ - Worst case for 
add(): It involves some split operations, but the simplified complexity remains the same. 
 - Worst case for 
 
Limitations of B-Trees
Can you guess why the code implementation is not included above?
Left Leaning Red-Black Trees
Definition
A BST with left glue links that represents a 2-3 tree is often called a “Left Leaning Red Black Binary Search Tree” or LLRB.
In other words, LLRB is a BST that corresponds to a specific B-Tree. Some remarks about LLRB trees:
- LLRBs are normal BSTs!
 - There is a one-to-one correspondence between an LLRB and an equivalent 2-3 tree.
 - The red color is just a convenient representation. Red links don’t have any special functionality.
 
Properties
Similar to BSTs, LLRB trees have the following properties:
- All leaves must be the same distance from the source (counting only black links).
 - No node has two red links.
 
Construction Mechanism
- In principle, LLRB trees are based on 2-3 trees, where the smaller item in a node is moved down. For example:
 
- However, it is not practical to implement a 2-3 tree and then convert it to an LLRB tree. In the code
 
, LLRB trees are implemented directly using red links, rotation, and color flipping:
- 
Always use a red link when inserting (analogous to adding items to a node in a 2-3 tree).
 - 
When inserting items on the right, use
rotateLeft().
 - 
When inserting items on the left twice, use
rotateRight().
 - 
A new rule allows representing temporary 4-nodes as BST nodes with two red links.
 - 
In case of temporary 4-nodes, use
flipColors(). - 
If a rotation or color flip operation causes an additional violation, fix it.
 
Code Implementation
The following code includes the implementation of these methods, including the modified Node class, put(), the newly added rotate methods, flipColors(), and isRed().
 | 
 | 
Runtime Analysis
- Height: $O(logN)$
 contains(): $O(logN)$insert(): $O(logN)$- $O(logN)$ to add the new node
 - $O(logN)$ for rotation and color flip operations per insert (After simplification and ignoring constant terms, the time complexity remains O(log N).)
 
Summary of search trees
逻辑梳理
Cited from the slides [lec 18, 19sp]
- 
Binary search trees** are simple, but they are subject to imbalance.
 - 
2-3 Trees (B Trees) are balanced, but painful to implement and relatively slow.
 - 
LLRBs insertion is simple to implement (but delete is hard).
- Works by maintaining mathematical bijection with a 2-3 trees.
 
 - 
Java’s TreeMap is a red-black tree (not left leaning).
- 
Maintains correspondence with 2-3-4 tree (is not a 1-1 correspondence).
 - 
Allows glue links on either side (see Red-Black Tree).
 - 
More complex implementation, but significantly (?) faster.
 
 - 
 
Complexity/runtime对比
WC = worst case
| BST | B-Trees | LLRB | |
|---|---|---|---|
| Height | $O(logN)$ | $\Theta(logN)$ | $O(logN)$ | 
| WC:$O(N)$ | WC:$O(log N)$ | WC:$O(log N)$ | |
contains() | 
$O(logN)$ | $O(log N)$ | $O(logN)$ | 
| WC:$O(N)$ | WC:$O(log N)$ | WC:$O(log N)$ | |
insert() | 
$O(logN)$ | $O(log N)$ | $O(logN)$ | 
| WC:$O(N)$ | WC:$O(log N)$ | WC:$O(log N)$ | 
- Explanation: B-trees and LLRB trees are self-balanced, which prevents the extreme case of a BST degenerating into a linked list. This property makes them faster compared to regular BSTs.
 
In summary:
- B-trees achieve self-balancing to avoid the worst-case complexity of regular BSTs.
 - LLRB trees not only inherit the self-balancing property of B-trees but also retain the ease of implementation characteristic of regular BSTs.