The trie: searching by prefix
Imagine typing into a search box and seeing suggestions appear after each keystroke. Behind that is often a [[trie|trie]] (say "try"), a tree where each edge is a single character and each path from the root spells out a prefix. Words that share a prefix share the same path until they diverge. To check whether a word is present, you walk down one character at a time — so a lookup costs O(L) where L is the length of the word, *no matter how many words the trie holds*.
Trie holding {"cat", "car", "can", "dog"}
(root)
/ \
c d
| |
a o
/ | \ |
t r n g*
* * *
(* marks the end of a real word)
Lookup "car": root -> c -> a -> r , and r is marked => found, O(3).struct TrieNode {
TrieNode* child[26] = {nullptr}; // one slot per lowercase letter
bool isEnd = false; // true if a word ends here
};
void insert(TrieNode* root, const string& w) {
TrieNode* cur = root;
for (char ch : w) {
int i = ch - 'a';
if (!cur->child[i]) cur->child[i] = new TrieNode();
cur = cur->child[i];
}
cur->isEnd = true; // mark the final node
}The payoff is prefix queries, which a hash table cannot do well. To autocomplete "ca", walk to the "ca" node, then collect every word in the subtree below it. A hash table can tell you if "ca" is *exactly* a key, but it has no idea which keys *start with* "ca". That is why tries power autocomplete, spell-checkers, and IP routing tables.
Union-Find: are these connected?
[[union-find|Union-find]] (also called a disjoint-set structure) maintains a collection of groups and answers one question astonishingly fast: "are x and y in the same group?" It supports two operations — union(x, y) merges two groups, and find(x) returns a representative for x's group. Two elements are connected exactly when they share a representative. The idea: store each group as a tree of parent pointers, with the root acting as the group's name.
Elements 0..5. parent[i] points up; a root points to itself.
Start: each its own group After union(0,1), union(2,3), union(1,3):
0 1 2 3 4 5 3
^ ^ ^ ^ ^ ^ / | \
(each is its own root) 1 2 (and 0 hangs under 1)
|
0
find(0) walks 0 -> 1 -> 3 ; find(2) walks 2 -> 3 ; same root => connected.Two small optimizations make this nearly free. Union by rank always hangs the shorter tree under the taller one, so trees stay shallow. Path compression re-points every node you touch during a find() straight at the root, flattening the tree as a side effect. Together they make each operation run in near-O(1) time (formally, the inverse Ackermann function α(n), which is below 5 for any n you will ever meet).
int parent[N], rnk[N];
void init(int n){ for(int i=0;i<n;i++){ parent[i]=i; rnk[i]=0; } }
int find(int x){ // path compression
if (parent[x] != x) parent[x] = find(parent[x]);
return parent[x];
}
void unite(int a, int b){ // union by rank
a = find(a); b = find(b);
if (a == b) return;
if (rnk[a] < rnk[b]) swap(a, b);
parent[b] = a;
if (rnk[a] == rnk[b]) rnk[a]++;
}What to learn next
Two structures into a big field. A few more are worth knowing by name so you recognize them when a problem calls for them, and can look up the details. They all answer some version of "summarize or update a range of values quickly."
- Segment tree — supports both range queries (sum/min/max over a[i..j]) and point updates in O(log n); a workhorse for problems that mix reads and writes over intervals.
- Fenwick tree (binary indexed tree) — a lighter, smaller cousin for prefix sums with point updates, also O(log n); fewer features than a segment tree but tiny to code.
- Balanced-tree libraries — in real code you rarely hand-roll a balanced tree. Reach for std::map / std::set (ordered, O(log n)) or std::unordered_map / std::unordered_set (hashed, average O(1)); they are the balanced tree and hash table you already understand, battle-tested.