Intermediate

Trees

The branching data structure under databases, filesystems, autocompletes, and priority queues. Nine trees, from the plain binary search tree through the self-balancing workhorses to the disk-friendly B-trees and the range-query structures competitive programmers swear by.

data-structuressearchfundamentals

What is Trees?

The 60-second primer

A tree is a graph with no cycles and exactly one path between any two nodes — that constraint buys you something powerful: a natural hierarchy. Every tree has a root, children branch from it, and the structure repeats recursively until you hit leaves. That branching shape is why trees show up everywhere: every time you need to search a sorted set, index a database, autocomplete a word, or answer 'what's the sum of this range?', a tree is almost certainly doing the work.

The nine trees here span three distinct contracts. The BST is the foundation — put smaller keys left, larger right, and you get O(log n) search on average. AVL and red-black trees fix the worst-case problem by keeping the tree balanced, guaranteeing O(log n) at all times; the red-black tree is the one you'll find in every production standard library. The heap abandons the search property entirely and optimises for one thing: finding the minimum or maximum in O(1). The trie does something completely different — it keys on characters, not comparisons, making prefix queries blazing fast. Finally, B-trees and B+ trees go wide instead of deep to match the block size of a disk, and segment and Fenwick trees are specialised range-query machines used everywhere from competitive programming to databases.

You do not need to implement all nine from scratch. You need to know what each one promises and what it costs to keep that promise. O(log n) is the whole game — and it only holds when the tree stays balanced. Once that's clear, everything else is a matter of which invariant you're willing to enforce.

Where this shows up

  • Database and filesystem indexes — every B+ tree index in PostgreSQL, MySQL (InnoDB), and SQLite lets the engine jump to a row in O(log n) disk reads instead of scanning the whole table; ext4 and NTFS directory structures are B-trees too.
  • Language standard librariesstd::map and std::set in C++, TreeMap and TreeSet in Java, and Python's sortedcontainers are all backed by red-black trees; you get O(log n) insert, delete, and lookup with in-order iteration for free.
  • Operating system schedulers — the Linux kernel's Completely Fair Scheduler (CFS) stores runnable tasks in a red-black tree keyed by virtual runtime; the scheduler always picks the leftmost node in O(log n).
  • Priority queues, Dijkstra, and task schedulers — binary heaps power Python's heapq, Java's PriorityQueue, and every implementation of Dijkstra's algorithm; OS job queues and event loops use the same structure.
  • Autocomplete, spell-check, and IP routing — tries back the autocomplete in your IDE and search bar; spell-checkers do prefix and edit-distance search over a trie; internet routers do longest-prefix matching on a trie of CIDR blocks.
  • Competitive programming and analytics range queries — segment trees answer 'sum / min / max over range [l, r]' with point updates in O(log n); Fenwick trees (BITs) do the same for prefix sums with half the code and half the memory.
  • Compilers and expression evaluation — parse trees and abstract syntax trees are the canonical internal representation of every compiler and interpreter; expression evaluation, constant folding, and code generation all walk a tree.

Balanced vs unbalanced: why O(log n) is the whole game

A perfect binary tree of n nodes has height ⌊log₂ n⌋ — that's why balanced trees give O(log n) operations. But insert keys in sorted order into a plain BST and the tree degrades into a linked list: height becomes n, and every search is O(n). Balance is not optional — it's the difference between a data structure and a slow list. AVL trees enforce a strict height difference of at most 1 between subtrees. Red-black trees enforce a looser colour invariant that guarantees height ≤ 2 log₂(n + 1). The trade-off: AVL trees rebalance more eagerly (better for read-heavy workloads); red-black trees rebalance lazily (better for write-heavy workloads, which is why they dominate in practice).

Side-by-side

How they compare

The same concepts, on the same axes. Use this as a map; the individual pages are the territory.

01Binary Search Tree
Self-balancing?
No
Search / Insert
O(log n) avg / O(n) worst
Ordered traversal?
Yes (in-order)
Best for
Teaching the search property; small data sets where balance is unlikely to degrade.
02AVL Tree
Self-balancing?
Yes — strict height balance
Search / Insert
O(log n) guaranteed
Ordered traversal?
Yes (in-order)
Best for
Read-heavy workloads where lookup speed matters more than insert overhead.
03Red-Black Tree
Self-balancing?
Yes — colour invariant
Search / Insert
O(log n) guaranteed
Ordered traversal?
Yes (in-order)
Best for
General-purpose ordered maps and sets; the default in most standard libraries.
04Binary Heap
Self-balancing?
N/A (not a search tree)
Search / Insert
Search O(n) / Insert O(log n)
Ordered traversal?
No
Best for
Priority queues, Dijkstra's algorithm, and any 'give me the smallest/largest fast' workload.
05Trie (Prefix Tree)
Self-balancing?
N/A
Search / Insert
O(L) by key length
Ordered traversal?
Yes (lexicographic)
Best for
Autocomplete, spell-check, IP routing, and any query keyed on string prefixes.
06B-Tree
Self-balancing?
Yes — wide, multi-key nodes
Search / Insert
O(log n) disk reads
Ordered traversal?
Yes (in-order)
Best for
Filesystem metadata, database indexes where every level is a disk read.
07B+ Tree
Self-balancing?
Yes — data only in leaves
Search / Insert
O(log n) disk reads
Ordered traversal?
Yes — leaves linked for range scans
Best for
Database range queries; the dominant structure behind SQL table indexes.
08Segment Tree
Self-balancing?
N/A (static structure)
Search / Insert
O(log n) query + update
Ordered traversal?
No
Best for
Range sum / min / max / GCD with point or range updates; competitive programming and analytics.
09Fenwick Tree (BIT)
Self-balancing?
N/A (static structure)
Search / Insert
O(log n) prefix query + update
Ordered traversal?
No
Best for
Prefix sums with point updates; simpler and faster constant than segment trees for sums only.

Decision guide

Which one should you use?

A practical tour of when each algorithm wins.

How to pick

  • Need an ordered map or set with O(log n) insert, delete, lookup, and in-order iterationred-black tree. It's already in your standard library (std::map, TreeMap, SortedDict). Reach for it first.
  • Read-heavy workload where the tightest possible height mattersAVL tree. AVL enforces strict height balance, so lookups hit fewer levels — at the cost of more rotations on insert. Worth it only when reads dominate heavily.
  • Need the minimum or maximum element in O(1), and you'll be inserting and removing frequentlybinary heap. It's an array, it's cache-friendly, and it powers every priority queue and heap-based shortest-path algorithm.
  • Keys are strings and queries are prefix-basedtrie. Autocomplete, spell-check, IP longest-prefix matching, and phone-book style lookups all fit naturally. Each lookup is O(L) on key length, independent of how many keys are stored.
  • Data lives on disk or in a block-addressable storeB+ tree. Its wide fan-out means the tree is only 3–4 levels deep even for billions of rows; every level is a disk read, so minimising height is everything. If you need range scans, B+ trees beat B-trees because the leaves form a linked list.
  • You need range aggregate queries (sum, min, max, GCD) with updatessegment tree. It supports both point updates and range updates with lazy propagation in O(log n), and it can hold any associative aggregate.
  • You only need prefix sums with point updates and every nanosecond countsFenwick tree (BIT). It's half the code and half the memory of a segment tree, with better cache behaviour. If you find yourself needing range min/max or range updates, switch to a segment tree.

The shape of the problem tells you the tree

Ask three questions: Is the key a string or a prefix? → trie. Is the data on disk? → B+ tree. Do you need range aggregates? → segment or Fenwick tree. If none of those apply and you need an ordered collection, red-black tree. If you only need the extreme element fast, heap. The BST is for learning; the others are for shipping.

Concepts in this track

9 concepts, in order

Each links to a concept page with its own explanation, prototype, and quiz.

01

Binary Search Tree

Smaller keys go left, larger keys go right. Every search, insert, and delete is one walk down the tree — fast when balanced, a linked list when not.

Beginner10 mintry it
02

AVL Tree

A BST that refuses to lean. Track every node's balance factor and rotate the instant it hits ±2 — guaranteeing O(log n) forever.

Intermediate11 mintry it
03

Red-Black Tree

Looser balancing through five color rules. Fewer rotations than AVL — which is why it's the tree inside your language's standard map and the Linux kernel.

Advanced12 mintry it
04

Binary Heap

A complete binary tree where every parent beats its children. Stored in a plain array, it makes the smallest (or largest) item O(1) to peek and O(log n) to pull.

Beginner10 mintry it
05

Trie (Prefix Tree)

A tree keyed by characters, not whole values. Shared prefixes share a path — the structure behind autocomplete, spell-check, and IP routing tables.

Intermediate10 mintry it
06

B-Tree

A wide, shallow tree where each node holds many keys. Built so one disk read fetches a whole node — the index structure under most databases and filesystems.

Advanced12 mintry it
07

B+ Tree

A B-tree that keeps all real data in the leaves and links those leaves in a chain — so range scans walk a list instead of re-descending the tree.

Advanced12 mintry it
08

Segment Tree

A tree of ranges over an array. Answer 'sum/min/max of indices l..r' and update any element, both in O(log n) — the competitive-programmer's range engine.

Advanced11 mintry it
09

Fenwick Tree (BIT)

Prefix sums with updates in O(log n) using nothing but an array and a bit trick. Smaller and simpler than a segment tree when all you need is sums.

Advanced11 mintry it

Related tracks

If this one clicks, try these next.