Overview
What this concept solves
Radix sort sorts by one digit at a time — and somehow ends up sorted overall. Pass 1 sorts the list by the ones digit. Pass 2 sorts the result by the tens. Pass 3 by the hundreds. After d passes (where d is the number of digits), the array is fully sorted. The magic is that each pass uses a stable sub-sort (counting sort, almost always), so the work of earlier passes survives each later pass.
This is the least-significant-digit (LSD) variant — the one almost always meant by 'radix sort.' Sort from the ones up, and you finish when the most-significant digit pass is done. The alternative, MSD radix, sorts from the top digit down and is the right shape for strings, but LSD is simpler to reason about and dominates for fixed-width integer keys.
The complexity is O(d · (n + k)) where d is the number of digits and k is the radix (10 for decimal, 256 for byte-radix). When d is small (fixed-width integers) and k is small (any reasonable radix), radix sort runs in linear time — and it really does, in practice. It's the sort of choice for billion-element integer datasets where O(n log n) comparison sorts get crushed by the log factor.
Mechanics
How it works
One pass (the d-th digit)
- Use a stable inner sort — usually counting sort — keyed on the d-th digit of each number.
- Walk the current list left-to-right. For each value, extract digit d and drop the value into
bucket[digit](one of 10 buckets in base 10). - Walk buckets 0 → 9 in order, appending their contents to the new list.
- Because the inner sort is stable, two values with the same d-th digit preserve their previous relative order — which encodes the result of all earlier passes.
All passes
- Find the maximum value to know how many digits you need:
d = digits(max). - For
p = 0tod - 1: run one pass keyed on digit at positionp(ones, tens, hundreds, …). - After the last pass, the array is sorted.
Why stability is the load-bearing property
Imagine pass 1 (ones digit) finishes, then pass 2 (tens digit) re-orders by tens. Two values with the same tens digit must keep the order they had after pass 1 — otherwise pass 1's work is wasted. Stable counting sort guarantees that. If you used quicksort as the inner pass, radix would silently corrupt earlier passes' work.
MSD vs LSD
LSD radix sorts the bottom digit first; you finish in d passes regardless of input. MSD sorts the top digit first, then recursively sorts each bucket — finishes early when buckets become single elements. MSD is the right shape for variable-length strings (it can stop at common prefixes), LSD is the right shape for fixed-width integers. Most textbook 'radix sort' is LSD.
Interactive prototype
Run it. Break it. Tune it.
Sandboxed simulation embedded right in the page. No setup, no install.
About this simulation
Pass 1 sorts by the ones digit (drop each number into bucket 0–9, then collect). Pass 2 sorts by the tens. Pass 3 by the hundreds. After the highest digit, the list is fully sorted. Press Play to watch every pass, Step / Back to walk each placement, Shuffle for a new list.
Hands-on
Try these on your own
Open the prototype above, run each experiment, predict the answer, then verify.
Play through all the digit passes
Press Play. Watch the Sorting by stat change: Ones, then Tens, then Hundreds. After each pass, the new list is the contents of buckets 0 → 9 in order. Notice the list isn't fully sorted until the highest digit pass completes — earlier passes look almost random.
Step into one pass
Use Step during a pass. Each step either reads a value from the list (orange highlight on the digit being read), drops it into the matching bucket, or — at the end — collects all buckets into the new list. Watch the underlined digit in the chip — that's the one being keyed on.
Verify the stability invariant
Pick two values with the same tens digit — say 42 and 47. After pass 2 (tens), both should land in bucket 4. Check that 42 and 47 keep the relative order they had after pass 1 (which sorted by ones digit). That's stability surviving the pass — exactly what makes radix sort correct.
Shuffle and watch d grow with magnitude
Press Shuffle. If the random max is 87, only 2 passes run (ones, tens). If it's 752, 3 passes run. d depends on the max value, not n. That's why radix is so good on fixed-width keys: d is bounded by the key width, so the work is linear in n.
In practice
When to use it — and what you give up
When it's the right tool
- Fixed-width integer keys at scale — IPv4 addresses, timestamps, 64-bit hashes, SSNs, phone numbers. d is constant, total work is O(n).
- Sorting strings of bounded length — use MSD radix to descend into common prefixes.
- External sorting of integers — radix can be implemented with sequential I/O; great for disk-backed sorts.
- GPU sorting — the per-digit counting step parallelises beautifully across thousands of threads.
- When O(n log n) is hitting the log-factor wall — at n = 10⁹, log₂(n) ≈ 30; that 30× factor is what radix sort eliminates.
When to reach for something else
- Floating-point keys — workable but tricky (need to handle sign bit, exponent). Comparison sort is simpler.
- Variable-length keys with no max — MSD can recurse forever; comparison sorts handle this directly.
- Small n — comparison sort's constants beat radix's allocations and bucket maintenance for n < ~100.
- Highly skewed distributions — most buckets are empty; the per-pass overhead dominates.
Pros
- Linear in n for fixed-width keys — O(d · (n + k)) with d and k constant.
- Stable — naturally, because the inner counting sort is stable.
- Beats comparison sorts at very large n — when log n itself becomes a noticeable constant factor.
- Parallelisable per-digit — counting and bucketing can be sharded across threads or GPU cores.
- No comparisons needed — sidesteps the n log n lower bound the same way counting sort does.
Cons
- O(n + k) extra memory — buckets and the inner counting sort's count array.
- Bad for variable-length or unbounded-range keys — d grows with the input.
- Per-pass constant factor isn't tiny — for small n, quicksort/Timsort win.
- Less general than comparison sorts — strings, floats, and custom comparators all need work.
Reference
Code & further reading
A minimal reference implementation and pointers worth bookmarking.
// LSD radix sort for non-negative integers.
// Stable, O(d · (n + b)) where d = digit count, b = radix.
function radixSort(a: number[], radix = 10): number[] {
if (a.length === 0) return a;
const max = Math.max(...a);
let out = a.slice();
for (let exp = 1; exp <= max; exp *= radix) {
out = countingSortByDigit(out, exp, radix);
}
return out;
}
function countingSortByDigit(a: number[], exp: number, b: number): number[] {
const n = a.length;
const count = new Array(b).fill(0);
const out = new Array(n);
// Tally by the current digit.
for (let i = 0; i < n; i++) count[Math.floor(a[i] / exp) % b]++;
// Prefix-sum into end positions.
for (let v = 1; v < b; v++) count[v] += count[v - 1];
// Place right-to-left so the inner sort is stable.
for (let i = n - 1; i >= 0; i--) {
const d = Math.floor(a[i] / exp) % b;
out[--count[d]] = a[i];
}
return out;
}
// Example
radixSort([170, 45, 75, 90, 802, 24, 2, 66]);
// → [2, 24, 45, 66, 75, 90, 170, 802]References & further reading
6 sources- Paperdl.acm.org
Herman Hollerith — An Electric Tabulating System (1889)
Hollerith invented mechanical radix sort for the 1890 US Census. The card-sorting machines that became IBM directly implemented this algorithm — sorting one digit at a time, one pass per digit.
- Bookmitpress.mit.edu
CLRS — Introduction to Algorithms, §8.3 Radix Sort
Includes the proof that LSD radix is correct iff the inner sort is stable, and the running-time analysis as a function of d, n, k.
- Bookalgs4.cs.princeton.edu
Sedgewick & Wayne — Algorithms §5.1 String Sorts
Covers both LSD and MSD radix sort, with the application to variable-length strings that LSD struggles with. The canonical undergraduate treatment.
- Paperresearch.nvidia.com
Duane Merrill & Andrew Grimshaw — High Performance and Scalable Radix Sorting (2011)
NVIDIA paper on GPU radix sort. Shows why radix wins on massively parallel hardware: each digit pass shards perfectly across thousands of threads.
- Articleen.wikipedia.org
Wikipedia — Radix sort
Walks through both LSD and MSD variants, the negative-number handling tricks, and the floating-point radix sort variant (sort by bit pattern with sign-bit flip).
- Bookwww-cs-faculty.stanford.edu
Engineering radix sort — Knuth, TAOCP Vol. 3 §5.2.5
Knuth's chapter analyses the optimal radix choice (it's surprisingly large — often 256 for modern hardware) and the cache-miss tradeoffs.
Knowledge check
Did the prototype land?
Quick questions, answers revealed on submit. No scoring saved.
question 01 / 03
Why does LSD radix sort require the inner per-digit sort to be stable?
question 02 / 03
Sorting 1 billion 64-bit integers with byte-radix (k = 256). About how many passes does LSD radix sort do?
question 03 / 03
When is MSD radix sort preferable to LSD?
0/3 answered
Was this concept helpful?
Tell us what worked, or what to improve. We read every note.