Overview
What this concept solves
Bucket sort splits the value range into a handful of buckets, sorts each bucket on its own, then concatenates them. If the input is roughly uniform across the range, the buckets are small and the per-bucket sorts are fast — so fast that the total work approaches O(n). It's the sort of choice when you know your data is, or can be made, roughly evenly distributed.
The three phases are mechanical. Scatter: for each value, compute its bucket (e.g. v / range) and append it there. Sort: sort each bucket on its own — usually with insertion sort, since buckets are small. Gather: walk buckets 0 → B-1 in order and concatenate. Because the buckets are ordered by value range and each bucket is sorted internally, the concatenation is sorted.
The expected runtime is O(n + n²/B + B) — linear when B is Θ(n) and the data is uniform. The catch is skew: if all the input lands in one bucket, you're really just running insertion sort on the whole input, and the algorithm degenerates to O(n²). Bucket sort and counting sort are cousins — counting sort uses one bucket per value, bucket sort uses one bucket per range — but bucket sort is the right tool for continuous values (floats from a uniform distribution) where you can't afford a count slot per distinct value.
Mechanics
How it works
Phase 1 — Scatter
- Create B empty buckets covering the value range.
- For each value
vin the input: append it tobucket[floor(v / range_size)]. - After the pass, every value is in the bucket for its value range — but not sorted within the bucket yet.
Phase 2 — Sort each bucket
- Walk buckets 0 → B-1.
- Sort each non-empty bucket with any sort. Insertion sort is the typical choice because buckets are expected to be small (~n/B elements each).
- After this phase, each bucket is sorted internally; the bucket order is already correct.
Phase 3 — Gather
- Walk buckets 0 → B-1 in order.
- Append each bucket's contents to the output.
- Done.
Why bucket sort is near-linear when data is uniform
Pick B = n buckets. Then the expected bucket size is 1, the per-bucket insertion sort is O(1) on expectation, and the total work is O(n) for scatter + O(n) for the n single-element sorts + O(n) for gather = O(n). Even with B = n/10, expected bucket size is 10 and insertion sort on size-10 takes ~100 ops per bucket, n/10 buckets = 10n ops. Still O(n).
Skew is bucket sort's enemy
If 90% of your values land in bucket 0, bucket sort isn't sorting them — insertion sort is, on a list of 0.9n elements, in O((0.9n)²). The whole algorithm degenerates to insertion sort. The fix is either to pick bucket boundaries adaptively (quantile-based), to use a comparison sort instead of insertion as the inner sort (recursive bucket sort), or to switch to radix when the data isn't uniform.
Interactive prototype
Run it. Break it. Tune it.
Sandboxed simulation embedded right in the page. No setup, no install.
About this simulation
Phase 1 scatters each value into the bucket for its range. Phase 2 sorts each bucket individually. Phase 3 concatenates buckets 0 → B-1 to produce the sorted output. Press Play to watch the three phases, Step / Back to walk each placement, Shuffle for a new list.
Hands-on
Try these on your own
Open the prototype above, run each experiment, predict the answer, then verify.
Watch the three phases run
Press Play. The Phase stat shows Setup → Scatter → Sort → Gather → Done. In Scatter, each value moves from the input list into its range-bucket. In Sort, each non-empty bucket sorts internally (the bucket border turns solid when it's being worked on). In Gather, the buckets concatenate in order into the output row.
Step the scatter phase
Use Step to walk Phase 1. Each step reads one input value (orange highlight) and drops it into the matching bucket (the bucket border lights up). Notice that scatter is O(n) — one pass, one division, one append per value.
Watch a bucket sort itself
Step into Phase 2. Each non-empty bucket gets sorted on its own. For a bucket with one value, nothing happens — already sorted. For a bucket with two or three values, the inner sort runs in microseconds; that's why bucket sort approaches O(n) when buckets stay small.
Shuffle and look for skew
Press Shuffle. If the random values cluster in one range (e.g. all in 0–19), watch one bucket fill up while others stay empty. That's bucket sort's worst case: the heavy bucket dominates the inner sort. Real implementations either pick bucket boundaries adaptively or fall back to a comparison sort when skew is detected.
In practice
When to use it — and what you give up
When it's the right tool
- Uniform-ish floating-point data — like sorting random doubles in [0, 1). The textbook bucket-sort use case.
- Sorting random samples — Monte Carlo outputs, random hashes from a uniform hash function, evenly-spread sensor readings.
- As an internal step in distributed sorts — Hadoop's Terasort scatters records into B partitions (buckets), each node sorts its bucket locally, then results gather. That's bucket sort on a cluster.
- As external sort's bucket-scatter step — when sorting more data than fits in RAM and you know the rough distribution.
When to reach for something else
- Skewed data — bucket sort degenerates to insertion sort on the heavy bucket. Use a comparison sort instead.
- Discrete integer values in a small range — use counting sort. Same idea but with one bucket per value.
- Fixed-width integer keys at scale — use radix. More robust to skew.
- You don't know the value range — bucket sort needs
min, max, range_sizeup front.
Pros
- Near-linear expected runtime on uniform data — O(n) when B = Θ(n).
- Naturally parallel — each bucket's sort is independent. Used in distributed sorts (Hadoop, Spark).
- Stable if the inner sort is stable (insertion is).
- Adaptive to data density — if you know the distribution, you can size buckets to keep them balanced.
- Conceptually simple — three phases, easy to reason about, easy to debug.
Cons
- O(n²) worst case when all values land in one bucket.
- Requires knowing the value range in advance.
- Memory-heavy — B buckets plus per-bucket overhead.
- Not great for integers — counting and radix sort do the same job with better guarantees.
- Bucket boundary choice matters a lot — bad boundaries kill performance silently.
Reference
Code & further reading
A minimal reference implementation and pointers worth bookmarking.
// Bucket sort — expected O(n) on uniform data, O(n²) worst case.
function bucketSort(a: number[], minVal: number, maxVal: number, B: number): number[] {
const buckets: number[][] = Array.from({ length: B }, () => []);
const range = (maxVal - minVal) / B;
// Phase 1: scatter into buckets.
for (const v of a) {
const idx = Math.min(B - 1, Math.floor((v - minVal) / range));
buckets[idx].push(v);
}
// Phase 2: sort each bucket (insertion sort is typical).
for (const bucket of buckets) bucket.sort((x, y) => x - y);
// Phase 3: gather in order.
const out: number[] = [];
for (const bucket of buckets) out.push(...bucket);
return out;
}
// Example — 9 values in [0, 100), 5 buckets of width 20.
bucketSort([29, 25, 3, 49, 9, 37, 21, 43, 17], 0, 100, 5);
// → [3, 9, 17, 21, 25, 29, 37, 43, 49]References & further reading
6 sources- Bookmitpress.mit.edu
CLRS — Introduction to Algorithms, §8.4 Bucket Sort
The expected-time analysis assuming uniform distribution. Shows how the n²/B² expected-work-per-bucket bound flattens to O(n) when B = Θ(n).
- Articleopendsa-server.cs.vt.edu
OpenDSA — Bucket Sort
Interactive textbook chapter with visualisations and code; pairs nicely with the prototype above.
- Papersortbenchmark.org
Hadoop TeraSort — Owen O'Malley (2008)
Yahoo's record-setting TeraSort uses bucket-sort-by-partition across a cluster. Real industrial bucket sort at petabyte scale.
- Docsgithub.com
Spark — RangePartitioner source
Spark samples the input to determine quantile-based bucket boundaries before partitioning. A production fix to bucket-sort's skew problem.
- Articleen.wikipedia.org
Wikipedia — Bucket sort
Covers the variants (generic bucket sort, ProxmapSort, postman's sort) and the relationship to counting and radix sort.
- Articlebrilliant.org
Brilliant — Bucket Sort
Clean walkthrough with the average-case analysis and a discussion of the right number of buckets B as a function of n.
Knowledge check
Did the prototype land?
Quick questions, answers revealed on submit. No scoring saved.
question 01 / 03
Bucket sort's expected runtime is O(n) when the data is uniform. What happens when the data is heavily skewed?
question 02 / 03
How does bucket sort relate to counting sort?
question 03 / 03
Why is bucket sort a natural fit for distributed sorting frameworks like Hadoop's TeraSort?
0/3 answered
Was this concept helpful?
Tell us what worked, or what to improve. We read every note.