Bucket Sort — Sorting

Overview

What this concept solves

Bucket sort splits the value range into a handful of buckets, sorts each bucket on its own, then concatenates them. If the input is roughly uniform across the range, the buckets are small and the per-bucket sorts are fast — so fast that the total work approaches O(n). It's the sort of choice when you know your data is, or can be made, roughly evenly distributed.

The three phases are mechanical. Scatter: for each value, compute its bucket (e.g. v / range) and append it there. Sort: sort each bucket on its own — usually with insertion sort, since buckets are small. Gather: walk buckets 0 → B-1 in order and concatenate. Because the buckets are ordered by value range and each bucket is sorted internally, the concatenation is sorted.

The expected runtime is O(n + n²/B + B) — linear when B is Θ(n) and the data is uniform. The catch is skew: if all the input lands in one bucket, you're really just running insertion sort on the whole input, and the algorithm degenerates to O(n²). Bucket sort and counting sort are cousins — counting sort uses one bucket per value, bucket sort uses one bucket per range — but bucket sort is the right tool for continuous values (floats from a uniform distribution) where you can't afford a count slot per distinct value.

Mechanics

How it works

Phase 1 — Scatter

Create B empty buckets covering the value range.
For each value v in the input: append it to bucket[floor(v / range_size)].
After the pass, every value is in the bucket for its value range — but not sorted within the bucket yet.

Phase 2 — Sort each bucket

Walk buckets 0 → B-1.
Sort each non-empty bucket with any sort. Insertion sort is the typical choice because buckets are expected to be small (~n/B elements each).
After this phase, each bucket is sorted internally; the bucket order is already correct.

Phase 3 — Gather

Walk buckets 0 → B-1 in order.
Append each bucket's contents to the output.
Done.

Why bucket sort is near-linear when data is uniform

Pick B = n buckets. Then the expected bucket size is 1, the per-bucket insertion sort is O(1) on expectation, and the total work is O(n) for scatter + O(n) for the n single-element sorts + O(n) for gather = O(n). Even with B = n/10, expected bucket size is 10 and insertion sort on size-10 takes ~100 ops per bucket, n/10 buckets = 10n ops. Still O(n).

Skew is bucket sort's enemy

If 90% of your values land in bucket 0, bucket sort isn't sorting them — insertion sort is, on a list of 0.9n elements, in O((0.9n)²). The whole algorithm degenerates to insertion sort. The fix is either to pick bucket boundaries adaptively (quantile-based), to use a comparison sort instead of insertion as the inner sort (recursive bucket sort), or to switch to radix when the data isn't uniform.

Interactive prototype

Run it. Break it. Tune it.

Sandboxed simulation embedded right in the page. No setup, no install.

simulation › Bucket Sort

About this simulation

Phase 1 scatters each value into the bucket for its range. Phase 2 sorts each bucket individually. Phase 3 concatenates buckets 0 → B-1 to produce the sorted output. Press Play to watch the three phases, Step / Back to walk each placement, Shuffle for a new list.

Hands-on

Try these on your own

Open the prototype above, run each experiment, predict the answer, then verify.

try 01

Watch the three phases run

Press Play. The Phase stat shows Setup → Scatter → Sort → Gather → Done. In Scatter, each value moves from the input list into its range-bucket. In Sort, each non-empty bucket sorts internally (the bucket border turns solid when it's being worked on). In Gather, the buckets concatenate in order into the output row.

try 02

Step the scatter phase

Use Step to walk Phase 1. Each step reads one input value (orange highlight) and drops it into the matching bucket (the bucket border lights up). Notice that scatter is O(n) — one pass, one division, one append per value.

try 03

Watch a bucket sort itself

Step into Phase 2. Each non-empty bucket gets sorted on its own. For a bucket with one value, nothing happens — already sorted. For a bucket with two or three values, the inner sort runs in microseconds; that's why bucket sort approaches O(n) when buckets stay small.

try 04

Shuffle and look for skew

Press Shuffle. If the random values cluster in one range (e.g. all in 0–19), watch one bucket fill up while others stay empty. That's bucket sort's worst case: the heavy bucket dominates the inner sort. Real implementations either pick bucket boundaries adaptively or fall back to a comparison sort when skew is detected.

In practice

When to use it — and what you give up

When it's the right tool

Uniform-ish floating-point data — like sorting random doubles in [0, 1). The textbook bucket-sort use case.
Sorting random samples — Monte Carlo outputs, random hashes from a uniform hash function, evenly-spread sensor readings.
As an internal step in distributed sorts — Hadoop's Terasort scatters records into B partitions (buckets), each node sorts its bucket locally, then results gather. That's bucket sort on a cluster.
As external sort's bucket-scatter step — when sorting more data than fits in RAM and you know the rough distribution.

When to reach for something else

Skewed data — bucket sort degenerates to insertion sort on the heavy bucket. Use a comparison sort instead.
Discrete integer values in a small range — use counting sort. Same idea but with one bucket per value.
Fixed-width integer keys at scale — use radix. More robust to skew.
You don't know the value range — bucket sort needs min, max, range_size up front.

Pros

Near-linear expected runtime on uniform data — O(n) when B = Θ(n).
Naturally parallel — each bucket's sort is independent. Used in distributed sorts (Hadoop, Spark).
Stable if the inner sort is stable (insertion is).
Adaptive to data density — if you know the distribution, you can size buckets to keep them balanced.
Conceptually simple — three phases, easy to reason about, easy to debug.

Cons

O(n²) worst case when all values land in one bucket.
Requires knowing the value range in advance.
Memory-heavy — B buckets plus per-bucket overhead.
Not great for integers — counting and radix sort do the same job with better guarantees.
Bucket boundary choice matters a lot — bad boundaries kill performance silently.

Reference

Code & further reading

A minimal reference implementation and pointers worth bookmarking.

bucket-sort.ts

// Bucket sort — expected O(n) on uniform data, O(n²) worst case.
function bucketSort(a: number[], minVal: number, maxVal: number, B: number): number[] {
  const buckets: number[][] = Array.from({ length: B }, () => []);
  const range = (maxVal - minVal) / B;

  // Phase 1: scatter into buckets.
  for (const v of a) {
    const idx = Math.min(B - 1, Math.floor((v - minVal) / range));
    buckets[idx].push(v);
  }

  // Phase 2: sort each bucket (insertion sort is typical).
  for (const bucket of buckets) bucket.sort((x, y) => x - y);

  // Phase 3: gather in order.
  const out: number[] = [];
  for (const bucket of buckets) out.push(...bucket);
  return out;
}

// Example — 9 values in [0, 100), 5 buckets of width 20.
bucketSort([29, 25, 3, 49, 9, 37, 21, 43, 17], 0, 100, 5);
// → [3, 9, 17, 21, 25, 29, 37, 43, 49]

References & further reading

6 sources

Knowledge check

Did the prototype land?

Quick questions, answers revealed on submit. Sign in to save your best score.

question 01 / 03

Bucket sort's expected runtime is O(n) when the data is uniform. What happens when the data is heavily skewed?

question 02 / 03

How does bucket sort relate to counting sort?

question 03 / 03

Why is bucket sort a natural fit for distributed sorting frameworks like Hadoop's TeraSort?

0/3 answered