Topological Sort

Overview

What this concept solves

A topological sort lays out the nodes of a DAG in a line so every edge points forward. If A must happen before B, then A sits to the left of B in the order. It's the right answer to 'what order should I do these tasks in, given that some require others?' — and the wrong question to even ask if your graph has a cycle.

There are two standard algorithms, both O(V + E) and both built on a traversal you already know. Kahn's algorithm repeatedly finds a node with zero remaining prerequisites (in-degree zero), emits it, then decrements its successors' in-degrees. It's the BFS-flavoured one — easy to write iteratively, easy to parallelise (all in-degree-zero nodes can run at the same time). DFS post-order runs DFS and appends each node to a list the moment its DFS call returns; reverse the list and you have a topological order. It's the DFS-flavoured one — three extra lines on top of plain DFS.

If you've ever waited for npm install to compute its install order, watched make skip a build step, queued tasks in Airflow, run a Spark DAG, or written a course-prerequisite recommender — you've used a topological sort. The same idea shows up in symbol resolution inside compilers, dependency injection containers, and database query planners.

Mechanics

How it works

Kahn's algorithm — the BFS-style one

Compute the in-degree of every node (how many edges point at it).
Put every in-degree-zero node into a queue — these have no prerequisites.
Loop: dequeue a node, emit it to the output, then for each of its outgoing edges decrement the neighbour's in-degree. If a neighbour's in-degree drops to zero, enqueue it.
When the queue empties, you have a valid order — unless the output count is less than V, which means there was a cycle.

DFS post-order — the DFS-style one

Run DFS over the whole graph (loop over every node and DFS into the unvisited ones).
When dfs(node) is about to return, push node onto a 'finished' stack.
Reverse the finished stack — that's a valid topological order.
To also detect cycles, use the white/grey/black classification — a grey-to-grey edge during DFS means there's a back edge, hence a cycle.

There can be many valid orders

Topological order isn't unique. If there are two in-degree-zero nodes at once, you can pick either — both produce valid orders. That's why Kahn's algorithm parallelises so well: at every step, all the currently in-degree-zero nodes are work that can run concurrently. Build systems like Bazel exploit this directly to drive parallel builds.

If the graph has a cycle, there is no order

A topological sort requires a Directed Acyclic Graph. If the graph has any cycle, both algorithms detect it: Kahn's because the output ends up shorter than V (the cycle's in-degrees never reach zero), DFS because of a grey-to-grey back edge. Production code should always check and surface a clear 'circular dependency' error — that's the kind of bug that turns a build system into an infinite loop.

Interactive prototype

Run it. Break it. Tune it.

Sandboxed simulation embedded right in the page. No setup, no install.

simulation › Topological Sort

About this simulation

A 7-node DAG with two algorithm tabs. Kahn's algorithm (BFS) pulls zero-in-degree nodes from a queue. DFS (reverse post-order) runs DFS and prepends each node when its subtree finishes. Both produce a valid build order — switch tabs to compare. Try cycle loads a graph with a cycle to show both algorithms correctly refusing.

Hands-on

Try these on your own

Open the prototype above, run each experiment, predict the answer, then verify.

try 01

Walk Kahn's algorithm (BFS)

Stay in the Kahn's algorithm tab and press Play. Watch the in-degree counter above each node count down as edges are consumed. Every node enters the queue the moment its in-degree hits zero — the width of the queue at any step is the maximum parallelism your build system could exploit then.

try 02

Walk DFS (reverse post-order)

Switch to the DFS (reverse post-order) tab. Now there's no queue — there's a call stack. DFS dives deep, then on the way back prepends each node to the result. The final order is different from Kahn's but still satisfies every edge.

try 03

Compare the two orders

Both Kahn and DFS produce valid topological orders, but usually different ones. Run each through completion and look at the build-order row — every edge u → v in the graph has u before v in both, but the specific positions can differ wherever the graph offers a choice.

try 04

Try cycle to see both detectors fire

Press Try cycle. In Kahn's tab the queue drains but some nodes never reach in-degree zero — cycle detected. Switch to DFS tab: the moment a back edge points to a node still on the call stack, the algorithm halts the same way. Same diagnosis, two different mechanisms.

In practice

When to use it — and what you give up

When it's the right tool

Build systems — make, Bazel, Webpack, CMake all topo-sort their targets to determine compile and link order.
Workflow schedulers — Airflow, Luigi, Dagster, GitHub Actions arrange tasks in a DAG and run them in topological order, fanning out where possible.
Course / curriculum planners — university degree-tracking tools sort courses by prerequisite so the student gets a feasible schedule.
Spreadsheet recalculation — when a cell changes, Excel topo-sorts the dependent cells so each one updates after its inputs.
Package managers — pip, npm, apt, cargo resolve dependencies into a topological install order.
Compilers — type checking, symbol resolution, and IR optimisation passes all topo-sort dependency graphs (use-def, call graph).

When to reach for something else

Your graph might have cycles — topo sort can't help. Use strongly connected components to collapse the cycles into super-nodes; the resulting condensation graph is a DAG and that can be topo-sorted.
You need shortest paths in the DAG — topo-sort the DAG first, then relax edges in topological order. Linear-time DAG shortest paths beat Dijkstra here.
You only care about reachability, not order — a plain DFS or BFS sweep is simpler and just as fast.

Pros

Linear time — O(V + E) for both Kahn's and DFS. Touches every node and every edge once.
Parallelism falls out naturally — Kahn's algorithm exposes every concurrently-ready node at once. Build systems exploit this for free parallelism.
Detects cycles for free — Kahn's via the output-count check, DFS via the back-edge test. No second pass needed.
Simple to implement — Kahn's is ~10 lines; DFS post-order is plain DFS plus one push on return.
Enables linear DAG shortest paths — once you have a topological order, you can compute single-source shortest paths in O(V + E), no priority queue needed.

Cons

DAG-only — the moment there's a cycle, both algorithms refuse. You need a 'detect-and-report' fallback in production.
Order isn't unique — fine for most uses, but if you depend on a stable order across runs you need a tiebreaker rule.
Doesn't optimise anything — it gives you a valid order, not the best one. Critical-path scheduling on top of topo-sort is the next step.

Reference

Code & further reading

A minimal reference implementation and pointers worth bookmarking.

topological-sort.ts

// Kahn's algorithm — BFS-flavoured topological sort with cycle detection.
function topoSortKahn(adj: Map<string, string[]>): string[] {
  // Step 1: compute in-degrees.
  const inDeg = new Map<string, number>();
  for (const u of adj.keys()) inDeg.set(u, 0);
  for (const nbs of adj.values())
    for (const v of nbs) inDeg.set(v, (inDeg.get(v) ?? 0) + 1);

  // Step 2: seed the queue with every zero-in-degree node.
  const queue: string[] = [];
  for (const [u, d] of inDeg) if (d === 0) queue.push(u);

  const order: string[] = [];
  while (queue.length > 0) {
    const u = queue.shift()!;
    order.push(u);
    for (const v of adj.get(u) ?? []) {
      const d = (inDeg.get(v) ?? 0) - 1;
      inDeg.set(v, d);
      if (d === 0) queue.push(v);
    }
  }

  // Step 3: cycle check.
  if (order.length !== adj.size)
    throw new Error("cycle detected — no topological order exists");
  return order;
}

// DFS post-order — same answer, three extra lines on top of DFS.
function topoSortDfs(adj: Map<string, string[]>): string[] {
  const seen = new Set<string>();
  const out: string[] = [];
  function dfs(u: string) {
    seen.add(u);
    for (const v of adj.get(u) ?? []) if (!seen.has(v)) dfs(v);
    out.push(u);           // on the way back up — post-order
  }
  for (const u of adj.keys()) if (!seen.has(u)) dfs(u);
  return out.reverse();     // reverse post-order is a topological order
}

References & further reading

6 sources

Knowledge check

Did the prototype land?

Quick questions, answers revealed on submit. Sign in to save your best score.

question 01 / 03

What is the key property of a topological order?

question 02 / 03

Kahn's algorithm finishes with `order.length < V`. What does this mean?

question 03 / 03

Why does the reverse of DFS post-order produce a topological sort?

0/3 answered