7 min readSubroute

Three problems I had to solve to teach algorithms in a browser

What I learned building Subroute — interactive in-browser simulations of system design algorithms. Three engineering problems, three teaching lessons.

system-designengineeringweb-developmentlearningjavascript

A few months ago, I noticed I was rereading the same ByteByteGo article on rate limiting for the fourth time.

I could recite the token bucket definition. I could draw the diagram. I could explain it in an interview. But if you'd asked me "what happens when the refill rate is half the request rate and the bucket is small?" — I'd have stared into the middle distance, run the math in my head, and given a noncommittal answer.

The diagram wasn't the problem. The diagram was actually pretty good. The problem was that the diagram was static, and the algorithm wasn't.

Token buckets, sliding windows, leaky buckets, LRU caches, garbage collectors, load balancers — these are all systems that do something over time. They have dynamics. Their interesting behavior is what happens at the edges: when load spikes, when memory fills, when one server dies. None of that lives in a diagram. It lives in what happens next.

So I built Subroute — a playground where every algorithm is a live simulation in the browser. You adjust the parameters, you watch it run, you break it. The goal is to skip the "stare at the diagram and imagine" step entirely.

Building it taught me three things I didn't expect.

Problem 1: Time is a liar in the browser

The first thing I learned is that you cannot trust the clock.

A rate limiter is fundamentally about time. Tokens refill at N per second. Windows tick over every X milliseconds. Requests arrive at some rate distribution. The whole concept depends on a consistent forward-marching clock.

My first naive implementation used setInterval(tick, 100). Tick the simulation every 100ms, advance the algorithm, render the new state. It worked beautifully — until I tabbed away to check Slack.

When you background a browser tab, most browsers throttle timers aggressively. setInterval(tick, 100) becomes setInterval(tick, 1000) or worse. Then I'd tab back, and the simulation would either freeze, lurch forward in a giant jump, or quietly desync — the visible state ahead of the algorithm's internal state by minutes of simulated time.

Date.now() had the opposite problem. Real wall-clock time kept ticking, so if I used it as the source of truth, the bucket would silently "refill" 30 seconds worth of tokens the instant the tab regained focus. The simulation would jump, not freeze. Worse for teaching, because you couldn't see what had happened — you just saw the aftermath.

The fix took two changes:

  1. A virtual clock. Every algorithm reads from simulationTime, not Date.now(). The simulation owns the concept of "now."
  2. A requestAnimationFrame driver. Instead of setInterval, I increment simulationTime by a controlled delta inside rAF. When the tab backgrounds, rAF pauses cleanly. When it foregrounds, it resumes from where it stopped. No drift, no surprise jumps.

The bonus, which I didn't see coming, is that once time is virtual it becomes adjustable. I can give readers a speed slider — 0.25x to play through a slow burst in detail, 10x to fast-forward through a hundred refills in seconds. The same slider that solved a bug became one of the most useful teaching tools in the whole thing.

That pattern kept showing up: the right primitive solves a bug and unlocks a feature you didn't plan for.

Problem 2: Randomness is the enemy of learning

Once the clock was solid, I plugged in a random workload generator — Poisson arrivals, exponential intervals, the standard textbook stuff — and pointed it at five rate-limiting algorithms in parallel.

They all looked identical.

Token bucket, leaky bucket, fixed window, sliding window log, sliding window counter — five legitimately different algorithms with five different trade-offs. Under a uniformly random workload they were indistinguishable on the chart.

Which makes sense in hindsight: averaged over enough randomness, every rate limiter accepts the same fraction of requests. The differences only show up under patterns — bursts, sustained pressure, mixed traffic.

This is the gap between academic descriptions of algorithms and what they actually do in production. Real traffic isn't uniform random. Real traffic is Zipfian (a few keys dominate everything), bursty (most of the day is quiet, then 10x in a 30-second window), or scan-heavy (a backup job sweeps the entire keyspace once, blowing past every cache).

I rewrote the workload generator with two changes:

  1. Named presets instead of free parameters. "Bursty," "scan," "Zipfian," "diurnal" — each one carefully constructed to make the differences between algorithms visible. The scan preset, for instance, is the single best demonstration of why ARC and LIRS exist and why vanilla LRU doesn't survive contact with production.
  2. Seeded RNG. Every preset uses a fixed seed by default, which means everyone who clicks "scan workload" sees the exact same request stream. Reproducible. When someone tells me "the LRU panel looks broken at second 47," I can load the same seed and look at the same second.

The seeded-RNG decision turned out to matter for a reason I didn't foresee: it made the simulations shareable. A reader can screenshot a moment and another reader can reproduce it. The simulation becomes a thing two people can have an argument about, which is what learning resources should be.

Problem 3: Side-by-side is a feature, not a layout

The third decision was the one that changed the whole product.

My early prototype showed one algorithm at a time. Pick "token bucket" from a dropdown, watch it run, switch to "leaky bucket," watch that one run separately. This is how every existing resource handles it — one algorithm per page, you flip between them.

It doesn't work. By the time you've switched from "token bucket" to "leaky bucket" and watched it for ten seconds, you've already forgotten what the token bucket looked like. Comparison becomes a memory exercise.

The change that fixed it sounds trivial: render all five algorithms simultaneously, side by side, on the same canvas. But the implementation matters — they had to share one workload stream, not five independent ones. Otherwise you're back to comparing averages of randomness.

The architecture became:

  • One workload generator, producing a single stream of requests per simulation tick.
  • Five algorithm panels, each subscribing to the same stream as a consumer.
  • Each panel computes its own decision (accept/reject, hit/miss, route/queue) and renders its own state.

The result is the most "aha" moment I've shipped. Hit the scan preset on the cache eviction page and watch four of the policies degrade in real time while ARC and LIRS hold their hit rate. You don't have to explain scan resistance after that. The reader has seen it.

This is the heuristic I now use to evaluate every new simulation: can a reader feel the difference between two algorithms in 30 seconds of playing? If not, it's not done.

What this taught me about teaching

The pattern across all three problems was the same. Each one started as "how do I make this work technically" and ended as "this is what makes the algorithm legible."

A virtual clock fixed a bug, then became the speed slider — and the speed slider is how readers see slow-motion bursts in detail.

Seeded workloads fixed a reproducibility issue, then became the named presets — and the named presets are what reveal the algorithms' actual differences.

Side-by-side rendering fixed a comparison problem, then became the core layout — and the core layout is what turns five separate articles into one playground.

Reading about algorithms tells you what they are. Diagrams tell you what they look like. But intuition — the kind you need to make architecture decisions, debug latency spikes, or answer an interview follow-up — only comes from watching them behave. The simulation is the teacher. The words around it are labels.

What's next

Subroute today covers six topics: rate limiting, cache eviction, cache write policies, garbage collection, memory allocation, load balancing. The next batch is on the runway: queues, consistent hashing, replication strategies, and the hard one — consensus.

Raft and Paxos are where this approach has the most to prove. Most explanations of consensus are very good at describing the happy path and very bad at conveying what happens during a network partition. That's exactly where a simulation should win: slice the cluster in half, watch the leader election, scrub time backward, see exactly which node thought what when.

If you want to poke the current set, subroute.dev is free and there's no signup. If you've built or operated any of these in production and something in a simulation feels off, I want to hear about it — that's the feedback that makes the next version better than the last.

This is the canonical version of this post. If you found it on Hashnode, dev.to, or Medium, the original lives here on Subroute.

All posts