Beginner9 min readlive prototype

The State Machine

Closed lets calls through. Open rejects them all. Half-Open peeks at recovery with a handful of trials. Learn the three-state dance before any of the trip-rules.

Overview

What this concept solves

Before you compare 'count vs time vs slow-call' rules, you need the states they all share. Every circuit breaker — Hystrix, Resilience4j, Polly, Envoy, your own ten-line one — is the same finite state machine with three positions. The variants differ only in what makes them transition.

Closed is normal. Calls go through, the breaker just watches the results. Open is the trip: every call is rejected instantly, without even being attempted. Half-Open is the careful peek in between: after a cooldown, a small number of trial calls are allowed through. If those trials succeed, the breaker closes again. If they fail, it opens for another cooldown.

Learn this once and the rest of the topic is just five different rules for the question 'when do we leave Closed?'

Mechanics

How it works

The three states

  • Closed — traffic flows. Every call is attempted. The breaker tallies results in some window (last N calls, last N seconds, etc.) and trips when whatever signal it cares about crosses a threshold.
  • Open — every new call is rejected immediately, with an exception or short-circuit error. Nothing reaches the downstream. A timer counts down a cooldown (often a few seconds).
  • Half-Open — the cooldown ended. A fixed number of trial calls (commonly 2–5) are allowed through. Their results decide: enough passes → back to Closed; even one significant failure → back to Open and the cooldown restarts.

The three transitions

  1. Closed → Open — the trip. The trip rule is what changes between variants; the transition is the same: stop accepting calls, start the cooldown.
  2. Open → Half-Open — the timer's job. After the cooldown elapses, automatically move to Half-Open so the next call can be a trial.
  3. Half-Open → Closed / Open — the decision. A configurable count of trials decides. Pass → Closed and the window resets. Fail → Open, cooldown again.

Why Half-Open at all?

Without Half-Open you'd have to choose between two bad options: stay Open forever (you'd never recover) or jump straight back to Closed and immediately retrip if the downstream is still sick. Half-Open is the cheap probe — a few calls' worth of risk to find out whether the downstream is healthy, instead of unleashing your full traffic on a service that just came back from the dead.

Half-Open is fragile by design

While Half-Open, the breaker is more sensitive than usual — a single bad trial typically re-trips it. That's on purpose. You don't want to declare 'fully recovered' on one lucky call, and you don't want to keep probing if the early evidence says no.

Interactive prototype

Run it. Break it. Tune it.

Sandboxed simulation embedded right in the page. No setup, no install.

About this simulation

The three-state dance you'll see in every variant. Pick a scenario — first trip, recovery succeeds, recovery fails — and step through with Prev / Next, or jump into Free play to send calls yourself. Watch the badge change colour, the arrows light up, and the narration explain each move. No trip rules to learn yet — just the state machine itself.

Hands-on

Try these on your own

Open the prototype above, run each experiment, predict the answer, then verify.

try 01

Walk the first trip

Open the First trip scenario and click Next through every step. Watch the calls land in the window, the badge turn red when the rule trips, and the arrow Closed → Open light up. Note that no call reaches the downstream once Open — every reject is instant.

try 02

Recovery succeeds

Switch to the Recovery succeeds scenario. The breaker is already Open and the cooldown is ticking. Step forward until it elapses and moves to Half-Open. Send the configured number of trial calls — all pass — and watch the transition to Closed. The window starts fresh.

try 03

Recovery fails

Switch to Recovery fails. Same setup, but the very first trial in Half-Open fails. Watch the breaker slam back to Open and restart the cooldown. This is by design — Half-Open is a probe, not a commitment.

try 04

Free play — break it yourself

Open Free play. Click healthy call until you're bored, then alternate failing call to see exactly how many bad ones it takes to trip with the default rule. While Open, click skip wait to jump straight to Half-Open. The same controls work for every other variant in this topic — only the trip rule changes.

In practice

When to use it — and what you give up

When this is enough on its own

  • Always — every circuit breaker uses these states. The question is only which trip rule sits on top.
  • For learning — implement the states first with a hard-coded rule (e.g. 'fail 3 times in a row'); plug in a real signal once the dance is comfortable.
  • For very small services — a basic Closed/Open/Half-Open with a fixed-count trip rule is often all you need before reaching for a library.

What the libraries call them

Resilience4j: CLOSED, OPEN, HALF_OPEN (plus DISABLED and FORCED_OPEN for ops overrides). Polly: Closed, Open, HalfOpen, Isolated. Hystrix: same names. Envoy talks about 'cluster outlier detection' but the model is identical. Different vocabulary, one machine.

Pros

  • Universal — every variant in this topic and every production library uses the same three states.
  • Tiny — the bookkeeping is literally three enum values and a timestamp.
  • Composable — swap in any trip rule (count, time, latency, %) without changing the states or transitions.
  • Self-recovering — the cooldown + Half-Open probe means the breaker tries to come back without human intervention.

Cons

  • Not a complete strategy — the states tell you when to fail fast, not what response to send to the caller. Pair with a fallback (cached value, default, secondary backend).
  • Half-Open is a thin moment — if you set the trial count to 1, a single unlucky failure during recovery sends you back to Open even if the downstream is healthy.
  • Fixed cooldown can be too eager or too lazy — too short and you slam a struggling downstream; too long and you keep failing requests after it has recovered. That's the problem the Adaptive variant exists to solve.

Reference

Code & further reading

A minimal reference implementation and pointers worth bookmarking.

circuit-breaker-state.ts
// The smallest useful circuit breaker — three states, one rule.
type State = "CLOSED" | "OPEN" | "HALF_OPEN";

class CircuitBreaker {
  private state: State = "CLOSED";
  private failures = 0;          // current window
  private trials: boolean[] = [];
  private openUntil = 0;

  constructor(
    private failThreshold = 5,   // how many fails in CLOSED trip it
    private cooldownMs = 4000,   // how long OPEN waits before probing
    private trialCount = 3,      // how many HALF_OPEN trials to decide
  ) {}

  async call<T>(work: () => Promise<T>): Promise<T> {
    if (this.state === "OPEN") {
      if (Date.now() >= this.openUntil) this.state = "HALF_OPEN";
      else throw new Error("circuit open — fail fast");
    }

    try {
      const result = await work();
      this.onSuccess();
      return result;
    } catch (err) {
      this.onFailure();
      throw err;
    }
  }

  private onSuccess() {
    if (this.state === "HALF_OPEN") {
      this.trials.push(true);
      if (this.trials.length >= this.trialCount) this.close();
    } else {
      this.failures = 0;          // reset on a healthy run
    }
  }

  private onFailure() {
    if (this.state === "HALF_OPEN") {
      this.trip();                // one bad trial re-opens
      return;
    }
    if (++this.failures >= this.failThreshold) this.trip();
  }

  private trip()  { this.state = "OPEN";    this.openUntil = Date.now() + this.cooldownMs; this.trials = []; }
  private close() { this.state = "CLOSED";  this.failures = 0; this.trials = []; }
}

References & further reading

5 sources

Knowledge check

Did the prototype land?

Quick questions, answers revealed on submit. No scoring saved.

question 01 / 03

What does the breaker do with a call that arrives while it is OPEN?

question 02 / 03

Why does the breaker spend time in HALF-OPEN instead of going straight from OPEN back to CLOSED?

question 03 / 03

While in HALF-OPEN, a single trial call fails. What happens next?

0/3 answered

Was this concept helpful?

Tell us what worked, or what to improve. We read every note.