State Transition Rules Need a Home

When inheriting an unfamiliar system, there's one thing I almost always do: find a core business object — order, ticket, approval, whatever — then open three or four files and try to draw a state diagram in my head.

Not from documentation. If docs exist, they're likely outdated. From code. Read OrderService.cancel(), see if order.Status != Paid. Read OrderService.refund(), see if order.Status == Paid || order.Status == Delivered. Read OrderService.complete(), discover it doesn't check status at all — was that intentional or forgotten? Git blame shows three if-statements written by three different people in different quarters. Each looks reasonable alone, but pieced together, nobody can say how many legal transition paths exist from the paid state.

This takes half an hour. The resulting diagram looks roughly like: 5 states, 7 arrows, 2 question marks — because two edge cases can't be derived from code, and the original author has left.

This half-hour shouldn't be necessary. This diagram should already be in the code.

That's what I want to discuss: why state transition rules in most code don't have a clear home, and what happens when you make them explicit.

Not a Bad Habit — a Good Habit Past Its Expiration

Why do we default to enum + if/else for state management? Not laziness, not lack of skill — this approach is entirely reasonable at the starting line.

When a business object is new, states are few. Order: pending → paid → shipped. Three states, two operations. Adding an if-statement before each operation to check state is the most direct approach. You don't even think of this as "state management" — it's just a few precondition checks, same nature as "parameter can't be empty."

The problem isn't here.

The problem is that business states follow a common pattern: they grow more easily than they shrink. Three states today, five in six months, seven in a year. Each time you add a new state, you naturally add new if-statements in new operations. Each time, entirely reasonable. Nobody says in sprint planning "let's refactor state management" — unless something has already broken.

But ten individually reasonable local decisions, stacked together, can produce an unreasonable global result. This isn't a design failure — it's organizational inertia. If state transitions don't have a designated home, they live in everyone's code, each version slightly different.

The point isn't to criticize if/else, but to recognize there's a threshold you cross quietly if you're not paying attention. On this side, if/else is the best choice. On the other side, it starts producing a particular kind of technical debt: not bad code, but lost design knowledge.

Three Ways Implicit State Machines Decay

When state transition rules are entirely expressed through if/else, there's a state machine in the code — but it's not written out, it's derived. Every time you answer "can state X perform operation Y," you need to look at every place, not one place, and AND them together.

This kind of implicit state machine has three decay mechanisms. These are my observations — might not be complete.

First: dispersion.

The same rule gets expressed repeatedly, each time potentially inconsistently. Back to orders: cancel checks if order.Status != Paid, refund checks if order.Status == Paid || order.Status == Delivered. Both express "what the order can do in which state," but using complementary logic — one excludes, one enumerates.

Two months later, someone adds PartiallyShipped. They update the refund check (adding || order.Status == PartiallyShipped) but not the cancel check — because they didn't know a state check for cancellation existed, in another file.

This isn't something code review easily catches. Unless the reviewer happens to remember "cancel has a state check over there too" — but six months later, nobody remembers.

Second: combinatorial blind spots.

N states × M operations = N×M possible combinations. For example, 6 states × 7 operations = 42. Of these, perhaps a dozen are legal; the remaining thirty-something should be rejected.

The if/else pattern doesn't tell you whether those thirty-something illegal combinations are all covered. How do you check? String together all operations' check logic for an audit? Nobody has done this. I haven't either.

More troubling: someone might add defensive programming that accidentally rejects a legal combination — "let me add a check to be safe." The motivation is good, but when checks are scattered everywhere, you can't distinguish "this restriction is a business rule" from "the programmer added it."

Third: assumption rot.

Every if carries an implicit assumption. if status == A || status == B assumes "only A and B can perform this operation." This assumption is correct the day it's written.

But assumptions don't have expiration dates. When new state C appears, new operations emerge, business rules shift — old if-statements don't update themselves. Code isn't like documentation — documentation can be called outdated; code keeps running and looks correct.

This is what makes this type of technical debt insidious: not "broken code" but "code still running, but its assumptions are no longer true." When you discover it, usually not through review but through a production bug.

Giving Transition Rules a Home

Having covered the problems, let's talk about the cleaner approach.

The core idea is simple: treat state transition rules as an independent thing, not an appendage of each operation.

At the code level, this doesn't require frameworks, libraries, or even design patterns. A table suffices. In Go, a map[State]map[Event]State; in Java, a Map<State, Map<Event, State>>; in Python, a nested dict. Language doesn't matter — the structure does:

type State string
type Event string

var transitions = map[State]map[Event]State{
    "pending":   {"pay": "paid", "cancel": "canceled"},
    "paid":      {"ship": "shipped", "refund": "refunding"},
    "shipped":   {"deliver": "delivered"},
    "delivered": {"refund": "refunding"},
    "refunding": {"complete_refund": "refunded"},
    "canceled":  {},
    "refunded":  {},
}

func (s State) Next(e Event) (State, error) {
    if nextStates, ok := transitions[s]; ok {
        if next, ok := nextStates[e]; ok {
            return next, nil
        }
        return s, fmt.Errorf("illegal transition: %s -[%s]→ ???", s, e)
    }
    return s, fmt.Errorf("unknown state: %s", s)
}

Not claiming these 30 lines are magical. The value isn't in the code itself — it's in externalizing what was in your head.

Before externalizing, state transition rules are a runtime emergent behavior: run the code, see what it does, infer the rules. After externalizing, they become a static design artifact: read it line by line, review it, show it to the product manager — "look, under current business rules, pending can only be paid or canceled, right?"

This table is what I meant by "a home for transition rules." It doesn't need to be big or complex. But it has one key property: anything not declared is illegal. No defensive if-statements needed in every operation, no else fallbacks. The table is your complete answer to "what's a legal transition." Everything outside it is rejected.

One more small thing worth mentioning: this table is the best business documentation. New hires wanting to understand "what does the order lifecycle look like" — look at the table. Six months later, when you return to modify this code — look at the table. The table doesn't lie, doesn't go stale, because the code executes by it. The gap between documentation and implementation disappears.

Bonus Pickups

After giving transition rules a centralized entry point, some things that were very hard become almost free. Not design goals — incidental benefits.

State change logging. Who changed what from A to B when, triggered by which event. In the if/else pattern, you add logging to every operation — someone always forgets. Now wrap it once in one place — and the event itself is the "why," so logs automatically carry business semantics.

Side effect management. A state change might need to send notifications, refresh caches, update search indexes. Scattered across handlers, these side effects are "by the way" — sometimes done, sometimes not, depending on which handler. With a centralized entry point, side effects can be registered by transition path, decoupled from business logic:

var sideEffects = map[struct{ From, To State }]func(*Order) error{
    {"paid", "shipped"}:  notifyCustomer,
    {"shipped", "delivered"}: updateInventory,
}

Idempotency. Same event twice — "paid" receives "pay" a second time. In the transition table, "paid" state has no "pay" exit — directly returns error. No per-handler deduplication needed.

My personal feeling: these cross-cutting concerns are scattered because there's no unified "state change location." Once one exists, they have a place to be. Conversely, the reason you can't do these things well under if/else isn't necessarily carelessness — the structure doesn't support doing them well.

When to Hold Back

A side effect of having a good tool is wanting to use it everywhere. But an important part of design judgment is being able to say "not needed here."

When is an explicit state machine worthwhile? My test: do "illegal transitions" exist as a business concept between states? If all state-to-state transitions are legal (like a user's "bio" field — any value can become any value), a state machine is meaningless. If there's a clear business rule saying "A can't directly become B," that rule should live in code, not comments.

But this isn't a precise standard. A few specific scenarios.

Few and stable states — if/else suffices. A feature flag on/off. Two states, never a third. Adding a transition table is over-design — no need to fight a bool.

Legality depends on extensive dynamic data. Like "can user withdraw" depending on balance, KYC status, risk control results across a dozen conditions. The legality check itself is a complex decision process — cramming it into a transition table doesn't fit. Transition tables answer "which state changes are allowed by business rules," not "are all these conditions met." The latter belongs in a separate check outside Next.

The middle ground. Most real scenarios are here — not absolutely simple, not absolutely complex. I personally lean toward the explicit side. Reasoning is straightforward: business states only grow. Three states with if/else feels fine today; five in six months, eight in a year — migration cost isn't linear. You're migrating not just code but implicit assumptions scattered everywhere. Some assumptions you didn't even know you had — until they become bugs during migration.

But this is my preference, not a rule.

The Missing Diagram

Back to the opening scenario — inheriting code.

That half-hour spent reverse-engineering a state diagram. What's missing isn't code — there's plenty of code, every handler has checks. What's missing is an acknowledged design decision.

If, when the system was still simple, someone had sat down for ten minutes and written "how can this object change" as a table, in the code — those subsequently added if-statements would have something to check against. Adding a new operation? Check the table first. Paths not in the table are discussion points: should this be rejected, or should the design be updated?

Nobody can foresee seven or eight states six months out when writing the first if-statement. But someone can make "transition rules should have a home" a design habit, letting subsequent changes happen in a controlled environment.

Wrong code doesn't necessarily mean dirty data — but a hole in a core business state machine significantly raises the probability of data entering illegal states.

Every business object with scattered if/else statements is maintaining an implicit state machine. The table wasn't written out not because it's unimportant — because it was too obvious, so obvious that everyone felt "no need to write it, just read the code."

Then the person who read the code left.

A simple state machine contributes more to system correctness than many "enterprise-grade architectures."

Not a Bad Habit — a Good Habit Past Its Expiration #

Three Ways Implicit State Machines Decay #

Giving Transition Rules a Home #

Bonus Pickups #

When to Hold Back #

The Missing Diagram #