§ 02.08 · architecture · stack · orchestration

The Three-Layer Stack

Infrastructure provides context. Skills provide capability. Orchestration provides experience. Keeping these three separate is what lets any one of them evolve without dragging the other two with it.

// kr8 reads this one with you — bring a project.

Why three, not two

The platforms I built before I had the three-layer shape in mind tended to collapse into two. Either the orchestration layer absorbed the capability layer — skills living inside the controller that called them — or the infrastructure layer bled upward, with business logic sitting in the event store or the embedding pipeline. Both of those shapes work for a while. Both get brittle around the time you want to change one of them without changing the others.

Three layers is the minimum number I've found where the independent-evolution property actually holds. Infrastructure at the bottom, providing context and raw substrate. Skills in the middle, providing domain capabilities. Orchestration on top, providing the user's experience. Each layer sits on top of the layer below it without needing to know how that layer is implemented — and more importantly, each layer can be rewritten without forcing a rewrite of its neighbours.

I wish I'd landed on this earlier. The time I wasted untangling an orchestration-skill mashup on one of my platforms would have paid for the entire rewrite twice over.

Layer 1 — Infrastructure

The bottom layer is the part the user never sees and which determines the quality of everything above it. In my head it has six components now, though I reached that set by subtraction rather than design.

The event store is the immutable record of everything that has happened. The context engine assembles working context across three tiers — hot (Redis, <10ms), warm (vectors and summaries, <100ms), cold (full event replay, <500ms). The embedding pipeline handles semantic search across the corpus. The trace infrastructure logs every AI decision and every human response, structured. The optimisation loops — meta-agents — read those traces and propose edits to skills. The eval harnesses measure quality against held-out benchmarks before any proposed edit ships.

The thing I want to stress is that infrastructure is not glue. It's the most load-bearing layer in the stack, because every skill and every orchestration above it is living on whatever quality of context and traces infrastructure provides. Skimp here and the upper layers feel flaky for reasons you can't quite trace.

Layer 2 — Skills

Skills are the domain capabilities — the things the platform can actually do. Each one is a markdown recipe that draws on infrastructure's context and produces an output for human review. Written as text, not code. Composable (skills can chain). Versionable (tracked in git, diffable, rollbackable). Auto-optimisable (a meta-agent can edit, test, and commit if quality improves).

I wrote the rationale for why capabilities belong in markdown in the Skills over Controllers module. In the three-layer view, the thing I want to emphasise is that this layer is the one where domain experts can live — they don't need to touch infrastructure or orchestration, they need to refine recipes. On the platforms where I built a healthy skill layer, I eventually stopped being the bottleneck on new capabilities. That's the point of having a middle layer that reads as text.

Layer 3 — Orchestration

The top layer is the user's interface to the stack. Not "the UI" — more specific than that. Orchestration is the part that manages sessions, routes intent to the right skill or skills, runs proactive monitoring (heartbeats), observes user behaviour to learn preferences, and composes skills into compound workflows when the user's intent needs more than one.

kr8 on this site is an orchestration-layer thing. It doesn't make model calls directly; it decides which skill applies to the current moment, assembles what that skill needs, and takes the result back into the conversation. When people ask whether kr8 is a model or an agent, the answer is neither — it's orchestration. The model call happens one layer down, and kr8 doesn't much care which model it was.

The Gateway / Runtime boundary

Inside the stack there's one boundary that earns more discipline than it might first appear to deserve: the Gateway / Runtime separation. The Gateway is stateful orchestration — sessions, routing, permissions, context assembly. The Runtime is stateless model execution — receive pre-assembled context, call the model, return structured output, exit. State lives in the Gateway. LLM calls live in the Runtime. Context is what crosses between them.

This sounds like bookkeeping. The reason it's load-bearing is that it's what lets you swap the model without touching orchestration, and change how context is assembled without touching skills. I had a platform where the model call was tangled up with session state in the same controller. When I wanted to swap models — a reasonable thing to want, given how fast they were moving — it turned into a three-week refactor. The second platform I built with this separation in mind, I swapped models in an afternoon. The boundary paid for itself immediately.

Layer 0 — Connectors

There's a layer below infrastructure that I sometimes forget to draw because it's so deliberately thin. Connectors are the boundary interfaces — email, forms, chat, APIs, webhooks. Their job is to normalise external data into platform events. Transport, normalise, validate. That's it.

The design rule I've come to trust is that a connector should be writable in an afternoon. If it isn't, business logic has crept in, and I'm slowly rebuilding my skill layer inside the transport layer — which is exactly the shape I was trying to avoid. When a connector of mine has grown past a day's work, that's usually a signal I haven't quite admitted to myself yet about what actually belongs where.

Why the separation matters more than it sounds like

None of the layer boundaries are obvious at the whiteboard. They all look like bureaucratic over-engineering when you're trying to ship the first version of something. The reason I draw them now, deliberately, is that the platforms where I didn't are the ones where each subsequent change got harder, and each model upgrade turned into a refactor, and each new capability bled into the plumbing below it.

The three layers give you the property that upgrading infrastructure doesn't touch skills, adding skills doesn't touch orchestration, and changing how the user experiences the system doesn't touch the model or the context. Whether you need that property depends on how long the platform is going to live. On a platform that runs for six months and dies, it's overhead. On anything I've wanted to keep growing for more than a year, it's the only reason the growth stayed tractable.

kr8 ·

in your platform, are infrastructure, skills, and orchestration evolving independently — or dragging each other?

kr8 · next

// Keep reading the playbook?

In Production →Journal →