scalable/ai/agentic patterns/lesson 03 cph / /
lesson 03 / 07 · 12 min · updated ·

extensions — wrapping the base loop

memory injection, retrieval, safety, telemetry. the pattern equivalent of middleware.

the middleware pattern, for agents

extensions are the middleware of the agent world. they wrap the base loop — the thing that takes a message, runs the model, maybe calls a skill, returns a response — and layer in everything that's around the core job: memory injection, retrieval, retries, rate limits, caching, telemetry, safety filters, token budgeting. if you've written express middleware or a rails interceptor, this is that, for agent turns.

the reason to call them a pattern, not just "some code you wrap things with": treating them as composable (ctx, next) ⇒ result objects is what lets you add and remove concerns without touching the loop, and lets you reason about them independently.

composition — retrieval, telemetry, safety

wrap a base loop with three extensions — click run
ready

three small rules keep extensions clean and debuggable:

order matters — the onion

observe the onion — before → before → before → base → after → after → after
ready

outer extensions see everything — including what inner extensions did. telemetry usually goes outermost so it captures retries; safety often goes just inside retrieval so it can inspect what the model produced with context. there's no universal stack, just the one you document for your team.

why this is worth the layer. the alternative is mixing retrieval logic into the agent's own prompt construction, then mixing telemetry into the tool-call code, then mixing safety into the response handler. six months later nobody can reason about the turn because it has twelve responsibilities in one function. extensions are the pattern that keeps the core boring. the agents i've had to put on call for are the ones where the turn function grew responsibilities; the ones i still trust are the ones where every cross-cutting concern lives in its own layer.

when it breaks

next: personas — the bundle of prompt, voice, allowed skills, and guardrails that makes one loop behave like many different agents.

scalable labs·cvr 30091604·github·linkedin·hello@scalable.dk