scalable systems handle growth without breaking
← back · 3 min read ·

spec-kit, in practice

the spec is the artifact. the code is the side-effect.

spec-kit is github’s open-source toolkit for spec-driven development with ai coding agents. you write a spec in markdown — then a plan, then a task list — before the agent writes any code.

spent a few evenings last week using it on a side project. the artifact at every step is markdown, in your repo. the agent reads each file like input and updates it like state.

it’s the closest i’ve come in 2026 to feeling like specs are back.

the move

four steps, four files — after a one-time /speckit.constitution that sets project-wide principles.

each step takes the previous file as input and produces the next. when something goes wrong at step four — wrong shape of code, missing edge case — you don’t argue with the agent. you go back to the file where the misunderstanding lives, fix it, and re-run from there.

the artifact, not the chat

i’ve been writing prompts at agents for two years. the chat transcript was where the intent lived. it was unreviewable, ungreppable, and gone when the session ended.

spec-kit’s quiet move is to make the spec the source of truth. it’s a file in the repo. it gets reviewed in a pr. when the agent drifts off the spec, the spec wins. when a teammate joins, they read the spec, not the chat history.

that’s a different relationship to ai-assisted coding than “describe the change and ship the diff.” it’s much closer to how good teams already work: a one-pager goes around, the engineers push back, the doc tightens, then code happens. spec-kit just made the one-pager cheap to draft, because the agent helps you write it.

where it earns its keep

where it doesn’t

the underlying shift

teams have always wanted specs. writing them by hand was expensive enough relative to the perceived value that, outside regulated environments, the practice atrophied. what changed in 2026 is that drafting a spec is five minutes of conversation with an agent instead of an afternoon at a keyboard. the quality of the spec is still on you. but the floor of effort dropped enough that “write the spec first” is finally cheaper than “skip the spec and debug later.”

eval-driven prompt iteration made the same shift at a smaller grain: writing the failing case first used to be friction, then it became the cheapest part of the loop. spec-kit is the same idea, one level coarser. the artifact you wished you had is finally the artifact you can afford.

the spec was always the right move. the agent just made it cheap.

scalable labs·cvr 30091604·github·linkedin·hello@scalable.dk