what a skill is
a skill is the smallest reusable unit in the pattern vocabulary. it's a tool call plus everything around it that the tool call needs to be safe and composable: a name the llm sees, a description that tells it when to reach for the skill, an input check that fires before anything runs, an output check that fires before anything is trusted, and the handler itself. if you've written tool-use with any sdk, you've written most of a skill. the pattern is about treating it as a first-class object.
skills are defined once, used by many agents. a persona picks a subset it's allowed to use; the skill itself has no idea which persona called it. that's the whole point of the separation.
the shape, runnable
four things to notice, because each one breaks if you forget it:
- the description is a prompt. the llm decides whether to call the skill based on one or two sentences you wrote. treat the description like product copy — it will be read hundreds of times a day, by the model, in a hurry. be specific about when to use it and when not to.
- validate both sides. input validation catches a hallucinated argument before it hits your database. output validation catches the downstream api returning something unexpected. skipping either turns the skill into a foot-gun that fires once a week.
- the context object is the side-channel.
auth, db clients, the current user's timezone, feature flags — all
the stuff the llm shouldn't know about — goes to
run's second argument. the llm supplies the semantic arguments; the host supplies the environment. - return a stable shape.
if the skill sometimes returns
{ event }and sometimes{ error }, the llm has to learn two return paths. prefer a single happy-path shape and throw on failure — the loop is already a good error-handling harness.
validation fires before the api call
the skill threw ValidationError before run
executed. in a real system, that means the hallucinated 9-hour booking
never hit your calendar api. tweak the durationMin to 60
and the validation passes — the skill runs end-to-end.
reuse across personas
this is the payoff. book_slot lives in both the scheduler's
toolbox and the assistant's toolbox, but each persona wraps it with
different guardrails. the assistant needs to confirm before booking; the
scheduler doesn't. neither change touches the skill itself.
when it breaks
- skills that depend on other skills. once a skill calls another skill internally, you have an implicit graph the llm can't reason about. push that composition up into the agent loop or into an extension. skills should be leaves.
- overly clever descriptions. every skill description that ends in "use this for most things" will be picked for most things. write narrow descriptions; let the model pick the right one from a clear menu.
- silent schema drift. when a downstream api changes its response and the skill still "works" because validation was optional, you've added a landmine. fail loud. make the eval suite catch it.
next up: extensions, the middleware pattern that wraps the loop around a skill — memory injection, retries, telemetry, safety.