tl;dr. the same query() call ships
three different ways: a one-shot cli, a server endpoint streaming
sse, an ide extension routing permissions through editor ux. the
agent loop doesn't change. what changes is the surface around it —
settings inheritance, permission ux, error handling, and what you
do with the events.
the only thing that's different per surface
if you've gotten this far, you've seen every primitive the sdk exposes. shipping isn't another primitive — it's a question of how those primitives compose with the surface around them. three surfaces, three sets of correct answers.
cli — the developer surface
a script. argv parsing, an env file, no auth, a tty for stdout. the surface where you'd run a refactor, a one-shot review, a release note generator. the agent runs as the developer, with the developer's settings.
#!/usr/bin/env node
// scripts/agent.ts — a one-shot cli wrapper.
// usage: ./agent.ts "do the thing"
import { query } from "@anthropic-ai/claude-agent-sdk";
const prompt = process.argv.slice(2).join(" ");
if (!prompt) { console.error("usage: agent.ts <prompt>"); process.exit(1); }
const run = query({
prompt,
options: {
permissionMode: "acceptEdits",
settingSources: ["project"],
allowedTools: ["Read", "Edit", "Glob", "Grep", "Bash"],
},
});
for await (const ev of run) {
if (ev.type === "assistant") {
for (const b of ev.message.content) {
if (b.type === "text") process.stdout.write(b.text);
}
}
if (ev.type === "result" && ev.subtype !== "success") {
process.stderr.write(`agent ended: ${ev.subtype}\n`);
process.exit(2);
}
} - settingSources: ["project"]project conventions matter; user-level claude code preferences usually don't.
- acceptEdits is fine herethe developer is at the terminal. ask-style permissions create unwanted prompts; trust the rules in the project's
.claude/settings.json. - exit codes matteragents in ci pipe their result to
$?. write the exit code from theresultevent's subtype.
server — the production surface
an http endpoint, probably streaming sse. running on a vps, in a container, behind a load balancer. zero developer context — the agent has to be self-contained. this is where most permission bugs ship to production.
// server-side: an http endpoint that streams sse to the browser.
// the http surface is yours; the agent is the same query() call.
import { Hono } from "hono";
import { streamSSE } from "hono/streaming";
import { query } from "@anthropic-ai/claude-agent-sdk";
const app = new Hono();
app.post("/agent", async (c) => {
const { prompt } = await c.req.json();
return streamSSE(c, async (stream) => {
const run = query({
prompt,
options: {
permissionMode: "default",
// SERVER-SIDE: never inherit user/local settings. project only.
settingSources: ["project"],
// canUseTool is your bouncer for anything destructive.
canUseTool: async (name, input) =>
name === "Bash"
? { behavior: "deny", message: "no shell from the http agent" }
: { behavior: "allow", updatedInput: input },
},
});
for await (const ev of run) {
await stream.writeSSE({ data: JSON.stringify(ev), event: ev.type });
}
});
});
export default { fetch: app.fetch, port: 8787 }; - never inherit user / local settingsa server has no developer, no
~/.claude/. the only validsettingSourcesis["project"]or omitted. - canUseTool is mandatoryon a server, "ask the user" is meaningless and the absence of a callback means defaults. write a deliberate one — even if it just denies Bash.
- timeout the runwrap the iterator in
AbortSignal.timeout()or callrun.interrupt()from a setTimeout. an agent that loops in production is an unbounded api bill. - isolate the working dirset
cwdto a temp folder per request. don't let one user's run see another user's files.
ide — the editor surface
a vs code extension, a jetbrains plugin, a custom editor. the agent has rich ux: showInputBox, showWarningMessage, output channels, diff views. permission ux belongs in the editor, not in stdout.
// vs code extension — drive the same sdk, surface events in a webview.
// the agent doesn't know it's in an editor; the editor knows it has an agent.
import * as vscode from "vscode";
import { query } from "@anthropic-ai/claude-agent-sdk";
export function activate(ctx: vscode.ExtensionContext) {
ctx.subscriptions.push(
vscode.commands.registerCommand("scalable.agent.run", async () => {
const prompt = await vscode.window.showInputBox({ prompt: "agent prompt" });
if (!prompt) return;
const out = vscode.window.createOutputChannel("scalable agent");
out.show();
const run = query({
prompt,
options: {
cwd: vscode.workspace.workspaceFolders?.[0]?.uri.fsPath,
permissionMode: "acceptEdits",
settingSources: ["project", "user"],
// route ask-permission through the editor, not stdin.
canUseTool: async (name, input) => {
const ok = await vscode.window.showWarningMessage(
`Run ${name}? ${JSON.stringify(input).slice(0, 80)}`,
"allow", "deny",
);
return ok === "allow"
? { behavior: "allow", updatedInput: input }
: { behavior: "deny", message: "user denied" };
},
},
});
for await (const ev of run) {
if (ev.type === "assistant") {
for (const b of ev.message.content) {
if (b.type === "text") out.append(b.text);
}
}
}
}),
);
} - canUseTool is your permission uxthe user is one click away. surface tool calls in the editor's native dialogs, not in a terminal.
- cwd is the workspace root
vscode.workspace.workspaceFolders[0].uri.fsPath— anchoring the agent to the open workspace prevents accidental writes elsewhere. - project + user settings ok hereunlike a server, an ide extension runs as the developer. inheriting user-level claude code preferences is part of why they installed the extension.
what doesn't change
everything you've learned about tool selection, permissions, hooks, sub-agents, and system prompts and skills applies in all three surfaces. the agent loop doesn't care whether the next event is going to a tty, a websocket, or a webview. that's the point of the sdk — one loop, many surfaces.
operational checklist
- api key handlingalways env-driven (
ANTHROPIC_API_KEY). for ide extensions, store via the editor's secret storage; never check in. - cost guardrailsread
result.total_cost_usdandresult.usageon every run. log the totals; cap with rate-limiting at the surface, not in the loop. - observabilitythe event stream is the trace. ship it to your logs as ndjson, with a session id field, and you've got per-turn replay for free.
- model pinningproduction agents pin a specific model id (e.g.
claude-sonnet-4-6). never use the family alias for prod — ship the explicit version, bump on a calendar. - replay your own runssave the prompt + the events to disk. when an agent does something weird, replaying with the same model and same input is the fastest path to a fix.