scalable/ai/claude agent sdk/lesson 08 cph / /
lesson 08 / 08 · 14 min · updated ·

shipping — cli, server, ide

packaging an agent as a cli, embedding it in a backend, or driving it from an ide. the diff to production.

tl;dr. the same query() call ships three different ways: a one-shot cli, a server endpoint streaming sse, an ide extension routing permissions through editor ux. the agent loop doesn't change. what changes is the surface around it — settings inheritance, permission ux, error handling, and what you do with the events.

the only thing that's different per surface

if you've gotten this far, you've seen every primitive the sdk exposes. shipping isn't another primitive — it's a question of how those primitives compose with the surface around them. three surfaces, three sets of correct answers.

cli — the developer surface

a script. argv parsing, an env file, no auth, a tty for stdout. the surface where you'd run a refactor, a one-shot review, a release note generator. the agent runs as the developer, with the developer's settings.

a one-shot cli — settings come from the project on disk
#!/usr/bin/env node
// scripts/agent.ts — a one-shot cli wrapper.
// usage: ./agent.ts "do the thing"
import { query } from "@anthropic-ai/claude-agent-sdk";

const prompt = process.argv.slice(2).join(" ");
if (!prompt) { console.error("usage: agent.ts <prompt>"); process.exit(1); }

const run = query({
  prompt,
  options: {
    permissionMode: "acceptEdits",
    settingSources: ["project"],
    allowedTools: ["Read", "Edit", "Glob", "Grep", "Bash"],
  },
});

for await (const ev of run) {
  if (ev.type === "assistant") {
    for (const b of ev.message.content) {
      if (b.type === "text") process.stdout.write(b.text);
    }
  }
  if (ev.type === "result" && ev.subtype !== "success") {
    process.stderr.write(`agent ended: ${ev.subtype}\n`);
    process.exit(2);
  }
}

server — the production surface

an http endpoint, probably streaming sse. running on a vps, in a container, behind a load balancer. zero developer context — the agent has to be self-contained. this is where most permission bugs ship to production.

hono + sse — the canonical streaming agent endpoint
// server-side: an http endpoint that streams sse to the browser.
// the http surface is yours; the agent is the same query() call.
import { Hono } from "hono";
import { streamSSE } from "hono/streaming";
import { query } from "@anthropic-ai/claude-agent-sdk";

const app = new Hono();

app.post("/agent", async (c) => {
  const { prompt } = await c.req.json();
  return streamSSE(c, async (stream) => {
    const run = query({
      prompt,
      options: {
        permissionMode: "default",
        // SERVER-SIDE: never inherit user/local settings. project only.
        settingSources: ["project"],
        // canUseTool is your bouncer for anything destructive.
        canUseTool: async (name, input) =>
          name === "Bash"
            ? { behavior: "deny", message: "no shell from the http agent" }
            : { behavior: "allow", updatedInput: input },
      },
    });

    for await (const ev of run) {
      await stream.writeSSE({ data: JSON.stringify(ev), event: ev.type });
    }
  });
});

export default { fetch: app.fetch, port: 8787 };

ide — the editor surface

a vs code extension, a jetbrains plugin, a custom editor. the agent has rich ux: showInputBox, showWarningMessage, output channels, diff views. permission ux belongs in the editor, not in stdout.

vs code — canUseTool routes through showWarningMessage
// vs code extension — drive the same sdk, surface events in a webview.
// the agent doesn't know it's in an editor; the editor knows it has an agent.
import * as vscode from "vscode";
import { query } from "@anthropic-ai/claude-agent-sdk";

export function activate(ctx: vscode.ExtensionContext) {
  ctx.subscriptions.push(
    vscode.commands.registerCommand("scalable.agent.run", async () => {
      const prompt = await vscode.window.showInputBox({ prompt: "agent prompt" });
      if (!prompt) return;
      const out = vscode.window.createOutputChannel("scalable agent");
      out.show();

      const run = query({
        prompt,
        options: {
          cwd: vscode.workspace.workspaceFolders?.[0]?.uri.fsPath,
          permissionMode: "acceptEdits",
          settingSources: ["project", "user"],
          // route ask-permission through the editor, not stdin.
          canUseTool: async (name, input) => {
            const ok = await vscode.window.showWarningMessage(
              `Run ${name}? ${JSON.stringify(input).slice(0, 80)}`,
              "allow", "deny",
            );
            return ok === "allow"
              ? { behavior: "allow", updatedInput: input }
              : { behavior: "deny", message: "user denied" };
          },
        },
      });

      for await (const ev of run) {
        if (ev.type === "assistant") {
          for (const b of ev.message.content) {
            if (b.type === "text") out.append(b.text);
          }
        }
      }
    }),
  );
}

what doesn't change

everything you've learned about tool selection, permissions, hooks, sub-agents, and system prompts and skills applies in all three surfaces. the agent loop doesn't care whether the next event is going to a tty, a websocket, or a webview. that's the point of the sdk — one loop, many surfaces.

operational checklist

the harness is the product. in eight lessons we've never taught you a prompt-engineering trick or a clever model parameter. the value of the sdk is the harness: permissions, hooks, sub-agents, the event stream. when an agent behaves well in production, that's almost always why.
scalable labs·cvr 30091604·github·linkedin·hello@scalable.dk