scalable/ai/agentcore/lesson 08 cph / /
lesson 08 / 08 · 18 min · updated ·

shipping to real aws

what changes when you leave the sandbox. cli, iam, costs, and the smallest deploy that works.

what the sandbox hid

every lesson so far ran your code in a module worker, transpiling typescript in-browser against a mocked sdk. that was a teaching choice — it let you write real-shape agent code without an aws account. the shape is correct. what's missing in production is the rest of the infrastructure: iam, containers, network, billing.

this lesson walks the actual deploy. thirty minutes if you have an aws account handy, longer if you don't.

prerequisites

  1. aws account in a region where bedrock and agentcore are available. us-east-1, us-west-2, and eu-west-1 are the safe bets as of writing — check the current regional availability list before you commit.
  2. aws cli configured aws configure with credentials that can create iam roles, ecr repos, and agentcore resources. a dev account is fine; don't use your root user.
  3. node.js 20+ the typescript sdk requires node 20 and up. match the runtime's node version when you build the container.
  4. docker for building the container image agentcore runtime will run. podman works too.
  5. agentcore cli (optional) npm install -g @aws/agentcore — scaffolds a new project with your choice of framework (strands, langgraph, vercel ai sdk) and language. agentcore create saves you the Dockerfile and IAM boilerplate below.

install

npm install bedrock-agentcore

that's the production sdk. the real import paths and class names match what you've been using: import { BedrockAgentCoreApp } from "bedrock-agentcore/runtime". the mock in this course exists to mirror that surface.

for the lower-level control plane (creating runtimes, invoking agents, managing sessions) you also want @aws-sdk/client-bedrock-agentcore from aws sdk for javascript v3.

the agent, unchanged

drop the mock, same code
ready

the only line the sandbox couldn't run is app.run(). in production that boots a fastify server on port 8080 which agentcore runtime invokes over the wire. everything else is byte-for-byte the same.

create the runtime

resources come first. you need a runtime (to host the agent) and typically a memory (to give it state). both are one-shot creations — usually done via the aws console the first time, then codified in cloudformation or cdk.

# create a memory resource
aws bedrock-agentcore-control create-memory \
  --name "my-agent-memory" \
  --event-expiry-days 30

# create the agent runtime (references the ecr image you'll push next)
aws bedrock-agentcore-control create-agent-runtime \
  --name "my-agent" \
  --container-uri "<account>.dkr.ecr.<region>.amazonaws.com/my-agent:latest" \
  --role-arn "arn:aws:iam::<account>:role/AgentCoreRuntimeRole"

the role needs three things: permission to assume agentcore's execution role, permission to read from the memory resource, and whatever outbound permissions the agent itself needs (bedrock invoke, s3 read, whatever your tools reach). the aws docs have a starter policy document — don't hand-write it.

build and push the container

# dockerfile — ~10 lines, nothing exotic
cat > Dockerfile <<'EOF'
FROM node:20-slim
WORKDIR /app
COPY package*.json ./
RUN npm ci --omit=dev
COPY dist ./dist
EXPOSE 8080
CMD ["node", "dist/agent.js"]
EOF

# build for linux/amd64 (the runtime architecture)
docker buildx build --platform linux/amd64 -t my-agent:latest .

# push to ecr
aws ecr get-login-password --region <region> | docker login --username AWS \
  --password-stdin <account>.dkr.ecr.<region>.amazonaws.com
docker tag my-agent:latest <account>.dkr.ecr.<region>.amazonaws.com/my-agent:latest
docker push <account>.dkr.ecr.<region>.amazonaws.com/my-agent:latest

if you used agentcore create to scaffold, the Dockerfile above is already generated — agentcore deploy bundles the build, push, and runtime-update steps into one command.

now point the runtime at the image (via update-agent-runtime if it exists, or recreate) and wait a minute or two for cold start. the runtime is live.

invoke it

aws bedrock-agentcore invoke-agent-runtime \
  --agent-runtime-arn "<arn>" \
  --payload '{"prompt": "hello"}' \
  --accept "application/json"

the response streams server-sent events — the same { event: "message", data: { text: "..." } } shape you've seen throughout. pipe it into your frontend, a chat ui, a batch consumer, whatever calls it.

what changes vs the sandbox

  1. streaming is real in the sandbox app.invoke returned an array. in production the caller sees each yield arrive as a separate sse event — design your ui for that.
  2. cold starts exist first invocation after idle takes 3–8 seconds. provisioned concurrency warms the container; it also doubles the base cost.
  3. memory has retention session events expire per the ttl you set on the memory resource. long-term records live until you delete them. plan both.
  4. identity is attached by agentcore the principal you read in your handler is injected by the runtime based on your inbound auth config — not by whoever called invoke. trust it.

costs to watch

three axes, billed independently:

  1. runtime minutes per-second billing while your container is running. idle time between invocations costs nothing unless you've provisioned concurrency.
  2. model tokens whatever model the agent calls — bedrock, anthropic, openai — billed by that provider. the biggest line item by far.
  3. memory storage session events are cheap; long-term records (with embeddings) cost more per gb. a chatty agent with no retention policy adds up faster than you'd expect.
set a budget alert before you invoke anything. an agent that gets stuck in a retry loop can chew through a month of token budget in an hour. a cloudwatch alarm on token spend per invocation, plus a per-account budget notification, is the cheapest insurance you can buy.

the pre-production checklist

  1. iam role scoped the runtime role holds only the permissions the agent needs. "AmazonBedrockFullAccess" is not a permissions boundary.
  2. secrets in parameter store / secrets manager not env vars, not baked-in images. the runtime can read them at startup through the role.
  3. alarms on error rate + p95 latency cloudwatch dashboards are free, alarms are cheap. set them before you need them.
  4. a rollback plan previous image tag retained in ecr, update-agent-runtime points at it. rolling back should be one command, not a war room.

that's the whole course

you now know the seven primitives, how they compose, what the real sdk looks like, and how to ship an agent that uses them. the mock was a ladder; you're off it now.

the agentcore docs are the canonical reference — the sdk moves fast, and anything you read here is fresh on but won't be forever. treat this course as the mental model. treat the docs as the source of truth.

scalable labs·cvr 30091604·github·linkedin·hello@scalable.dk