what the runtime does
the runtime is where your agent lives. in production,
app.run() boots a
fastify
server on port 8080; agentcore routes each invocation to your configured
handler and streams events back to the caller over sse as you yield them.
here in the sandbox we skip the server and call the handler directly —
same typescript, same yields, fewer moving parts.
the runtime isn't opinionated about how you generate those events. strands, langgraph, the vercel ai sdk, your own loop — all fine. the runtime's job is to serve, scale, and observe. your job is to yield.
anatomy
- new BedrockAgentCoreApp({...})
the app instance. takes an
invocationHandlerand (in production) becomes the server target. - invocationHandler.process
an async generator (or async function) — the handler.
one per app. optionally paired with
requestSchema(a zod schema) for inbound validation. - payload a plain object, whatever the caller sent. agentcore doesn't enforce a schema unless you give one — you own it.
- yield event
stream an event to the caller as sse. the sdk expects
{ event, data }— any json-serializable data. - app.run()
starts the fastify server — production only. in the sandbox we call
app.invoke(payload)to exercise the handler directly.
the trace shows the runtime lifecycle: app.init when the app
object is constructed, app.entrypoint_registered when the
handler is wired up, app.invoke when it's called,
app.yield for each event, and app.done when the
handler returns.
the payload is just an object
agentcore passes the request body through verbatim. there's no required
schema. most teams settle on something like
{ prompt: "...", sessionId: "...", tools: [...] },
but that's convention, not enforcement. if you want enforcement, attach
a requestSchema (zod) to the invocationHandler
and the runtime validates every call for you.
try editing the invoke call: switch mode to
"upper" or "reverse", drop the prompt entirely,
or pass an unknown mode. each path produces a different trace, which is
exactly what you'd see in cloudwatch for the real runtime.
yield vs return
two styles work. the common one is an async generator that yields events one at a time — the runtime streams each to the caller as an sse event as it appears. the alternative is a plain async function that returns a single value at the end. generators are the norm for agent workloads because you want partial progress visible (text deltas, tool calls, intermediate thoughts) without waiting for the full turn.
look at the traces side by side. the first shows three
app.yield lines; the second shows a single
app.return. both are valid, both are seen by the runtime —
but only the first gives the caller progressive output.
app.invoke() collects all events into an
array and hands them to you at the end. in production, the real runtime
streams them to the http caller as server-sent events — the
caller sees each one arrive, not all at once. don't build a mental model
where the whole list materializes before anyone sees the first event.
when it breaks
uncaught exceptions propagate. there's no implicit retry, no swallowed
stack traces — the runtime records the failure, surfaces it to the caller,
and moves on. what you see below is the same shape you'd see in
observability: the events that made it out before the crash, then an
app.error marker, then the exception itself.
flip fail to false and run again. the trace
collapses down to the happy path — three yields and a done — with no
error event. this is the shape of the contract: one end-of-turn marker
either way, success or failure.
what we skipped
- session context each invocation is stateless as far as the runtime cares. state lives in memory — next lesson.
- tool calls yielding a tool-call event doesn't call the tool. that's gateway's job (lesson 4).
- framework integration strands, langgraph, the vercel ai sdk all plug in here as generators of events. the runtime doesn't care which.
next: memory — the state that survives the turn.