what you get for free
every lesson so far, the trace panel has been showing you exactly what agentcore's observability surface looks like. the runtime traces the invocation; memory traces reads and writes; gateway traces tool calls; identity traces principal reads and token issuance; browser and code interpreter trace their own events. you haven't written a single line of observability code.
in production, these events become opentelemetry spans. cloudwatch gets them automatically; so do x-ray, datadog, honeycomb — anywhere that speaks otel. the same trace you see here is the trace your on-call engineer sees at 2am.
anatomy of an automatic trace
read top to bottom. app.init constructs the app. each
primitive emits its own *.init. app.invoke
opens a span for this invocation; everything nested under it is attached
as child spans. app.yield and app.done close
the span. in cloudwatch this becomes a waterfall you can click through.
adding your own spans
automatic tracing covers the primitives. your agent's logic —
the retrieval step, the reranking, the generation call, the custom
business rule — doesn't trace itself. that's what span
is for.
each await span(name, attrs, fn) emits a
span.start event with the attributes you pass, runs
fn, then emits span.end when it returns. the
attributes (model name, topK, anything) become searchable tags in
cloudwatch. "show me every invocation where the reranker model was
cohere-rerank-3 and latency was over 500ms" is a query you can actually
write.
what happens when spans fail
span emits span.error when fn throws, records
the exception type and message, then lets the error propagate. in
production this becomes a marked error span in the waterfall, plus a
cloudwatch metric you can alert on. you don't need try/catch to get
error observability — you need it only when you want to recover.
what cloudwatch adds on top
- traces the waterfall view. every span nested under its parent, timings drawn to scale. click a span to see its attributes and exception details. one view per invocation.
- metrics aggregated counters and histograms: p50/p95/p99 latency per span name, error rates per tool, memory event throughput. no extra instrumentation — metrics are derived from the same spans.
- logs
anything you
console.log(or write through the context-scopedlog) ends up in cloudwatch logs, tagged with the trace id. jumping from a trace to its logs is one click. - cost attribution model tokens, sandbox minutes, memory storage — all billed per invocation and visible in the same trace. you can answer "which agent flow costs the most per turn" without leaving the console.
what to actually watch
three signals worth an alarm: app.error rate (errors per
invocation); p95 of the app.invoke → app.done
span (total latency); tool call failure rate (filtered on
gateway.error). everything else can wait until it matters.
next: shipping this to real aws.