scalable/ai/agentcore/lesson 03 cph / /
lesson 03 / 08 · 14 min · updated ·

memory — state that survives the turn

session memory vs long-term memory. events, strategies, retrieval across sessions.

what memory is for

without memory every invocation is a goldfish. the runtime doesn't keep state between turns — each invoke starts from scratch. memory is the primitive that makes a conversation feel continuous, a user feel recognized, a workflow feel like it remembers what happened five minutes ago.

agentcore memory is a managed service. you don't stand up redis, you don't pick an embedding store, you don't design a schema. you create events; the service handles persistence, retention, and (with strategies, below) distillation into long-term knowledge.

two kinds, one api

  1. session memory short-term. everything in this session — the turns you just took, the user's last message, what the agent just said. scoped to a session id, cleared when the session ends.
  2. long-term memory survives sessions. name, preferences, recurring facts, distilled summaries. retrieved via search/semantic lookup, not listed by position.

both kinds write the same shape: an event with a role, content, and whatever metadata you attach. the difference is how you read — listEvents() gives you the current session in order; recall(query) finds matches across all sessions.

note. high-level memory helpers are still rolling out in the typescript sdk — the shape you see here is what the first-class Memory wrapper will expose. in production today you either install it when it lands, or use the lower-level @aws-sdk/client-bedrock-agentcore commands (CreateEvent, ListEvents, RetrieveMemoryRecords) directly.

the smallest possible memory

write three events, read them back
ready

the trace shows memory.init when you construct the Memory object, then memory.create for each event, then memory.list when you ask for the session's history. that's the whole protocol.

memory inside an entrypoint

real use looks like this: every turn the entrypoint writes the user's message, reads back the session so far, decides what to do, and writes its own reply. two consecutive invokes simulate a two-turn conversation without any session-management infrastructure on your side.

two-turn chat, shared Memory across invokes
ready

the agent's "recall" here is a dumb substring match — no embeddings, no model — but the shape is the thing that matters: the handler treats the session as its short-term memory, reads it fresh each turn, writes back what happened. replace rememberedName with an llm call and you have a real conversational agent.

across sessions

what about "the user mentioned a peanut allergy yesterday"? different session id, different chat. the agent can still reach it via recall — memory is partitioned by session, but queries can cross the boundary.

yesterday's allergy, today's snack recommendation
ready

listEvents() on the today session returns only today's events. recall("peanut") scans all sessions and finds yesterday's. in production, recall uses semantic search (embeddings + vector similarity) rather than substring matching; the interface is the same, the recall quality is better.

strategies — the part we skipped

agentcore memory supports strategies: background processes that distill short-term events into long-term records. examples: extract user preferences ("ada likes dark roast"), summarize long conversations into a few sentences, build an entity graph of people mentioned. you define a strategy once; the service runs it across your events and makes the output available to recall.

the mock doesn't simulate strategies — they're a server-side behavior that's hard to fake honestly. the important thing is to know they exist, and to design your createEvent calls so the raw material is there: explicit roles, clean content, metadata that strategies can key off.

what memory isn't. it's not your application database. it's not where you store billing records, session tokens, or anything you'd put behind a foreign key. memory is for the agent's understanding of the conversation — the stuff an llm would benefit from seeing. everything operational stays in your own store.

what this looks like in production

  1. one Memory resource per agent created via the aws console or cli, identified by an arn. your code gets a handle and starts writing events.
  2. session ids are yours to choose usually a stable per-user or per-conversation id. the same id across invokes means the same session.
  3. strategies run asynchronously long-term records appear a little after the events that produced them. don't build flows that assume instant distillation.
  4. retention is configurable session memory has a ttl; long-term memory is retained indefinitely unless you delete it. design for both.

next: gateway — turning apis into tools your agent can call.

scalable labs·cvr 30091604·github·linkedin·hello@scalable.dk