How Agent Memory Really Works

People talk about agent memory as if the agent has a mind.

Most of the time, it does not.

What it has is a context window.

That sounds like a small distinction, but it changes how I design AI products. If I believe the agent “remembers,” I will blame the model when it forgets. If I understand that the product is rebuilding state into a prompt, I will look at the architecture.

Memory is usually reconstruction

For many tools, each new request starts from a stateless model call.

The product collects recent messages, system instructions, tool definitions, maybe retrieved notes, maybe user profile data, and sends them into the model. The model answers with confidence because it sees a stitched-together slice of history.

That can feel like memory.

But it is closer to bringing a worker a folder before each meeting.

If the folder is missing the important page, the worker does not know it is missing. It just works from the folder.

The context window is not a diary

A larger context window helps, but it is not a diary.

Once the window fills up, something has to be left out. Usually the earliest material goes first, or a summary replaces the raw history. Either way, the agent does not get a clean warning that part of its past vanished.

It just keeps answering.

This is why users end up repeating themselves. They are not only fighting a forgetful model. They are fighting a product that has not made the memory boundary visible.

Real memory needs product decisions

If you want memory to be real, you need to decide what deserves durability.

Some facts should be stored explicitly: name, plan, workspace, project, preference.

Some things should be stored as evidence: decisions, failed attempts, approvals, bug causes.

Some things should expire.

Some things should never be remembered unless the user approves.

The model cannot cleanly infer all of this from a chat transcript. The product has to own the state model.

The useful framing

When an agent seems to remember, ask three questions:

What source did this memory come from?
How did it get selected into the current context?
What happens when it becomes wrong?

If the product cannot answer those questions, it does not really have memory. It has a long prompt and a good illusion.

That illusion is useful for demos.

For production agent products, it is not enough.

Memory is usually reconstruction

The context window is not a diary

Real memory needs product decisions

The useful framing

Think this is wrong?