Can Agents Really Self-Improve?

“Self-improving agent” is one of those phrases that sounds bigger than it usually is.

I do think agents can improve over time.

I do not think that automatically means they are evolving in the science-fiction sense.

The useful version is more boring: the system captures feedback, turns it into reusable procedure, tests whether the procedure helps, and loads it when a similar task appears.

That is still valuable.

Improvement needs memory plus evaluation

An agent cannot improve just because it stores more history.

History can be garbage.

The system needs to know which past behavior was good, which was bad, and which context made it relevant.

That means feedback.

It also means evaluation. If the agent writes a new rule for itself, does the rule improve the next task or just make the prompt longer?

Without that loop, “self-improvement” becomes accumulation. The agent keeps adding notes until the context gets heavier and the behavior gets less predictable.

Skills are a practical unit

This is why I like the skill abstraction.

A skill can package a learned workflow:

when to use it;
what steps to follow;
what tools to call;
what examples to reference;
what mistakes to avoid.

If an agent can create or update skills from successful work, that starts to look like useful self-improvement.

But the hard part is not writing the skill. The hard part is deciding whether the skill is correct, general, and safe to reuse.

Offline evolution is more credible

The version I trust more is offline evolution.

Let the agent run tasks. Capture traces. Review failures. Extract candidate rules. Test them against a benchmark or a replay set. Promote only the rules that help.

That is not as magical as an agent spontaneously becoming smarter during a chat.

It is also much closer to engineering.

Production systems need rollback. They need versioning. They need evidence that a new behavior is better than the old one.

Self-improvement without those controls is just mutation.

My current stance

Agents can learn operational habits.

They can learn which tools work, which file patterns matter, which errors are common, and which steps should be repeated.

But I would not call it real self-improvement until the system can answer:

what changed;
why it changed;
which tasks improved;
which tasks regressed;
how to roll it back.

That is the bar I care about.