Continual learning for AI agents
Most discussions of continual learning in AI focus on one thing: updating model weights. But for AI agents, learning can happen at three distinct layers: the model, the harness, and the context. Understanding the difference changes how you think about building systems that improve over time. The three main layers
Most discussions of continual learning in AI focus on one thing: updating model weights. But for AI agents, learning can happen at three distinct layers: the model, the harness, and the context. Understanding the difference changes how you think about building systems that improve over time.
The three main layers of agentic systems are:
- Model: the model weights themselves.
- Harness: the harness around the model that powers all instances of the agent. This refers to the code that drives the agent, as well as any instructions or tools that are always part of the harness.
- Context: additional context (instructions, skills) that lives outside the harness, and can be used to configure it.
Example #1: Mapping this a coding agent like Claude Code:
- Model: claude-sonnet, etc
- Harness: Claude Code
- User context: CLAUDE.md, /skills, mcp.json
Example #2: Mapping this to OpenClaw:
- Model: many
- Harness: Pi + some other scaffolding
- Agent context: SOUL.md, skills from clawhub
When we talk about continual learning, most people jump immediately to the model. But in reality - an AI system can learn at all three of these levels.
Continual learning at the model layer
When most people talk about continual learning, this is what they most commonly refer to: updating the model weights.
Techniques to update this include SFT, RL (e.g. GRPO), etc.
A central challenge here is catastrophic forgetting — when a model is updated on new data or tasks, it tends to degrade on things it previously knew. This is an open research problem.
When people do train models for a specific agentic system (e.g. you could view the OpenAI codex models as being trained for their Codex agent) they largely do this for the agentic system as a whole. In theory, you could do this at a more granular level (e.g. you could have a LORA per user) but in practice this is mostly done at the agent level.
Continual learning at the harness layer
As defined earlier, the harness refers to the code that drives the agent, as well as any instructions or tools that are always part of the harness.
As harnesses have become more popular, there have been several papers that talk about how to optimize harnesses.
A recent one is Meta-Harness: End-to-End Optimization of Model Harnesses.
The core idea is that the agent is running in a loop. You first run it over a bunch of tasks, and then evaluate them. You then store all these logs into a filesystem. You then run a coding agent to look at these traces, and suggest changes to the harness code.
Similar to continual learning for models, this is usually done at the agent level. You could in theory do this at a more granular level (e.g. learn a different code harness per user).
Continual learning at the context layer
“Context” sits outside the harness and can be used to configure it. Context consists of things like instructions, skills, even tools. This is also commonly referred to as memory.
This same type of context exists inside the harness as well (e.g. the harness may have base system prompt, skills). The distinction is whether it is part of the harness or part of the configuration.
Learning context can be done at several different levels.
Learning context can be done at the agent level - the agent has a persistent “memory” and updates its own configuration over time. A great example is OpenClaw which has its own SOUL.md that gets updated over time.
Learning context is more commonly done at the tenant level (user, org, team, etc). In this case each tenant gets their own context that is updated over time. Examples include Hex’s Context Studio, Decagon’s Duet, Sierra’s Explorer.
You can also mix and match! So you could have an agent with agent level context updates, user level context updates, AND org level context updates.These updates can be done in two ways:
- After the fact in an offline job. Similar to harness updates - run over a bunch of recent traces to extract insights and update context. This is what OpenClaw calls “dreaming”.
- In the hot path as the agent is running. The agent can decided to (or the user can prompt it to) update its memory as it is working on the core task.
Another dimension to consider here is how explicit the memory update is. Is the user prompting the agent to remember, or is the agent remembering based on core instructions in the harness itself?
Comparison
Traces are the core
All of these flows are powered by traces - the full execution path of what an agent did. LangSmith is our platform that (among other things) helps collect traces.
You can then use these traces in a variety of different ways.
If you want to update the model, you can collect traces and then work with someone like Prime Intellect to train your own model.
If you want to improve the harness, you can use LangSmith CLI and LangSmith Skills to give a coding agent access to these traces. This pattern is how we improved Deep Agents (our open source, model agnostic, general purpose base harness) on terminal bench.
If you want to learn context over time (either at the agent, user, or org level) - then your agent harness needs to support this. Deep Agents - our harness of choice - supports this in a production ready way. See the documentation there for examples of how to do user-level memory, background learning, and more.
Join our newsletter
Updates from the LangChain team and community
Enter your email
Processing your application...
Success! Please check your inbox and click the link to confirm your subscription.
Sorry, something went wrong. Please try again.
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modelagentcontinual learning
Hong Kong developers test homebuyers with modest price increases after sell-outs
Hong Kong developers are raising prices of new homes this week following sold-out launches in recent days, further testing the appetite of homebuyers amid geopolitical and interest rate uncertainties. Henderson Land Development put another 39 units at its Chester project in Hung Hom on sale on Monday, with 25 homes finding buyers, according to agents. With an average discounted price of HK$22,198 (US$2,831) per square foot, the units were priced 4.57 per cent higher than the 123 units that sold...

Why APEX Matters for MoE Coding Models and why it's NOT the same as K quants
I posted about my APEX quantization of QWEN Coder 80B Next yesterday and got a ton of great questions. Some people loved it, some people were skeptical, and one person asked "what exactly is the point of this when K quants already do mixed precision?" It's a great question. I've been deep in this for the last few days running APEX on my own hardware and I want to break down what I've learned because I think most people are missing the bigger picture here. So yes K quants like Q4_K_M already apply different precision to different layers. Attention gets higher precision, feed-forward gets lower. That's been in llama.cpp for a while and it works. But here's the thing nobody is talking about. MoE models have a coherence problem. I was reading this article last night and it clicked for me. When

Agentic AI Vision System: Object Segmentation with SAM 3 and Qwen
Table of Contents Agentic AI Vision System: Object Segmentation with SAM 3 and Qwen Why Agentic AI Outperforms Traditional Vision Pipelines Why Agentic AI Improves Computer Vision and Segmentation Tasks What We Will Build: An Agentic AI Vision and Segmentation The post Agentic AI Vision System: Object Segmentation with SAM 3 and Qwen appeared first on PyImageSearch .
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Self-Evolving AI

Agentic AI Vision System: Object Segmentation with SAM 3 and Qwen
Table of Contents Agentic AI Vision System: Object Segmentation with SAM 3 and Qwen Why Agentic AI Outperforms Traditional Vision Pipelines Why Agentic AI Improves Computer Vision and Segmentation Tasks What We Will Build: An Agentic AI Vision and Segmentation The post Agentic AI Vision System: Object Segmentation with SAM 3 and Qwen appeared first on PyImageSearch .

Podcast: Context Engineering with Adi Polak
In this episode, Thomas Betts and Adi Polak talk about the need for context engineering when interacting with LLMs and designing agentic systems. Prompt engineering techniques work with a stateless approach, while context engineering allows AI systems to be stateful. By Adi Polak

Engineering Storefronts for Agentic Commerce
For years, persuasion has been the most valuable skill in digital commerce. Brands spend millions on ad copy, testing button colours, and designing landing pages to encourage people to click Buy Now. All of this assumes the buyer is a person who can see. But an autonomous AI shopping agent does not have eyes. I [ ]



Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!