Moving fast with agents without losing comprehension
Addy Osmani wrote a great post last week on comprehension debt , the hidden cost of AI-generated code. The core idea: AI generates code far faster than humans can evaluate it, and that gap quietly hollows out the team's understanding of their own codebase. It resonated with me, but what struck me most is a specific asymmetry in how the industry is responding. Most guidance around working with agents optimises for agent comprehension: context files, MCP servers, documented skills, feeding in the right information so the agent can reason about your codebase. There's far less conversation about the equally important problem: making sure humans still understand the system the agent is changing. We're optimising for agent comprehension while human comprehension quietly erodes. That gap is what'
Addy Osmani wrote a great post last week on comprehension debt, the hidden cost of AI-generated code. The core idea: AI generates code far faster than humans can evaluate it, and that gap quietly hollows out the team's understanding of their own codebase.
It resonated with me, but what struck me most is a specific asymmetry in how the industry is responding. Most guidance around working with agents optimises for agent comprehension: context files, MCP servers, documented skills, feeding in the right information so the agent can reason about your codebase. There's far less conversation about the equally important problem: making sure humans still understand the system the agent is changing.
We're optimising for agent comprehension while human comprehension quietly erodes. That gap is what's made me think carefully about how I've been working, and what actually needs to be in place before you can move fast without losing the understanding that keeps a codebase healthy.
The thing reviews were actually doing
Reviews aren't just quality assurance. They're how understanding spreads across a team. When someone reads your code carefully enough to approve it, they're building a mental model of what changed and why. That's the mechanism by which a team stays collectively oriented to its own codebase.
Agents put this mechanism under pressure, not by making code worse, but by generating it faster than the review process was designed to handle. Sometimes moving fast and trusting the agent is the right call, especially in well-covered, well-understood parts of the codebase. But when it goes wrong the consequences compound. Each poorly-understood change makes the next review less meaningful as you're reasoning about new code against a mental model that's already drifting.
What I've learned from trying
My initial instinct when I ran into this was process. Break large agent changesets into smaller sequenced MRs, each telling a coherent part of the story, each individually deployable, like a slow-motion replay after a fast-forward session. There's something to it. A large MR where I reorganised commits to be reviewed one by one got merged without friction. Making changes legible and telling a coherent story is always the right instinct.
But I also have five stacked MRs on a legacy codebase sitting in draft. I understand what the changes do, but I don't trust the existing test coverage to catch the side effects and functional behaviour that could break. Without that confidence there's an implicit expectation of manual verification underneath the whole thing, and that's asking a reviewer to carry the risk you haven't dealt with.
Process can make changes more legible. It can't substitute for a safety net that isn't there.
What comprehension actually needs to look like now
It's not line-by-line, that's not feasible anymore, and pretending otherwise just means some reviews are theatre. But it's not nothing either. I think it works at three levels.
The first is behavioural: does it work as expected? This is where test coverage becomes the most important investment a team can make. Real coverage that covers real behaviour across paths users actually take, alongside type safety that catches type errors at compile time. If the compiler and test suite are doing their job, reviewers don't need to trace every line. The places where coverage is thin, or where teams have been relying on manual testing, are exactly the places where agent velocity stops being speed and starts being negligence.
The second is architectural: do we broadly understand how the changes work, and can we update our mental model of the system? This is something agents can help with directly. Ask the agent to summarise the meaningful decisions in a changeset, not the mechanical changes but the choices a human needs to evaluate: what alternatives were considered, where the non-obvious decisions are, what the author would flag in a code walkthrough. Use that as the basis for your MR description. I've packaged this into an agent skill you can drop into your own workflow, it produces a structured MR description and a commit structure recommendation you can review and use to help make agent-generated changesets more legible to reviewers.
The third is standards: does the code meet the conventions the team has agreed on? Linting handles a lot of this automatically and anything you can push into a linter is one less thing a human reviewer needs to spend attention on. For the things linting can't catch, I've written before about agent skills. If your standards are documented well enough to guide the agent writing the code, they're documented well enough to guide an agent reviewing it too.
Show your working
Good authorship has always mattered. It matters more now. The reviewer wasn't in your agent session and they have no ambient understanding of what you were trying to do, what tradeoffs you considered, what decisions the agent made that you consciously kept. That context doesn't transfer through the diff, you have to transfer it deliberately.
That means flagging the architectural decisions that actually need human eyes, not just describing what changed but why. It means thinking carefully about commit structure so the story of the change is legible before someone even reads the code. It means writing a description that demonstrates you understood what the agent produced, because if you can't explain it clearly there's a risk you've switched to passive delegation.
The Anthropic study Addy cites found that engineers who used AI for passive delegation, just letting it produce code without staying actively engaged, scored significantly lower on comprehension tests than those who used it as a thinking tool. The agent doesn't replace the engineer. It's a tool, and you still need to understand what it's doing and why, not just that it works. That understanding is what your reviewer deserves: guide them toward it rather than leaving them to reconstruct it from scratch.
Not every change carries the same risk or requires the same depth of review, and being explicit about that is part of good authorship too. Ship / Show / Ask is a useful frame for this, calibrating the level of review based on the nature of the change and the trust already established with your team.
What fast actually requires
The five MRs sitting in draft aren't blocked by process or by my understanding of the code. They're blocked because the safety net isn't there. That's the first obligation, fix it before you ship, not after.
But a solid test suite without the authorship work just means your reviewer can confirm nothing broke. That's not the same as understanding what changed, or why, or what the agent decided that you consciously kept. The agent gives you velocity. What makes that velocity real is being able to explain what you built and why, not just that it works.
DEV Community
https://dev.to/alexocallaghan/moving-fast-with-agents-without-losing-comprehension-49fkSign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modelupdateinvestmentlangchain-ollama==1.1.0
Changes since langchain-ollama==1.0.1 release(ollama): 1.1.0 ( #36574 ) feat(ollama): support response_format ( #34612 ) fix(ollama): serialize reasoning_content back to ollama thinking ( #36573 ) fix(ollama): prevent _convert_messages_to_ollama_messages from mutating caller list ( #36567 ) feat(ollama): add dimensions to OllamaEmbeddings ( #36543 ) fix(ollama): respect scheme-less base_url ( #34042 ) feat(ollama): logprobs support in Ollama ( #34218 ) chore(ollama): switch to ty ( #36571 ) chore: add comment explaining pygments>=2.20.0 ( #36570 ) chore: pygments>=2.20.0 across all packages ( CVE-2026-4539 ) ( #36385 ) chore: bump requests from 2.32.5 to 2.33.0 in /libs/partners/ollama ( #36249 ) chore(partners): bump langchain-core min to 1.2.21 ( #36183 ) ci: suppress pytest streaming ou

TABQAWORLD: Optimizing Multimodal Reasoning for Multi-Turn Table Question Answering
arXiv:2604.03393v1 Announce Type: new Abstract: Multimodal reasoning has emerged as a powerful framework for enhancing reasoning capabilities of reasoning models. While multi-turn table reasoning methods have improved reasoning accuracy through tool use and reward modeling, they rely on fixed text serialization for table state readouts. This introduces representation errors in table encoding that significantly accumulate over multiple turns. Such accumulation is alleviated by tabular grounding methods in the expense of inference compute and cost, rendering real world deployment impractical. To address this, we introduce TABQAWORLD, a table reasoning framework that jointly optimizes tabular action through representation and estimation. For representation, TABQAWORLD employs an action-condit

Contextual Control without Memory Growth in a Context-Switching Task
arXiv:2604.03479v1 Announce Type: new Abstract: Context-dependent sequential decision making is commonly addressed either by providing context explicitly as an input or by increasing recurrent memory so that contextual information can be represented internally. We study a third alternative: realizing contextual dependence by intervening on a shared recurrent latent state, without enlarging recurrent dimensionality. To this end, we introduce an intervention-based recurrent architecture in which a recurrent core first constructs a shared pre-intervention latent state, and context then acts through an additive, context-indexed operator. We evaluate this idea on a context-switching sequential decision task under partial observability. We compare three model families: a label-assisted baseline
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Products

Structural Segmentation of the Minimum Set Cover Problem: Exploiting Universe Decomposability for Metaheuristic Optimization
arXiv:2604.03234v1 Announce Type: new Abstract: The Minimum Set Cover Problem (MSCP) is a classical NP-hard combinatorial optimization problem with numerous applications in science and engineering. Although a wide range of exact, approximate, and metaheuristic approaches have been proposed, most methods implicitly treat MSCP instances as monolithic, overlooking potential intrinsic structural properties of the universe. In this work, we investigate the concept of \emph{universe segmentability} in the MSCP and analyze how intrinsic structural decomposition (universe segmentability) can be exploited to enhance heuristic optimization. We propose an efficient preprocessing strategy based on disjoint-set union (union--find) to detect connected components induced by element co-occurrence within s

The Kidney Problem
Your immune system has an ID card on every cell in your body. It's called the Major Histocompatibility Complex. Your immune cells check these cards constantly. If the card matches your genome, the cell belongs. If it doesn't, it gets attacked. This system works perfectly inside one body. It fails completely between two bodies. Transplant a kidney from one person to another. The kidney is healthy. It functions. It would save the recipient's life. But the recipient's immune system can't read the donor's ID card. The MHC molecules on the kidney's cells don't match. The immune system attacks the transplant. Without immunosuppressive drugs, the kidney dies. The credentials don't port. We run a network of 13 autonomous AI agents. They build trust by publishing work and citing each other. An agen

Web Color "Wheel" Chart
Based off of a beautiful old web color "KiloChart" poster from 2002 which I purchased a long time ago. It was put out by the now defunct Visibone corporation. This little tool makes it easy to find the right color(s) for your page in rgb() output. The chart shows 42 different hues in triangle groups of 25 shades each.

🚀 The "Legacy Code" Nightmare is Over: How AI Agents are Automating App Modernization
Let’s be honest for a second. If you’ve been a software engineer for more than a few years, you’ve probably inherited a "legacy monolith" . You know the one I'm talking about. The massive, 15-year-old codebase where business logic is hopelessly tangled with presentation layers, the original developers left a decade ago, and touching a single file breaks production. Historically, when upper management says, "We need to move this to the cloud," developers groan. The process of migrating and modernizing apps—deciding whether to Rehost, Refactor, or Rebuild —is notoriously painful, expensive, and slow. But the meta is shifting. Microsoft just released their highly anticipated App Modernization Playbook , and tucked inside the strategy guide is the absolute game-changer for 2026: Intelligent Ag


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!