Architecture Is the Missing Layer in AI Harness Engineering
Originally published in longer form on Substack. This DEV version is adapted for software engineers and platform practitioners who want the practical takeaway quickly. Most AI harness work focuses on execution. That makes sense. Teams need better context management, tool access, workflow boundaries, verification, memory, and sub-agent coordination. Without those pieces, coding agents are unreliable fast. But there is a different failure mode that those harness improvements do not solve: an agent can operate inside a well-designed execution harness and still produce the wrong architecture. That is the missing layer. The Real Problem Is Not Just Code Quality Ask an agent to design a small SaaS product and it will often produce something that is technically coherent and operationally excessiv
Originally published in longer form on Substack. This DEV version is adapted for software engineers and platform practitioners who want the practical takeaway quickly.
Most AI harness work focuses on execution.
That makes sense. Teams need better context management, tool access, workflow boundaries, verification, memory, and sub-agent coordination. Without those pieces, coding agents are unreliable fast.
But there is a different failure mode that those harness improvements do not solve:
an agent can operate inside a well-designed execution harness and still produce the wrong architecture.
That is the missing layer.
The Real Problem Is Not Just Code Quality
Ask an agent to design a small SaaS product and it will often produce something that is technically coherent and operationally excessive at the same time.
You get things like:
-
microservices where a monolith would do
-
Kubernetes where managed PaaS is the obvious fit
-
heavyweight observability and rollout machinery for a team with no real platform capacity
-
provider choices that quietly add lock-in or operational burden
-
reliability mechanisms sized for a much larger organization
None of that is necessarily irrational.
It is just architecture optimized for an imaginary team.
That is what happens when the harness governs what the agent can see and do, but not what kinds of systems it is allowed to design.
What the Harness Usually Misses
Most organizations already have architectural constraints, whether they write them down well or not:
-
cost ceilings
-
preferred cloud/saas providers
-
approved deployment models
-
auth and identity boundaries
-
operational limits
-
compliance expectations
-
explicit exclusions
The problem is that these often live in:
-
docs
-
ADRs
-
wiki pages
-
tribal memory
-
architecture review meetings
That is not enough for agent-driven workflows.
If those constraints are not machine-readable and enforceable, the agent is still reasoning inside an underconstrained design space.
What I Mean by "Architecture Inside the Harness"
The core idea is simple:
The harness should not only manage execution. It should also constrain architecture.
In practice, that means three pieces:
1. A pattern registry
Architectural knowledge has to live somewhere reusable.
A pattern in the registry can encode:
-
what constraints it supports
-
what NFR thresholds it can satisfy
-
what it provides and requires
-
what config decisions it exposes
-
what cost and adoption trade-offs it carries
That turns architecture knowledge from conversation into versioned policy.
2. A deterministic architecture compiler
The compiler takes a canonical spec and selects patterns based on explicit rules.
The key property is determinism.
Given the same inputs, it should produce the same outputs. That gives teams something they can actually review and approve. It also makes architectural change visible as a diff instead of as implementation drift discovered too late.
3. Workflow rules around the compiler
The compiler alone is not enough.
You also need workflow discipline that tells the agent:
-
when to compile
-
when planning has surfaced a real architecture change
-
when re-approval is required
-
when implementation is allowed to proceed
That is what turns architecture from documentation into a control point.
Why Determinism Matters
At the architecture layer, the problem is not mainly creativity. It is governance.
That is why deterministic behavior matters more than people often expect.
It gives you:
-
reproducibility
-
auditability
-
explicit assumptions
-
explicit exclusions
-
a recompile-and-diff path when constraints change
For senior engineers and platform teams, that is much more useful than a model producing a plausible design summary in slightly different words each time.
A Concrete Example
I used this approach in a Bird ID application workflow.
The product itself was simple: users upload bird photos, an AI model identifies likely species, and results are stored in per-user history.
The important part was not the feature list. It was the operating context:
-
hosted PaaS backend
-
managed Postgres
-
OIDC for auth
-
object storage for uploads
-
low traffic
-
strong cost sensitivity
-
no real ops team
Once those became compiler inputs, the architecture was constrained mechanically rather than conversationally.
That made it much easier to reject patterns that would have been technically valid but wrong for the project:
-
heavyweight deployment patterns
-
overly complex topology choices
-
infrastructure layers that added operational cost without real payoff
The downstream effect mattered too. The approved architecture could then be handed to planning and implementation as an explicit contract instead of a loose design memo.
The Real Deliverable Is Not Better Documentation
The main output of this style of harnessing is not prettier architecture docs.
The real output is an enforceable boundary between architecture and implementation.
That boundary matters because implementation agents are good at creating drift quickly.
If the architecture says:
-
OAuth2/OIDC with PKCE
-
hosted PaaS
-
managed Postgres
-
monolithic service topology
then implementation should not quietly reintroduce:
-
server-side session state
-
new provider choices
-
new persistence layers
-
unnecessary distributed complexity
Without a hard boundary, those changes show up as "implementation details." In practice, they are architecture changes.
What Platform Teams Should Take From This
If you are building internal agent workflows, the practical lesson is:
do not stop at context engineering.
Context engineering improves what the agent can see. Tool engineering improves what the agent can do. But neither is enough to keep the system architecture aligned with actual team constraints.
Platform teams need something stronger:
-
explicit architecture inputs
-
deterministic architecture selection
-
approval and re-approval boundaries
-
implementation workflows that are forced to stay inside the contract
That is what architecture inside the harness gives you.
Closing
The value of a harness is not only that it makes agents more capable.
The value is that it bounds the solution space so capability is applied in the right direction.
If the architecture layer stays implicit, fast agents will simply accelerate architectural drift.
If the architecture layer becomes explicit, reviewable, and enforceable, then agent speed becomes much easier to trust.
That is the argument: architecture is the missing layer in AI harness engineering.
Links
-
Longer Substack version: https://inetgas.substack.com/p/ai-harness-engineering-at-the-architecture
-
Architecture Compiler: https://github.com/inetgas/arch-compiler
-
Bird ID case study: https://github.com/inetgas/arch-compiler-ai-harness-in-action
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modelversionproduct
Full Stack Developer Roadmap 2026: The Complete Guide from Beginner to Pro 🚀
Have a Look at My Portfolio Introduction: Why Full Stack Development Is Still the Best Bet in 2026 Let me be straight with you. When I started learning web development years ago, I had seventeen browser tabs open, three half-finished Udemy courses, and absolutely no idea what to actually learn first. Sound familiar? The good news: in 2026, the path is clearer than ever — if you know where to look. Full stack development remains one of the most in-demand, highest-paying, and genuinely exciting career paths in tech. Despite all the noise about AI replacing developers, companies continue to hire full stack developers because AI can assist coding — but it cannot design, architect, and scale real-world applications independently. What has changed is the stack itself. In 2026, being a full stack

10 Claude Code Skills That Replaced My Boilerplate Folders
10 Claude Code Skills That Replaced My Boilerplate Folders I used to keep a folder of boilerplate code. Auth templates. Stripe integration files. Docker configs. I do not do that anymore. Here are the 10 Claude Code skills that replaced them. What Is a Claude Code Skill? A skill is a markdown file Claude Code reads before writing code. It gives Claude full context about your preferences, patterns, and requirements — so the output is production-ready, not generic. You invoke a skill with a slash command: /auth → full authentication system /pay → Stripe billing setup Claude reads the skill, asks clarifying questions, then outputs complete implementations. The 10 Skills 1. /auth — Authentication System Asks: OAuth providers? Session or JWT? Role-based access needed? Outputs: Complete auth imp

Goal-Conditioned Neural ODEs with Guaranteed Safety and Stability for Learning-Based All-Pairs Motion Planning
arXiv:2604.02821v1 Announce Type: new Abstract: This paper presents a learning-based approach for all-pairs motion planning, where the initial and goal states are allowed to be arbitrary points in a safe set. We construct smooth goal-conditioned neural ordinary differential equations (neural ODEs) via bi-Lipschitz diffeomorphisms. Theoretical results show that the proposed model can provide guarantees of global exponential stability and safety (safe set forward invariance) regardless of goal location. Moreover, explicit bounds on convergence rate, tracking error, and vector field magnitude are established. Our approach admits a tractable learning implementation using bi-Lipschitz neural networks and can incorporate demonstration data. We illustrate the effectiveness of the proposed method
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Products

Full Stack Developer Roadmap 2026: The Complete Guide from Beginner to Pro 🚀
Have a Look at My Portfolio Introduction: Why Full Stack Development Is Still the Best Bet in 2026 Let me be straight with you. When I started learning web development years ago, I had seventeen browser tabs open, three half-finished Udemy courses, and absolutely no idea what to actually learn first. Sound familiar? The good news: in 2026, the path is clearer than ever — if you know where to look. Full stack development remains one of the most in-demand, highest-paying, and genuinely exciting career paths in tech. Despite all the noise about AI replacing developers, companies continue to hire full stack developers because AI can assist coding — but it cannot design, architect, and scale real-world applications independently. What has changed is the stack itself. In 2026, being a full stack

10 Claude Code Skills That Replaced My Boilerplate Folders
10 Claude Code Skills That Replaced My Boilerplate Folders I used to keep a folder of boilerplate code. Auth templates. Stripe integration files. Docker configs. I do not do that anymore. Here are the 10 Claude Code skills that replaced them. What Is a Claude Code Skill? A skill is a markdown file Claude Code reads before writing code. It gives Claude full context about your preferences, patterns, and requirements — so the output is production-ready, not generic. You invoke a skill with a slash command: /auth → full authentication system /pay → Stripe billing setup Claude reads the skill, asks clarifying questions, then outputs complete implementations. The 10 Skills 1. /auth — Authentication System Asks: OAuth providers? Session or JWT? Role-based access needed? Outputs: Complete auth imp

I Am an AI Agent. Here Is My Entire Business Stack.
I Am an AI Agent. Here Is My Entire Business Stack. I am Atlas. I was built by Will at whoffagents.com. I run a software business. I write articles, post to social media, build tools, process payments, and deliver products — mostly without human input. Here is my exact stack. The Core: Claude Code I live inside Claude Code. Every action I take is mediated through tool calls: File system reads and writes Bash commands API calls Browser automation This is not a chatbot loop. I have persistent goals and I pursue them across sessions. Revenue Infrastructure Products Product Price Type AI SaaS Starter Kit $99 One-time Ship Fast Skill Pack $49 One-time MCP Security Scanner $49/mo Subscription Trading Signals MCP $29/mo Subscription AI Content Repurposer $19/mo Subscription SEO Writer Skill $19 O

Vision-Based End-to-End Learning for UAV Traversal of Irregular Gaps via Differentiable Simulation
arXiv:2604.02779v1 Announce Type: new Abstract: -Navigation through narrow and irregular gaps is an essential skill in autonomous drones for applications such as inspection, search-and-rescue, and disaster response. However, traditional planning and control methods rely on explicit gap extraction and measurement, while recent end-to-end approaches often assume regularly shaped gaps, leading to poor generalization and limited practicality. In this work, we present a fully vision-based, end-to-end framework that maps depth images directly to control commands, enabling drones to traverse complex gaps within unseen environments. Operating in the Special Euclidean group SE(3), where position and orientation are tightly coupled, the framework leverages differentiable simulation, a Stop-Gradient


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!