Claude Code Architecture Explained: Agent Loop, Tool System, and Permission Model (Rust Rewrite Analysis)
<h2> Claude Code Deep Dive (Part 1): Architecture Overview and the Core Agent Loop </h2> <p>Claude Code’s leaked source code weighs in at over <strong>510,000 lines of TypeScript</strong>—far too large to analyze directly.</p> <p>Interestingly, a community-driven Rust rewrite reduced that complexity to around <strong>20,000 lines</strong>, while still preserving the core functionality.</p> <p>Starting from this simplified version makes one thing much clearer:</p> <blockquote> <p>What does an AI agent system <em>actually need</em> to work?</p> </blockquote> <h2> Why Start with the Rust Rewrite? </h2> <p>On March 31, 2026, Claude Code’s full source was unintentionally exposed due to an npm packaging mistake.</p> <p>The package <code>@anthropic-ai/claude-code v2.1.88</code> included a <strong
Claude Code Deep Dive (Part 1): Architecture Overview and the Core Agent Loop
Claude Code’s leaked source code weighs in at over 510,000 lines of TypeScript—far too large to analyze directly.
Interestingly, a community-driven Rust rewrite reduced that complexity to around 20,000 lines, while still preserving the core functionality.
Starting from this simplified version makes one thing much clearer:
What does an AI agent system actually need to work?
Why Start with the Rust Rewrite?
On March 31, 2026, Claude Code’s full source was unintentionally exposed due to an npm packaging mistake.
The package @anthropic-ai/claude-code v2.1.88 included a 59.8MB source map file, which allowed anyone to reconstruct the original TypeScript codebase.
To clarify:
-
The official GitHub repo always existed
-
But it only contained compiled bundles and documentation
-
The readable source code was not normally accessible
The Problem with the Original Codebase
Most analyses focused on the leaked TypeScript code:
-
510K+ lines
-
QueryEngine alone: ~46K lines
-
40+ tools
-
Complex plugin system
The result: too much detail, not enough clarity.
Why the Rust Version Is More Useful
Shortly after the leak:
-
Developer Sigrid Jin (instructkr community)
-
First built a Python clean-room version
-
Then pushed a Rust implementation (claw-code)
👉 Project overview: claw-code
This version:
-
~20K lines of Rust
-
Retains core functionality:
Agent loop Tool system Permission control Prompt system Session management MCP protocol Sub-agents
The key benefit:
Rewriting forces simplification. What remains is what actually matters.
Architecture Overview: A 6-Module System
The Rust implementation is structured into six modules:
claw-code/ ├── runtime/ # Core runtime: loop, permissions, config, session, prompt ├── api/ # LLM client, SSE streaming, OAuth ├── tools/ # Tool registry and execution ├── commands/ # Slash commands (/help, /cost) ├── compat-harness/ # TS → Rust compatibility layer └── rusty-claude-cli/ # CLI, REPL, terminal renderingclaw-code/ ├── runtime/ # Core runtime: loop, permissions, config, session, prompt ├── api/ # LLM client, SSE streaming, OAuth ├── tools/ # Tool registry and execution ├── commands/ # Slash commands (/help, /cost) ├── compat-harness/ # TS → Rust compatibility layer └── rusty-claude-cli/ # CLI, REPL, terminal renderingEnter fullscreen mode
Exit fullscreen mode
These modules form a layered architecture:
CLI / REPL (User Interaction) ───────────────────────────── MCP Protocol · Sub-agents (Extension Layer) ───────────────────────────── API Client · Session Management (Communication Layer) ───────────────────────────── System Prompt · Config (Context Layer) ───────────────────────────── Agent Loop · Tools · Permissions (Core Layer)CLI / REPL (User Interaction) ───────────────────────────── MCP Protocol · Sub-agents (Extension Layer) ───────────────────────────── API Client · Session Management (Communication Layer) ───────────────────────────── System Prompt · Config (Context Layer) ───────────────────────────── Agent Loop · Tools · Permissions (Core Layer)Enter fullscreen mode
Exit fullscreen mode
A Key Design Decision
The runtime module defines interfaces, not implementations:
-
ApiClient → LLM communication
-
ToolExecutor → tool execution
Concrete implementations live at the top (CLI layer).
This enables:
-
Mock implementations for testing
-
Real implementations for production
-
Zero changes to core logic
Testability is built into the architecture—not added later.
The Core: An 88-Line Agent Loop
If you only read one file, read this:
conversation.rs
The entire agent loop is implemented in ~88 lines.
Runtime State: Simpler Than Expected
AgentRuntime { session # message array (the only state) api_client # LLM interface tool_executor # tool execution permission_policy # access control system_prompt max_iterations usage_tracker }AgentRuntime { session # message array (the only state) api_client # LLM interface tool_executor # tool execution permission_policy # access control system_prompt max_iterations usage_tracker }Enter fullscreen mode
Exit fullscreen mode
The surprising part:
The only state is a message array.
No explicit state machine. No workflow graph.
The Core Loop: run_turn()
Here’s the simplified logic:
def run_turn(user_input):
session.messages.append(UserMessage(user_input))
`while True:
if iterations > max_iterations:
raise Error("Max iterations exceeded")
response = api_client.stream(system_prompt, session.messages)
assistant_message = parse_response(response)
session.messages.append(assistant_message)
tool_calls = extract_tool_uses(assistant_message)
if not tool_calls:
break
for tool_name, input in tool_calls:
permission = authorize(tool_name, input)
if permission == Allow:
result = tool_executor.execute(tool_name, input)
session.messages.append(ToolResult(result))
else:
session.messages.append(
ToolResult(deny_reason, is_error=True)
)`
Enter fullscreen mode
Exit fullscreen mode
`---
## A Concrete Example
User asks:
> “What is 2 + 2?”
Execution flow:
| Step | Message State | Description |
| ------ | -------------------------- | ------------------------ |
| Start | `[User("2+2")]` | User input |
| API #1 | + Assistant (calls tool) | Model decides to compute |
| Tool | + ToolResult("4") | Tool executes |
| API #2 | + Assistant("Answer is 4") | Final answer |
| End | Loop exits | No more tool calls |
Termination condition:
> The model decides to stop calling tools.
---
## Key Design Insight #1: Messages = State
Instead of managing state explicitly:
* The system stores everything as messages
* The full state is reconstructible from history
Benefits:
* Easy persistence (save session)
* Easy replay (debugging)
* Easy compression (context trimming)
> One append-only structure solves multiple problems.
---
## Key Design Insight #2: Errors Are Feedback
When a tool is denied:
* The system does **not** crash
* It returns an error as a `ToolResult`
This is fed back to the model.
Result:
* The model adapts
* Chooses alternative strategies
> Failure becomes part of the reasoning loop.
---
## Tool System: 18 Tools, One Pattern
The Rust version implements 18 built-in tools in a unified structure.
---
### Three Layers
```plaintext
1. Tool Registry → defines schema and permissions
2. Dispatcher → routes tool calls
3. Implementation → executes logic`
Enter fullscreen mode
Exit fullscreen mode
### Tool Specification
```json id="i9j1sx"
{
"name": "bash",
"description": "Execute shell commands",
"input_schema": {
"command": "string",
"timeout": "number?"
},
"required_permission": "DangerFullAccess"
}
`This schema is passed directly to the LLM.
---
### Why JSON Schema Matters
* Decouples LLM from implementation
* Enables language-agnostic tools
* Standardizes interfaces
> Schema = contract
---
### Dispatcher Pattern
```python id="5g5syv"
def execute_tool(name, input):
match name:
"bash" -> run_bash()
"read_file" -> run_read()
...`
Enter fullscreen mode
Exit fullscreen mode
Adding a tool:
- Define input struct
- Implement logic
- Add one dispatch line
### Sub-Agent Design
Sub-agents reuse the same runtime:
```python id="5y9zsl"
runtime = AgentRuntime(
session = new_session,
tool_executor = restricted_tools,
permission = high,
prompter = None
)
`Key constraint:
* Sub-agents cannot spawn sub-agents
This prevents recursion loops.
---
## Permission System: Minimal but Complete
The system uses **5 permission levels**:
* ReadOnly
* WorkspaceWrite
* DangerFullAccess
* Prompt
* Allow
---
### Core Logic
```python id="9t9ahj"
if current >= required:
allow
elif one_level_gap:
ask_user
else:
deny`
Enter fullscreen mode
Exit fullscreen mode
### Design Insight: Gradual Escalation
Instead of:
- All-or-nothing access
It uses:
> Controlled escalation
- Small gap → ask user
- Large gap → deny
### Sub-Agent Safety Model
Sub-agents:
- Have high permission
- But no user prompt interface
Result:
- Allowed within scope
- Automatically blocked outside
> Two mechanisms combine into precise control.
## Part 1 Summary
Claude Code’s core reduces to three components:
`Agent Loop → execution engine
Tool System → action layer
Permissions → safety control`
Enter fullscreen mode
Exit fullscreen mode
Key principles:
- Messages are the only state
- LLM decides when to stop
- Tools are schema-driven
- Errors are part of reasoning
- Permissions are incremental
## Final Thought
After stripping away 500K lines of code, what remains is surprisingly small:
> A loop, a tool interface, and a permission system.
That’s enough to build a functional AI agent.
But making it robust, scalable, and safe—that’s where the real complexity begins.
## Next Part
Claude Code Deep Dive (Part 2): Context Engineering and Design Patterns
- Prompt construction
- Config merging
- Context compression
- Practical design takeaways
## References
- Claw Code (Rust rewrite): https://github.com/instructkr/claw-code
- Project site: https://claw-code.codes/
- Claude Code official repo: https://github.com/anthropics/claude-code
def run_turn(user_input):
session.messages.append(UserMessage(user_input))
`while True:
if iterations > max_iterations:
raise Error("Max iterations exceeded")
response = api_client.stream(system_prompt, session.messages)
assistant_message = parse_response(response)
session.messages.append(assistant_message)
tool_calls = extract_tool_uses(assistant_message)
if not tool_calls:
break
for tool_name, input in tool_calls:
permission = authorize(tool_name, input)
if permission == Allow:
result = tool_executor.execute(tool_name, input)
session.messages.append(ToolResult(result))
else:
session.messages.append(
ToolResult(deny_reason, is_error=True)
)`
Enter fullscreen mode
Exit fullscreen mode
`---
## A Concrete Example
User asks:
> “What is 2 + 2?”
Execution flow:
| Step | Message State | Description |
| ------ | -------------------------- | ------------------------ |
| Start | `[User("2+2")]` | User input |
| API #1 | + Assistant (calls tool) | Model decides to compute |
| Tool | + ToolResult("4") | Tool executes |
| API #2 | + Assistant("Answer is 4") | Final answer |
| End | Loop exits | No more tool calls |
Termination condition:
> The model decides to stop calling tools.
---
## Key Design Insight #1: Messages = State
Instead of managing state explicitly:
* The system stores everything as messages
* The full state is reconstructible from history
Benefits:
* Easy persistence (save session)
* Easy replay (debugging)
* Easy compression (context trimming)
> One append-only structure solves multiple problems.
---
## Key Design Insight #2: Errors Are Feedback
When a tool is denied:
* The system does **not** crash
* It returns an error as a `ToolResult`
This is fed back to the model.
Result:
* The model adapts
* Chooses alternative strategies
> Failure becomes part of the reasoning loop.
---
## Tool System: 18 Tools, One Pattern
The Rust version implements 18 built-in tools in a unified structure.
---
### Three Layers
```plaintext
1. Tool Registry → defines schema and permissions
2. Dispatcher → routes tool calls
3. Implementation → executes logic`
Enter fullscreen mode
Exit fullscreen mode
### Tool Specification
```json id="i9j1sx"
{
"name": "bash",
"description": "Execute shell commands",
"input_schema": {
"command": "string",
"timeout": "number?"
},
"required_permission": "DangerFullAccess"
}
`This schema is passed directly to the LLM.
---
### Why JSON Schema Matters
* Decouples LLM from implementation
* Enables language-agnostic tools
* Standardizes interfaces
> Schema = contract
---
### Dispatcher Pattern
```python id="5g5syv"
def execute_tool(name, input):
match name:
"bash" -> run_bash()
"read_file" -> run_read()
...`
Enter fullscreen mode
Exit fullscreen mode
Adding a tool:
- Define input struct
- Implement logic
- Add one dispatch line
### Sub-Agent Design
Sub-agents reuse the same runtime:
```python id="5y9zsl"
runtime = AgentRuntime(
session = new_session,
tool_executor = restricted_tools,
permission = high,
prompter = None
)
`Key constraint:
* Sub-agents cannot spawn sub-agents
This prevents recursion loops.
---
## Permission System: Minimal but Complete
The system uses **5 permission levels**:
* ReadOnly
* WorkspaceWrite
* DangerFullAccess
* Prompt
* Allow
---
### Core Logic
```python id="9t9ahj"
if current >= required:
allow
elif one_level_gap:
ask_user
else:
deny`
Enter fullscreen mode
Exit fullscreen mode
### Design Insight: Gradual Escalation
Instead of:
- All-or-nothing access
It uses:
> Controlled escalation
- Small gap → ask user
- Large gap → deny
### Sub-Agent Safety Model
Sub-agents:
- Have high permission
- But no user prompt interface
Result:
- Allowed within scope
- Automatically blocked outside
> Two mechanisms combine into precise control.
## Part 1 Summary
Claude Code’s core reduces to three components:
`Agent Loop → execution engine
Tool System → action layer
Permissions → safety control`
Enter fullscreen mode
Exit fullscreen mode
Key principles:
- Messages are the only state
- LLM decides when to stop
- Tools are schema-driven
- Errors are part of reasoning
- Permissions are incremental
## Final Thought
After stripping away 500K lines of code, what remains is surprisingly small:
> A loop, a tool interface, and a permission system.
That’s enough to build a functional AI agent.
But making it robust, scalable, and safe—that’s where the real complexity begins.
## Next Part
Claude Code Deep Dive (Part 2): Context Engineering and Design Patterns
- Prompt construction
- Config merging
- Context compression
- Practical design takeaways
## References
- Claw Code (Rust rewrite): https://github.com/instructkr/claw-code
- Project site: https://claw-code.codes/
- Claude Code official repo: https://github.com/anthropics/claude-code
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
claudemodelversion
Anthropic says Claude subscriptions will no longer cover usage on third-party tools like OpenClaw starting April 4 at 12pm PT, to better manage capacity (Boris Cherny/@bcherny)
Boris Cherny / @bcherny : Anthropic says Claude subscriptions will no longer cover usage on third-party tools like OpenClaw starting April 4 at 12pm PT, to better manage capacity Starting tomorrow at 12pm PT, Claude subscriptions will no longer cover usage on third-party tools like OpenClaw. You can still use these tools with your Claude login via extra usage bundles (now available at a discount), or with a Claude API key.

Will machines ever be intelligent?
Are machines truly intelligent? AI researchers Subutai Ahmad and Nicolò Fusi join Doug Burger to compare transformer-based AI with the human brain, exploring continual learning, efficiency, and whether today’s models are on a path toward human intelligence. The post Will machines ever be intelligent? appeared first on Microsoft Research .
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Products

Trailer: The Shape of Things to Come
Microsoft research lead Doug Burger introduces his new podcast series, "The Shape of Things to Come", an exploration into the fundamental truths about AI and how the technology will reshape the future. The post Trailer: The Shape of Things to Come appeared first on Microsoft Research .





Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!