Vulnerability Research Is Cooked

Simon Willison Blogby Simon WillisonApril 3, 20262 min read2 views

Vulnerability Research Is Cooked Thomas Ptacek's take on the sudden and enormous impact the latest frontier models are having on the field of vulnerability research. Within the next few months, coding agents will drastically alter both the practice and the economics of exploit development. Frontier model improvement won’t be a slow burn, but rather a step function. Substantial amounts of high-impact vulnerability research (maybe even most of it) will happen simply by pointing an agent at a source tree and typing “find me zero days”. Why are agents so good at this? A combination of baked-in knowledge, pattern matching ability and brute force: You can't design a better problem for an LLM agent than exploitation research. Before you feed it a single token of context, a frontier LLM already en

3rd April 2026 - Link Blog

Vulnerability Research Is Cooked. Thomas Ptacek's take on the sudden and enormous impact the latest frontier models are having on the field of vulnerability research.

Within the next few months, coding agents will drastically alter both the practice and the economics of exploit development. Frontier model improvement won’t be a slow burn, but rather a step function. Substantial amounts of high-impact vulnerability research (maybe even most of it) will happen simply by pointing an agent at a source tree and typing “find me zero days”.

Why are agents so good at this? A combination of baked-in knowledge, pattern matching ability and brute force:

You can't design a better problem for an LLM agent than exploitation research.

Before you feed it a single token of context, a frontier LLM already encodes supernatural amounts of correlation across vast bodies of source code. Is the Linux KVM hypervisor connected to the hrtimer subsystem, workqueue, or perf_event? The model knows.

Also baked into those model weights: the complete library of documented "bug classes" on which all exploit development builds: stale pointers, integer mishandling, type confusion, allocator grooming, and all the known ways of promoting a wild write to a controlled 64-bit read/write in Firefox.

Vulnerabilities are found by pattern-matching bug classes and constraint-solving for reachability and exploitability. Precisely the implicit search problems that LLMs are most gifted at solving. Exploit outcomes are straightforwardly testable success/failure trials. An agent never gets bored and will search forever if you tell it to.

The article was partly inspired by this episode of the Security Cryptography Whatever podcast, where David Adrian, Deirdre Connolly, and Thomas interviewed Anthropic's Nicholas Carlini for 1 hour 16 minutes.

I just started a new tag here for ai-security-research - it's up to 11 posts already.

Original source

Simon Willison Blog

https://simonwillison.net/2026/Apr/3/vulnerability-research-is-cooked/#atom-everything

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modelagentresearch

Models

Anthropic Races to Contain Leak of Code Behind Claude AI Agent - wsj.com

Anthropic Races to Contain Leak of Code Behind Claude AI Agent wsj.com

Google News: Claude

1m4 days ago

ProductsFresh

Target Warns That If Its AI Shopping Agent Makes an Expensive Mistake, You’ll Have to Pay for It

The company "does not purport to guarantee that an Agentic Commerce Agent will act exactly as you intend in all circumstances." The post Target Warns That If Its AI Shopping Agent Makes an Expensive Mistake, You ll Have to Pay for It appeared first on Futurism .

Futurism AI

1mabout 4 hours ago

Models

Baseten Expands Model Library With Early Access to Gemma 4 Multimodal AI - TipRanks

Baseten Expands Model Library With Early Access to Gemma 4 Multimodal AI TipRanks

GNews AI multimodal

1m3 days ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 151 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

Vulnerability Research Is Cooked

Daily AI Digest

More about

Anthropic Races to Contain Leak of Code Behind Claude AI Agent - wsj.com

Target Warns That If Its AI Shopping Agent Makes an Expensive Mistake, You’ll Have to Pay for It

Baseten Expands Model Library With Early Access to Gemma 4 Multimodal AI - TipRanks

Knowledge Map

Connected Articles — Knowledge Graph

Discussion

More in Models

Exclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT - wsj.com

Gemma 4 and Gemini: Two Paths Shaping Google’s AI Strategy - Morocco World News

Anthropic Cracks Down On Unauthorized Claude Usage By Third Party Harnesses And Rivals Click Through The Up Coming Post (86PIxNR3De) - Mshale

I let Gemini in Google Maps plan my day and it went surprisingly well - The Verge