ExploitBench: Claude Mythos Preview and GPT-5.5 Develop Real Browser Exploits Autonomously

Anthropic

Research official + media 2 src. ~1 min

Carnegie Mellon University researchers published ExploitBench, a benchmark testing AI models on real-world V8 JavaScript engine vulnerabilities across 16 capability tiers. Anthropic's Claude Mythos Preview led all models with a score of 9.90/16 (with hints) and 9.55/16 autonomous, achieving arbitrary code execution on 21 of 41 tested vulnerabilities. OpenAI's GPT-5.5 scored 5.51. Researchers found 'reaching arbitrary code execution is an emerging frontier capability.'

Why it matters

The first systematic benchmark demonstrating frontier AI models can operate as 'fairly competent' browser security researchers — autonomously constructing working exploits against hardened targets. Mistral's CEO cited the findings in a French parliamentary hearing, warning against AI systems with these capabilities accessing military codebases.

Importance: 4/5

Frontier AI autonomously exploiting browser vulnerabilities — landmark security benchmark with direct policy implications including parliamentary testimony

claude-mythos cybersecurity benchmark red-teaming security

Sources

official ExploitBench: Evaluating AI Models on Real-World Browser Vulnerability Exploitation (arXiv)

media New benchmark shows Claude Mythos and GPT-5.5 can develop real browser exploits — The Decoder