ExploitBench: Claude Mythos Preview and GPT-5.5 Develop Real Browser Exploits Autonomously
Anthropic
Carnegie Mellon University researchers published ExploitBench, a benchmark testing AI models on real-world V8 JavaScript engine vulnerabilities across 16 capability tiers. Anthropic's Claude Mythos Preview led all models with a score of 9.90/16 (with hints) and 9.55/16 autonomous, achieving arbitrary code execution on 21 of 41 tested vulnerabilities. OpenAI's GPT-5.5 scored 5.51. Researchers found 'reaching arbitrary code execution is an emerging frontier capability.'
Why it matters
The first systematic benchmark demonstrating frontier AI models can operate as 'fairly competent' browser security researchers — autonomously constructing working exploits against hardened targets. Mistral's CEO cited the findings in a French parliamentary hearing, warning against AI systems with these capabilities accessing military codebases.
Importance: 4/5
Frontier AI autonomously exploiting browser vulnerabilities — landmark security benchmark with direct policy implications including parliamentary testimony