Google DeepMind's AI Co-Mathematician Reaches 48% on FrontierMath Tier 4

Google DeepMind

Research official + media 2 src. ~1 min

Google DeepMind presents an interactive agentic workbench supporting the full cycle of mathematical research: brainstorming, literature search, computational exploration, formal proof development, and theory building. The system maintains a stateful asynchronous workspace that tracks uncertainty, records failed hypotheses, and communicates when reasoning stalls. On FrontierMath Tier 4 (hard unsolved problems), it achieves 48% — a new state-of-the-art among all AI systems evaluated. In early real-world trials it helped researchers resolve open problems and surface overlooked references.

Why it matters

48% on FrontierMath Tier 4 is a concrete SOTA milestone showing that agentic scaffolding — not just raw model capability — materially advances mathematical discovery.

Importance: 3/5

New SOTA at 48% on FrontierMath Tier 4; stateful agentic math workbench applied to real research problems; Google DeepMind frontier lab.

agents reasoning mathematics rl benchmark

Sources

official AI Co-Mathematician: Accelerating Mathematicians with Agentic AI — arXiv

media AI Co-Mathematician on HuggingFace Daily Papers