Sakana AI Releases Fugu: Multi-LLM Orchestrator Achieving SoTA on SWE-Bench Pro

Sakana AI

Research official + media 2 src. ~1 min

Sakana AI published the Fugu Technical Report (arXiv 2606.21228, revised June 23, 2026). Fugu is a family of orchestrator models trained to coordinate an adaptive team of specialized LLMs, dynamically devising agent scaffolds tailored to each query via fine-tuning, evolutionary algorithms, and RL. Two variants: Fugu (performance/latency balance) and Fugu-Ultra (maximum quality). Achieves state-of-the-art results on SWE-Bench Pro, Terminal Bench, LiveCodeBench, and GPQA-Diamond among publicly accessible models.

Why it matters

Fugu directly addresses vendor lock-in and frontier LLM fragmentation by learning to compose specialist models rather than relying on a single provider. Achieving SoTA on hard benchmarks like GPQA-Diamond and SWE-Bench Pro without a monolithic model is a meaningful architectural result.

Importance: 3/5

SoTA results on SWE-Bench Pro and GPQA-Diamond from a multi-LLM orchestrator — challenges assumption that frontier-level performance requires monolithic models

multi-agent coding-agent reinforcement-learning software-engineering

Sources

official Sakana Fugu Technical Report — arXiv

media Sakana AI blog: Fugu release