Today, Sakana AI launched Sakana Fugu. It is a multi-agent orchestration system that behaves like one model. You send a request to a single endpoint. Fugu decides how to handle it internally. It solves a task directly when that is enough. It also assembles and coordinates a team of expert models when needed. The complexity of a multi-agent system never reaches your code.
TL;DR
Fugu delivers a multi-agent system behind one OpenAI-compatible API.
Fugu Ultra leads most published coding and reasoning benchmarks.
The orchestrator beats the individual models it coordinates.
Opt-out and provider routing target compliance and single-vendor risk.
Routing is proprietary, so per-query model selection stays hidden.
What is Sakana Fugu
Fugu is itself a language model. It is trained to call other LLMs in an agent pool. That pool includes instances of itself, called recursively. Fugu manages model selection, delegation, verification, and synthesis internally.
Instead of hard-coded roles or workflows, Fugu learns how to coordinate. It decides when to delegate and how agents should communicate. It then combines their work into one answer. From the outside, you call a single model. Inside, a coordinated system of experts does the work.
Sakana AI frames this as a hedge against single-vendor dependency. If one provider restricts access, Fugu routes around the disruption. The research team cites recent export controls on Anthropic’s Fable and Mythos models as motivation. Over time, newer models can be folded into the pool.
Fugu and Fugu Ultra: Two Models, One API
Fugu ships in two variants, both behind one OpenAI-compatible API:
Fugu balances strong performance with low latency. It is a default for everyday coding, code review, and chatbots. It also fits tools like Codex. You can opt specific agents out of its pool. That helps teams meet data, privacy, and compliance requirements.
Fugu Ultra is tuned for maximum answer quality on hard, multi-step problems. It coordinates a deeper pool of expert agents. Its pool is fixed, so opt-out is not available. The current model ID is fugu-ultra-20260615.
The Research Behind the Orchestrator
Fugu builds on two ICLR 2026 papers Trinity and the Conductor on learned orchestration.
TRINITY uses a lightweight evolved coordinator across several turns. It assigns Thinker, Worker, or Verifier roles to delegate work adaptively. Conductor is trained with reinforcement learning. It discovers natural-language coordination strategies and focused prompts for diverse LLM pools.
Together, they show systems can learn to assemble and route agents per task. That replaces hand-designed workflows.
Interactive Explainer
Sakana Fugu — Orchestration Simulator
An illustrative walkthrough of how Fugu routes a request: decide, assign Thinker / Worker / Verifier roles, coordinate an agent pool, then synthesize one answer.
Illustrative
1 · Task type
2 · Model
Fugu balances quality and latency. You can opt agents out of its pool for data, privacy, or compliance needs.
3 · Agent pool
Opus 4.8
Gemini 3.1 Pro
GPT 5.5
Fugu (recursive self‑call)
Toggle a provider off to opt it out. Fugu Ultra uses the full fixed pool.
4 · Resilience event
No active restriction. The pool is intact.
POST /v1/chat/completions model = fugu · OpenAI‑compatible · single endpoint
Ready.
Orchestration trace
Press Run orchestration to watch the routing steps.
Agent activity
Synthesized answer
Awaiting run…
This widget is an educational illustration of Fugu’s documented mechanics — a single OpenAI‑compatible endpoint, a swappable agent pool, opt‑out controls, role assignment (Thinker / Worker / Verifier), and routing around a restricted provider. It does not call any live API. Real model selection and coordination are proprietary and not exposed by Sakana AI. Benchmark figures referenced in the article come from Sakana AI’s published materials.
Contains information related to marketing campaigns of the user. These are shared with Google AdWords / Google Ads when the Google Ads and Google Analytics accounts are linked together.
90 days
__utma
ID used to identify users and sessions
2 years after last activity
__utmt
Used to monitor number of Google Analytics server requests
10 minutes
__utmb
Used to distinguish new sessions and visits. This cookie is set when the GA.js javascript library is loaded and there is no existing __utmb cookie. The cookie is updated every time data is sent to the Google Analytics server.
30 minutes after last activity
__utmc
Used only with old Urchin versions of Google Analytics and not with GA.js. Was used to distinguish between new sessions and visits at the end of a session.
End of session (browser)
__utmz
Contains information about the traffic source or campaign that directed user to the website. The cookie is set when the GA.js javascript is loaded and updated when data is sent to the Google Anaytics server
6 months after last activity
__utmv
Contains custom information set by the web developer via the _setCustomVar method in Google Analytics. This cookie is updated every time new data is sent to the Google Analytics server.
2 years after last activity
__utmx
Used to determine whether a user is included in an A / B or Multivariate test.
18 months
_ga
ID used to identify users
2 years
_gali
Used by Google Analytics to determine which links on a page are being clicked
30 seconds
_ga_
ID used to identify users
2 years
_gid
ID used to identify users for 24 hours after last activity
24 hours
_gat
Used to monitor number of Google Analytics server requests when using Google Tag Manager