What Is Sakana Fugu? Sakana AI's Orchestration Model
•10 min read
Sakana Fugu launched June 22, 2026 as one model that commands a pool of frontier LLMs. Fugu Ultra scored 73.7 on SWE-Bench Pro. Here's what it means for builders.
On June 22, 2026, Tokyo-based Sakana AI shipped something unusual: a single model whose main job is to command other models. Sakana Fugu is a multi-agent orchestration system delivered as one OpenAI-compatible API, and it routes each task across a swappable pool of frontier LLMs (Sakana AI, Sakana Fugu, June 2026). The pitch is not just raw capability. It is independence. If one provider restricts access tomorrow, Fugu routes around it. For anyone running AI in production, that's a different kind of promise than "our model scored higher." Here's what's real and what's marketing.
Key Takeaways
In June 2026, Sakana Fugu launched as one model that orchestrates a pool of frontier LLMs behind a single API, built on two ICLR 2026 papers (Sakana AI, 2026).
Fugu Ultra scored 73.7 on SWE-Bench Pro, ahead of Opus 4.8 (69.2), Gemini 3.1 Pro (54.2), and GPT-5.5 (58.6) on Sakana's own table.
The real selling point is vendor independence — and 81% of enterprise execs are at least somewhat concerned about AI vendor dependency (Zapier, 2026).
It's not available in the EU or EEA at launch, and the "matches Anthropic's Fable 5" claim is Sakana's framing, not a head-to-head benchmark.
What Exactly Is Sakana Fugu?
Sakana Fugu is a language model trained to orchestrate other language models. As of June 2026, Sakana AI describes it as "a multi-agent system as a model" — you send one request to one endpoint, and Fugu decides whether to answer directly or assemble a team of expert models to handle the job (Sakana AI, Sakana Fugu, June 2026). It handles model selection, delegation, verification, and synthesis for you. The orchestration disappears behind a single API call.
That last part is the trick. Most multi-agent systems are hand-built workflows: you wire up the steps, pick the models, and maintain the glue. Fugu learns the coordination instead of being told it.
As of June 2026, Sakana Fugu delivers a full multi-agent orchestration system as a single foundation model, exposed through one OpenAI-compatible API that dynamically assembles and coordinates expert models per task (Sakana AI, Sakana Fugu, June 2026). For builders, this means you get team-of-models behavior without building or maintaining the team yourself.
The engineering underneath comes from two ICLR 2026 papers. TRINITY uses a lightweight evolved coordinator that assigns Thinker, Worker, and Verifier roles across turns. Conductor is trained with reinforcement learning to discover its own natural-language coordination strategies rather than following a prescribed script (Sakana AI, Sakana Fugu, June 2026). In Sakana's own words, "instead of using domain knowledge to prescribe team organization, Fugu learns to dynamically assemble agents." The company is co-founded by David Ha and Llion Jones, one of the authors of the original 2017 Transformer paper (Nikkei Asia, June 2026).
So is it a real model or a clever wrapper? It's a model that wraps models. Both things are true, and that tension is the whole story.
How Good Is Fugu Ultra, Really?
On Sakana's published benchmarks, Fugu Ultra leads a strong field. As of June 2026, the company reports Fugu Ultra at 73.7 on SWE-Bench Pro, ahead of Claude Opus 4.8 (69.2), Gemini 3.1 Pro (54.2), and GPT-5.5 (58.6) (Sakana AI, Sakana Fugu, June 2026). It also tops the table on TerminalBench 2.1, LiveCodeBench, and Humanity's Last Exam. These are Sakana's own numbers, but they're corroborated cell-for-cell by MarkTechPost's independent transcription of the same table (MarkTechPost, June 2026).
Now the caveat, because it matters. Sakana says its models "stand shoulder-to-shoulder with Anthropic's Fable 5 and Mythos Preview" (Sakana AI, Fugu release, June 2026). But Fable 5 and Mythos are not in the benchmark table, and they're not in Fugu's agent pool, because they aren't publicly accessible. So that comparison is framing, not a measured head-to-head. Treat the "beats Fable 5" headlines you'll see as marketing, and trust the numbers against Opus 4.8, Gemini 3.1 Pro, and GPT-5.5, which are real models Fugu was actually scored against.
There's a structural reason the numbers look good. Fugu can route a coding task to whichever pool model is strongest at coding, then have a verifier check the work. An orchestrator that picks the best tool for each step should, in principle, beat any single tool. The open question is whether that edge holds on your messy real-world tasks, not curated benchmarks.
Why Does the Vendor Lock-In Angle Matter?
The benchmarks get the headlines, but vendor independence is the actual product thesis. As of June 2026, 81% of enterprise executives say they are at least somewhat concerned about depending on a specific AI vendor, and 29% are "very concerned" (Zapier, 2026 Enterprise AI Survey of 500 US execs). Sakana built Fugu to answer exactly that anxiety: if a provider gets export-controlled, repriced, or pulled, the orchestrator reroutes to another model in the pool without you rewriting anything.
Sakana is blunt about the risk it's selling against. The company warns that "relying on a single company's APIs for critical infrastructure, finance, or governance is a material vulnerability," and that access "can shift or disappear overnight due to changing regulatory boundaries, export controls, and foreign policies" (Sakana AI, Fugu release, June 2026).
The concern isn't hypothetical, and the data backs the fear. Migrations are genuinely painful when you're locked in.
Sources: Zapier 2026 Enterprise AI Survey (500 US execs); Parallels 2026 State of Cloud Computing Survey (540 IT pros).
The migration data is the part that stings. Of enterprises that tried to move between AI platforms, only 42% described the migration as smooth — meaning 58% hit failures or far more effort than they expected (Zapier, 2026). A separate Parallels survey of 540 IT pros found 94% of IT leaders concerned about vendor lock-in, roughly half of them "very concerned" (Parallels, February 2026). Fugu's swappable pool is a direct bet on that pain.
What Does Sakana Fugu Cost and How Do You Use It?
Adoption friction is low by design. Fugu exposes an OpenAI-compatible API across the Chat Completions and Responses endpoints, so you point an existing client at the Fugu endpoint with no SDK migration (Sakana AI, Sakana Fugu, June 2026). On price, subscriptions run $20/month (Standard), $100/month (Pro, ~10x usage), and $200/month (Max, ~20x usage). Pay-as-you-go on Fugu Ultra is $5 per million input tokens and $30 per million output tokens, with cached input at $0.50.
As of June 2026, Sakana Fugu charges one blended rate per token rather than stacking a separate fee for each model it calls internally, with Fugu Ultra priced at $5 per million input tokens and $30 per million output tokens (Sakana AI, Sakana Fugu, June 2026). For teams, this means orchestration cost is predictable even though many models may run behind a single request.
A few practical notes before you commit. Pricing rises to $10/$45 per million tokens for context above 272K tokens, so long-context jobs cost more. The base Fugu model lets you opt specific agents out of the pool for compliance or privacy reasons; Fugu Ultra uses a fixed pool with no opt-out. And the catch that will stop some readers cold: Fugu is not available in the EU or EEA at launch. Multiple outlets report this is a temporary GDPR-alignment restriction tied to the black-box routing architecture, though I'd confirm it on Sakana's console before planning around it, since it wasn't stated on every official page (The Decoder, June 2026).
Should You Actually Adopt Sakana Fugu?
Here's my honest read as someone who builds AI systems for clients. Fugu is most interesting if you've already felt the lock-in pain and you don't want to build your own model router. As of June 2026, the value isn't a single benchmark win — it's that one endpoint gives you best-of-pool routing plus a hedge against any one provider disappearing (Sakana AI, Fugu release, June 2026). That combination is genuinely hard to assemble yourself.
But weigh the trade-off honestly. The same black-box routing that makes Fugu convenient also makes it opaque. When a request fans out to several models and a verifier synthesizes the answer, you lose direct visibility into which model produced what — and that's a real problem for debugging, auditing, and regulated workloads. You're also adding a new vendor dependency in the name of reducing vendor dependency. Fugu hedges against losing any one pool model, but you're now locked into Sakana's orchestration layer.
So who should try it? Teams running coding, code review, and research workloads outside the EU, who value resilience over full transparency, and who want frontier-adjacent quality without managing a model zoo. Teams with strict audit requirements, EU data residency, or a need to know exactly which model touched each request should wait and watch. The smart move right now is a scoped pilot on a non-critical workload, measured against your current single-model setup on your own tasks — not Sakana's benchmarks.
Frequently Asked Questions
Is Sakana Fugu just a wrapper around other models?
It's more than a router but less than a fully independent frontier model. Fugu is itself a trained language model that learns coordination strategies and assigns Thinker, Worker, and Verifier roles across a pool of LLMs (Sakana AI, 2026). It wraps models, but the orchestration logic is learned, not hand-coded.
Does Fugu really beat Claude Fable 5?
No verified head-to-head exists. Sakana says Fugu Ultra stands "shoulder-to-shoulder" with Fable 5, but Fable 5 isn't in Fugu's benchmark table or its agent pool because it isn't publicly accessible (Sakana AI, 2026). The measured wins are against Opus 4.8, Gemini 3.1 Pro, and GPT-5.5.
How much does Sakana Fugu cost?
Subscriptions run $20, $100, and $200 per month for Standard, Pro, and Max tiers. Pay-as-you-go on Fugu Ultra is $5 per million input tokens and $30 per million output tokens, rising to $10/$45 above 272K tokens of context (Sakana AI, 2026). Cached input is $0.50 per million.
Can I use Sakana Fugu in the European Union?
Not at launch. As of June 2026, Sakana states it does not provide Fugu services to users in EU or EEA member states, reportedly a temporary restriction tied to GDPR and its data-routing design (The Decoder, 2026). No EU timeline has been announced.
What's the difference between Fugu and Fugu Ultra?
Fugu balances performance and latency for everyday coding, code review, and chat, and lets you opt agents out of the pool. Fugu Ultra maximizes answer quality on hard, multi-step problems like Kaggle competitions and paper reproduction, using a fixed pool with no opt-out (Sakana AI, 2026).
The Bottom Line
Sakana Fugu is the most concrete answer yet to a question every serious AI team is now asking: what happens if our model provider pulls the rug? By packaging multi-agent orchestration as a single OpenAI-compatible API, Sakana turns "best-of-pool routing plus vendor resilience" into something you can adopt in an afternoon. The benchmarks are strong against the public frontier, the pricing is transparent, and the lock-in thesis is backed by hard survey data.
The reservations are just as real: black-box routing, a new dependency on Sakana itself, and no EU access. If you build or operate AI systems and you've felt the squeeze of single-vendor risk, Fugu is worth a scoped pilot this quarter — run it against your current stack on your own tasks, and let the results, not the launch-day headlines, make the call.