Skip to main content
Back to blog

Sakana Fugu Release: Model Orchestration Is Becoming the Product

Sakana AI's Fugu and Fugu Ultra release turns multi-model orchestration into a single API product, with lessons for AI DevOps and Clanker Cloud.

Sakana AI's Fugu release is a clean signal that the AI market is moving past one-model fandom.

The company launched Sakana Fugu and Fugu Ultra on June 22, 2026 as a generally available product. The pitch is simple: call one OpenAI-compatible API, and behind that endpoint Fugu decides whether to answer directly or coordinate a pool of expert models. Sakana says Fugu is itself a language model trained to call other LLMs, including instances of itself recursively.

That last part is the point. Fugu is not just a router. It is a productized orchestration layer.

What Was Released

At launch there are two variants.

Fugu is the everyday model, tuned for strong performance with lower latency. It is meant for coding tools, review workflows, chat services, and interactive use. Teams can also opt specific agents out of the pool for data, privacy, or compliance reasons.

Fugu Ultra is the quality-first model. It coordinates a deeper pool of expert agents for harder, longer, higher-stakes tasks such as paper reproduction, cybersecurity analysis, literature review, patent investigation, and data-science research.

The benchmark claims are aggressive. Sakana says the Fugu models outperform publicly accessible frontier models across several engineering, science, reasoning, and agentic benchmarks, while Fugu Ultra is positioned near Fable 5 and Mythos Preview. The caveat matters: baseline scores are provider-reported, and Fable/Mythos are not in the Fugu pool because they are not publicly accessible.

Still, the shape is credible. Long, messy work benefits from planning, execution, checking, and synthesis. A trained system that knows when to split work and verify it should be better when the task deserves the extra cost.

What Experts Are Saying

The expert signal is mostly about orchestration, not raw score worship.

David Ha said Sakana has used Fugu internally for research and coding, and framed the future of AI as "collective intelligence." His useful point is test-time scaling: when Fugu calls itself recursively, it can read prior output and launch corrective workflows.

Robert Tjarko Lange described Fugu as more than an argmax over a model pool. That is a good phrase because it separates Fugu from shallow model picking. The product is trying to learn query sequences, roles, and collaboration patterns.

Sakana's own early users echo the pattern. A software engineer reported deeper code review findings. A cybersecurity engineer said Fugu kept a scoped assessment inside bounds while producing evidence and retest steps. An enterprise platform executive called out persona stability in long sessions.

My opinion: those are the right things to measure. Agents fail in the boring middle of the task: losing scope, dropping evidence, forgetting constraints, or finishing without a usable handoff.

Why This Matters for Clanker Cloud

Clanker Cloud is built around a similar systems view.

The model is not the product. The operating layer is the product.

For cloud operations, that means live infrastructure context, local credentials, MCP tools, cost and deployment state, review-before-apply plans, and enough audit trail to understand what the agent did. Fugu handles model orchestration behind one API. Clanker Cloud handles infrastructure orchestration around real systems.

The overlap is obvious. As models become swappable and orchestration becomes normal, teams will care less about which model is temporarily winning a leaderboard. They will care whether the whole system can inspect reality, route intelligently, preserve trust boundaries, and stop before unsafe action.

That is where Novlabs.ai's systems engineering research behind Clanker Cloud points: agentic software needs a control plane, not just a bigger model.

My Take

I like this release because it treats AI capability as a runtime system.

I am also cautious. A hidden orchestration layer can hide cost, provenance, failure modes, and tool boundaries. The enterprise version of Fugu needs excellent observability: which agents ran, what evidence they saw, what was excluded, and why the final answer should be trusted.

The good news is that this is the right debate. AI products are growing up. The frontier is no longer only model size. It is coordination, governance, and useful work that survives contact with real workflows.

Sources

Next step

Ask Clanker Cloud what your cluster is doing

Install the local app, connect your kubeconfig, and turn cluster state, workload health, cost context, and safe next steps into one readable answer.

Download Clanker CloudRead about the agentic-native cloud