Skip to main content
Back to blog

Sakana Fugu Architecture: The Orchestrator Is the New Foundation Model

Sakana Fugu's architecture treats orchestration as the model: learned routing, roles, verification, recursive calls, and multi-agent workflows through one API.

The most interesting thing about Sakana Fugu is not that it calls many models.

Lots of systems call many models. Most are glorified routing tables, hand-written agent chains, or a pile of prompts with a nice name.

Fugu is more ambitious. Its architecture treats orchestration itself as the learned object. The release page says Fugu is a language model trained to call other LLMs in an agent pool, including recursive calls to itself. Users call one API. Inside, Fugu decides when to delegate, which model should work, when to verify, and how to synthesize the result.

That is a real architectural shift.

The Research Base: Trinity and Conductor

Sakana ties Fugu to two ICLR 2026 research lines: Trinity and Conductor.

Trinity uses a lightweight evolved coordinator to orchestrate LLMs over multiple turns. It assigns roles like Thinker, Worker, and Verifier: one model plans, another executes, another checks the answer. The coordinator learns how to spend that limited turn budget.

Conductor is different but adjacent. It learns natural-language coordination strategies with reinforcement learning. Instead of hard-coding a workflow, Conductor can generate custom instructions, choose communication structures, and decide which prior subtasks each agent should see.

Fugu is the product expression of that research direction.

Why This Beats Static Routing

Static routing says: coding goes to model A, long context goes to model B, math goes to model C.

That is useful, but shallow.

Hard tasks do not stay inside one category. A production incident can require logs, Kubernetes knowledge, cost awareness, source-code inspection, change history, and a final plan for humans. A research task can require paper reading, implementation, experiment design, result interpretation, and critique.

Fugu decides the work shape at runtime. Maybe one model is enough. Maybe it needs a planner, a worker, and a verifier. Maybe it needs to call itself after seeing a flawed partial answer.

David Ha's expert take is that recursive self-calls let Fugu create corrective workflows when it sees mistakes. Robert Tjarko Lange framed the system as test-time scaling that reasons not only about which model to use, but how to sequence calls.

The Failure Surface

My opinion: Fugu is exciting because it makes orchestration a first-class model capability. It is also risky for exactly the same reason.

When a workflow is hand-written, at least you can inspect it. Learned orchestration can be harder to explain. If the system uses hidden subtasks, model choices, and context filters, users may get a polished final answer without knowing what happened.

That is manageable only if the product exposes operational metadata:

  • Which agents were used.
  • What role each agent played.
  • What context each agent saw.
  • Where verification happened.
  • How much the orchestration cost.

Without that, multi-agent intelligence becomes a black box with a bigger bill.

The Clanker Cloud Parallel

This is where Fugu's architecture connects directly to Clanker Cloud.

Infrastructure agents need orchestration too, but around tools, credentials, live state, and human approval. A Clanker Cloud workflow may ask one model to summarize an incident, another to inspect code, MCP tools to read cluster state, and the user to approve any high-impact change.

The architectural principle is the same: the valuable system is the coordinator plus the evidence layer, not only the strongest model in the pool.

Clanker Cloud keeps that evidence close to the user. Credentials stay local. Agents get structured infrastructure context. Plans are reviewable before execution. Novlabs.ai is researching this systems engineering layer because model intelligence without operational boundaries is not enough.

My Take

Fugu's architecture is a glimpse of the next default agent stack: learned orchestration above a changing pool of models.

The winning systems will make that complexity legible. If Fugu exposes enough trace, cost, privacy, and verification detail, it can make multi-agent workflows feel like one model call without pretending they are simple internally.

That is the frontier: not one model to do everything, but one trustworthy operating layer to coordinate the right work.

Sources

Next step

Give your agent live infrastructure context

Download Clanker Cloud, expose the local MCP surface, and let coding agents work from current cloud, Kubernetes, GitHub, and cost state instead of guesses.

Download Clanker CloudRead the AI agents guide