12 min read2026-04-24Last updated 2026-07-14Clanker Cloud Editorial Team

MCP Server for Kubernetes and Cloud Infrastructure: How Clanker Cloud Works with Any AI Agent

Clanker Cloud's MCP server connects AI agents like Claude Code, OpenClaw, and Codex to live Kubernetes and cloud infrastructure.

Download Clanker Cloud Watch demo

The Model Context Protocol is solving a coordination problem that has bothered infrastructure engineers since AI coding agents became practical: every agent needs a custom integration to query live systems, and those integrations are brittle, bespoke, and hard to maintain. MCP gives that problem a standard answer. Clanker Cloud implements that answer for Kubernetes and cloud infrastructure.

This article is for engineers who know what MCP is and want to connect their AI agents — Claude Code, OpenClaw, Codex, Hermes, or something custom — to real clusters and cloud accounts.

1. What MCP Is and Why Infrastructure Teams Care

Anthropic introduced the Model Context Protocol in late 2024. The core idea: define a standard way for AI models to call external tools and read data sources, so any MCP-compatible client can talk to any MCP-compatible server without custom glue code.

Before MCP, connecting an AI agent to your infrastructure meant writing a Slack bot that wrapped kubectl, a webhook that forwarded GitHub Actions events to an LLM endpoint, or a collection of shell scripts that an agent could invoke. Each integration was one-off. When you switched agents or upgraded your AI provider, you rewrote the integrations.

MCP is to AI-agent integrations what REST APIs were to web integrations. When REST became the dominant pattern, you stopped writing custom SOAP envelopes and started reading an OpenAPI spec and making HTTP calls. MCP does the same for AI tools: define the interface once, implement it once, and any compliant client can use it.

By 2026, MCP support is native in Claude Desktop, Claude Code, OpenClaw, Codex, Cursor, and a growing list of agent runtimes. The protocol specifies two transports:

HTTP transport — a persistent server on a local port; multiple agents can connect simultaneously and sessions persist across requests.
stdio transport — the agent spawns the MCP server as a subprocess; one session per process, preferred for desktop integrations.

Both transports expose the same tool schema. The agent calls a tool, gets a structured response, and uses it to continue its reasoning loop.

2. Clanker Cloud as the MCP Server for Infrastructure

Clanker Cloud is a local-first desktop app for infrastructure operations. The open-source CLI at github.com/bgdnvk/clanker ships with a built-in MCP server that exposes your connected providers — AWS, GCP, Azure, Kubernetes (EKS, GKE, AKS), Cloudflare, Hetzner, DigitalOcean, and GitHub — through a standard MCP interface. Credentials and AI keys stay on your machine.

Install via Homebrew:

brew tap clankercloud/tap && brew install clanker

Then start the MCP server:

# HTTP transport — best for multiple agents or persistent sessions
clanker mcp --transport http --listen 127.0.0.1:39393

# stdio transport — best for Claude Desktop integration
clanker mcp --transport stdio

The server reads your local credentials (~/.aws/credentials, ~/.kube/config, and so on) and exposes three MCP tools:

Tool 1: `clanker_version`

Returns the CLI version and the connection status of each configured provider. Useful for agent health checks and for verifying that the MCP server has the expected providers before running infrastructure queries.

{
  "tool": "clanker_version",
  "result": {
    "version": "0.9.4",
    "providers": {
      "aws": "connected",
      "kubernetes": "connected (prod-cluster)",
      "gcp": "not configured"
    }
  }
}

Tool 2: `clanker_route_question`

Takes a natural language question, routes it to the appropriate provider, executes the necessary API calls or kubectl commands against your live cluster, and returns a structured answer. This is the primary tool for read operations. Example inputs:

"which pods are using more than 80% of their memory limit right now?"
"what changed in production in the last 30 minutes?"
"are all services in the production namespace healthy?"
"what is the current instance type for prod-cluster?"

For multi-provider questions, the routing logic fans out in parallel. See the deep research use case for the full cross-account scanning capability.

Tool 3: `clanker_run_command`

Executes an infrastructure command — a kubectl apply, a Terraform plan, an AWS CLI command — but with a hard guard: Maker Mode approval is required. The agent can prepare and propose a command, but execution is blocked until the operator explicitly approves. An agent calling clanker_run_command receives a pending-approval response, not an execution confirmation, until that approval arrives.

3. Agent Setup: OpenClaw, Claude Code, Codex, Hermes

OpenClaw

OpenClaw (68,000+ GitHub stars, MIT license, Node.js/TypeScript) registers the server in one line:

openclaw mcp set clanker-cloud --url http://127.0.0.1:39393

HTTP transport is recommended because OpenClaw may have multiple concurrent task threads; the persistent server handles session overlap cleanly.

Claude Code

Claude Code reads MCP server configurations from ~/.claude/claude_desktop_config.json:

{
  "mcpServers": {
    "clanker-cloud": {
      "command": "clanker",
      "args": ["mcp", "--transport", "stdio"]
    }
  }
}

Claude Code spawns a clanker mcp subprocess per session. The subprocess reads your local kubeconfig and cloud credentials directly — credentials never leave your machine or pass through Anthropic's API. Codex uses the same mcpServers configuration pattern.

Hermes via Ollama

Hermes (hermes3:70b, MIT license via NousResearch) runs locally via Ollama:

ollama pull hermes3:70b

Configure the MCP endpoint in your Hermes agent config to point to http://127.0.0.1:39393. Hermes or Gemma 4 (gemma4:31b via Ollama) combined with the local Clanker MCP server can keep the agent's model prompts and responses on-device without an external AI API call. Raw provider credentials also stay in the local credential chain. Live Kubernetes and cloud-provider queries still reach those providers, while account traffic and optional hosted features retain separate paths. See the for-agents page for the full compatible agent list.

4. Four Concrete Use Cases

4.1 Claude Code Mid-Session Infrastructure Query

A developer is writing a deployment script in Claude Code and needs to implement back-pressure logic based on pod memory limits. Rather than alt-tabbing to a dashboard, they ask Claude Code:

"Which pods are using more than 80% of their memory limit right now?"

Claude Code calls clanker_route_question. The MCP server runs the equivalent of kubectl top pods --all-namespaces cross-referenced against resource limit definitions from pod specs, then returns a structured list of pods above the threshold. Claude Code incorporates the live answer into the script, so back-pressure thresholds are based on actual cluster state.

This is the pattern that vibe-coding-to-production workflows depend on: the AI coding agent and the infrastructure query agent share the same context window, so generated code reflects live system state.

4.2 OpenClaw HEARTBEAT.md Autonomous Monitoring

OpenClaw's HEARTBEAT.md is a checklist of autonomous checks the agent runs on a schedule (default: every 30 minutes). A production cluster HEARTBEAT.md might include:

- [ ] Are all services in the production namespace healthy?
- [ ] Is any pod in CrashLoopBackOff?
- [ ] Is memory utilization above 85% on any node?

OpenClaw calls clanker_route_question for each check. If redis-cache returns a Degraded status — the kind of signal visible in the live demo — OpenClaw marks the check failed, constructs an alert with the pod name and error state, and triggers the configured alerting channel. Natural-language checks replace writing and maintaining custom Prometheus alerting rules for every condition.

4.3 Hermes Incident Correlation

A GitHub Actions workflow deploys a new checkout API version at 14:32 UTC. At 14:38, Hermes detects a latency spike and calls clanker_route_question: "What changed in the last 30 minutes in production?"

Clanker queries Kubernetes event logs, recent pod restarts, and config map changes, returning a timeline that includes the 14:32 deployment. Hermes correlates the deployment with the latency spike and creates a Jira ticket with the deployment SHA, affected service, latency delta, and a suggested rollback command — prepared but not executed without operator approval via clanker_run_command.

For teams building agent-driven incident response, this pattern is documented in the AI DevOps for teams guide.

4.4 Codex Infrastructure Code Generation from Live State

Codex is generating a Terraform configuration to resize the production EKS cluster. Before writing instance_type and desired_size, it calls clanker_route_question: "What instance type and node count is prod-cluster currently using, and what is its average CPU utilization over the last 7 days?"

Clanker returns live values from the EKS API and CloudWatch. Codex writes Terraform with accurate current values as baseline, then applies sizing logic from its prompt — producing a terraform plan that reflects actual cluster state rather than stale documentation.

5. Security Model: Why Local-Only MCP Is Different

Cloud-hosted AI infrastructure APIs have a trust problem: give an agent direct service account credentials (risky) or route queries through a vendor proxy (your cluster topology becomes visible to that vendor). Clanker Cloud's MCP server runs on 127.0.0.1 — loopback only. Key security properties:

Credential isolation: The MCP server reads ~/.aws/credentials, ~/.kube/config, and other local credential files. Those credentials are used to make API calls from your machine and are never serialized into MCP responses, logged, or transmitted externally.
No telemetry: The MCP server does not phone home. No usage tracking, no query logging, no session metadata sent to Clanker Cloud's backend.
BYOK model calls: When clanker_route_question invokes an AI model, that call goes directly from your machine to your configured AI provider (Anthropic, OpenAI, Google, Cohere, or a local Ollama instance) — not through a Clanker proxy. Model costs are billed directly by the provider.
Maker Mode enforcement: clanker_run_command does not execute without an explicit approval event. An agent can prepare a change; execution is blocked until the operator confirms.

The combination gives agents full read access to infrastructure state and the ability to prepare change plans, while write execution stays under human control. See the full security model in the docs and the FAQ for how Maker Mode interacts with automated agent loops.

6. What Agents Can and Cannot Do

Agents can: query live cluster state via clanker_route_question; read topology, resource utilization, event logs, cost signals, and recent changes; prepare and propose commands via clanker_run_command; check provider connectivity via clanker_version; fan out across multiple connected providers (AWS + K8s + GitHub in one session).

Agents cannot (without explicit operator action): execute kubectl apply, terraform apply, or any mutating command without Maker Mode approval; access credentials directly; reach the MCP server from outside the local machine; make changes silently.

The boundary between read-only agent autonomy and approval-gated write execution is what makes Clanker Cloud practical for production infrastructure. For the full workflow, see the demo.

7. MCP as the REST API of AI-Infrastructure Integrations

Pre-MCP, connecting an AI agent to infrastructure meant: write a custom tool definition for each agent runtime, wrap it in a function call schema, handle authentication inside the tool, parse the output back into the agent's expected format, and repeat per agent. The result was N agents × M infrastructure providers, each cell requiring its own integration.

With MCP, that matrix collapses. Clanker Cloud implements MCP once, and any MCP-compatible agent gets access to all connected providers. The agent calls clanker_route_question with a natural language question and gets a structured answer — no Kubernetes API semantics, no AWS SDK authentication, no GCP service account management needed on the agent side.

Before REST, integrating with a third-party service meant reading vendor-specific protocol specs and writing custom serialization code. After REST, you read an OpenAPI spec and made HTTP calls. MCP is doing that for AI-tool integrations. Clanker Cloud's role is the infrastructure adapter: translating AI agent queries into live infrastructure calls and returning structured results, with local-credential and approval-gate safety intact.

8. Building Custom Agents That Call the Clanker MCP Server

The Clanker MCP server is a standard MCP server. Any MCP client library can connect to it.

Python (using the mcp package):

import asyncio
from mcp import ClientSession
from mcp.client.http import http_client

async def query_infra(question: str) -> str:
    async with http_client("http://127.0.0.1:39393") as (read, write):
        async with ClientSession(read, write) as session:
            await session.initialize()
            result = await session.call_tool(
                "clanker_route_question",
                {"question": question}
            )
            return result.content[0].text

answer = asyncio.run(query_infra("are there any pods in CrashLoopBackOff?"))
print(answer)

Node.js (using the @modelcontextprotocol/sdk package):

import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StreamableHTTPClientTransport } from "@modelcontextprotocol/sdk/client/streamableHttp.js";

const client = new Client({ name: "infra-agent", version: "1.0.0" });
const transport = new StreamableHTTPClientTransport(new URL("http://127.0.0.1:39393/mcp"));
await client.connect(transport);

const result = await client.callTool({
  name: "clanker_route_question",
  arguments: { question: "what is the memory utilization of the production namespace?" }
});
console.log(result.content[0].text);

Both patterns work with any reasoning loop — simple scripts, LangChain agents, or custom frameworks. The MCP server handles infrastructure API complexity; your agent code handles task orchestration.

Full API reference and code examples are in the Clanker Cloud documentation. Start with a free beta account — $0 by default, with optional paid support tiers and BYOK model usage billed directly by your AI provider.

FAQ

What is an MCP server for Kubernetes?

A process that implements the Model Context Protocol and exposes Kubernetes operations — pod status, event logs, resource utilization — as MCP tools. An AI agent calls those tools in natural language rather than constructing raw Kubernetes API requests. Clanker Cloud's MCP server covers Kubernetes alongside AWS, GCP, Azure, and other providers through the same interface.

How does the Clanker Cloud MCP server differ from a cloud-hosted AI infrastructure API?

The Clanker Cloud MCP server runs on 127.0.0.1. Credentials are read from local files and used to make API calls from your machine. No credential material or infrastructure topology passes through Clanker Cloud's servers. Cloud-hosted AI infrastructure APIs proxy those calls through the vendor's infrastructure — a trust boundary that many security and compliance requirements cannot accommodate.

Which AI agents support MCP in 2026?

Claude Desktop, Claude Code, OpenClaw, Codex, Cursor, and Hermes all support MCP natively. Any agent built on Python or Node.js can add MCP client support using the mcp or @modelcontextprotocol/sdk packages. The protocol is open and vendor-neutral.

Can agents make changes to my infrastructure through the MCP server?

Agents can prepare and propose changes via clanker_run_command, but execution requires explicit Maker Mode approval. Read operations via clanker_route_question are unrestricted. The write-gate is enforced at the protocol level — the server returns a pending-approval response until the operator approves.

What does BYOK mean for MCP-based infrastructure queries?

Model calls use your own API key — configured in the Clanker Cloud desktop app — and are billed directly by your AI provider (Anthropic, OpenAI, Google, Cohere, or a local Ollama instance running Gemma 4 or Hermes). No Clanker Cloud markup, no proxy. The subscription covers the platform; model costs are separate and transparent.

Get Started

Install the CLI, start the MCP server, and point your agent at http://127.0.0.1:39393:

brew tap clankercloud/tap && brew install clanker
clanker mcp --transport http --listen 127.0.0.1:39393

Create a free account to connect the desktop app to your cloud providers. Browse the full documentation for provider setup, MCP tool schemas, and agent integration guides. If you are building autonomous agents that need live infrastructure data, the for-agents reference covers the full MCP surface in detail.

For teams moving from AI-generated code to production deployments, see the vibe-coding-to-production guide and the AI DevOps for teams overview.

Next step

Give your agent live infrastructure context

Download Clanker Cloud, expose the local MCP surface, and let coding agents work from current cloud, Kubernetes, GitHub, and cost state instead of guesses.