Skip to main content
Back to blog

AI Infrastructure Assistant with Local Credentials: Why It Matters and How Clanker Cloud Delivers

AI infrastructure assistants split into two models: hosted (credentials go to vendor) and local-first (everything stays on your machine).

The category of AI infrastructure assistant is maturing fast. Tools can now answer "why is checkout latency spiking?", enumerate all IAM roles with administrator access, and generate a reviewed change plan before anything touches production. The capability gap between these tools has narrowed considerably. The trust gap has not.

Every AI infrastructure assistant makes one fundamental choice: where do your credentials go when the tool needs to run a query? The answer to that question determines your security posture, your compliance exposure, and whether a vendor breach becomes your breach.

This article defines the two emerging models — hosted and local-first — explains what each means for credential custody, and shows exactly how Clanker Cloud implements the local-first model in practice.


1. The Credential Trust Problem: Why It Matters in 2026

AWS IAM keys, kubeconfig files, and GCP service account JSON are among the most sensitive assets a company holds. Whoever controls those credentials can enumerate your infrastructure, exfiltrate data, escalate privileges, and delete resources. This is not a theoretical concern — it is the core of most cloud security incidents.

Most hosted AI DevOps tools work in one of two ways: they install an agent on your infrastructure that sends telemetry to the vendor's cloud, or they ask you to paste credentials directly into a SaaS platform. Either way, your credentials or the data they unlock lives in someone else's environment.

When you grant an AI assistant access to your cloud accounts, you are trusting more than the tool. You are trusting the vendor's security posture, their breach response, their data retention policies, and the access controls governing which employees can view your data — plus every sub-processor in their supply chain.

High-profile credential leaks frequently trace back to third-party tooling rather than direct infrastructure breaches. In the CircleCI incident of January 2023, customer secrets were exfiltrated from CircleCI's systems — every team using CircleCI had to rotate credentials immediately, regardless of whether their own infrastructure was hardened. The Codecov breach in 2021 involved a compromised uploader script that stole environment variables and credentials from CI pipelines. In both cases, the attack surface was a trusted third-party tool, not the affected team's own infrastructure.

The pattern is consistent: when credentials leave your controlled environment, the perimeter expands to include every system they touch.


2. The Two Models: Hosted vs Local-First

The AI infrastructure assistant market is splitting along a clear architectural line:

Approach How credentials are handled Example tools
Hosted AI assistant Credentials stored or proxied on vendor platform AWS Q, Datadog Bits AI, most hosted AI copilots
Local-first assistant Credentials read from local machine, never leave Clanker Cloud, Clanker CLI

The hosted model is not inherently reckless — for teams already comfortable with a vendor's security posture, the convenience may outweigh the risk. But it is a deliberate trust decision that should be made explicitly, not by default.

The local-first model draws a hard boundary: the tool runs on your machine, reads credentials from the same secure storage your other local tools use, and makes API calls directly from your machine. The vendor's infrastructure is never in the data path.


3. How Most Hosted AI Assistants Actually Handle Your Credentials

It is worth being specific about what "hosted" means in practice, because the word covers several different architectures.

AWS Q for Infrastructure offers deep native integration with AWS resources and understands them natively. Queries are processed within AWS's AI infrastructure, and AWS has access to your account data to generate answers. For teams already operating entirely within AWS and comfortable with that boundary, this may be acceptable.

GitHub Copilot Workspace is primarily code-focused and does not perform live infrastructure queries. It does not connect to your running clusters or cloud accounts. This makes it a non-candidate for infrastructure operations, though it is useful for the code side of vibe-coding-to-production workflows.

Datadog Bits AI is a capable AI assistant for infrastructure analysis — but it operates on data that is already flowing into Datadog's hosted platform. The agent installed on each host sends metrics, traces, and logs to Datadog's cloud. Bits AI then reasons over that data. This is an effective architecture for teams already committed to Datadog, but it means your infrastructure topology, service relationships, and cost signals all live on Datadog's servers before the AI ever sees them.

The Linear vs Notion analogy is useful here. Datadog Bits AI is like Notion: the data lives on the vendor's servers, and the AI queries it there. Clanker Cloud is like Linear: a native desktop app that syncs when needed but keeps your working data on your machine.


4. Clanker Cloud's Local-First Model: How It Actually Works

Clanker Cloud is a desktop app for macOS, Windows, and Linux. It reads credentials from the same locations your existing tools already use: ~/.aws/credentials (or environment variables), ~/.kube/config, GCP application default credentials via the cloud SDK, and other standard credential chains. There is no credential import step, no "connect your cloud account" OAuth flow, and no credential storage in Clanker's infrastructure.

When you ask "show me all EC2 instances in us-east-1," the request goes: your machine → AWS API. Clanker Cloud makes the API call from your machine using your local credentials. The response comes back to your machine. Nothing is proxied through Clanker's servers.

This is the same trust model as running aws ec2 describe-instances in your terminal. The difference is that Clanker Cloud parses the output, correlates it with other data sources (costs, logs, related resources), and surfaces a structured answer in plain English.

The four-step workflow reflects this:

  1. ASK — Query live infrastructure in plain English: incidents, topology, cost, recent changes
  2. INSPECT — Scan resources, trace dependencies, inspect topology without console-hopping
  3. PLAN — Generate a reviewed change plan; see intended impact before anything executes
  4. APPLY — Maker Mode: explicit approval-gated execution; changes only happen when you confirm

For AI DevOps teams that need to move fast without compromising on security controls, this combination — local credentials plus explicit approval gates — gives you the speed of plain-English queries without surrendering custody of your most sensitive assets.


5. Plain-English Query Examples with Actual Output

These examples all run against your infrastructure directly from your machine. No data leaves your environment to generate these answers.

IAM audit:

Query: "show me all IAM roles with AdministratorAccess"

Clanker Cloud reads your local ~/.aws/credentials, calls the IAM API, and returns a list of roles with AdministratorAccess attached — with direct links to each role in the AWS console.

S3 public access check:

Query: "which S3 buckets have public access enabled?"

A direct S3 API call from your machine. Results are grouped by account and region, flagged by severity.

Kubernetes resource limits:

Query: "show me all pods running without resource limits"

Reads your local kubeconfig, queries the Kubernetes API, and returns a table of pods missing CPU or memory limits — exactly the kind of finding that surfaces as a reliability risk in the Deep Research scan.

Live incident diagnosis:

Query: "why is checkout-api latency spiking?"
Answer: "checkout-api is the hottest synchronous service in this path. 
redis is degraded, so more reads are falling through to orders-postgres. 
orders-api and billing-worker still look healthy, so the blast radius 
is mostly checkout."

This answer comes from reading live infrastructure state — pod metrics, service topology, dependency relationships — all from your machine using your local credentials. The live demo at clankercloud.ai/demo shows this interaction against a real environment.


6. For Regulated Industries: GDPR, HIPAA, and SOC 2 Implications

Security-conscious teams in regulated industries face a more acute version of this problem. The credential trust question is not just about security hygiene — it has direct compliance implications.

GDPR Article 44 restricts the transfer of personal data to third countries. If your infrastructure topology contains EU user data — service names, database schemas, log formats that reveal PII patterns — sending that topology to a vendor's AI platform may constitute a data transfer requiring appropriate safeguards. A local-first tool sidesteps this question entirely: nothing leaves your controlled environment.

HIPAA requires covered entities to execute Business Associate Agreements with any vendor that may process protected health information or PHI-adjacent data. Infrastructure telemetry from a healthcare application can qualify. If an AI tool is analyzing your infrastructure and that infrastructure serves PHI, the vendor becomes a business associate. Local-first tools that never receive your infrastructure data are not business associates.

SOC 2 Type II auditors ask specifically where cloud credentials are stored and who has access. An honest answer of "in a third-party SaaS platform, subject to that vendor's controls" requires compensating controls and documentation. "On the engineer's machine, accessed only by the local desktop app" is a cleaner answer.

Local-first does not eliminate your compliance obligations, but it significantly narrows the scope of vendor assessment required for your AI infrastructure tooling.


7. MCP and Agent Security: Local Server, Local Credentials

Clanker Cloud exposes an MCP (Model Context Protocol) server that AI agents can use to query your infrastructure. The security model here is worth understanding precisely.

The MCP server runs on 127.0.0.1 — localhost only, not exposed to the network. An agent running on the same machine can connect to it; a cloud-hosted agent cannot. Credential isolation extends to agent-driven workflows.

clanker mcp --transport http --listen 127.0.0.1:39393

The three MCP tools are:

  • clanker_version — returns version info
  • clanker_route_question — routes a plain-English infrastructure question
  • clanker_run_command — executes infrastructure operations (write-gated by Maker Mode)

Read and write operations are separated. Agents — OpenClaw, Claude Code, Hermes, or any MCP-compatible agent — can query your infrastructure freely: enumerate resources, check configurations, correlate events. They cannot make changes without operator confirmation. clanker_run_command requires Maker Mode approval before any write executes; an agent proposes the change, the operator approves or rejects it.

For teams experimenting with autonomous agent workflows, the for-ai-agents documentation covers the full MCP surface, including how to configure OpenClaw's HEARTBEAT.md autonomous check pattern against a live cluster.


8. BYOK: What "Bring Your Own Keys" Actually Means for Cost and Trust

Most hosted AI DevOps tools bundle AI model access into their platform pricing. This creates two problems: you do not know which model is running your queries, and you are paying a markup on top of the underlying model cost.

Clanker Cloud is BYOK by design. You bring your own API keys for whichever AI provider you use — Anthropic, OpenAI, Google, Cohere, or local models via Ollama. Those keys go directly from your machine to the AI provider. Clanker's infrastructure is not in that path either.

Available models as of April 2026:

  • Anthropic: Claude Opus 4.6 (claude-opus-4-6), Claude Sonnet 4.6 (claude-sonnet-4-6)
  • OpenAI: GPT-5.4 Thinking, GPT-5.4 Pro, GPT-5.4 mini; open-weight gpt-oss-120b, gpt-oss-20b
  • Google: Gemini 3.1 Pro (gemini-3.1-pro-preview), Gemini 3 Flash
  • Cohere: Command A (cohere.command-a-03-2025) — 256K context, open-weights
  • Local via Ollama: Gemma 4 (gemma4:31b, gemma4:26b, gemma4:e4b), Hermes (hermes3:70b, hermes3:8b)

The trust implication of BYOK extends beyond cost: when your AI model key goes directly to Anthropic or OpenAI, those providers' data handling policies apply. When it goes through a middleware SaaS platform, you are subject to both the AI provider's policies and the middleware platform's policies.

Using Gemma 4 via Ollama takes this further — the model runs on your machine, your queries never leave your local environment, and the only network traffic is the API calls to your cloud providers. For teams with strict data residency requirements, this is the most defensible configuration.

Full documentation on model configuration and BYOK setup is at docs.clankercloud.ai.


9. When Hosted Tools Make Sense

The case for local-first is strong, but not universal.

AWS Q makes sense for teams fully committed to the AWS ecosystem, already using AWS-native security controls (SCPs, IAM Identity Center, CloudTrail), and comfortable with AWS's shared responsibility model.

Datadog Bits AI makes sense when your metrics, traces, and logs are already centralized in Datadog. Bits AI adds an AI layer on top of that investment without introducing new data flows — the credential custody question is settled by the existing Datadog agent deployment.

GitHub Copilot makes sense for code-first workflows where live infrastructure queries are not needed. Teams on fully managed platforms (Heroku, Render, Railway) may not require direct infra access at all.

For teams starting fresh in 2026, or re-evaluating their AI tooling stack, local-first is the lower-risk default.


10. FAQ

Does Clanker Cloud store my AWS credentials anywhere?

No. Clanker Cloud reads credentials from your local credential chain — ~/.aws/credentials, environment variables, or the AWS SDK credential provider chain — the same sources used by the AWS CLI and other local tools. Credentials are never transmitted to Clanker's infrastructure.

What happens to my AI model API keys?

Your AI model keys go directly from your machine to the AI provider (Anthropic, OpenAI, Google, etc.). Clanker Cloud does not proxy these requests. You can verify this by inspecting network traffic from the app — all AI API calls will resolve directly to the provider's endpoints.

Can Clanker Cloud make changes to my infrastructure without my approval?

No. All write operations require explicit Maker Mode approval. The tool generates a reviewed plan showing the intended changes before anything executes. You approve each operation individually. Agents using the MCP interface are subject to the same approval gate — clanker_run_command will not execute writes without operator confirmation.

Is the local-first model compatible with team workflows?

Yes. Each team member runs the desktop app on their machine with their own credentials. This is consistent with standard IAM best practices: each operator has individual credentials with appropriate permissions, and there is no shared credential that lives in a third-party platform. For team-level configuration and access management, see the account portal.


Get Started

Clanker Cloud is in free beta. One-minute setup: download the desktop app, point it at your existing credential files, and start querying your infrastructure in plain English.

Download and create your account — no credential import, no agent rollout, no vendor access to your cloud keys.

For team deployment patterns, see AI DevOps for teams. For MCP integration and agent security patterns, see the for-agents documentation. For the path from prototype to production-grade infrastructure, see vibe-coding-to-production.

Questions are answered in the FAQ.

Next step

Run a local security and drift review

Use Clanker Cloud to inspect live cloud and Kubernetes state with local credentials, then review findings before any infrastructure change runs.

Download and scan infrastructureWatch demo