14 min read2026-04-24Last updated 2026-07-14Clanker Cloud Editorial Team

AI Infrastructure Assistant with Local Credentials: Why It Matters and How Clanker Cloud Delivers

How local desktop credential custody differs from direct BYOK and hosted AI infrastructure paths, and what each boundary means for security reviews.

Download Clanker Cloud Watch demo

The category of AI infrastructure assistant is maturing fast. Tools can now answer "why is checkout latency spiking?", enumerate all IAM roles with administrator access, and generate a reviewed change plan before anything touches production. The capability gap between these tools has narrowed considerably. The trust gap has not.

Every AI infrastructure assistant makes one fundamental choice: where do your credentials go when the tool needs to run a query? The answer to that question determines your security posture, your compliance exposure, and whether a vendor breach becomes your breach.

This article compares hosted and local desktop credential paths and explains how Clanker Cloud's normal desktop workflow differs from its direct BYOK and hosted features. "Local credentials" is a custody claim about raw cloud credentials, not a promise that every prompt, result, file, or collaboration feature stays on the machine.

1. The Credential Trust Problem: Why It Matters in 2026

AWS IAM keys, kubeconfig files, and GCP service account JSON are among the most sensitive assets a company holds. Whoever controls those credentials can enumerate your infrastructure, exfiltrate data, escalate privileges, and delete resources. This is not a theoretical concern — it is the core of most cloud security incidents.

Most hosted AI DevOps tools work in one of two ways: they install an agent on your infrastructure that sends telemetry to the vendor's cloud, or they ask you to paste credentials directly into a SaaS platform. Either way, your credentials or the data they unlock lives in someone else's environment.

When you grant an AI assistant access to your cloud accounts, you are trusting more than the tool. You are trusting the vendor's security posture, their breach response, their data retention policies, and the access controls governing which employees can view your data — plus every sub-processor in their supply chain.

High-profile credential leaks frequently trace back to third-party tooling rather than direct infrastructure breaches. In the CircleCI incident of January 2023, customer secrets were exfiltrated from CircleCI's systems — every team using CircleCI had to rotate credentials immediately, regardless of whether their own infrastructure was hardened. The Codecov breach in 2021 involved a compromised uploader script that stole environment variables and credentials from CI pipelines. In both cases, the attack surface was a trusted third-party tool, not the affected team's own infrastructure.

The pattern is consistent: when credentials leave your controlled environment, the perimeter expands to include every system they touch.

2. The Two Models: Hosted vs Local-First

The AI infrastructure assistant market is splitting along a clear architectural line:

Approach	How credentials are handled	Example tools
Hosted AI assistant	Credentials stored or proxied on vendor platform	AWS Q, Datadog Bits AI, most hosted AI copilots
Local desktop assistant	Raw credentials read from the local machine; other content follows the selected model and service route	Clanker Cloud desktop, Clanker CLI

The hosted model is not inherently reckless — for teams already comfortable with a vendor's security posture, the convenience may outweigh the risk. But it is a deliberate trust decision that should be made explicitly, not by default.

The local desktop model draws a hard boundary around raw credentials and direct cloud-provider calls: the tool reads the same local credential chain as other desktop tools and makes those calls from your machine. That boundary does not automatically cover model inference, sandboxes, voice, account services, or opt-in remote control; each has its own data path.

3. How Most Hosted AI Assistants Actually Handle Your Credentials

It is worth being specific about what "hosted" means in practice, because the word covers several different architectures.

AWS Q for Infrastructure offers deep native integration with AWS resources and understands them natively. Queries are processed within AWS's AI infrastructure, and AWS has access to your account data to generate answers. For teams already operating entirely within AWS and comfortable with that boundary, this may be acceptable.

GitHub Copilot Workspace is primarily code-focused and does not perform live infrastructure queries. It does not connect to your running clusters or cloud accounts. This makes it a non-candidate for infrastructure operations, though it is useful for the code side of vibe-coding-to-production workflows.

Datadog Bits AI is a capable AI assistant for infrastructure analysis — but it operates on data that is already flowing into Datadog's hosted platform. The agent installed on each host sends metrics, traces, and logs to Datadog's cloud. Bits AI then reasons over that data. This is an effective architecture for teams already committed to Datadog, but it means your infrastructure topology, service relationships, and cost signals all live on Datadog's servers before the AI ever sees them.

The key comparison is therefore not simply desktop versus browser. It is the complete route for each data category: raw credentials, infrastructure evidence, prompts, files, audio, remote-control messages, and account metadata can have different destinations even within one product.

4. Clanker Cloud's Local-First Model: How It Actually Works

Clanker Cloud is a desktop app for macOS, Windows, and Linux. In the normal desktop provider workflow, it reads credentials from the same locations your existing tools already use: ~/.aws/credentials (or environment variables), ~/.kube/config, GCP application default credentials via the cloud SDK, and other standard credential chains. There is no credential import step for that workflow, and raw cloud credentials are not stored in Clanker's infrastructure.

When you ask "show me all EC2 instances in us-east-1," the cloud-provider request goes: your machine → AWS API. Clanker Cloud makes that API call from your machine using your local credentials, and the response returns to your machine. The AWS credential and provider API call are not proxied through Clanker's servers. If a model is asked to analyze selected results, the subsequent route depends on whether you configured local Ollama, direct desktop BYOK, or Standard hosted inference.

This is the same trust model as running aws ec2 describe-instances in your terminal. The difference is that Clanker Cloud parses the output, correlates it with other data sources (costs, logs, related resources), and surfaces a structured answer in plain English.

Boundaries outside the normal desktop provider workflow

Direct desktop BYOK: selected prompts and infrastructure evidence go directly to the chosen model provider under that provider's terms. With local Ollama, the model prompt can remain on the machine.
Standard hosted inference: prompt and selected context pass through the Clanker Cloud Google Cloud control plane to the Google Gemini Developer API.
Standard sandboxes and hosted voice: commands, files, runtime data, audio, transcripts, or generated speech use shared global Cloudflare services as applicable. Account-region preference is not a Standard residency guarantee.
Opt-in web-to-desktop remote control: instructions, responses, status, device identifiers, and limited diagnostics pass through the Clanker Cloud portal and a Google Cloud relay. This path is not end-to-end encrypted, and protected and sovereign accounts currently reject it.
Account services: authentication, account, security, and audit metadata use the Google Cloud control plane.

Business and Enterprise purchase or region selection starts onboarding; it does not activate a protected environment. Regulated-data, residency, DPA, BAA, CJIS, and similar commitments require signed terms and a separately verified active environment. Review the current Security, Privacy, and Subprocessor and Transfer Register pages before choosing a mode.

The four-step workflow reflects this:

ASK — Query live infrastructure in plain English: incidents, topology, cost, recent changes
INSPECT — Scan resources, trace dependencies, inspect topology without console-hopping
PLAN — Generate a reviewed change plan; see intended impact before anything executes
APPLY — Maker Mode: explicit approval-gated execution; changes only happen when you confirm

For AI DevOps teams that need to move fast without compromising on security controls, this combination — local credentials plus explicit approval gates — gives you the speed of plain-English queries without surrendering custody of your most sensitive assets.

5. Plain-English Query Examples with Actual Output

The provider API calls in these examples run directly from your machine and do not expose raw provider credentials to Clanker. Generating or summarizing an answer can still send selected results to the configured model route: local Ollama, a direct BYOK provider, or Standard hosted inference.

IAM audit:

Query: "show me all IAM roles with AdministratorAccess"

Clanker Cloud reads your local ~/.aws/credentials, calls the IAM API, and returns a list of roles with AdministratorAccess attached — with direct links to each role in the AWS console.

S3 public access check:

Query: "which S3 buckets have public access enabled?"

A direct S3 API call from your machine. Results are grouped by account and region, flagged by severity.

Kubernetes resource limits:

Query: "show me all pods running without resource limits"

Reads your local kubeconfig, queries the Kubernetes API, and returns a table of pods missing CPU or memory limits — exactly the kind of finding that surfaces as a reliability risk in the Deep Research scan.

Live incident diagnosis:

Query: "why is checkout-api latency spiking?"
Answer: "checkout-api is the hottest synchronous service in this path. 
redis is degraded, so more reads are falling through to orders-postgres. 
orders-api and billing-worker still look healthy, so the blast radius 
is mostly checkout."

This answer starts by reading live infrastructure state — pod metrics, service topology, dependency relationships — from your machine using your local credentials. Selected evidence is then processed according to the configured model route. The live demo at clankercloud.ai/demo shows the interaction; it should not be treated as proof that every deployment uses the same data boundary.

6. For Regulated Industries: GDPR, HIPAA, and SOC 2 Implications

Security-conscious teams in regulated industries face a more acute version of this problem. The credential trust question is not just about security hygiene — it has direct compliance implications.

GDPR Article 44 regulates transfers of personal data to third countries. If infrastructure evidence contains personal data, sending it to a cloud-model provider or hosted Clanker Cloud feature may require a lawful transfer mechanism and documented processor terms. A local-only model route can reduce transfers, but local credential custody alone does not remove GDPR obligations or establish residency.

HIPAA requires covered entities and business associates to assess whether a service creates, receives, maintains, or transmits protected health information on their behalf and to execute a Business Associate Agreement when required. That role depends on the facts and data flow; it should not be inferred solely from a "local-first" label. Local-only processing can reduce vendor handling, but it does not remove the customer's HIPAA duties. Do not submit PHI to Clanker Cloud hosted features unless a BAA and an approved protected environment are both in effect; Standard is not that environment.

SOC 2 Type II auditors commonly examine credential storage, access, change approval, logging, retention, and vendor management. Keeping raw credentials on the operator's machine can simplify one part of that control narrative, but hosted model, relay, sandbox, voice, and account paths remain in scope for the relevant risk assessment and evidence.

Local-only configurations may narrow parts of a vendor assessment. They do not eliminate legal, security, retention, access-control, incident-response, or transfer obligations, and no product mode creates compliance by itself.

7. MCP and Agent Security: Local Server, Local Credentials

Clanker Cloud exposes an MCP (Model Context Protocol) server that AI agents can use to query your infrastructure. The security model here is worth understanding precisely.

The MCP server runs on 127.0.0.1 — localhost only, not exposed to the network. An agent running on the same machine can connect to it; a cloud-hosted agent cannot. Credential isolation extends to agent-driven workflows.

clanker mcp --transport http --listen 127.0.0.1:39393

The three MCP tools are:

clanker_version — returns version info
clanker_route_question — routes a plain-English infrastructure question
clanker_run_command — executes infrastructure operations (write-gated by Maker Mode)

Read and write operations are separated. Agents — OpenClaw, Claude Code, Hermes, or any MCP-compatible agent — can query your infrastructure freely: enumerate resources, check configurations, correlate events. They cannot make changes without operator confirmation. clanker_run_command requires Maker Mode approval before any write executes; an agent proposes the change, the operator approves or rejects it.

For teams experimenting with autonomous agent workflows, the for-ai-agents documentation covers the full MCP surface, including how to configure OpenClaw's HEARTBEAT.md autonomous check pattern against a live cluster.

8. BYOK: What "Bring Your Own Keys" Actually Means for Cost and Trust

Most hosted AI DevOps tools bundle AI model access into their platform pricing. This creates two problems: you do not know which model is running your queries, and you are paying a markup on top of the underlying model cost.

In direct desktop BYOK mode, you bring your own key for the selected AI provider. That key and the selected prompt and context go directly from your machine to that provider; Clanker's hosted inference path is not used for that request. Standard hosted inference is a separate mode that passes prompt and selected context through the Clanker Cloud Google Cloud control plane to the Google Gemini Developer API.

Available models as of April 2026:

Anthropic: Claude Opus 4.6 (claude-opus-4-6), Claude Sonnet 4.6 (claude-sonnet-4-6)
OpenAI: GPT-5.4 Thinking, GPT-5.4 Pro, GPT-5.4 mini; open-weight gpt-oss-120b, gpt-oss-20b
Google: Gemini 3.1 Pro (gemini-3.1-pro-preview), Gemini 3 Flash
Cohere: Command A (cohere.command-a-03-2025) — 256K context, open-weights
Local via Ollama: Gemma 4 (gemma4:31b, gemma4:26b, gemma4:e4b), Hermes (hermes3:70b, hermes3:8b)

The trust implication of BYOK extends beyond cost: when your AI model key goes directly to Anthropic or OpenAI, those providers' data handling policies apply. When it goes through a middleware SaaS platform, you are subject to both the AI provider's policies and the middleware platform's policies.

Using a supported model through local Ollama can keep the model prompt on your machine. Cloud-provider calls still go to those providers, and ordinary account, security, download, or update traffic may still use Clanker Cloud services. Teams with strict residency requirements must validate the complete route and contract, not only the model endpoint.

Full documentation on model configuration and BYOK setup is at docs.clankercloud.ai.

9. When Hosted Tools Make Sense

The case for local-first is strong, but not universal.

AWS Q makes sense for teams fully committed to the AWS ecosystem, already using AWS-native security controls (SCPs, IAM Identity Center, CloudTrail), and comfortable with AWS's shared responsibility model.

Datadog Bits AI makes sense when your metrics, traces, and logs are already centralized in Datadog. Bits AI adds an AI layer on top of that investment without introducing new data flows — the credential custody question is settled by the existing Datadog agent deployment.

GitHub Copilot makes sense for code-first workflows where live infrastructure queries are not needed. Teams on fully managed platforms (Heroku, Render, Railway) may not require direct infra access at all.

For teams starting fresh in 2026, or re-evaluating their AI tooling stack, local-first is the lower-risk default.

10. FAQ

Does Clanker Cloud store my AWS credentials anywhere?

In the normal desktop provider workflow, Clanker Cloud reads credentials from your local credential chain — ~/.aws/credentials, environment variables, or the AWS SDK credential provider chain — the same sources used by the AWS CLI and other local tools. Raw cloud credentials are not transmitted to Clanker's infrastructure. This does not mean that selected results or other hosted-feature data remain local.

What happens to my AI model API keys?

In direct desktop BYOK mode, the model key and request go directly from your machine to the selected provider. Local Ollama can keep the model prompt on-device. Standard hosted inference is different: it uses the Clanker Cloud Google Cloud control plane and Google Gemini Developer API. Verify the selected mode, provider terms, and network path before sending sensitive content.

Can Clanker Cloud make changes to my infrastructure without my approval?

No. All write operations require explicit Maker Mode approval. The tool generates a reviewed plan showing the intended changes before anything executes. You approve each operation individually. Agents using the MCP interface are subject to the same approval gate — clanker_run_command will not execute writes without operator confirmation.

Is the local-first model compatible with team workflows?

Yes. Each team member runs the desktop app on their machine with their own credentials. This is consistent with standard IAM best practices: each operator has individual credentials with appropriate permissions, and there is no shared credential that lives in a third-party platform. For team-level configuration and access management, see the account portal.

Get Started

Pricing note (July 14, 2026): The free-beta reference in this April 2026 article is historical. See current Pricing. A Business or Enterprise purchase starts onboarding and does not activate a protected environment.

Download the desktop app, use your existing local credential chain, and start querying your infrastructure in plain English after reviewing the data path for the model and features you enable.

Download and create your account — no credential import or agent rollout is required for the normal desktop provider workflow; hosted service boundaries are described above.

For team deployment patterns, see AI DevOps for teams. For MCP integration and agent security patterns, see the for-agents documentation. For the path from prototype to production-grade infrastructure, see vibe-coding-to-production.

Questions are answered in the FAQ.

Next step

Give your agent live infrastructure context

Download Clanker Cloud, expose the local MCP surface, and let coding agents work from current cloud, Kubernetes, GitHub, and cost state instead of guesses.

Download Clanker Cloud Watch demo

Byline

Clanker Cloud Editorial Team

Editorial Team

Clanker Cloud Editorial Team writes about local-first infrastructure, multi-cloud operations, AI-assisted incident response, and safer workflows for builders and infrastructure teams.