Skip to main content
Back to blog

Natural Language Kubernetes Management: The Full Day-2 Workflow

Manage Kubernetes clusters in plain English — inspect resources, analyze costs, check policy, and apply changes with approval gates.

Managing a Kubernetes cluster after initial deployment — day-2 operations — is where kubectl fluency becomes a persistent tax on engineering time. You need the right flags, resource types, field selectors, and jq expressions to get a useful answer from the API server. For a question as simple as "which pods are running without resource limits?" you would normally write:

kubectl get pods --all-namespaces -o json | jq '
  .items[] |
  select(.spec.containers[].resources.limits == null) |
  {name: .metadata.name, namespace: .metadata.namespace}'

That is reasonable to ask of a senior engineer. It is slow to construct from memory during an incident, and impossible to ask a product manager trying to understand cluster resource hygiene.

Natural language Kubernetes management changes this. Not by replacing kubectl — it runs kubectl under the hood — but by adding a plain-English interface on top of your existing cluster context. You ask; the tool translates, executes against the live API, and returns a structured answer.

This article covers the full day-2 management lifecycle: inspection, cost visibility, scaling state, policy posture, topology, and controlled changes. For troubleshooting specifically (OOMKilled, CrashLoopBackOff, ImagePullBackOff), see Article 117. This article covers the management workflow that happens before and around incidents.


What Natural Language K8s Management Actually Means

Natural language Kubernetes management is not a chatbot that knows Kubernetes documentation. It is a tool that holds your kubeconfig, queries your live cluster, and returns answers grounded in actual API state — not training data.

Clanker Cloud reads your local ~/.kube/config. When you ask a question, it routes that question to the appropriate kubectl commands or K8s API calls, executes them against your real cluster, and returns a readable answer. Every response is backed by a live API call — no cached state, no guessing.

The four-step workflow applies directly to K8s management:

  1. ASK — query live cluster state in plain English
  2. INSPECT — scan resources, trace dependencies, surface topology
  3. PLAN — generate a reviewed change plan before any modification
  4. APPLY — Maker Mode executes only after explicit operator approval

Queries are instant and non-destructive. Changes require explicit approval. Natural language K8s management in Clanker Cloud is not autonomous — it is a precision interface for faster, clearer human decisions.

For teams exploring how this fits a broader AI DevOps workflow for teams, the full context is covered there.


Inspection Queries: Understanding What Is Running

The first category of natural language K8s management is pure inspection — understanding what the cluster is actually running right now. These queries have no side effects.

Resource hygiene queries:

  • "show me all pods with CPU above 80% in namespace production"
  • "which pods are running without resource limits?"
  • "show me all services that have no health checks configured"
  • "list all ConfigMaps that haven't been accessed in 30 days"
  • "which deployments have been running for more than 60 days without a rollout?"

The last query is particularly useful for identifying deployment drift — services running from an image that predates your current security or dependency baselines. A 60-day-old deployment in a fast-moving codebase is often a forgotten service nobody owns.

Clanker Cloud translates these into combinations of kubectl get, kubectl describe, and field-selector expressions. You get back a structured list: pod names, namespaces, current CPU usage, and whether each resource is above your threshold.


Cost Queries: Understanding What It Costs

Cost visibility in Kubernetes is notoriously difficult. Cloud provider billing is at the node level, not the pod level. To understand per-namespace or per-service cost, you need to correlate node capacity, resource requests, actual usage, and per-instance pricing — a calculation most teams simply do not do on a regular basis.

Natural language cost queries make this routine:

  • "which namespaces are spending the most this month?"
  • "show me all idle deployments — pods with less than 5% CPU average"
  • "which PersistentVolumeClaims are not mounted by any pod?"
  • "what's the monthly cost breakdown by namespace?"

That last query, run against a representative production cluster, might return:

production:   $356/month
ml-jobs:      $187/month
staging:       $89/month

The idle deployment query is especially valuable. A worker pool averaging 3% CPU over 30 days but running 4 replicas burns approximately $140/month in pure waste — the kind of finding that surfaces immediately in a Deep Research scan but requires manual analysis through standard kubectl tooling.

Unmounted PersistentVolumeClaims are another common waste category. A PVC provisioned for a deleted pod continues to accrue storage costs. This query cross-references kubectl get pvc --all-namespaces with pod volume mounts — a query that takes seconds to ask and minutes to write by hand.


Scaling Queries: Understanding Capacity State

Scaling decisions in Kubernetes require understanding current replica counts, HPA configuration, and actual vs desired state across deployments. These are day-2 questions that operators ask constantly but rarely have fast answers to.

Natural language scaling queries:

  • "how many replicas does checkout-api currently have?"
  • "are any deployments currently at their HPA maximum?"
  • "show me deployments where actual replica count differs from desired"
  • "which services scaled up in the last 24 hours?"

The HPA maximum query is useful before making changes. If a deployment is already at its HPA ceiling with high CPU, scaling it down is the wrong move — you need to understand why HPA cannot scale it further (node capacity? quota?) before acting.

"Show me deployments where actual replica count differs from desired" surfaces pods stuck in a pending or terminating state — a condition that means your cluster is not in a healthy steady state even if no alarms have fired.


Policy and Security Queries: Understanding Security Posture

Security posture in Kubernetes is often invisible until an audit or an incident. These queries bring it into the regular management workflow:

  • "which pods are running as root?"
  • "which services are exposed as NodePort?"
  • "are there any privileged containers in production?"
  • "show me all ServiceAccounts with cluster-admin binding"
  • "which namespaces have no ResourceQuota set?"

The cluster-admin binding query is a common security finding. ServiceAccounts bound to cluster-admin have unrestricted access to the entire cluster. In a well-configured cluster there should be zero or exactly one for a documented system component. Finding five or six across production namespaces is not unusual, and it is the kind of drift that accumulates without anyone noticing.

NodePort exposure is another common finding. Services exposed as NodePort bypass your ingress layer and are reachable from any node's external IP. In most production configurations, NodePort services should be rare or nonexistent.

Namespaces without ResourceQuota set can cause noisy-neighbor problems — a misbehaving workload can consume unbounded CPU and memory, starving other pods in the same cluster.

These queries translate to kubectl get pods -o json, kubectl get svc, and kubectl get clusterrolebindings with the appropriate field selectors and jq expressions. Clanker Cloud executes them against your live cluster without you constructing the query chain.

For teams building toward production from AI-assisted development, these security posture queries pair well with the vibe coding to production workflow — catching policy drift before it becomes an audit finding.


Topology Queries: Understanding Dependencies

Topology queries are the hardest to answer with raw kubectl because they require correlating multiple resource types: Services, Deployments, Endpoints, NetworkPolicies, and ideally runtime traffic data.

Natural language topology queries:

  • "what talks to the orders-postgres database?"
  • "show me the dependency graph for the checkout service"
  • "which services have no upstream callers?"
  • "map all internal services in production namespace"

"What talks to the orders-postgres database?" matters before any database change — schema migration, instance resize, or failover test. In the Clanker Cloud demo environment, orders-postgres receives traffic from checkout-api, orders-api, and billing-worker. If session-cache is degraded and redis reads are falling through, orders-postgres sees elevated qps from checkout-api specifically — a topology-aware answer that explains latency patterns without manual tracing through service mesh logs.

"Which services have no upstream callers?" identifies orphaned services consuming resources but receiving no traffic. Combined with the idle CPU query from the cost section, this reliably surfaces services ready for decommission.

For a deeper look at how AI agents use topology context to make better infrastructure decisions, the agent-facing documentation covers the MCP surface in detail.


Plan-Before-Apply: Natural Language Changes with Approval Gates

Inspection queries are non-destructive. When natural language K8s management moves from reading state to changing it, the model changes.

Clanker Cloud implements a plan-before-apply model for all cluster changes. A natural language change request generates a plan for operator review. The plan shows exactly what will change, the current state it is changing from, the estimated cost impact, and a risk assessment based on downstream dependencies. Changes execute only after explicit approval.

Here is what this looks like for a scaling change:

Request: "scale billing-worker down from 4 to 2 replicas"

PLAN:
Resource:          billing-worker (Deployment, namespace: production)
Change:            replicas 4 → 2
Current CPU avg:   3.1% (30-day window)
Current mem avg:   11.8%
Est. cost change:  -$62/month
Risk:              LOW — no downstream callers identified
Approve? [yes/no]

The plan answers the three questions any operator should ask before a scaling change: Is this workload actually idle? What does the change cost? Who depends on it?

A 3.1% CPU average over 30 days is genuine idleness. No downstream callers means the blast radius is contained to the deployment itself. The $62/month saving is concrete. The operator approves with full information.

This is the same philosophy as terraform plan before terraform apply. kubectl has no equivalent built-in — you apply changes and find out what happened. Clanker Cloud adds the plan step to the Kubernetes change workflow.

For CI/CD pipelines, the open-source Clanker CLI supports the --maker and --apply flags for scripted approval patterns:

clanker ask "scale billing-worker to 2 replicas" --maker --apply

This is appropriate for pre-approved automation patterns in non-production environments. Production changes should go through the interactive approval flow.


Multi-Cluster Queries Without Context Switching

Multi-cluster environments are where kubectl ergonomics break down fastest. Managing three clusters — EKS us-east-1, GKE eu-west-2, EKS ap-southeast-1 — means maintaining separate kubeconfig contexts and remembering to switch before every command. A query against the wrong context is a common source of confusion and occasionally a production incident.

Natural language multi-cluster management eliminates context switching:

"show me pod health across all three clusters"

Clanker Cloud queries all three simultaneously and returns a unified view of pod health. No kubectl config use-context. No risk of running a command against prod when you meant staging.

The Clanker Cloud demo shows this in a live environment. The same model applies to multi-cloud: queries fan out across AWS, GCP, Azure, and Kubernetes clusters simultaneously without switching consoles.


kubectl to Natural Language: Equivalents Table

This table maps common kubectl commands to their natural language equivalents in Clanker Cloud. Both work against the same live cluster state.

kubectl Command Natural Language Equivalent
kubectl get pods -n production --field-selector status.phase!=Running "show me all non-running pods in namespace production"
kubectl top pods -n production --sort-by=cpu "show me pods with highest CPU usage in production"
kubectl get hpa --all-namespaces "are any deployments currently at their HPA maximum?"
kubectl get pods -o json | jq '.items[] | select(.spec.containers[].resources.limits == null)' "which pods are running without resource limits?"
kubectl get svc --all-namespaces | grep NodePort "which services are exposed as NodePort?"
kubectl get clusterrolebindings -o json | jq '.items[] | select(.roleRef.name == "cluster-admin")' "show me all ServiceAccounts with cluster-admin binding"
kubectl get pvc --all-namespaces + cross-ref with pod volumes "which PersistentVolumeClaims are not mounted by any pod?"
kubectl get pods -o json | jq '.items[] | select(.spec.securityContext.runAsRoot == true)' "which pods are running as root?"
kubectl rollout history deployment/checkout-api -n production "which deployments have been running more than 60 days without a rollout?"
kubectl get pods --all-namespaces -o json | jq '.items[] | select(.spec.containers[].securityContext.privileged == true)' "are there any privileged containers in production?"

The kubectl commands are shown not to suggest you should avoid them — they are precise and reliable — but to illustrate what Clanker Cloud is actually running on your behalf. Every natural language answer is backed by equivalent API calls.


Model Selection for K8s Queries

Clanker Cloud is BYOK — bring your own AI keys. For Kubernetes management:

  • Routine inspection (pod status, replica counts, resource lists): Gemma 4 via Ollama (gemma4:27b) runs locally and is free.
  • Cross-namespace analysis (topology graphs, multi-cluster cost correlation): Claude Opus 4.6 or GPT-5.4 Thinking for deeper reasoning.
  • Policy and security queries: Hermes 3 (hermes3:70b via Ollama, MIT) handles structured RBAC and security context analysis well.

AI costs are billed directly by your provider with no markup. Routine queries using a local Gemma 4 model cost nothing. The full pricing breakdown covers platform tiers separately from AI model costs.


FAQ

What is natural language Kubernetes management?

Natural language Kubernetes management is a plain-English interface on top of your existing kubectl access and kubeconfig. You ask questions about your live cluster — resource state, costs, scaling, security posture — and the tool translates those questions into kubectl API calls, executes them against your real cluster, and returns structured answers. It does not replace kubectl; it adds a faster interface on top of it.

Does natural language K8s management make changes automatically?

No. In Clanker Cloud, inspection queries execute immediately, but changes follow a plan-before-apply model. A natural language change request generates a plan showing the exact resource modification, current state, cost impact, and risk level. Changes execute only after the operator explicitly approves. Maker Mode is the mechanism — it is off by default and requires opt-in for each change.

What kubectl commands does Clanker Cloud run under the hood?

Clanker Cloud routes natural language queries to the appropriate kubectl commands. Simple queries like "how many replicas does checkout-api have?" map to kubectl get deployment checkout-api -n production -o jsonpath='{.spec.replicas}'. Complex topology queries correlate across multiple resource types. The Clanker documentation covers query routing in detail.

Can I use natural language K8s management across multiple clusters?

Yes. Clanker Cloud fans out queries across all configured kubeconfig contexts simultaneously. "Show me pod health across all three clusters" returns a unified view across EKS, GKE, and AKS without switching contexts.

What AI models can I use for Kubernetes queries?

Clanker Cloud supports BYOK: Claude Opus 4.6, GPT-5.4 Thinking, Gemini 3.1 Pro, Cohere Command A, Hermes 3 via Ollama, and Gemma 4 via Ollama for fully local free inference. See the FAQ for model configuration details.


Get Started

Natural language Kubernetes management is available in Clanker Cloud starting with the free beta. Connect your kubeconfig, ask your first inspection query, and see live cluster state in plain English within a minute of setup.

Download Clanker Cloud — free beta, macOS, Windows, and Linux. BYOK. Local credentials. No agent rollout required.

Next step

Ask Clanker Cloud what your cluster is doing

Install the local app, connect your kubeconfig, and turn cluster state, workload health, cost context, and safe next steps into one readable answer.

Download and inspect a clusterWatch demo