AI for DevOps in 2026: kubectl, AWS, and GitHub Through One Agent

DevOps and SRE work is the part of software where AI agents have the most leverage and the most blast radius. A misread kubectl delete at 2am ends careers. A correctly-read tail of pod logs saves the on-call. The right agent shape isn't "AI for ops" as a separate product — it's a careful terminal-native agent that uses the tools you already use, under the auth you already have.

The wrong shape, briefly

Most "AI for DevOps" tools in 2024–2025 asked you to paste cluster credentials, AWS access keys, and a GitHub PAT into a SaaS dashboard. The dashboard then proxied your kubectl and aws calls through their backend. This is the wrong shape twice over: it expands the credential blast radius, and it depends on a third party staying up when your incident is live.

The right shape, briefly

Run the agent on your machine. Let it use the CLIs you already have installed — kubectl, aws, gh, ssh — with the auth your local environment is already configured for. Every command goes through the approval gate. The credential surface doesn't grow. The agent has no special access; it has your access.

What this looks like in practice

Kubernetes

"The checkout pod is restarting" — the agent runs kubectl get pods -n checkout, finds the OOMKilled one, runs kubectl logs --previous, finds the heap dump, proposes a memory bump in the deployment YAML as a per-hunk diff you accept. Then kubectl apply -f — gated. You see the exact apply, click, it lands.

AWS

"Why is this EC2 instance unreachable" — the agent runs aws ec2 describe-instances, notices the security group is missing port 443, proposes an aws ec2 authorize-security-group-ingress, you read the command, you approve. The agent connects via SSM Session Manager when SSH won't.

GitHub

"Open a PR for this fix" — the agent stages the change as a per-hunk diff, runs gh pr create with a body it drafted, you read the body, you approve. The PR opens. The agent can be asked to address review comments in the same loop.

SSH fleet

"Tail the nginx log on web-04" — the agent reads your ~/.ssh/config, opens an SSH session through the picker, runs the tail. The auth is your SSH key — never copied, never reachable.

The four-stripe principle

An AI-for-DevOps agent worth using has four stripes:

Your auth. No new paste-in tokens.
Approval gate. Every cluster/account-modifying command shown before it fires.
Reads encouraged. The agent should freely kubectl get, aws describe-*, gh issue list — those are observably safe and they're where context comes from.
Local first. No SaaS in the loop. The agent dies with your laptop. That's the point.

Where this still struggles

Multi-step incident response with strict rollback semantics still benefits from a human in the loop on every step, and the agent's biggest contribution is reading logs faster than you can. That's fine — context-gathering is half of incident work, and an agent that surfaces the right kubectl describe at the right moment is worth its weight.

Cost analysis ("why did my AWS bill jump") works very well — it's mostly read tools and aggregation. The agent can drive the whole investigation and present a summary, no destructive verbs needed.

What we ship

AZMX AI is a native 7 MB terminal + editor + agent that calls your local CLIs under approval. GitHub through the gh CLI, Kubernetes through kubectl, AWS through aws and SSM, your SSH fleet through the SSH picker (⌘⇧S). The auth is whatever your shell already has. No SaaS in the middle. No new credentials.

Download AZMX AI · How approval gates make this safe

AI for DevOps in 2026.