Guide · 2026-05-25 · 12 min read
How to Run Claude Locally: Step-by-Step Guide
Cut the cloud cord. Run Claude models fully offline on your Mac, Windows, or Linux machine with an approval-gated agent.
Running Claude locally is no longer a futuristic fantasy — it's a practical, privacy-first reality. By using offline-capable AI agents and local language models like Anthropic's Claude (via Ollama or LM Studio), you can preserve your data, avoid per-token costs, and maintain full control over your workflow. This guide walks you through the exact steps to set up a local Claude-like experience on your desktop, complete with terminal access, code editing, and MCP tool integration — no account required.
Why Run Claude Locally?
In 2026, the default AI ecosystem is cloud-dependent: every prompt, every file, every conversation flows through someone else's server. That's fine for casual use, but if you're working with sensitive code, proprietary data, or simply value your privacy, it's a liability. Running Claude locally eliminates that dependency. Your data stays on your machine. No telemetry. No per-token billing. No rate limits.
Local inference also means you can integrate AI directly into your development workflow — inside your terminal, your editor, your project. Agents like AZMX AI combine a real PTY terminal with granular approval gates, so you get AI assistance without the risk of silent file mutations. When you run Claude locally, you're not just chatting with an LLM; you're commissioning an agent that can read files, execute commands, and write code — all under your supervision.
What You Need to Run Claude Locally
Before you start, understand the constraints. Local models require hardware. A modern laptop with 16 GB of RAM and a dedicated GPU (NVIDIA, AMD, or Apple Silicon) can run 7B-13B parameter models comfortably. For Claude-level reasoning (think Sonnet or Opus), you'll want 32 GB+ and a powerful GPU — or use a smaller distilled model from the Claude family.
The tools you need:
- Ollama or LM Studio for model serving
- AZMX AI or a similar agent platform for terminal + editor integration
- A Claude-compatible model: try
claude-3.5-sonnet(via Ollama's API) orclaude-3-opus:14bfrom Hugging Face - A BYOK-compatible client if you want to mix local with cloud (e.g., use local for quick edits, cloud for heavy lifts)
Step-by-Step: Set Up Claude Locally
1. Install and Run an Ollama Model
Ollama is the easiest path. Download it from azmx.ai or directly from ollama.ai. Then, pull a Claude-compatible model:
ollama pull claude-3.5-sonnet:14b ollama run claude-3.5-sonnet:14b
If your hardware is limited, try claude-3-haiku:7b — it's faster and still good for code completions.
2. Configure AZMX AI as Your Agent Frontend
AZMX AI is a ~7 MB desktop app that connects to any local endpoint. Open it, go to Settings > Provider, and select "Ollama". Point it to http://localhost:11434. Now you have a terminal that can use your local Claude model for every command you type.
AZMX's approval gate is critical: when your local Claude suggests a command, it shows you the diff before execution. You approve or deny. No silent `rm -rf /`. This is the difference between a toy and a tool.
3. Integrate with Your Editor
AZMX includes a CodeMirror 6 editor with per-hunk AI diffs. Open any file, select text, and press Ctrl+. to ask your local Claude for a change. The diff appears inline. Approve it, and the edit is applied. This workflow works offline — no network calls.
4. Add MCP Tools (Optional but Powerful)
AZMX speaks the Model Context Protocol (MCP) over stdio and HTTP. You can connect tools like filesystem access, database querying, or web scraping — all running locally via your self-hosted MCP server. This turns your local Claude from a chat bot into an agent that can read your project's AZMX.md memory file, run tests, and install dependencies.
Comparing Local vs. Cloud Claude
Cloud Claude (via Anthropic's API) is faster and smarter for complex reasoning. Local Claude is slower at equal size but cheaper at scale (no per-token cost). The real advantage is privacy: when you run Claude locally, nothing leaves your machine. For codebases under NDA, financial models, or personal vaults, that's non-negotiable.
Competitors like Cursor and Claude Code offer local-completion modes but still phone home for authentication and telemetry. Aider and Continue are local-first but lack the terminal PTY and approval gate. Cline, Windsurf, and GitHub Copilot are entirely cloud-dependent. AZMX AI is unique in offering a fully offline, approval-gated, BYOK agent with built-in terminal and editor — no account, no telemetry.
Troubleshooting Common Issues
Model Not Responding
Check that Ollama is running. Run curl http://localhost:11434/api/tags — if it returns JSON, your server is up. If not, restart Ollama or reinstall.
Slow Performance
Reduce model size. claude-3.5-sonnet:14b is fine on 24 GB VRAM; for 8 GB, use claude-3-haiku:7b. Close other GPU-intensive apps.
AZMX Can't Connect
Ensure your provider URL is correct. If using HTTPS vs HTTP, adjust in Settings. Firewalls may block localhost — disable them temporarily.
Security Considerations
Local inference is inherently more secure than cloud, but not immune. Always:
- Use AZMX's denial list to block access to
.env,.ssh, and credential files - Review every diff in the approval gate before approving
- Keep your Ollama server isolated (no external network access)
- Regularly update both Ollama and AZMX for security patches
AZMX's signed updater is the only network call it makes on its own — no telemetry, no analytics, no tracking.
Use Cases for Local Claude
- Private code review — scan your entire repo for vulnerabilities without uploading to any cloud.
- Offline pair programming — use your local Claude as a sounding board during flights or in air-gapped environments.
- Automated refactoring — let the agent suggest changes to your TypeScript or Python files, approve per-hunk.
- Learning and experimentation — test your own fine-tuned models without sharing data.
Conclusion
Running Claude locally is straightforward: install Ollama, pull a model, point AZMX AI at it, and you have a private, approval-gated AI agent in about 10 minutes. It won't match cloud Claude's absolute intelligence, but for daily development work — where privacy, control, and cost matter — it's the better choice. Download AZMX AI from azmx.ai/download and try it today. No account. No telemetry. Just your machine, your model, your code.