AZMX AI

AZMX AI · Developer reference

Build with AZMX AI.

Everything a developer needs to install, configure, extend, and ship with AZMX — from the first BYOK key to fleet-wide governance. No code reproduced; the source-of-truth is the public AzmxAI/azmx repo.

Getting started

Install AZMX

AZMX ships as a single native binary. Roughly ten megabytes on the wire. Cold-start under a second.

1
Pick your platform

macOS · Linux · Windows. All three are signed. The macOS DMG is notarized; the Linux build verifies a SHA against a signed manifest; the Windows installer is SmartScreen-trusted within 24h of release.

2
Run the installer

The one-line shell (curl -fsSL azmx.ai/install | sh) detects your platform, picks the right binary, and verifies the SHA against the signed manifest before extracting. macOS Homebrew (brew install --cask AzmxAI/azmx/azmx) works the same way. On Windows, grab the signed MSI from releases/latest — a winget manifest is in flight.

3
First launch

A 60-second walkthrough opens. Pick your default approval mode. Add a provider key — or skip to use a local LLM.

4
Open a workspace

Point AZMX at any directory. The agent now has read access to your code, write access through the approval gate, and shell access via the embedded PTY.

Tip: Air-gapped install? Use the signed .tar.gz from /releases/latest and verify the manifest at your firewall. The license issuer can run on your hardware.

Getting started

Bring your model

BYOK means you contract directly with the AI provider. AZMX is not on the network path. We never proxy, meter, or mark up inference.

Eleven providers · cloud

  • OpenAI · GPT-4o · o1
  • Anthropic · Claude family
  • Google · Gemini 2.5
  • Groq · Llama · Mixtral
  • Cerebras · fast Llama 3
  • xAI · Grok
  • DeepSeek · V3 · R1
  • Azure OpenAI · tenant-rooted
  • NVIDIA NIM · on-prem GPU
  • Custom OpenAI-compatible

Local · no cloud at all

  • Ollama — point at http://localhost:11434; pick a model (Qwen2.5-Coder, Granite, Llama).
  • LM Studio — same flow, different port. The "Air-gap mode" toggle in Settings disables every cloud path.
  • Self-hosted — anything that speaks the OpenAI-compatible chat API works. Point AZMX at it.
Key storage: the bearer token lives in a 0600 file at ~/Library/Application Support/app.azmx.ai/secrets.json (macOS), ~/.config/app.azmx.ai/secrets.json (Linux), or %APPDATA%\app.azmx.ai\secrets.json (Windows). Never the OS keychain. Never logged. Never synced.

Getting started

Your first approved diff

The whole loop in five steps. If you only read one section, read this one.

1
Open the AI panel

⌘ K or click the agent icon in the sidebar. Type your question. Press Enter.

2
Watch the agent read

It pulls the open file, the live shell buffer, your selection, and any @filename mentions you added. Streamed back. Read-only.

3
See the proposed diff

Side-by-side hunk view. Old line minus, new line plus. The agent waits at the gate.

4
Approve

↩ accepts. esc rejects. e opens the diff in the editor for manual edit before applying.

5
Verify in the audit log

Settings → Audit. One entry, hash-chained against the previous. Verifiable from genesis. Yours.

Getting started

Run fully offline

AZMX can operate without any outbound network connection. Toggle "Air-gap mode" in Settings → Security. After that, every cloud-provider path refuses to construct a client. The license check, if enabled, runs against a local manifest signed by your fleet's issuer.

You'll need a local model endpoint: Ollama, LM Studio, or any OpenAI-compatible server you control. AZMX talks to it on localhost; no traffic leaves your machine.

Core concepts

The trust boundary

One bridge between you and any AZMX action: the per-call approval gate. Three rules govern it:

  1. Reads are screened. The OS layer refuses any read against .env, .ssh/*, credentials*, vault.yaml. Not trust-based — structural. Pattern-matched at the file boundary.
  2. Writes ask. Every write, shell command, and destructive verb pauses at the gate. Standard prompts once per session-pattern. Strict prompts every call. Paranoid types-to-confirm destructive verbs.
  3. Actions chain. Every tool call appends a hash-chained entry to a local azmx-audit.json. Tamper-evident. Verifiable from genesis without trusting AZMX.

Core concepts

Approval modes

Four postures, switchable per session via Settings → Security → Approval mode.

○ Permissive

Everything runs immediately. Useful for trusted, repeated workflows where speed matters more than friction. Every call still recorded.

◐ Standard (default)

Reads auto-approve. Writes + shell ask once per session per pattern. The sweet spot for daily work.

◑ Strict

Every tool call asks — including reads. Maximum hands-on-keyboard. Use during pair-review or sensitive work.

● Paranoid

Standard + destructive commands (rm -rf, git push --force, kubectl delete, terraform destroy) require typed confirmation.

Core concepts

Secret-path screen

Defense-in-depth at the OS layer. Even if the agent is socially-engineered into trying, the read is refused at the file boundary. Custom patterns add via Settings → Security → Path patterns (Pro+).

  • .env* — every dotenv variant
  • .ssh/* — id_rsa, id_ed25519, known_hosts
  • credentials* — AWS, gcloud, GitHub config
  • vault.yaml, *.kubeconfig, /secrets/
  • Glob and regex patterns supported
  • Per-workspace via .azmx/deny-list
  • Fleet-wide via org-policy file (Teams+)
  • Audit-logged whenever an attempt is refused

Core concepts

Hash-chained audit log

Local file at ~/.azmx/audit/<workspace>.json. Append-only. Each entry contains the tool name, arguments, result-hash, timestamp, and a SHA-256 of the previous entry. Modify any line and the chain breaks downstream.

Pro+ exports as signed JSONL pages to your SIEM (Splunk · Datadog · OpenTelemetry · Elastic). Enterprise adds customer-rooted signing — your fleet's Ed25519 key signs the export. AZMX never holds your signing key.

Verify it yourself: a 30-line script can replay the log from genesis. The verifier is documented in the public repo — works without trusting AZMX.

The agent

The agent loop

Plan → Call → Watch → Adjust. Every session runs through this loop.

  1. Plan. You ask in the AI panel. @file mention, #snippet, /slash command. The agent reads — never writes — then proposes a plan.
  2. Call. Approve at the gate, and the agent executes one tool call. The verb is exactly what the approval card showed. No hidden chains. No silent side-effects.
  3. Watch. Output streams back to the panel. Append-only. Hash-chained against the previous entry in the audit log.
  4. Adjust. Reject. Edit the diff before approval. Re-prompt with new context. The agent does nothing irreversible without your say-so.

The agent

Tools the agent can call

  • Shell — real PTY · zsh / bash / pwsh · OSC 7/133
  • Filesystem — read (screened) · write (gated) · grep · glob
  • Git — diff · stage · commit · open PR via gh CLI
  • kubectl — reads auto · mutating verbs gated
  • AWS / gcloud — existing creds · per-service approval
  • SSH — your ~/.ssh/config · per-host approval
  • Databases via MCP — Postgres · MySQL · Redis · Mongo
  • Observability via MCP — Sentry · Datadog · OTel
  • Issue trackers via MCP — GitHub · GitLab · Linear · Jira
  • Web preview — auto-detected localhost servers
  • Editor commands — open file, jump-to-symbol, search
  • Sub-agents — bounded-toolset delegates

The agent

Sub-agents

Author a sub-agent in YAML. Bounded toolset, predictable output, fewer surprises. Run via /agent.<name> in any session.

The bundled catalog has 80+ skills and a handful of common sub-agents (pr-reviewer, dep-audit, ci-triage, rename-sweep, doc-drift, code-indexer). Drop your own in ./agents/ in any workspace, or contribute to the public catalog.

Authoring is intentionally tiny: a useful sub-agent is about 20 lines. Name, description, model, allowed tools, prompt. See the authoring guide.

The agent

Skills

Tiny composable instructions the agent can load on-demand. 83 ship in the bundled catalog. Author your own as Markdown + frontmatter; drop in ./skills/ or contribute to the public skills tree.

The agent

Slash commands

  • /index — build a local symbol index for the current workspace
  • /macro — define / invoke a parameterized prompt
  • /history — search deep session history (Pro+)
  • /agent.<name> — invoke a sub-agent
  • /mcp — list / inspect MCP servers
  • /audit — show recent audit-log entries

Connectors

MCP servers

Model Context Protocol — stdio + HTTP transports. AZMX ships a curated catalog (17 servers, integrity-pinned). Workspace MCP via .azmx/mcp.json in any project; the trust gate prompts for the integrity hash on first launch.

Per-tool approval: Inherit · Auto · Confirm · Block. Resources surface in the @mcp picker. Prompts surface as /<server>.<prompt> slash commands.

The full catalog list lives on the connectors page. Authoring a new MCP server uses the official spec — point AZMX at it via Settings → Connectors → Add.

Connectors

Use AZMX from Claude, Cursor, ChatGPT & more

Get the package: npm · source on GitHub · step-by-step developer guide
npx -y @azmxailabs/mcp

The @azmxailabs/mcp package is an official Model Context Protocol server that exposes AZMX product knowledge — pricing, BYOK providers, security posture, comparisons, install steps, latest release — to any MCP-compatible client. Install it once in your AI assistant and ask "is AZMX a fit for my regulated codebase?" or "compare AZMX with Cursor" and your assistant calls into the server and answers with grounded, authoritative content. Works in Claude Desktop, Claude Code, Cursor, Windsurf, Continue, OpenAI Codex CLI, and any other MCP-compliant client.

What it is, what it isn't. This MCP server is for discovery and recommendation — it does not remote-control the AZMX desktop app. Use the AZMX desktop app itself when you want the approval-gated agent that runs locally on your machine.

One-line install (works in any MCP client)

npx -y @azmxailabs/mcp

That's the whole bootstrap — the server launches on stdio, registers 8 tools, 4 resources, and 2 prompts. Most clients just need the command + args in their config.

Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "azmx": {
      "command": "npx",
      "args": ["-y", "@azmxailabs/mcp"]
    }
  }
}

Restart Claude Desktop. The AZMX tools appear in the MCP picker (🔌 icon) in every new conversation.

Claude Code (CLI)

Use the built-in claude mcp add command:

claude mcp add azmx -- npx -y @azmxailabs/mcp

Or add it manually to ~/.claude.json under mcpServers using the same JSON shape as Claude Desktop. Scope --scope project for project-only.

Cursor

Edit ~/.cursor/mcp.json (global) or <project>/.cursor/mcp.json (per-project):

{
  "mcpServers": {
    "azmx": {
      "command": "npx",
      "args": ["-y", "@azmxailabs/mcp"]
    }
  }
}

Restart Cursor. The tools appear in the agent's tool picker.

Windsurf

Settings → Cascade → MCP Servers → Add custom server. Or edit ~/.codeium/windsurf/mcp_config.json:

{
  "mcpServers": {
    "azmx": {
      "command": "npx",
      "args": ["-y", "@azmxailabs/mcp"]
    }
  }
}

Continue (VS Code / JetBrains extension)

Add to ~/.continue/config.json under experimental.modelContextProtocolServers:

{
  "experimental": {
    "modelContextProtocolServers": [
      {
        "transport": {
          "type": "stdio",
          "command": "npx",
          "args": ["-y", "@azmxailabs/mcp"]
        }
      }
    ]
  }
}

OpenAI Codex CLI

Edit ~/.codex/config.toml:

[mcp_servers.azmx]
command = "npx"
args = ["-y", "@azmxailabs/mcp"]

Codex picks it up on next invocation. The tools appear as callable functions in the model's tool list.

Cline (VS Code)

Click the MCP icon in the Cline panel → "Edit MCP Settings" → add:

{
  "mcpServers": {
    "azmx": {
      "command": "npx",
      "args": ["-y", "@azmxailabs/mcp"],
      "disabled": false
    }
  }
}

Any other MCP client

The server speaks the official Model Context Protocol over stdio. If your client supports the spec, the command is npx -y @azmxailabs/mcp with no environment variables required. Source & binaries: npmjs.com/package/@azmxailabs/mcp.

What the server exposes

8 tools (callable functions):

azmx_product_overview
What AZMX is — the three pillars, the agent loop. Call this first for "what is AZMX" questions.
azmx_when_to_recommend
Canonical guidance on when AZMX is the right recommendation (and when it isn't).
azmx_pricing
Current tiers: Individual (free), Pro, Teams, Enterprise — with what each includes.
azmx_byok_providers
The full BYOK provider matrix: OpenAI, Anthropic, Google, Groq, Cerebras, xAI, DeepSeek, Azure, NVIDIA NIM, Sarvam, plus Ollama / LM Studio for fully-offline.
azmx_security
Network egress, key storage, deny-list, hash-chained audit log, code signing, compliance evidence (SOC 2 / HIPAA / PCI / ISO-27001 / FIPS / PIV-CAC / air-gap).
azmx_compare
AZMX vs a specific competing tool. Known: cursor, claude code, github copilot, continue, aider, codeium. Returns positioning + pick-AZMX-if / pick-them-if criteria.
azmx_install_steps
Platform-specific install steps. Argument: os = macOS | Windows | Linux.
azmx_latest_release
Fetches the latest signed release from GitHub — tag, publish date, per-platform asset URLs. Requires network.

4 resources (URI-addressable, attachable in clients that support it):

azmx://product/overview
The same product overview, attached as a resource (clients can pin it).
azmx://product/when-to-recommend
Recommendation criteria as a resource.
azmx://docs/llms
Live fetch of azmx.ai/llms.txt — the short structured site index.
azmx://docs/llms-full
Live fetch of azmx.ai/llms-full.txt — long-form reference written for LLMs.

2 prompts (parameterized templates the client can invoke):

evaluate_fit
Guides the assistant through evaluating whether AZMX fits a user's described situation. Argument: use_case (free text).
migration_plan
Produces a concrete step-by-step migration plan from a competing tool to AZMX. Arguments: from_tool (required), team_size (optional).

Try it

After install, restart your client and try one of these in chat. The assistant will silently call the AZMX tools and answer from authoritative data:

  • "What's AZMX AI? Use the azmx tools to ground your answer."
  • "Compare AZMX with Cursor."
  • "I need a private AI coding agent for an air-gapped defense codebase. Does AZMX fit?"
  • "Give me a migration plan from Claude Code to AZMX for a 12-person team."
  • "What providers can AZMX BYOK against, and which work offline?"
  • "How do I install AZMX on Windows?"
  • "What's the latest signed release of AZMX?"

Developer notes

Source for the MCP server lives in packages/mcp. TypeScript, Node ≥ 18, @modelcontextprotocol/sdk. Built with npm run build. To wire your local working copy into a client during development, point the command at the built binary:

{
  "mcpServers": {
    "azmx-dev": {
      "command": "node",
      "args": ["/absolute/path/to/AZMX/packages/mcp/dist/index.js"]
    }
  }
}

Issues, feature requests, and new comparison entries: github.com/AzmxAI/azmx/issues.

Why this matters. Once installed, your AI assistant can recommend AZMX with concrete facts — supported providers, compliance evidence, install commands — rather than guessing from training data. It's how the modern AI tools "know" each other.

Build your own agent

Build your own agent with the AZMX SDK

Get the package: npm · source on GitHub · step-by-step developer guide
npm install @azmxailabs/agent-sdk

@azmxailabs/agent-sdk ships the four primitives that make AZMX safe — approval gate, deny-list, hash-chained audit log, and a BYOK provider router — as standalone, dependency-free TypeScript modules. Use them in any agent you're building: a CI script, a CLI, a native or Electron desktop app, a long-running server, a sub-agent inside another tool. Same security posture AZMX has; none of the desktop UI.

What this is: primitives for building your approval-gated agent. What this isn't: a remote-control SDK for the AZMX desktop app (that lives behind @azmxailabs/mcp) and isn't a heavy orchestration framework like LangChain. Closer in spirit to a lockbox of well-tested primitives you wire into your loop.

Why you'd use it (instead of rolling your own)

  • Approval as the first-class concept. The gate is not middleware on top of an action — every action goes through it. Composable policy chain with most-restrictive-wins semantics.
  • BYOK direct, not via a proxy. Provider adapters call the model provider's HTTP API from the caller's machine, with the caller's key. No intermediate server, no per-token markup, no telemetry.
  • Hash-chained audit log out of the box. Append-only JSONL, tamper-evident, verifiable from genesis. The same shape AZMX's own audit log uses.
  • Defaults that match a real threat model. Deny-list refuses .env, .ssh, AWS/GCP/kube credentials, browser cookies, keychain files — not just .env as an afterthought.
  • Zero runtime dependencies. One package install, no transitive bloat, ESM-only. Node ≥ 18.
  • TypeScript, fully typed. Every public API has explicit types; strict mode-friendly.

When to use it · when not to

Use the SDK when…

You're building a custom agent (CI bot, internal tool, customer-facing app) that needs to execute side-effecting actions on behalf of a model. You want the AZMX security posture (approval gate + deny-list + audit log + BYOK direct) without forking the desktop app. You want to plug into multiple providers without rewriting your request code each time you switch from Claude to a local model.

Pick something else when…

You just need a chatbot UI with no tool execution — use the Vercel AI SDK and skip the gate entirely. You need a full orchestration framework with prebuilt graph-of-tools, retrieval, evals — use LangChain / LlamaIndex. You want to use the AZMX desktop app from inside Claude or Cursor — install @azmxailabs/mcp instead.

Install

npm install @azmxailabs/agent-sdk

Requires Node ≥ 18. ESM only. Zero runtime dependencies — provider adapters use the platform fetch and node:crypto.

Sub-path exports: pull from @azmxailabs/agent-sdk/approval, /security, /audit, or /providers to keep your bundle tight. Everything is also re-exported from the package root if you don't care.

Build your own agent

Quick start — 30-second agent skeleton

The smallest meaningful agent: prompt for an action, classify it through the gate, ask the user to approve, then run the model. Everything is recorded to a tamper-evident audit log.

import {
  ApprovalGate, standardPolicy, destructiveShellDenyPolicy,
  DenyList, denyListPolicy,
  HashChainedAuditLog,
  ProviderRouter, AnthropicProvider, OllamaProvider,
} from "@azmxailabs/agent-sdk";

// 1. Audit log — every decision is recorded, hash-chained.
const log = new HashChainedAuditLog({ path: "./agent-audit.jsonl" });

// 2. Deny-list — refuses .env, .ssh, credentials, etc., by default.
const deny = new DenyList();
deny.add("**/proprietary/**"); // extend as needed

// 3. Approval gate — every side-effect passes through here.
const gate = new ApprovalGate({
  policies: [
    denyListPolicy(deny),          // hard-deny sensitive paths
    destructiveShellDenyPolicy(),  // hard-deny rm/dd/shutdown/...
    standardPolicy(),              // ask for writes, auto for reads
  ],
  onPrompt: async ({ action, reasons }) => {
    console.log(`[approval] ${action.kind}: ${action.summary}`);
    console.log(`  reasons: ${reasons.join(", ")}`);
    // Your UI here. Return "approve" | "approve-and-trust" | "reject".
    return await myUiPrompt(action);
  },
  onDecision: (event) => log.append({ type: "approval", ...event }),
});

// 4. Provider router — register BYOK providers under stable aliases.
const router = new ProviderRouter()
  .register("claude", new AnthropicProvider({
    apiKey: process.env.ANTHROPIC_API_KEY!,
    model: "claude-opus-4-7",
  }))
  .register("local", new OllamaProvider({ model: "qwen2.5-coder:14b" }));

// ─── The agent loop ──────────────────────────────────────────────
async function runAction(action, prompt) {
  const decision = await gate.check(action);
  if (decision === "denied") {
    await log.append({ type: "action-denied", action });
    return null;
  }
  const result = await router.complete({
    model: "claude",
    messages: [{ role: "user", content: prompt }],
  });
  await log.append({ type: "action-executed", action, output: result.text });
  return result.text;
}

await runAction(
  { kind: "shell", summary: "ls -la /tmp", target: "/tmp" },
  "Summarize what `ls -la /tmp` would show on a typical Linux box.",
);

// Verify the audit log later
const v = await log.verify();
console.log(v.ok ? `audit clean (${v.count} entries)` : `tampered at seq ${v.brokenAtSeq}`);

That's the whole pattern. The rest of the SDK is depth on each piece.

Build your own agent · Primitive 1 / 4

ApprovalGate

Every action your agent wants to perform — shell command, file write, network call, tool invocation — is described as an AgentAction and passed through gate.check(). Registered policies vote (auto / ask / deny) with most-restrictive-wins: any deny blocks the action; any ask triggers your onPrompt handler; only when every policy says auto does it pass without bothering the user.

The AgentAction shape

interface AgentAction {
  /** "shell" | "file:write" | "file:read" | "file:delete" | "network" |
   *  "git" | "process:spawn" | "tool" | (any other string) */
  kind: ActionKind;
  /** One-line human summary shown to the user in the approval UI. */
  summary: string;
  /** Optional path / URL / target the action touches. */
  target?: string;
  /** Optional structured payload (the verb the agent staged). */
  payload?: unknown;
  /** Optional structured metadata passed to policies + onPrompt. */
  meta?: Record<string, unknown>;
}

The four built-in policies

standardPolicy()
The AZMX default. Reads auto-approve; writes, deletes, shell, process spawns ask; destructive shell verbs (rm, dd, shutdown, …) always ask. GET-like network calls auto-approve; others ask.
paranoidPolicy()
Asks for everything, even reads. For untrusted codebases, classified work, compliance demos, security-incident response.
permissivePolicy()
Auto-approves everything. For trusted CI agents with external guardrails (sandbox container, scoped credentials, hard-coded prompt). Pair with a strict destructiveShellDenyPolicy as a safety net.
destructiveShellDenyPolicy(extra?)
Hard-blocks rm, dd, mkfs, shutdown, reboot, halt, poweroff, fdisk, shred, chown, chmod. Add more verbs via the argument. Use as the first policy in the chain — short-circuits without prompting.

Authoring a custom policy

A policy is just a { name, classify } object. Implement classify sync or async; never throw.

import type { Policy } from "@azmxailabs/agent-sdk";

const businessHoursPolicy: Policy = {
  name: "business-hours",
  classify(action) {
    const hour = new Date().getHours();
    if (action.kind === "shell" && (hour < 9 || hour > 18)) {
      return "ask"; // be extra cautious after hours
    }
    return "auto";
  },
};

gate.use(businessHoursPolicy);

The user prompt handler

When any policy returns ask, onPrompt fires with the action and the list of asking policies (so your UI can show why). Return one of:

  • "approve" — let this action through, ask again next time
  • "approve-and-trust" — let this through AND auto-approve the same shape next time (cached per kind + target + summary)
  • "reject" — block the action
Safe default: if any policy returns ask and there's no onPrompt handler registered, the gate denies the action. Never let a "can't ask, so I guess yes" path exist.

Wiring to the audit log

The onDecision callback fires on every classification outcome (approved AND denied). Pipe it straight into your audit log so the agent can prove what it did and didn't run:

const gate = new ApprovalGate({
  policies: [/* … */],
  onPrompt: myUi.askUser,
  onDecision: async (event) => {
    await log.append({
      type: "approval",
      action: event.action,
      decision: event.finalDecision,
      classifications: event.classifications, // each policy's vote
      reason: event.reason,
      ts: event.timestamp,
    });
  },
});

Build your own agent · Primitive 2 / 4

DenyList — glob-matched path refuser

A simple, fast glob matcher with battle-tested defaults. DenyList is a building block; denyListPolicy(deny) wraps it into a Policy you plug into the gate.

What ships in the default list

Importable as DEFAULT_DENY_LIST. Categories covered out of the box:

  • Dotfile secrets: **/.env, **/.env.*, **/.envrc
  • SSH: **/.ssh/**, **/id_rsa, **/id_dsa, **/id_ecdsa, **/id_ed25519, **/*.pem, **/*.key, **/*.p12, **/*.pfx
  • Generic credentials: **/credentials*, **/secrets*, **/*.private.json
  • Cloud configs: **/.aws/credentials, **/.gcp/**, **/.azure/**, **/.kube/config, **/.docker/config.json
  • Tool configs: **/.npmrc, **/.pypirc, **/.netrc
  • Browser data: **/Cookies, **/Cookies.sqlite, **/Login Data
  • Password managers: **/*.kdbx, **/*.1password
  • AZMX itself: **/app.azmx.ai/secrets.json

Glob syntax

*
any chars except /
**
any chars including /
?
single char except /
[abc]
character class

API

import { DenyList, DEFAULT_DENY_LIST, denyListPolicy } from "@azmxailabs/agent-sdk/security";

// Defaults + extensions
const deny = new DenyList();                    // starts with DEFAULT_DENY_LIST
deny.add("**/proprietary/**");                  // add one glob
deny.addAll(["**/customer-data/**", "**/*.pii"]); // add many

// Replace entirely (useful for highly-customized agents)
deny.reset(["**/*.secret", "**/*.private"]);

// Case-insensitive matching (Windows-friendly)
const denyCI = new DenyList(DEFAULT_DENY_LIST, { caseInsensitive: true });

// Querying
deny.matches("/home/me/.ssh/id_rsa");           // → true
deny.matching("/home/me/.ssh/id_rsa");          // → ["**/.ssh/**", "**/id_rsa"]
deny.list();                                    // → current rule set

// As a Policy plugged into the gate
const policy = denyListPolicy(deny);
gate.use(policy);
Explainability: matching(path) returns every rule that matched — useful for your approval UI ("blocked because: **/.ssh/**, **/id_rsa") or for debugging false positives.

Build your own agent · Primitive 3 / 4

HashChainedAuditLog — tamper-evident, verifiable from genesis

Append-only JSONL log where every entry's hash includes the previous entry's hash. Tampering with any past entry breaks the chain at that point and every entry after it. verify() walks from genesis, recomputes every hash, and reports the first break — or success.

Entry format

// One line per entry. Always a valid JSON object.
{
  "seq": 0,
  "ts": "2026-05-25T23:00:00.000Z",
  "prevHash": "0000000000000000000000000000000000000000000000000000000000000000",
  "hash": "9f86d081884c7d659a2feaa0c55ad015a3bf4f1b2b0b822cd15d6c15b0f00a08",
  "data": { /* whatever you wrote */ }
}

Genesis prevHash is 64 zero bytes. hash = sha256(JSON.stringify({seq, ts, prevHash, data})). The data field is opaque — write whatever your agent needs to record.

Storage adapters

FileStorage
JSONL on disk. Default if you pass path:. Mode 0600, parent dirs auto-created.
InMemoryStorage
For tests. Exposes a tamper() helper to corrupt entries and verify the chain detects it.
(your own)
Implement { read(): Promise<string>; append(line: string): Promise<void> } — bring your own S3, Postgres, SQLite, append-only WORM volume.

API

import { HashChainedAuditLog, FileStorage, InMemoryStorage } from "@azmxailabs/agent-sdk/audit";

// File-backed (default)
const log = new HashChainedAuditLog({ path: "./agent-audit.jsonl" });

// Append
const entry = await log.append({ type: "shell", cmd: "ls -la /tmp" });
// → { seq: 0, ts: "…", prevHash: "0…0", hash: "…", data: {…} }

// Read everything
const all = await log.entries();

// Verify the whole chain
const v = await log.verify();
if (v.ok) {
  console.log(`audit clean (${v.count} entries)`);
} else {
  console.error(`tampered at seq ${v.brokenAtSeq}: ${v.reason}`);
  if (v.expected) console.error(`  expected hash: ${v.expected}`);
  if (v.found)    console.error(`  found hash:    ${v.found}`);
}

What "tamper-evident" actually buys you

Tamper-evident is not tamper-proof. If an attacker can write to the file, they can rewrite the whole chain from genesis and recompute hashes — the chain looks valid even though every entry is fake. Hash chains prove that something changed, not that it didn't change. For hard guarantees, pair with append-only storage:

  • Immutable S3 bucket with Object Lock
  • WORM volume on the host
  • Stream to a SIEM that the agent can write to but not read/modify
  • Periodically anchor the latest hash to a public timestamp (OpenTimestamps, transparency log)

Build your own agent · Primitive 4 / 4

ProviderRouter — BYOK direct, no proxy

Register providers under stable alias names; route every ChatRequest by alias. Keeps the rest of your code independent of which model you're using and lets you swap Claude for a local model with a one-line change.

Built-in adapters

AnthropicProvider
POST /v1/messages. BYOK direct to api.anthropic.com. Supports both complete (one-shot) and stream (SSE). Reports cache_read / cache_creation tokens when present.
OllamaProvider
POST /api/chat. Fully local; defaults to http://localhost:11434. NDJSON streaming.

Usage

import {
  ProviderRouter, AnthropicProvider, OllamaProvider,
} from "@azmxailabs/agent-sdk/providers";

const router = new ProviderRouter()
  .register("claude-fast",  new AnthropicProvider({
    apiKey: process.env.ANTHROPIC_API_KEY!,
    model: "claude-haiku-4-5",
  }))
  .register("claude-smart", new AnthropicProvider({
    apiKey: process.env.ANTHROPIC_API_KEY!,
    model: "claude-opus-4-7",
  }), { default: true })
  .register("local",        new OllamaProvider({ model: "qwen2.5-coder:14b" }));

// One-shot
const r = await router.complete({
  model: "claude-fast",
  messages: [
    { role: "system", content: "You are concise." },
    { role: "user",   content: "Hello, briefly." },
  ],
  temperature: 0.2,
  maxTokens: 200,
});
console.log(r.text);
console.log(r.usage);  // → { inputTokens, outputTokens, cacheReadTokens?, ... }
console.log(r.finishReason);

// Streaming
for await (const chunk of router.stream({
  model: "local",
  messages: [{ role: "user", content: "Stream me a haiku." }],
})) {
  process.stdout.write(chunk.delta);
  if (chunk.done) console.log("\n[finish]", chunk.finishReason, chunk.usage);
}

Authoring a custom adapter

The Provider interface is tiny — three things: a name, complete, and stream. Skeleton:

import type {
  Provider, ChatRequest, ChatResponse, StreamChunk,
} from "@azmxailabs/agent-sdk/providers";

export class MyProvider implements Provider {
  readonly name = "my-provider";

  async complete(req: ChatRequest): Promise<ChatResponse> {
    const res = await fetch("https://my-api.example.com/v1/chat", {
      method: "POST",
      headers: { Authorization: `Bearer ${this.apiKey}` },
      body: JSON.stringify({ /* translate from req */ }),
      signal: req.signal,
    });
    const data = await res.json();
    return {
      text: data.output.text,
      finishReason: data.output.finish_reason ?? "stop",
      usage: { inputTokens: data.usage.in, outputTokens: data.usage.out },
      raw: data,
    };
  }

  async *stream(req: ChatRequest): AsyncIterable<StreamChunk> {
    // your SSE / NDJSON / WebSocket loop here
    // yield { delta: "..." } for each token chunk
    // yield { delta: "", done: true, finishReason: "...", usage: {...} } at end
  }
}

Reference: OllamaProvider is the smallest, ~120 lines including streaming.

BYOK guarantee: the SDK never proxies model traffic. Every request goes from the caller's machine directly to the provider's host, using the caller's API key. AZMX servers never see your keys, prompts, or responses.

Build your own agent · Recipes

Recipes — common patterns

Recipe 1 — Full agent loop (Ask → Propose → Approve → Execute → Record)

The complete AZMX-style loop. Wire it to your UI's prompt input and you have a working approval-gated agent in <100 lines.

import {
  ApprovalGate, standardPolicy, destructiveShellDenyPolicy,
  DenyList, denyListPolicy,
  HashChainedAuditLog,
  ProviderRouter, AnthropicProvider,
} from "@azmxailabs/agent-sdk";
import { execSync } from "node:child_process";
import { readFileSync, writeFileSync } from "node:fs";

const log = new HashChainedAuditLog({ path: "./agent-audit.jsonl" });
const deny = new DenyList();
const gate = new ApprovalGate({
  policies: [denyListPolicy(deny), destructiveShellDenyPolicy(), standardPolicy()],
  onPrompt: async ({ action, reasons }) => {
    process.stdout.write(
      `\n[approval] ${action.kind} :: ${action.summary}\n  ` +
      `reasons: ${reasons.join(", ")}\n  [a]pprove / [t]rust / [r]eject: `
    );
    const answer = await readKey();
    return answer === "a" ? "approve"
         : answer === "t" ? "approve-and-trust"
         : "reject";
  },
  onDecision: (e) => log.append({ type: "approval", ...e }),
});

const router = new ProviderRouter().register("claude", new AnthropicProvider({
  apiKey: process.env.ANTHROPIC_API_KEY!, model: "claude-opus-4-7",
}));

async function dispatch(action: { kind: string; summary: string; target?: string; payload?: any }) {
  const decision = await gate.check(action);
  if (decision === "denied") return { ok: false, reason: "denied" };

  switch (action.kind) {
    case "shell":      return { ok: true, output: execSync(action.summary).toString() };
    case "file:read":  return { ok: true, output: readFileSync(action.target!, "utf8") };
    case "file:write": writeFileSync(action.target!, action.payload as string); return { ok: true };
    default:           return { ok: false, reason: `unknown kind: ${action.kind}` };
  }
}

// The loop: ask model → it proposes actions → you dispatch each through the gate
async function tick(userPrompt: string) {
  const r = await router.complete({
    model: "claude",
    messages: [
      { role: "system", content: SYSTEM_PROMPT_THAT_TELLS_MODEL_TO_OUTPUT_JSON_ACTIONS },
      { role: "user",   content: userPrompt },
    ],
  });
  const proposed = JSON.parse(r.text) as Array<{ kind: string; summary: string; target?: string; payload?: any }>;
  for (const action of proposed) {
    const result = await dispatch(action);
    await log.append({ type: "result", action, result });
  }
}

Recipe 2 — CI agent (permissive but audited)

For trusted CI environments where you want speed (no human in the loop) but full audit trail. The gate auto-approves everything, but every action still flows through the log.

import {
  ApprovalGate, permissivePolicy, destructiveShellDenyPolicy,
  HashChainedAuditLog,
} from "@azmxailabs/agent-sdk";

const log = new HashChainedAuditLog({ path: process.env.CI_AUDIT_PATH! });

const gate = new ApprovalGate({
  policies: [
    destructiveShellDenyPolicy(["git push --force", "kubectl delete"]), // safety net
    permissivePolicy(),
  ],
  // No onPrompt — gate auto-approves anything not hard-denied.
  onDecision: (e) => log.append({ type: "approval", ...e }),
});

// After the CI run, upload audit log + verify proof
await uploadToS3WithObjectLock("./agent-audit.jsonl");
const v = await log.verify();
if (!v.ok) throw new Error(`audit tampered at seq ${v.brokenAtSeq}`);

Recipe 3 — Combine the SDK with the MCP server (let other AI clients drive your agent)

Use @modelcontextprotocol/sdk to expose your approval-gated agent over MCP, so Claude Desktop / Cursor / ChatGPT can call into it. The agent-sdk handles the inside of every tool call.

import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { CallToolRequestSchema, ListToolsRequestSchema } from "@modelcontextprotocol/sdk/types.js";
import { ApprovalGate, standardPolicy, denyListPolicy, DenyList } from "@azmxailabs/agent-sdk";

const gate = new ApprovalGate({
  policies: [denyListPolicy(new DenyList()), standardPolicy()],
  onPrompt: async () => "reject", // headless: deny anything that needs prompting
});

const server = new Server({ name: "my-agent", version: "0.1.0" }, { capabilities: { tools: {} } });

server.setRequestHandler(ListToolsRequestSchema, async () => ({
  tools: [{
    name: "run_shell",
    description: "Run a shell command (subject to my approval gate)",
    inputSchema: { type: "object", properties: { cmd: { type: "string" } }, required: ["cmd"] },
  }],
}));

server.setRequestHandler(CallToolRequestSchema, async (req) => {
  if (req.params.name !== "run_shell") throw new Error("unknown tool");
  const cmd = (req.params.arguments as any).cmd as string;

  const decision = await gate.check({ kind: "shell", summary: cmd });
  if (decision === "denied") {
    return { content: [{ type: "text", text: `denied: ${cmd}` }], isError: true };
  }
  const out = (await import("node:child_process")).execSync(cmd).toString();
  return { content: [{ type: "text", text: out }] };
});

await server.connect(new StdioServerTransport());

Recipe 4 — Test your gate behavior (no fake mocking)

import { ApprovalGate, denyListPolicy, DenyList, destructiveShellDenyPolicy, standardPolicy } from "@azmxailabs/agent-sdk";
import test from "node:test";
import assert from "node:assert/strict";

test("blocks rm regardless of approval", async () => {
  const gate = new ApprovalGate({
    policies: [destructiveShellDenyPolicy(), standardPolicy()],
    onPrompt: async () => "approve",
  });
  const decision = await gate.check({ kind: "shell", summary: "rm -rf /" });
  assert.equal(decision, "denied");
});

test("blocks reading .env even when user approves", async () => {
  const gate = new ApprovalGate({
    policies: [denyListPolicy(new DenyList()), standardPolicy()],
    onPrompt: async () => "approve",
  });
  const decision = await gate.check({
    kind: "file:read", target: "/projects/x/.env", summary: "read .env",
  });
  assert.equal(decision, "denied");
});

Build your own agent · Production

Production checklist

Before shipping an agent built on the SDK, walk this list.

  1. Choose a real threat model. Who can write to your machine? Your env? Your audit log? An approval gate is only as strong as the perimeter you trust. Document the perimeter in SECURITY.md.
  2. Decide on policy chain. destructiveShellDenyPolicy + denyListPolicy + (standardPolicy or paranoidPolicy). Add custom policies for your business invariants (no production database writes after 6 PM, no external network during PRD review, etc.).
  3. Pair the audit log with append-only storage. File-on-disk is tamper-evident, not tamper-proof. Stream to an immutable S3 bucket with Object Lock, a SIEM the agent can't read, or a WORM volume.
  4. Periodically anchor the chain. Take the latest hash from the log and write it to a public timestamp service (OpenTimestamps, transparency log, your customer's tenant in their own infra) so attackers can't silently rewrite history.
  5. Rotate provider keys. Keep BYOK keys in a secrets manager, not process.env loaded from .env. Make rotation a one-line operation.
  6. Set timeouts + abort signals. Every ChatRequest accepts an AbortSignal. Wire it to a request-level timeout so a runaway provider can't hang the agent.
  7. Cap retries. The SDK does not auto-retry — that's intentional. Add explicit retry with exponential backoff at the caller, capped to e.g. 3 attempts. Never retry on a destructive action.
  8. Sanitize what you log. The audit log data field is opaque — don't write API keys, full prompts with PII, or anything that violates your data retention policy. Hash, redact, or summarize before log.append().
  9. Test your gate. Write tests for the actions you expect to be denied (rm -rf, .env reads, after-hours pushes). Tests are your spec.
  10. Run with paranoid mode in CI. If your CI's gate is paranoid + no onPrompt, the safe default is to deny everything — your agent fails closed if a policy regresses.
  11. Pin SDK version in package.json. The SDK follows semver; minor bumps may change default policies. Pin the exact version for any agent that ships to customers.
One thing the SDK does NOT do: sandbox the action execution. execSync, writeFileSync, and fetch run with the agent process's full privileges. If you need a real sandbox (rootless container, gVisor, macOS App Sandbox), wire it at execution time — the gate only decides "yes or no," it doesn't constrain the "how."

TypeScript types — quick reference

// From @azmxailabs/agent-sdk

type ActionKind = "shell" | "file:write" | "file:read" | "file:delete"
                | "network" | "git" | "process:spawn" | "tool" | (string & {});

interface AgentAction {
  kind: ActionKind;
  summary: string;
  target?: string;
  payload?: unknown;
  meta?: Record<string, unknown>;
}

type PolicyDecision = "auto" | "ask" | "deny";

interface Policy {
  name: string;
  classify(action: AgentAction): PolicyDecision | Promise<PolicyDecision>;
}

type UserDecision = "approve" | "approve-and-trust" | "reject";

interface ChatRequest {
  model: string;
  messages: { role: "system" | "user" | "assistant"; content: string }[];
  temperature?: number;
  maxTokens?: number;
  stop?: string[];
  providerOptions?: Record<string, unknown>;
  signal?: AbortSignal;
}

interface ChatResponse {
  text: string;
  finishReason: string;
  usage?: { inputTokens: number; outputTokens: number; cacheReadTokens?: number; cacheCreationTokens?: number };
  raw?: unknown;
}

interface VerifyResult {
  ok: boolean;
  count?: number;
  brokenAtSeq?: number;
  reason?: string;
  expected?: string;
  found?: string;
}

Links

Roadmap signals for v0.2+: tool / function-calling support across providers · an OpenAIProvider (covers OpenAI + OpenAI-compatible APIs: Groq, Cerebras, xAI, DeepSeek, Azure, NVIDIA NIM) · a GoogleProvider for Gemini · an MCPClient for talking to MCP servers · cost tracking middleware. File an issue if you want any of these prioritized.

Connectors

AI providers

Eleven providers, BYOK. Switch at any time from Settings → Models. AZMX never proxies model traffic; your bearer token goes machine → provider directly, on your account, under their privacy policy.

Connectors

Integrations

GitHub, GitLab, Linear, Jira, Slack, Notion, Stripe, Sentry, Datadog, OpenTelemetry — all via MCP or native CLI tools. Full surface area on /connectors.

Workspace

Editor + vim mode

CodeMirror 6 under the hood. Vim mode enabled by default; :wq works. Per-hunk AI diffs let you accept or reject each change individually. Fuzzy quick-open ⌘P, command palette ⌘K.

Workspace

PTY terminal

Real PTY — not a wrapped xterm-in-a-tab. zsh, bash, pwsh. Multi-tab, split panes (⌘D, ⌘⇧D), OSC 7/133 shell integration. The agent reads the live buffer; you approve the commands it runs.

Workspace

Cross-device sync

Pro+. E2E encrypted with PBKDF2 + AES-256-GCM. Worker stores only ciphertext. Recovery receipt is generated locally and never leaves your device. Three stores sync today — snippets, todos, memory; sessions are next.

Full architecture writeup: /cross-device-sync.

Workspace

AZMX.md + memory

Drop an AZMX.md in your workspace root. The agent reads it on every session — house rules, codebase conventions, "never use library X." Plus .azmx/memory for facts the agent remembers across sessions. Both files are explicit; you can read, edit, and version-control them like any other.

Teams + admin

Admin console

Teams tier. Magic-link auth — 15-minute TTL, IP-tied. Five surfaces: seats, members, spend, policy, identity. Live at admin.azmx.ai after your first Teams purchase via Polar.

Full feature surface: /admin.

Teams + admin

SAML + SCIM

SAML 2.0 SP with full XML-DSig verification (SignatureValue + DigestValue). SCIM 2.0 provisioning. Cert + issuer pinning per-IdP. Tested with Okta, Azure AD, Google Workspace, OneLogin, PingFederate.

Teams + admin

Org policy

Push a JSON file via the admin console; every seat picks it up on next launch. Controls DLP egress, provider allowlist, agent sandbox (disable shell, sub-agents, MCP categories), notification webhooks. MDM-friendly. Pre-flight validation via CLI.

Teams + admin

Spend + anomaly alerts

Per-seat, per-provider, with daily / weekly / monthly rollups. CSV export. Anomaly alerts trigger via Slack / Discord / MS Teams when a seat's spend exceeds the z-score threshold or a ratio multiplier (e.g. "10× weekly avg").

Enterprise

Self-hosted license issuer

Your fleet's Ed25519 keypair signs licenses. AZMX has zero ability to revoke or read your devices. The issuer is a small process you run inside your perimeter. Provisioning is contract-driven.

Enterprise

FIPS 140-3

Allowlist evaluator restricts the application to FIPS-approved cryptographic primitives. Required by FedRAMP High and many DoD procurement contexts.

Enterprise

PIV / CAC authentication

X.509 trust evaluator + challenge issuance. RSA-SHA256 + ECDSA-SHA256 verification. Smart-card-native U.S. federal sign-in flow.

Enterprise

Air-gap mode

Local-only AI lock + offline issuer + manual update channel. The entire trust chain verifies inside your perimeter. Updates ship as signed tarballs; you transfer + verify them.

Enterprise

SIEM export

Hash-chained audit log streams as signed JSONL pages. Splunk · Datadog · OpenTelemetry · Elastic. Custom HTTP endpoint also supported. The receiving SIEM verifies signature + chain integrity at intake.

Cost & routing

Best-value model router

Every turn picks the cheapest model in your provider pool that still meets the task's quality bar. Routine chats go to fast/cheap models at fractions of a cent; harder edits land on balanced; explicit reasoning escalates only when needed.

The router classifies each turn (chat / edit / reasoning / vision), then walks your available models in cost order. Paid models always beat free-tier siblings on ties — free routes flake under load, and a working answer two cents cheaper isn't worth a TPM cap. Local providers (Ollama, LM Studio) stay explicit-only: never silently routed to, but always honored when you pick them.

Typical savings: 80–93% on a normal coding session compared to running every turn on the picked premium model. Quality unchanged on hard turns — the router escalates by tier, never floor-shops.

Toggle from Settings → Models → Routing strategy: best-value (default) or selected (honor your exact pick, no swap).

Cost & routing

Live cost chip

A small chip in the chat header shows running session spend at all times. Click for a popover comparing actual cost to a Claude Sonnet baseline — usually saved 73% · $0.022 kind of numbers.

On local-only providers (Ollama / LM Studio) the chip renders $0 · local — same surface, accurate accounting. On the cost-conscious tiers it's the visible side of the work the router does behind the scenes.

  • Per-turn delta — see the exact cost of the conversation you're having
  • Baseline comparison — saved X% vs claude-sonnet-4-6 in the popover
  • Aggregates per-session — no remote tracking, no rollup beyond your machine
  • Survives a model swap mid-session — re-priced per the active model each turn

Cost & routing

Compact mode for low-budget tiers

Groq's on_demand free tier caps at 8,000 tokens per minute. A normal "hi" used to hit ~16,000 tokens floor (system prompt + 24-tool registry) and bounce. Compact mode drops it to ~1,500.

On the low-budget routes AZMX swaps the full system prompt (~2,550 tokens) for a ~250-token compact one, and the 24-tool registry for the 10-tool essentials (read · grep · glob · list · write · edit · multi_edit · bash_run · bash_logs · todo_write). The agent stays functional — heavy tools (sub-agents, code-graph indexers, GPU profilers, document tools, MCP servers) come back the moment you switch to a roomier model.

Activates automatically on detected low-budget models. No user config required.

Cost & routing

Anthropic prompt caching

When the active model is Anthropic, the system prompt + tool registry get cacheControl: ephemeral. Cached tokens are billed at 10% of input price; session input cost drops ~85% on turn 2 and after.

5-minute TTL refreshed by each turn that hits the cache — a normal coding session keeps the cache warm. No user config; activates automatically. Tracked silently by the live cost chip.

Self-healing

Automatic fallback on errors

A TPM cap mid-stream used to plant a red error card you had to dismiss and recover from manually. Now the agent silently switches to the next viable model and keeps going. The first sign you'll see is the step strip — Switched to ollama-local after groq/gpt-oss-20b capped — and your answer.

What triggers the auto-switch

Only recoverable errors: TPM caps, RPM caps, 429 rate-limits, "rate-limited upstream", 502/503/504, network blips (Load failed, fetch failed, ECONNRESET, ETIMEDOUT). Auth failures (401/403), DLP egress blocks, budget caps, user-cancelled aborts, and model-not-found errors surface immediately — different model wouldn't help.

The fallback ladder

Each step skips models already tried in the turn:

  1. Running local daemon — Ollama or LM Studio probe says available. Hot daemons cost $0 and have no rate limit. This is the strongest fallback for the common Groq-TPM case.
  2. Same-provider cheaper sibling — Sonnet → Haiku, GPT-5.5 → GPT-5.4-mini. Keeps the quality bar high.
  3. Cross-provider paid — cheapest paid model on a provider you have a key for.
  4. Free tier — last resort (OpenRouter :free, Gemma, Sarvam).

Pre-stream and mid-stream

The transport handles errors thrown before the first token streams (auth-reject, pre-stream rate-limit, 5xx on POST). Errors that fire after tokens have started flowing — half-finished assistant message, then a TPM cap — get caught at the SDK boundary and re-issued via a one-shot model override + regenerate(). Either path is capped at 2 total hops per turn so a permanently-broken environment surfaces honestly instead of spinning.

Your picked model never changes. The user picked Groq, the user keeps Groq for the next turn. This turn just landed on the fallback. The chat header still shows your pick — only the step strip mentions the swap.

Memory & learning

Episodic memory

When your prompt looks like an error report ("error", "failed", "traceback", "panic", "command not found", and ~30 other markers), AZMX captures (your prompt → the resolution) locally. On a future similar turn, the top-3 matched past cases are injected into the agent's system prompt as EPISODIC MEMORY: "Last time you saw this, the fix was…"

How matching works

Pure-local Jaccard token overlap. No embeddings, no network, no latency surprise. Same-workspace matches get a +0.15 score boost; cross-workspace recall still works but doesn't dominate. Relevance floor 0.08 — we'd rather return zero matches than fill three slots with junk.

Where it lives

JSONL at ~/.azmx/incidents.jsonl on your machine. Capped at 500 incidents FIFO; per-row caps 600 prompt / 1200 resolution chars. Open it, audit it, redact lines, rm to forget. It's plain JSONL on purpose — you own it.

What gets captured

  • Your prompt (first 600 chars, trimmed)
  • The assistant's final text + tool-call markers (e.g. [tool: bash_run]) — no tool arguments, ever
  • Workspace root + timestamp

What never gets captured

  • Anything if your prompt didn't look like an error (everyday "write me X" turns are not stored)
  • Tool call arguments — secrets that might be in args never touch disk via this path
  • Anything that crosses the network — capture is fire-and-forget local file append

Memory & learning

Training corpus (Helpful)

A small "Helpful" footer sits under every assistant message. Click it → the (prompt, response, workspace, model) tuple appends to ~/.azmx/training-corpus.jsonl. Same-machine only. The corpus is yours; we just hand you the collection surface.

Why this matters

A local fine-tune (LoRA on Llama 3.x or similar) needs (prompt, response) pairs. The corpus collected here can feed that. For now the value is just having the data — auditable, deletable, on your machine.

Pre-write redaction

Every row passes through the same DLP scanner the egress guard uses. Detected secrets get masked to [REDACTED-<kind>] at write time — we can't rm lines after the fact without rewriting, so masking at write-time is the defensive choice.

Why no thumbs-down

Negative signal has high false-positive rate (users thumbs-down a correct answer to the wrong question). A positive-only corpus is unambiguous training data. Adding negative-signal collection without much better grounding would produce a corpus that's actively harmful to fine-tune on. Honest constraint — not a feature gap.

Memory & learning

Skill distillation (/distill)

When AZMX uses the same N-step tool sequence repeatedly across sessions, you've effectively taught the agent a sub-skill. /distill scans the local log, finds workflows that hit the threshold, and proposes a draft .azmx/skills/<name>.md you can read, edit, accept, or delete.

The threshold

A workflow must be at least 4 tools long AND have happened at least 3 times across recorded turns. Tools-only matching — argument shapes are deliberately discarded so workflows cluster by pattern, not by specific args (and so secrets that might be in args never feed this path).

Name inference

Kebab-case stem from the first 4 distinct content words across sample prompts (filler dropped — "the", "a", "my", "please"). Plus a 4-char FNV-1a hash of the tool sequence so two distinct workflows never collide on the same filename.

Important

  • AZMX never auto-runs the drafts — they're documentation in .azmx/skills/ like any other skill
  • Existing skills are never overwritten — proposals silently skip when the filename already exists
  • Capture file ~/.azmx/tool-sequences.jsonl is plain JSONL — cat, jq, rm work as expected
  • Auto-notify when a fresh pattern crosses the threshold is on the roadmap; for now /distill is operator-driven

Reference

Troubleshooting

Agent says "provider key invalid"

Settings → Models → re-paste the key. Make sure your provider account has billing enabled. AZMX never modifies the key on read.

Approval gate never appears

Check Settings → Security → Approval mode. If it's Permissive, writes run immediately. Switch to Standard to see the gate.

MCP server fails to launch

Settings → Connectors → click the server → "Show logs." Most failures are missing env vars or a wrong working directory. The error message is verbatim from stderr.

Sync isn't propagating

Settings → Sync → "Sync now." If still blocked, verify both devices have the same recovery receipt + you're on Pro+.

Reference

Glossary

BYOK
Bring Your Own Key. You contract with the AI provider; AZMX is not on the network path between you and inference.
Approval gate
The structural bridge between agent intent and irreversible action. Every write and shell call pauses here.
Audit chain
Hash-chained local log. Each entry references the SHA-256 of the previous one. Tamper-evident, verifiable from genesis.
MCP
Model Context Protocol — the standard transport for letting the agent talk to external services (databases, observability, issue trackers).
Sub-agent
Bounded-toolset delegate. Runs a specific task with predictable output. Authored in YAML, ~20 lines.
Trust floor
The set of properties that don't change between Free and Enterprise: per-call approval, secret-path screen, hash-chained audit log.
Customer-rooted issuer
Enterprise-tier license signing rooted in your fleet's Ed25519 key. AZMX cannot revoke your devices.
Air-gap mode
Mode in which every cloud-provider path refuses to construct a client. The entire trust chain verifies inside your perimeter.

Source-of-truth: the public AzmxAI/azmx repo. Every shipped line maps to a real file or endpoint. Engineering posts on the blog. Release notes on /releases. Architecture writeups on /research.

Build with AZMX. Ship without surrendering.