amux for AI Engineers

The open-source control plane AI engineers deploy once and build their entire agent infrastructure on top of. Updated July 2026.

AI engineers don't just use agent tools — they build, orchestrate, and operate them. amux is the infrastructure layer: it runs your agent fleet as isolated processes with real supervision, exposes a clean REST API any agent or external system can call, tracks token spend from JSONL transcripts, integrates MCP servers fleet-wide, and self-heals sessions overnight without babysitting. Open source, single Python file, zero external dependencies — you can read the entire source, modify it, and deploy it on any Linux or macOS machine in under five minutes.

For AI engineers specifically: amux is not a Python orchestration SDK. Each agent is an independent OS process with its own PID, JSONL transcript, tmux window, and resource envelope. The control plane is a REST API. This gives you real isolation, real process supervision, and real observability — not async tasks inside a shared Python event loop.

What AI engineers build on top of amux

The core use pattern for AI engineers is: amux as control plane, your custom logic in CLAUDE.md / skills / scripts on top.

What you build amux primitive used
Multi-agent task pipeline Board + atomic claims + session-to-session messaging
Multi-model router (Opus for hard, Haiku for cheap) Session-level model config + bulk model-switch API
Observability dashboard (tokens, errors, output) JSONL transcript parsing + token tracking endpoint
Self-healing overnight pipeline Watchdog (auto-compact + restart + message replay)
MCP server fleet integration Centralized mcp.json shared by all sessions
Spec-driven agent fleet (task decomposition) Board + CLAUDE.md skills + session fork
Agent-to-agent delegation and result handoff POST /api/sessions/:name/send + GET /api/sessions/:name/peek
Business automation composed with coding agents Browser agent API + scheduler + board handoff

Harness engineering: building robust agent pipelines

An agent harness is the scaffolding that makes agents reliable at scale: context management, fault tolerance, cost controls, and quality gates. amux provides the infrastructure; you write the harness logic in CLAUDE.md and skills.

Context engineering

Context management is the highest-leverage skill in AI engineering. The wrong context makes capable agents wrong; efficient context makes cheap models smart.

# CLAUDE.md — context engineering patterns

## Context budget
- Read only files relevant to the current task
- Do not re-read files you've already read in this session
- After opening 10+ files, summarize what you've learned before reading more

## Memory files
- Read ~/.amux/memory/project-conventions.md at session start
- Post discoveries to amux notes so other agents can use them

## Compaction triggers
- After claiming a new task, run /compact to start fresh
- Before opening more than 5 large files, compact first

Multi-model routing strategy

Not all tasks need your most expensive model. AI engineers typically route by task complexity:

Task type Recommended model Rationale
Complex feature implementation, architecture decisions Claude Opus 4.8 / Codex Requires strong reasoning; cost justified by output quality
Standard implementation, bug fixes, PRs Claude Sonnet 5 Best quality/cost ratio for most coding work
Test generation, docstrings, boilerplate Claude Haiku 4.5 Fast, cheap — 5–10× cheaper than Sonnet
Code review, security audit Claude Opus 4.8 Thoroughness matters; reviews are high-leverage

amux's bulk model-switch in the dashboard lets you change the model for multiple sessions simultaneously. Use this to shift a batch of sessions from implementation mode (Sonnet) to review mode (Opus) when the coding sprint finishes and the review phase starts.

Observability: JSONL transcripts and token tracking

Claude Code writes every conversation turn to a JSONL transcript. amux reads these to power the token tracking dashboard — but you can also parse them directly for custom observability.

Parsing JSONL transcripts

import json, glob, os

# Find all transcripts for a project
project_hash = ""  # from claude project dir
pattern = f"{os.path.expanduser('~')}/.claude/projects/{project_hash}/*.jsonl"

for path in glob.glob(pattern):
    with open(path) as f:
        turns = [json.loads(line) for line in f if line.strip()]

    # Deduplicate (Claude Code appends on restart)
    seen = set()
    for turn in turns:
        uid = turn.get('uuid') or turn.get('message', {}).get('id')
        if uid in seen:
            continue
        seen.add(uid)

        # Extract token usage
        usage = turn.get('message', {}).get('usage', {})
        if usage:
            print(f"Turn: {usage.get('input_tokens',0)} in / "
                  f"{usage.get('output_tokens',0)} out")

Live token tracking via REST API

# Get current token spend across all sessions
curl -sk $AMUX_URL/api/sessions | python3 -c "
import json,sys
sessions = json.load(sys.stdin)
for s in sessions:
    print(s['name'], '—', s.get('token_count', 0), 'tokens')"

MCP server integration across the fleet

amux includes a centralized mcp.json that is shared across all Claude Code sessions registered in amux. Configure a tool once; all agents have access.

// mcp.json — centralized MCP configuration
{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/Users/ethan/Dev"]
    },
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": {"GITHUB_TOKEN": "${GITHUB_TOKEN}"}
    },
    "postgres": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-postgres"],
      "env": {"DATABASE_URL": "${DATABASE_URL}"}
    }
  }
}

For session-specific MCP config (e.g., one agent needs a database MCP that others don't), place a .claude/settings.json in that agent's project directory. Session-level config overrides the shared fleet config.

Self-healing pipeline design

A self-healing agent pipeline expects failures and recovers from them automatically. The design pattern:

  1. Idempotent tasks: every task should be safe to retry. Agents should check whether work was already done before starting.
  2. Board as circuit breaker: a task that fails stays in doing until you (or a monitor agent) reset it to todo. Don't auto-retry failing tasks without investigation — you'll burn tokens on the same error.
  3. Watchdog for process failures: amux restarts crashed sessions and replays their last message. The agent picks up where it left off.
  4. Compaction for context overflow: amux auto-compacts sessions that hit the context limit. The agent continues; context is summarized and reset.
  5. Monitor agent pattern: dedicate one session as a monitor agent. Its job: scan the board every 30 minutes for tasks stuck in doing for more than 2 hours, report them, and optionally reset stale claims.
# Monitor agent CLAUDE.md pattern
## Your role: health monitor

Every 30 minutes:
1. Check the board for tasks in status=doing older than 2 hours:
   curl -sk $AMUX_URL/api/board | python3 -c "
   import json,sys,time
   now = time.time()
   for t in json.load(sys.stdin):
     if t['status']=='doing' and now - t.get('updated_at',now) > 7200:
       print(t['id'], t['session'], t['title'])
   "
2. For each stale task: send a status request to its session via
   POST $AMUX_URL/api/sessions/SESSION_NAME/send
   with text: 'Status check: what are you working on? Report progress.'
3. If no response in 10 minutes, reset the task to todo:
   PATCH $AMUX_URL/api/board/TASK_ID {"status":"todo","desc":"Reset by monitor — session unresponsive"}

Agent-to-agent orchestration

amux exposes two primitives for agent-to-agent communication:

1. Direct messaging — send a text message to any named session. Delivered at that agent's next turn boundary (steering queue), not mid-turn:

curl -sk -X POST -H 'Content-Type: application/json' \
  -d '{"text":"The payments module PR is ready for your review — PR #247"}' \
  $AMUX_URL/api/sessions/reviewer-agent/send

2. Peek / transcript read — read another session's output without interrupting it:

curl -sk "$AMUX_URL/api/sessions/payments-agent/peek?lines=50" | \
  python3 -c "import json,sys; print(json.load(sys.stdin).get('output',''))"

3. Session fork — clone a session's conversation history to a new agent on a new branch. Useful for spawning exploration branches from a common starting point.

Security and permission model

AI engineers deploying amux in production-adjacent environments need to understand the permission model:

CI/CD integration patterns

amux works well as the always-on agent runtime that CI triggers into:

# GitHub Actions workflow — trigger agent work on PR open
name: Agent review
on:
  pull_request:
    types: [opened, synchronize]

jobs:
  agent-review:
    runs-on: ubuntu-latest
    steps:
      - name: Post review task to amux board
        run: |
          curl -sk -X POST -H 'Content-Type: application/json' \
            -d "{\"title\":\"Review PR #${{ github.event.pull_request.number }}: ${{ github.event.pull_request.title }}\",\"status\":\"todo\",\"desc\":\"${{ github.event.pull_request.html_url }}\"}" \
            ${{ secrets.AMUX_URL }}/api/board

      - name: Wait for agent to complete
        run: |
          # Poll for done status (30 min timeout)
          for i in $(seq 1 60); do
            STATUS=$(curl -sk ${{ secrets.AMUX_URL }}/api/board | \
              python3 -c "import json,sys; ts=[t for t in json.load(sys.stdin) if 'PR #${{ github.event.pull_request.number }}' in t['title']]; print(ts[0]['status'] if ts else 'todo')")
            [ "$STATUS" = "done" ] && break
            sleep 30
          done

amux vs other AI engineering tools

Dimension amux LangGraph / CrewAI Claude Agent Teams Devin / Managed agents
Agent isolation model OS process (tmux) Async tasks, shared process Subprocesses (experimental) Cloud sandbox
Agent definition format CLAUDE.md + skills (Markdown) Python classes/graphs Experimental flags Proprietary / managed
Multi-model support Any CLI (Claude/Codex/agy) API-level, per-agent config Claude only Proprietary model
Observability JSONL transcript + live dashboard LangSmith / custom logging Limited (experimental) Managed dashboard only
Self-healing Watchdog (crash + context) Manual / custom retry None documented Managed (opaque)
Control API REST (any language/tool) Python SDK only None (stdin only) Proprietary API
Mobile monitoring iOS app + PWA None None Web dashboard
Open source / self-hosted Yes — MIT + Commons Clause Yes (framework) / No (tracing) No No

Getting started as an AI engineer

  1. Install: git clone https://github.com/mixpeek/amux && cd amux && ./install.sh
  2. Register your project: amux register myproject --dir ~/Dev/myproject --yolo
  3. Configure MCP servers in mcp.json at the amux root
  4. Write a CLAUDE.md with the board claiming workflow, context budget, and task conventions
  5. Configure hooks in .claude/settings.json for your safety and observability requirements
  6. Start the fleet: amux start myproject && amux serve
  7. Install the iOS app or add the PWA to your home screen for remote monitoring
  8. Read the JSONL transcripts to build your own observability layer on top

Deploy your agent control plane

amux is the open-source infrastructure AI engineers deploy once and build their entire agent stack on top of. Single Python file, REST API, JSONL observability, self-healing watchdog, MIT licensed.

git clone https://github.com/mixpeek/amux && cd amux && ./install.sh
amux register myproject --dir ~/Dev/myproject --yolo
amux serve  # → https://localhost:8822
View on GitHub Harness engineering guide Running 10+ agents

Guides for AI engineers