Skip to content

Multi-Agent Team

When a feature spans multiple layers — frontend, backend, database, infrastructure — a single Claude instance working sequentially can be slow. The multi-agent team build spawns multiple Claude instances that work in parallel, coordinating through a contract-first protocol to avoid integration conflicts.

Use multi-agent builds when:

  • The feature spans 2+ independent components (e.g., frontend and backend)
  • Components have well-defined interfaces between them
  • Parallel work would save significant time
  • The feature is complex enough to justify the coordination overhead

Use the standard loop instead when:

  • The task is focused on a single component or layer
  • Changes are tightly coupled and hard to parallelize
  • The task is a bug fix, refactor, or small feature
  • You want lower cost (multi-agent builds run multiple Claude instances simultaneously)

Before using /mx:build-with-agent-team, ensure:

Agent Teams require Claude Code v2.1.32+. Check with:

Terminal window
claude --version

Add to your Claude Code settings:

~/.claude/settings.json
{
"env": {
"CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS": "1"
}
}

Without this flag, the command will stop with an error.

Agent Teams support two display modes:

ModeHow it worksRequirements
In-process (default)All teammates run in your main terminal. Use Shift+Down to cycle.Any terminal
Split panesEach teammate gets its own pane — full visibility of all output at once.tmux or iTerm2

The default is "auto" — uses split panes if already inside tmux, in-process otherwise.

To install tmux (recommended for split panes):

Terminal window
# macOS
brew install tmux
# Ubuntu/Debian
sudo apt install tmux
# Fedora
sudo dnf install tmux

To use iTerm2 split panes: install the it2 CLI, then enable the Python API in iTerm2 → Settings → General → Magic → Enable Python API.

To override the display mode, set teammateMode in ~/.claude.json:

~/.claude.json
{
"teammateMode": "in-process"
}

Or pass per-session:

Terminal window
claude --teammate-mode in-process
ShortcutAction
Shift+DownCycle through teammates (wraps back to lead after last)
EnterView a teammate’s session (in-process mode)
EscapeInterrupt a teammate’s current turn
Ctrl+TToggle the shared task list

The multi-agent team build follows a structured protocol with phases.

An agent team consists of:

ComponentRole
Team leadYour main Claude Code session — creates the team, spawns teammates, coordinates work
TeammatesSeparate Claude Code instances that each work on assigned tasks
Task listShared list of work items that teammates claim and complete
MailboxMessaging system for direct communication between agents

Teams and tasks are stored locally:

  • Team config: ~/.claude/teams/{team-name}/config.json
  • Task list: ~/.claude/tasks/{team-name}/
  • Context: Teammates load CLAUDE.md, MCP servers, and skills automatically — but they do not inherit the lead’s conversation history. Include all task-specific context in the spawn prompt.
  • Permissions: Teammates start with the lead’s permission settings. You can change individual teammate modes after spawning, but not at spawn time.

This is the most important phase. Agents building in parallel will diverge on interfaces unless they agree on contracts first.

  1. Upstream agents spawn first (e.g., database agent, then backend agent)
  2. Each upstream agent’s first task is to define and publish their contract — exact API endpoints, request/response shapes, status codes, error formats
  3. The lead agent verifies each contract for:
    • Exact URLs, including trailing slashes
    • Explicit response shapes (not vague descriptions like “returns data”)
    • Error response formats
    • SSE event formats (if applicable)
  4. The lead forwards verified contracts to downstream agents
  5. Downstream agents spawn only after receiving their upstream contract

This staggered approach prevents the most common multi-agent failure: interface divergence.

After contracts are verified and distributed, all agents build in parallel. Each agent owns specific files and directories and must not touch other agents’ code. If an agent needs to deviate from the contract, it notifies the lead first.

Before declaring completion, the lead runs a contract verification:

  • “Backend: what exact curl commands test each endpoint?”
  • “Frontend: what exact fetch URLs are you calling?”

The lead compares responses and flags any mismatches before integration. This catches contract drift that happened during implementation.

Each agent reviews another agent’s integration points. This catches issues at the boundaries between components — the most common source of integration bugs.

Teammates can communicate using two patterns:

PatternWhen to use
messageSend to one specific teammate — e.g., “tell the backend agent to add a new endpoint”
broadcastSend to ALL teammates simultaneously. Use sparingly — token costs scale with team size

In split-pane mode, click into any pane to interact with that teammate directly. In in-process mode, use Shift+Down to cycle to a teammate and type to send them a message.

The shared task list coordinates work across the team:

  • Task states: pending → in progress → completed
  • Dependencies: Tasks can depend on other tasks. A pending task with unresolved dependencies cannot be claimed until those dependencies are completed. When a dependency completes, blocked tasks unblock automatically.
  • Assignment: The lead can assign tasks explicitly, or teammates can self-claim the next unassigned, unblocked task after finishing their current work.
  • Race prevention: Task claiming uses file locking — no two teammates can claim the same task simultaneously.

For complex or risky components, require teammates to plan before implementing:

Spawn an architect teammate to refactor the authentication module.
Require plan approval before they make any changes.

When a teammate finishes planning, it sends a plan approval request to the lead. The lead reviews the plan and either:

  • Approves — the teammate exits plan mode and begins implementation
  • Rejects with feedback — the teammate stays in plan mode, revises, and resubmits

Influencing the lead’s judgment: Give it criteria in your prompt — e.g., “only approve plans that include test coverage” or “reject plans that modify the database schema.”

Use hooks to enforce rules automatically when teammates finish work or tasks change state:

HookFires whenUse for
TeammateIdleA teammate is about to go idleExit code 2 → sends feedback and keeps the teammate working
TaskCreatedA task is being createdExit code 2 → prevents creation with feedback
TaskCompletedA task is being marked completeExit code 2 → prevents completion with feedback (e.g., “run tests first”)

Configure these in your project’s hooks config. Examples:

  • “All tasks must have passing tests before completion” — use TaskCompleted to verify
  • “No teammate goes idle without reporting status” — use TeammateIdle to enforce
  • “Tasks must include acceptance criteria” — use TaskCreated to validate

Every team build includes a dedicated QA teammate that verifies completed work before tasks can close. The QA agent uses the mx-quality-keeper persona and never writes production code. It does not count toward the implementation team size — if you specify 3 agents, the team will be 3 builders + 1 QA.

  1. Implementation agents complete tasks using a test-first cycle: write the test → confirm it fails → implement the feature → confirm it passes
  2. When a task is marked complete, the QA agent verifies it:
    • Runs lint, type-check, and tests on changed files
    • Checks for new type/lint suppressions
    • Verifies contract conformance (do implementations match the agreed interfaces?)
  3. If verification passes, the task stays completed and dependent tasks unblock
  4. If verification fails, the QA agent messages the owning agent with specific failure details and the task reverts to in-progress

The QA agent enforces a structured retry protocol:

AttemptQA action
1st failureMessages owning agent with failure details and required fix
2nd failureMessages with additional context about why the previous fix did not resolve the issue
3rd failureEscalates to the lead with a summary of all attempts

After 3 failed attempts, the lead decides how to proceed — reassign the task, pair with the agent, or descope.

The QA agent checks:

CheckWhat it verifies
Quality commandsLint, type-check, tests pass on changed files
SuppressionsNo new @ts-ignore, eslint-disable, etc. without justification
Contract conformanceImplementation matches the agreed API contract (endpoints, shapes, status codes)
Spec conformanceEvery must-have from the PRD was actually built, wired up, and works as specified — verified per user role (Admin, Customer, etc.). Reports PASS/FAIL/MISS per role and requirement ID.
Test coverageEvery implemented feature has a corresponding test written test-first (e2e for user flows, unit for business logic). If an agent skipped the test-first cycle, QA fails the task.
Integration pointsCross-boundary calls match (frontend URLs match backend endpoints)

Use the TaskCompleted hook to enforce QA verification before tasks can close:

.claude/settings.json
{
"hooks": {
"TaskCompleted": [
{
"matcher": "",
"hooks": [
{
"type": "command",
"command": "echo 'Task completion requires QA verification. Message the QA agent to verify this task before marking complete.' && exit 2"
}
]
}
]
}
}

This blocks task completion with a feedback message, prompting the teammate to request QA verification first. The QA agent then runs its checks and either confirms the completion or routes a failure.

/mx:prd "real-time collaboration feature with presence indicators"

The PRD defines the feature’s requirements, scope, and acceptance criteria. This gives the agent team a clear specification to build from.

Either run /mx:plan to generate a plan from the PRD, or write a plan document manually. The plan should include:

  • What components are needed
  • How they interact
  • What the dependencies are between components
/mx:build-with-agent-team <path-to-plan> [num-agents]

The command reads the plan and determines the team structure:

Team sizeWhen to use
2 agentsClear frontend/backend split
3 agentsFull-stack (frontend, backend, database/infra)
4 agentsAdditional concerns (testing, DevOps)
5+ agentsLarge systems with many independent modules

You can specify the team size explicitly, or let the command decide based on the plan. You can also specify a model per teammate (e.g., “Use Sonnet for each teammate”).

The command automatically spawns a QA teammate in addition to the implementation agents. You do not need to account for the QA agent in your team size number.

Sizing guidance: Start with 3–5 teammates. Token costs scale linearly with team size, so only scale up when the work genuinely benefits from parallelism. Three focused teammates often outperform five scattered ones.

After spawning agents, the lead enters Delegate Mode. The lead should not implement code — only coordinate:

  • Verify and forward contracts
  • Resolve conflicts between agents
  • Answer questions from agents
  • Run end-to-end validation after all agents complete

Use Shift+Down to cycle through teammates. In split-pane mode, click into a pane to interact directly.

After all agents report done and the QA agent confirms all tasks have passed verification, the lead runs end-to-end validation:

  1. QA summary clean? — Review the QA agent’s final report — any unresolved failures or escalations?
  2. Can the system start? — Start all services, check for startup errors
  3. Does the happy path work? — Walk through the primary user flow
  4. Do integrations connect? — Verify data flows from frontend through backend to database
  5. Are edge cases handled? — Empty states, error states, loading states

If validation fails, the lead identifies which agent’s domain contains the bug, re-spawns that agent with the specific issue, and re-runs validation after the fix.

Graceful shutdown:

  1. Ask each teammate to shut down: “Ask the [role] teammate to shut down”
  2. The teammate can approve (exits gracefully) or reject with an explanation
  3. Teammates finish their current request/tool call before stopping — this can take time

Team cleanup — after all teammates have shut down:

Clean up the team

Each agent receives a prompt that defines their role, ownership, and constraints:

You are the [ROLE] agent for this build.
## Your Ownership
- You own: [directories/files]
- Do NOT touch: [other agents' files]
## What You're Building
[Relevant section from plan]
## Before You Build (REQUIRED)
- Your FIRST deliverable is your [API contract / schema / interface]
- Send it to the lead via message BEFORE writing implementation code
- Wait for lead to confirm before proceeding
## The Contract You Must Conform To
[Upstream agent's verified contract]
## Cross-Cutting Concerns You Own
[Integration behaviors this agent is responsible for]
## Project Conventions
- Run quality checks after changes (see CLAUDE.md)
- Commit format: <type>(<scope>)[TICKET] <description>
## Before Reporting Done
Run these validations and fix any failures:
1. [specific validation command]
2. [specific validation command]
Do NOT report done until all validations pass.
PitfallProblemPrevention
Fully parallel spawnAll agents start at once, interfaces divergeStagger spawns: upstream first, contracts verified before downstream
Lead over-implementingLead starts writing code instead of coordinatingStay in Delegate Mode — coordinate only
Implicit contracts”Returns sessions” — what does that mean?Require exact JSON shapes, URLs, and status codes
File conflictsTwo agents edit the same fileAssign clear, non-overlapping file ownership
Orphaned cross-cutting concernsStreaming, URL conventions, error shapes have no ownerExplicitly assign ownership of every cross-cutting concern
Lead not waitingLead starts implementing instead of delegatingTell it “Wait for your teammates to complete their tasks”
Task status lagTeammates fail to mark tasks complete, blocking dependentsCheck if work is done and update manually or nudge the teammate
ProblemSolution
Teammates not appearingIn in-process mode, press Shift+Down — they may be running but not visible. For split panes, verify which tmux or that iTerm2’s it2 CLI is installed. Check that the task is complex enough to warrant a team.
Too many permission promptsPre-approve common operations in your permission settings before spawning teammates.
Teammates stopping on errorsCheck their output via Shift+Down or click their pane. Give additional instructions directly, or spawn a replacement.
Lead shuts down earlyTell the lead to keep going and wait for teammates to finish.
Orphaned tmux sessionsRun tmux ls then tmux kill-session -t <session-name> to clean up.
/resume or /rewind after crashThese do NOT restore in-process teammates. Tell the lead to spawn new teammates.

Current known limitations of agent teams:

  • No session resumption: /resume and /rewind do not restore in-process teammates
  • One team per session: Clean up the current team before starting a new one
  • No nested teams: Teammates cannot spawn their own teams — only the lead can
  • Lead is fixed: Cannot promote a teammate to lead or transfer leadership
  • Permissions set at spawn: All teammates inherit lead’s mode; change individually after spawning
  • Split panes restricted: Not supported in VS Code terminal, Windows Terminal, or Ghostty
  • Task status can lag: Teammates sometimes fail to mark tasks completed, blocking dependents
  • Shutdown can be slow: Teammates finish their current request before stopping

mx-workflow uses two forms of multi-agent execution. Here is when to use each:

Sub-Agents (Task tool)Agent Teams
ContextOwn context window; results return to callerOwn context window; fully independent
CommunicationReports to main agent onlyTeammates message each other directly
CoordinationMain agent manages allShared task list, self-coordination
VisibilityResults summarizedEach visible in own pane or via Shift+Down
Best forFocused tasks where only the result mattersComplex work requiring discussion and collaboration
CostLower (results summarized back)Higher (separate Claude instances)

Use Sub-Agents when work is isolated and agents do not need to coordinate with each other — for example, running code review and test analysis in parallel during /mx:implement.

Use Agent Teams when components depend on each other and agents need to share contracts, resolve conflicts, and verify integration points.