Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save kieranklaassen/d2b35569be2c7f1412c64861a219d51f to your computer and use it in GitHub Desktop.

Select an option

Save kieranklaassen/d2b35569be2c7f1412c64861a219d51f to your computer and use it in GitHub Desktop.
Claude Code Multi-Agent Orchestration System

Claude Code TeammateTool - Source Code Analysis

This is not a proposal. This documents existing but hidden functionality found in Claude Code v2.1.19 binary, plus speculation on how it could be used.


Executive Summary

TeammateTool already exists in Claude Code. We extracted this from the compiled binary at ~/.local/share/claude/versions/2.1.19 using strings analysis. The feature is fully implemented but gated behind feature flags (I9() && qFB()).


Part 1: What We Found in the Binary

How We Found This

# Location
~/.local/share/claude/versions/2.1.19  # Mach-O 64-bit executable

# Extract strings mentioning TeammateTool
strings ~/.local/share/claude/versions/2.1.19 | grep -i "TeammateTool"

# Extract team_name references
strings ~/.local/share/claude/versions/2.1.19 | grep -i "team_name"

TeammateTool Operations (Confirmed in Source)

Operation Purpose
spawnTeam Create a new team, become leader
discoverTeams List available teams to join
requestJoin Ask to join an existing team
approveJoin Leader accepts a join request
rejectJoin Leader declines a join request
write Send message to specific teammate
broadcast Send message to all teammates
requestShutdown Ask a teammate to shut down
approveShutdown Accept shutdown and exit
rejectShutdown Decline shutdown, keep working
approvePlan Leader approves teammate's plan
rejectPlan Leader rejects plan with feedback
cleanup Remove team directories

Error Messages (Verbatim from Binary)

"team_name is required for spawn operation. Either provide team_name in input
 or call spawnTeam first to establish team context."

"team_name is required for broadcast operation. Either provide team_name in input,
 set CLAUDE_CODE_TEAM_NAME, or create a team with spawnTeam first."

"proposed_name is required for requestJoin operation."

"does not exist. Call spawnTeam first to create the team."

Environment Variables (Confirmed)

Variable Purpose
CLAUDE_CODE_TEAM_NAME Current team context
CLAUDE_CODE_AGENT_ID Agent identifier
CLAUDE_CODE_AGENT_NAME Agent display name
CLAUDE_CODE_AGENT_TYPE Agent role/type
CLAUDE_CODE_PLAN_MODE_REQUIRED Whether plan approval needed

Feature Gating

isEnabled() {
  return I9() && qFB()  // Two feature flags must be true
}

Spawn Backends

Backend Terminal Use Case
iTerm2 split panes Native macOS Visual side-by-side agents
tmux windows Cross-platform Server/headless
In-process None Same process, fastest

File Structure

~/.claude/
├── teams/
│   └── {team-name}/
│       ├── config.json          # Team metadata, members
│       └── messages/            # Inter-agent mailbox
│           └── {session-id}/
├── tasks/
│   └── {team-name}/             # Team-scoped tasks
│       ├── 1.json
│       └── ...

Part 2: Speculative Use Cases

Everything below is speculation based on how the API could be used once enabled.

Use Case 1: The Code Review Swarm

Scenario: You open a PR and want thorough review from multiple perspectives.

You: "Review PR #1588 with a full team"

Claude (Leader):
  └── spawnTeam("pr-review-1588")
  └── spawn("security-sentinel", prompt="Review for vulnerabilities")
  └── spawn("performance-oracle", prompt="Check for N+1 queries, memory leaks")
  └── spawn("rails-expert", prompt="Check Rails conventions")
  └── spawn("test-coverage", prompt="Verify test coverage is adequate")

  [All agents work in parallel, each in their own iTerm2 pane]

  Leader polls for completion, aggregates findings:
  └── broadcast("Wrap up, send your findings")
  └── [Collects responses via inbox]
  └── requestShutdown("security-sentinel")
  └── requestShutdown("performance-oracle")
  └── ...
  └── cleanup()

  Leader: "Here's the consolidated review with 3 critical, 5 moderate findings..."

What you'd see: 5 terminal panes, each showing a different agent working. The leader coordinates and synthesizes.


Use Case 2: The Feature Factory

Scenario: Build a complete feature with specialized agents for each layer.

You: "Build user authentication with OAuth"

Claude (Leader):
  └── spawnTeam("auth-feature")

  Phase 1 - Planning:
  └── spawn("architect", prompt="Design the OAuth flow", plan_mode_required=true)
  └── [architect creates plan, sends plan_approval_request]
  └── approvePlan("architect", request_id="...")

  Phase 2 - Implementation (parallel):
  └── spawn("backend-dev", prompt="Implement OAuth controller and models")
  └── spawn("frontend-dev", prompt="Build login UI components")
  └── spawn("test-writer", prompt="Write integration tests", blockedBy=["backend-dev"])

  Phase 3 - Integration:
  └── write("backend-dev", "Frontend is using /auth/callback endpoint")
  └── write("frontend-dev", "Backend expects redirect_uri param")

  Phase 4 - Verification:
  └── spawn("qa-agent", prompt="Run full test suite and verify flow")
  └── broadcast("QA found issues in session handling, please fix")

  Phase 5 - Shutdown:
  └── requestShutdown("backend-dev")
  └── [backend-dev]: approveShutdown()  // Done with work
  └── requestShutdown("frontend-dev")
  └── [frontend-dev]: rejectShutdown(reason="Still fixing CSS")  // Not done
  └── [Leader waits, retries later]

The magic: Agents communicate, block on dependencies, and the leader orchestrates without micromanaging.


Use Case 3: The Bug Hunt Squad

Scenario: A production bug needs investigation from multiple angles.

You: "Users report checkout fails intermittently"

Claude (Leader):
  └── spawnTeam("bug-hunt-checkout")

  Investigation (parallel):
  └── spawn("log-analyst", prompt="Search AppSignal for checkout errors")
  └── spawn("code-archaeologist", prompt="git log -p on checkout paths")
  └── spawn("reproducer", prompt="Try to reproduce in test environment")
  └── spawn("db-detective", prompt="Check for data anomalies in orders table")

  [Agents work independently, report findings to leader]

  log-analyst → write("team-lead", "Found timeout errors correlating with 3rd party API")
  code-archaeologist → write("team-lead", "Recent change to retry logic looks suspicious")
  reproducer → write("team-lead", "Reproduced! Happens when API returns 503")

  Leader synthesizes:
  └── "Root cause: retry logic doesn't handle 503 correctly.
       code-archaeologist, please prepare a fix."
  └── write("code-archaeologist", "Implement exponential backoff for 503 responses")

  [Fix implemented, verified, PR created]
  └── broadcast("Bug fixed, shutting down")
  └── cleanup()

Use Case 4: The Self-Organizing Refactor

Scenario: Large refactoring with automatic work distribution.

You: "Refactor all service objects to use the new BaseService pattern"

Claude (Leader):
  └── spawnTeam("service-refactor")

  Discovery:
  └── spawn("scout", prompt="Find all service objects that need refactoring")
  └── [scout returns list of 47 services]

  Work Distribution:
  └── Creates 47 tasks with TaskCreate
  └── spawn("worker-1", prompt="Refactor services, claim tasks from list")
  └── spawn("worker-2", prompt="Refactor services, claim tasks from list")
  └── spawn("worker-3", prompt="Refactor services, claim tasks from list")

  [Workers autonomously claim tasks via TaskUpdate]
  worker-1: TaskUpdate(taskId="12", status="in_progress", owner="worker-1")
  worker-2: TaskUpdate(taskId="7", status="in_progress", owner="worker-2")

  [If worker-1 crashes, heartbeat timeout releases its task]
  [worker-3 claims the abandoned task]

  Verification:
  └── spawn("verifier", prompt="Run tests after each refactored service")
  └── [verifier monitors completed tasks, runs tests]

  [All 47 tasks complete]
  └── broadcast("All services refactored, final test run passing")
  └── cleanup()

Key insight: Workers self-organize around a shared task queue. No central assignment needed.


Use Case 5: The Research Council

Scenario: Evaluate multiple technical approaches before committing.

You: "Should we use Redis or PostgreSQL for our job queue?"

Claude (Leader):
  └── spawnTeam("tech-evaluation")

  └── spawn("redis-advocate", prompt="Make the case FOR Redis. Research benchmarks, patterns.")
  └── spawn("postgres-advocate", prompt="Make the case FOR PostgreSQL. Research benchmarks, patterns.")
  └── spawn("devil-advocate", prompt="Find problems with BOTH approaches in our context.")
  └── spawn("cost-analyst", prompt="Compare operational costs, hosting, maintenance.")

  [Each agent researches independently]

  Debate Phase:
  └── broadcast("Present your findings. Respond to each other's points.")

  redis-advocate → broadcast("Redis is 10x faster for queue operations")
  postgres-advocate → broadcast("But we already run Postgres, no new infrastructure")
  devil-advocate → broadcast("Redis advocate ignores connection pool limits")
  cost-analyst → broadcast("Redis adds $200/mo, Postgres is free")

  Leader synthesizes:
  └── "Recommendation: Use PostgreSQL with SKIP LOCKED pattern.
       Redis performance benefits don't justify operational complexity
       for our 10k jobs/day scale."

  └── cleanup()

Use Case 6: The Deployment Guardian

Scenario: Automated pre-deployment verification with multiple checkpoints.

You: "Deploy to production with full verification"

Claude (Leader):
  └── spawnTeam("deploy-2026-01-23")

  Pre-flight (parallel, all must pass):
  └── spawn("test-runner", prompt="Run full test suite")
  └── spawn("security-scan", prompt="Run Brakeman and bundler-audit")
  └── spawn("migration-check", prompt="Verify migrations are safe and reversible")
  └── spawn("perf-baseline", prompt="Capture current performance metrics")

  [All agents must approveShutdown before proceeding]

  Gate Check:
  └── IF any agent rejectShutdown with failures → abort deployment
  └── ELSE proceed

  Deploy:
  └── spawn("deployer", prompt="Run cap production deploy")

  Post-deploy (parallel):
  └── spawn("smoke-tester", prompt="Hit critical endpoints, verify responses")
  └── spawn("perf-compare", prompt="Compare metrics to baseline")
  └── spawn("log-watcher", prompt="Monitor for error spikes for 5 minutes")

  [If any post-deploy check fails]
  └── broadcast("ROLLBACK REQUIRED")
  └── spawn("rollback-agent", prompt="Execute rollback procedure")

  Success:
  └── "Deployment complete. All checks passed."
  └── cleanup()

Use Case 7: The Living Documentation Team

Scenario: Keep documentation in sync with code changes automatically.

You: "Update all docs affected by the API changes in this PR"

Claude (Leader):
  └── spawnTeam("docs-sync")

  Analysis:
  └── spawn("change-detector", prompt="Identify all API changes in PR #1590")
  └── [Returns: 3 new endpoints, 2 modified, 1 deprecated]

  Documentation (parallel):
  └── spawn("api-docs", prompt="Update OpenAPI spec for changed endpoints")
  └── spawn("readme-updater", prompt="Update README examples")
  └── spawn("changelog-writer", prompt="Add changelog entry")
  └── spawn("migration-guide", prompt="Write migration guide for deprecated endpoint")

  Review:
  └── spawn("docs-reviewer", prompt="Check all doc changes for accuracy and style")
  └── [reviewer sends feedback via write() to specific agents]

  └── cleanup()

Use Case 8: The Infinite Context Window

Scenario: Work on a massive codebase that exceeds context limits.

You: "Understand this entire 500-file codebase and answer questions"

Claude (Leader):
  └── spawnTeam("codebase-brain")

  Specialists (each handles a domain):
  └── spawn("models-expert", prompt="Become expert on app/models/")
  └── spawn("controllers-expert", prompt="Become expert on app/controllers/")
  └── spawn("services-expert", prompt="Become expert on app/services/")
  └── spawn("jobs-expert", prompt="Become expert on app/jobs/")
  └── spawn("tests-expert", prompt="Become expert on test/")

  [Each agent reads and indexes their domain]

  Query Routing:
  You: "How does user authentication work?"

  Leader:
  └── broadcast("Who knows about authentication?")
  └── controllers-expert: "I handle SessionsController"
  └── models-expert: "I handle User model with has_secure_password"
  └── services-expert: "I handle AuthenticationService"

  Leader:
  └── write("controllers-expert", "Explain the login flow")
  └── write("models-expert", "Explain the User auth methods")
  └── write("services-expert", "Explain AuthenticationService")
  └── [Synthesizes responses]

  [Team persists across questions - no re-reading needed]

The breakthrough: Each agent maintains context for their domain. Combined, they "know" the entire codebase.


Part 3: Predicted Interaction Patterns

The Leader Pattern

Leader creates team → Leader spawns workers → Workers report to leader → Leader synthesizes

Most common. One orchestrator, multiple specialists.

The Swarm Pattern

Leader creates team + tasks → Workers self-assign from task queue → Leader monitors

For embarrassingly parallel work. Workers are interchangeable.

The Pipeline Pattern

Agent A (blockedBy: []) → Agent B (blockedBy: [A]) → Agent C (blockedBy: [B])

Sequential processing with handoffs. Each agent waits for predecessor.

The Council Pattern

Multiple agents with same task → Each proposes solution → Leader picks best

For decisions where you want diverse perspectives.

The Watchdog Pattern

Worker agent does task → Watcher agent monitors → Watcher can trigger rollback

For critical operations needing safety checks.


Part 4: What Could Go Wrong (And How It's Handled)

Failure Mode How System Handles It
Agent crashes mid-task Heartbeat timeout (5min) releases task
Leader crashes Workers complete current work, then idle
Infinite loop in agent requestShutdown → timeout → force kill
Deadlocked dependencies Cycle detection at task creation
Agent refuses shutdown Timeout → forced termination
Resource exhaustion Max agents per team limit

Part 5: Verification Commands

Confirm this exists on your system:

# Check Claude Code version
claude --version

# Find TeammateTool references
strings ~/.local/share/claude/versions/$(claude --version | cut -d' ' -f1) \
  | grep "TeammateTool" | head -5

# Find all operations
strings ~/.local/share/claude/versions/$(claude --version | cut -d' ' -f1) \
  | grep -E "spawnTeam|discoverTeams|requestJoin|approveJoin" | head -20

# Find environment variables
strings ~/.local/share/claude/versions/$(claude --version | cut -d' ' -f1) \
  | grep "CLAUDE_CODE_TEAM" | head -10

Conclusion

The future of Claude Code is multi-agent. The infrastructure exists:

  • 13 TeammateTool operations
  • File-based coordination
  • Three spawn backends
  • Inter-agent messaging
  • Plan approval workflows
  • Graceful shutdown protocol

It's waiting behind feature flags. When enabled, we'll see:

  • Code review swarms
  • Feature development teams
  • Self-organizing refactors
  • Research councils
  • Deployment guardians
  • Distributed codebase understanding

The primitives are there. The creativity is up to us.


Analysis: 2026-01-23 Claude Code: v2.1.19 Binary: ~/.local/share/claude/versions/2.1.19

@ruvnet
Copy link

ruvnet commented Jan 25, 2026

Anthropic keeps ripping off Claude Flow... Here's my recent analysis.

A detailed analysis reveals striking architectural similarities between Claude Flow V3's swarm system and Claude Code's TeammateTool. The terminology differs, but the core concepts, data structures, and workflows are nearly identical.

Similarity Score 92% Overlap
Core Concepts 95% match
Data Structures 90% match
Workflow Patterns 93% match
Terminology 70% match (different words, same meaning)

See complete report.

Claude Flow v3

@delorenj
Copy link

I wouldn't say they're ripping it off. I just think you stumbled upon the correct solution first. I wouldn't say that you invented claude flow any more than Einstein invented e=mc^2. More like, you discovered the natural progression way before this multi-billion dollar company and all of their engineers did. Whether you created cloudflow or not we would have ended up here because it's just that obvious to me it's how a genetic workflows should work

@dgtise25
Copy link

@delorenj accreditation would have gone a long way here. Recognise a pioneer in this space when you see one, thousands of others have.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment