Build This Now
Build This Now
Echte BuildsVon der Idee zum SaaSGAN LoopSelf-Evolving HooksTrace to SkillDistribution AgentsKI-Sicherheits-AgentsAutonomer KI-SchwarmKI-E-Mail-SequenzenKI räumt sich selbst aufAgent Swarm Orchestration
speedy_devvkoen_salo
Blog/Real Builds/Agent Swarm Orchestration

Agent Swarm Orchestration

Four infrastructure layers that stop agent swarms from double-claiming tasks, drifting on field names, and collapsing under merge chaos.

Hören Sie auf zu konfigurieren. Fangen Sie an zu bauen.

SaaS-Builder-Vorlagen mit KI-Orchestrierung.

Published Apr 22, 202612 min readReal Builds hub

Running multiple AI agents in parallel without falling apart is harder than it looks.

Most swarms fail the same way. Agents double-claim the same task. They invent different field names for the same data. Merges turn into chaos. One agent loops forever and nobody notices. The failures are consistent because the causes are consistent: missing infrastructure.

Four layers fix all of that. This post walks through each one.

Why most swarms break

The single-agent model is easy to reason about. One agent reads a task, builds it, and either finishes or gets stuck. You see what it did. You fix what it missed.

Add a second agent and the problems multiply. Both agents can see the same task queue. Both can grab the same task at the same time. One of them does work the other already started. You now have two half-finished versions of the same feature and no way to know which one to keep.

Add a third agent and field names start diverging. Agent A calls it userId. Agent B calls it user_id. Agent C writes uid. Three agents, three conventions, three branches that won't merge cleanly.

This is not a model quality problem. Claude Code agents are good at writing code. The problem is coordination infrastructure. Without it, even well-prompted agents produce broken swarms.

The four failure modes appear in order:

FailureWhat happensWhen it appears
Double-claimingTwo agents grab the same taskAs soon as 2+ agents run
Field name driftAgents invent different names for shared dataFirst cross-agent feature
Merge chaosBranches conflict because agents wrote to the same filesAt merge time
Silent loopingOne agent repeats the same failed step indefinitelyLong runs

Fix these four and a swarm becomes reliable. Skip any one and it breaks.

The four layers

Every working swarm shares the same architecture. The names vary. The shape does not.

LayerNameJob
01Task GraphAtomic task claims via database
02Process ShellEach agent in its own worktree
03Contracts FirstShared interfaces injected before coding starts
04Merge QueueSerialized merges with tiered conflict resolution

These are not optional. Remove any one and a specific failure mode returns. The task graph stops double-claiming. The process shell stops file conflicts. Contracts stop field drift. The merge queue stops bad code landing on main.

Layer 1: the task graph

The most common fix people try is a markdown plan file. Agents read it, pick a task, update it. In practice this breaks immediately. Two agents read the file at the same time. Both see the same unclaimed task. Both write status: in-progress in parallel. The file has a race condition baked in.

The fix is a database with atomic transactions.

A task graph is a table with one row per task. Each row has a status column: pending, claimed, done, or failed. Agents claim tasks with a SQL transaction that checks and updates in one atomic step:

UPDATE tasks
SET status = 'claimed', agent_id = $1, claimed_at = NOW()
WHERE id = $2 AND status = 'pending'
RETURNING id;

If two agents run this query at the same time with the same task ID, the database serializes them. One gets the row back. One gets nothing. The agent that gets nothing moves on to the next unclaimed task. No race condition. No duplicate work.

The task graph also tracks dependencies. Task B can only be claimed after Task A reaches done. This keeps agents from trying to build a payment form before the payment table exists.

Three columns do most of the work:

ColumnTypePurpose
statusenumpending / claimed / done / failed
agent_idtextWhich agent holds the claim
depends_onint[]Task IDs that must complete first

A SQLite file on disk is enough for single-machine swarms. Supabase or Postgres works for anything distributed. The database is not the complex part. The transaction pattern is.

Layer 2: process isolation

Agents sharing a working directory fight over the same files. Two agents editing the same file at the same time produce conflicts at best and corrupted output at worst. Git's index can only track one active operation at a time. When two agents both run git add in the same repo simultaneously, one of them fails with index.lock.

Git worktrees solve this completely.

A worktree is a separate checkout of the same repository at a different path on disk. Each checkout has its own working directory, its own index, and its own HEAD. The agents share the underlying object store but nothing else.

You create one worktree per agent at the start of a swarm run:

git worktree add ../agent-a-worktree feature/auth
git worktree add ../agent-b-worktree feature/payments
git worktree add ../agent-c-worktree feature/email

Agent A works in agent-a-worktree/. Agent B works in agent-b-worktree/. They never touch each other's directories. No index locks. No file conflicts during the build phase.

The worktree for each agent is pointed at its own branch. When Agent A is done with feature/auth, that branch merges back through the merge queue (Layer 4). The worktree is then cleaned up or reused for the next task.

What each agent gets:

ResourceSharedPer-agent
Git object storeYesNo
Working directoryNoYes
IndexNoYes
HEAD pointerNoYes
BranchNoYes

This is the layer that makes true parallelism possible. Agents cannot accidentally overwrite each other's work because they are never writing to the same location.

Layer 3: contracts first

Field name drift is invisible until merge time. Agent A builds an API that returns { userId: "abc" }. Agent B builds a frontend that reads data.user_id. Both work in isolation. At merge time, the frontend reads undefined and the team spends two hours tracing why.

The fix is shared type contracts injected into every agent prompt before coding starts.

A contract is a TypeScript interface (or JSON schema, or plain type definition) that all agents agree to use. You write the contracts before any agent starts:

// contracts/user.ts
export interface User {
  userId: string;
  email: string;
  createdAt: string;
}

export interface ApiResponse<T> {
  data: T;
  error: string | null;
}

Every agent gets these contracts in its system prompt. The orchestrator injects the full contracts file at the top of each agent's context. Agents are instructed to use the defined types and not invent new field names.

The result is measurable. Without contracts, a six-agent swarm building a SaaS backend produced three variants of the user ID field across six branches. Three of the six branches failed to merge cleanly. The integration quality score (measured by counting type errors across the merged codebase) was 28.

With contracts injected at the start, a four-agent swarm building the same feature used userId everywhere. Zero branches failed at merge. Quality score reached 68.

What changes with contracts:

CheckWithoutWith contracts
User field nameuserId / user_id / uiduserId everywhere
Branch merge failures3 of 6 fail0 of 4 fail
Quality score2868
Merge timeUnpredictableFIFO, tiered

The contracts file does not have to be large. Five to ten type definitions covering the shared data models are enough for most features. Add to it as the codebase grows.

Layer 4: the merge queue

Parallel branches are useful until they need to land. Without a queue, the team hits git merge on two branches at the same time, gets conflicts on both, and loses track of which resolution to keep.

A FIFO merge queue serializes landings and handles conflicts in tiers.

Agents push their completed branches to the queue. The queue processes one branch at a time, oldest first. For each branch, it tries four resolution steps:

Tier 1: git merge --no-ff (clean merge, no conflicts)
       ↓ fails
Tier 2: deterministic auto-resolve (whitespace, import order, lock files)
       ↓ fails
Tier 3: LLM resolver per conflicted file (Claude reads both versions, picks one)
       ↓ fails
Tier 4: human review (branch parked, notification sent)

Most merges land at Tier 1 or Tier 2. Tier 3 handles the cases where two agents both modified the same function with different changes. Tier 4 is rare and reserved for conflicts where neither automatic approach is safe.

The key constraint: the LLM resolver in Tier 3 is bounded. It resolves one file at a time. It must produce valid code or it rejects the merge entirely. Prose output is not accepted. A merge that cannot be resolved automatically reaches Tier 4 and parks there until a human reviews it.

This design keeps the queue predictable. Branches land in order. Every landing is logged with the tier it required. Over time, a pattern of Tier 3 conflicts in the same files tells you where the contracts are incomplete.

The cost math

Swarms cost more than sequential runs. That is true and worth knowing before you run one.

A single Claude Code agent completing a task uses a baseline token count. Add a second agent and you roughly double the tokens (two agents, two context windows). Add parallel coordination overhead and the multiplier rises further.

For complex multi-module tasks:

ScenarioTokensCostQuality gain
Sequential (1 agent)1x$9Baseline
Swarm (20 agents, 6 hrs)6.7x$60+28% quality
Time saved+$512 hours faster

The 3.4x token multiplier for complex tasks produces 28% better output quality, measured by type error count and test pass rate on the merged codebase. For simple tasks the multiplier is higher (3.9x) but the quality gain is larger too (+32%).

The rule is straightforward:

Use a swarm when you have three or more independent modules that can be built in parallel. Auth, payments, and email are a good example. They share types but do not share implementation files. Three agents building in parallel with proper contracts and worktrees finish faster and produce cleaner code than one agent doing all three sequentially.

Do not use a swarm when the work fits in one context window. A single agent with full context is cheaper, simpler to debug, and produces equivalent output for tasks that are inherently sequential.

How to build your own version

You do not need a complex stack to run this. The minimum shape works on one machine.

What you need:

  • A SQLite file (or Postgres if you want multiple machines)
  • git worktree (built into Git, no install needed)
  • A contracts file with your shared types
  • A merge script that implements the four tiers

Start with the task graph. Create a SQLite table with the columns above. Write a small script that lets agents claim tasks atomically. Test it with two agents racing to claim the same task. Only one should succeed.

CREATE TABLE tasks (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  title TEXT NOT NULL,
  status TEXT NOT NULL DEFAULT 'pending' CHECK (status IN ('pending','claimed','done','failed')),
  agent_id TEXT,
  claimed_at DATETIME,
  depends_on TEXT -- JSON array of task IDs
);

Add worktrees next. Write a setup script that creates one worktree per agent before the swarm starts. The script should also clean up worktrees after agents finish. Stale worktrees accumulate fast in long swarm runs.

Write your contracts file before creating any agent prompt. Put it in a shared location that every agent can access. Make it a non-negotiable part of the agent's system prompt.

Build the merge queue last. Start with Tier 1 and Tier 4 only. A clean merge lands immediately. A conflict parks for human review. Add Tier 2 and Tier 3 once you have a sense of what kinds of conflicts come up most in your codebase.

One rule per layer:

  • Task graph: always use transactions. Never a file.
  • Process shell: one worktree per agent. Never a shared working directory.
  • Contracts: inject at the top of every agent prompt. Non-negotiable.
  • Merge queue: never merge two branches simultaneously. Always serialize.

Where else this pattern applies

The four-layer architecture is not specific to feature builds.

Security audits benefit from the same shape. Multiple agents scan different parts of the codebase in parallel, each in its own worktree, each writing findings to a shared task graph. The merge queue combines their reports without duplication.

Content pipelines use it too. Multiple agents draft different sections of a document in parallel. Contracts define the shared outline structure. The merge queue combines sections in the right order.

Performance profiling runs several agents in parallel across different subsystems. Contracts define the shared benchmark format so all reports are comparable. The queue serializes which recommendations land.

The specific tools change. SQLite becomes Postgres. Worktrees become Docker containers. TypeScript contracts become JSON schemas. The four layers stay the same.

Task graph stops double-claiming. Process shell stops file conflicts. Contracts stop drift. Merge queue stops bad code reaching main. That is the whole model.

More in Real Builds

  • KI räumt sich selbst auf
    Drei overnight Claude Code-Workflows, die das Chaos der KI selbst bereinigen: slop-cleaner entfernt toten Code, /heal repariert kaputte Branches, /drift erkennt Pattern-Drift.
  • GAN Loop
    Ein Agent generiert, einer reißt ihn auseinander, sie loopen bis der Score nicht mehr steigt. GAN Loop Implementierung mit Agent-Definitionen und Rubrik-Templates.
  • KI-E-Mail-Sequenzen
    Ein Claude Code-Befehl erstellt 17 Lifecycle-E-Mails über 6 Sequenzen, verkabelt Inngest-Verhaltenstrigger und liefert einen verzweigten E-Mail-Funnel bereit zum Deployment.
  • KI-Sicherheits-Agents
    Zwei Claude Code Befehle starten acht Sicherheits-Sub-Agents: Phase 1 scannt SaaS-Logik auf RLS-Lücken und Auth-Fehler, Phase 2 versucht echte Angriffe zu bestätigen.
  • Autonomer KI-Schwarm
    Ein autonomer Claude Code Schwarm: ein 30-Minuten-Trigger, ein Orchestrator, Spezialisten-Sub-Agents in Worktrees und fünf Gates, die overnight Features sicher shippen.
  • Distribution Agents
    Vier Claude Code Agents laufen nach Plan, schreiben SEO-Posts, lesen PostHog, bauen Carousels und scouten Reddit. Definitionen kopieren und einsetzen.

Hören Sie auf zu konfigurieren. Fangen Sie an zu bauen.

SaaS-Builder-Vorlagen mit KI-Orchestrierung.

KI räumt sich selbst auf

Drei overnight Claude Code-Workflows, die das Chaos der KI selbst bereinigen: slop-cleaner entfernt toten Code, /heal repariert kaputte Branches, /drift erkennt Pattern-Drift.

On this page

Why most swarms break
The four layers
Layer 1: the task graph
Layer 2: process isolation
Layer 3: contracts first
Layer 4: the merge queue
The cost math
How to build your own version
Where else this pattern applies

Hören Sie auf zu konfigurieren. Fangen Sie an zu bauen.

SaaS-Builder-Vorlagen mit KI-Orchestrierung.