If you landed here after searching for how to orchestrate multiple coding agents with OpenClaw, this article gives you the wiring diagram, actual config files, and the pitfalls the community already tripped over so you don’t have to. The audience is senior engineers who have shipped single-agent bots and are now hitting the ceiling on throughput and code quality.
Why bother with multi-agent coding pipelines?
A single OpenClaw agent is fine for small tasks—formatting a file, answering a question. The moment you ask it to refactor a 12-service monorepo, cognitive overload kicks in. The community reply was to split the job into specialized agents that talk to each other:
- Scout reads and summarizes the code base.
- Optimizer decides on a better architecture.
- Implementer writes or patches code.
This mirrors human workflows—intern fetches context, senior dev proposes design, regular dev implements—and makes prompts smaller, faster, and cheaper to run.
The "Pi" workflow: Haiku → GPT → Codex
In ClawCloud’s GitHub Discussions (#2134) user pi-shuffle published the so-called Pi workflow. It strings together three models, each selected for its strength:
- Haiku (from the Claude family) produces concise, lossless summaries.
- GPT-4 Turbo 128k re-architects modules while preserving constraints.
- OpenAI codex-002 cranks out syntactically valid code quickly.
The novelty is not the models but the glue code: how context is passed, how conflicts are resolved, and how the pipeline recovers from partial failure. We’ll rebuild that in OpenClaw 3.6.2 (Node 22.4).
Prerequisites and environment setup
You need Node 22+ and the latest gateway UI. If you prefer cloud, skip to the ClawCloud section later.
# macOS or Linux
brew install node@22 # or your package manager
npm create openclaw@latest my-pi-pipeline
cd my-pi-pipeline
npm i openclaw@3.6.2
The scaffolder creates:
gateway.config.mjs– UI and authdaemon.config.mjs– agent runtimeagents/– where we’ll drop three .mjs filesmemory/– SQLite by default
Defining agent roles in OpenClaw
1. Haiku scout
// agents/scout.mjs
export const agent = {
name: "scout",
model: "anthropic/claude-haiku",
systemPrompt: `You are Scout. Your job: walk the repo and output JSON with summaries per directory. Max 800 tokens.`,
tools: [
"shell", // uses ripgrep
"file.read"
],
memory: {
namespace: "scout",
ttl: 86400
}
};
2. GPT optimizer
// agents/optimizer.mjs
export const agent = {
name: "optimizer",
model: "openai/gpt-4o-128k",
systemPrompt: `You are Optimizer. Given JSON summaries from Scout, output a migration plan with module boundaries. No code yet.`,
tools: ["browser.read"],
deps: ["scout"], // wait for Scout
memory: { namespace: "optimizer" }
};
3. Codex implementer
// agents/implementer.mjs
export const agent = {
name: "implementer",
model: "openai/codex-002",
systemPrompt: `Implementer turns Optimizer plans into PRs. Follow project lint rules. Patch only touched files.`,
tools: ["git", "shell"],
deps: ["optimizer"],
memory: { namespace: "implementer" }
};
Key points:
depsenforces pipeline order without external orchestration.- Each agent writes to its own memory namespace to avoid collisions. We’ll share context explicitly later.
- Tools are regular OpenClaw wrappers—no secret sauce.
Context sharing and memory passing
By default, agents can’t see each other’s memory. That prevents prompt bloat but makes hand-off tricky. In 3.6.x you get two knobs:
- memory.query(namespace, pattern) inside a tool call
- event payloads via
postMessagebetween agents
The community prefers events because they stream and back-pressure is easier. Below is the minimal glue code you need in agents/scout.mjs after grabbing summaries:
import { postMessage } from "openclaw/runtime";
...
const summary = await buildSummary(rootDir);
await postMessage("optimizer", { type: "scout/sum", summary });
On the optimizer side:
import { onMessage } from "openclaw/runtime";
...
onMessage("scout/sum", async ({ summary }) => {
const plan = await makePlan(summary);
await postMessage("implementer", { type: "opt/plan", plan });
});
This pattern is basically a Kafka topic with zero external dependencies—fast enough for a single repo, not for 500 parallel builds. If you need more, see the blackboard pattern below.
Coordination patterns compared
Three recipes circulate in Slack #multi-agent:
Queue (the simple pipeline you just saw)
- Pros: trivial to reason about, low memory use
- Cons: head-of-line blocking, hard to fan-in results
Blackboard
- Central store (Postgres, Redis, or S3) where each agent posts partial results
- Agents poll or subscribe to changes
- Great for N-way merges—e.g., two scouts covering different repos
- Extra infra and race conditions if you don’t lock correctly
Supervisor pattern
- Add a fourth agent (“Foreman”) that schedules tasks to workers
- Dynamic pooling: spin up 10 Implementers for large refactors, kill when idle
- Works well on ClawCloud because you pay per minute per container
- Twice the complexity, state drift is common
For most teams, start with Queue, benchmark, then graduate to Blackboard if you see agents sitting idle.
Putting it all together in daemon.config.mjs
Open the daemon config and register your agents:
export default {
agents: [
import("./agents/scout.mjs"),
import("./agents/optimizer.mjs"),
import("./agents/implementer.mjs")
],
runtime: {
parallelism: 3,
retries: 2,
metrics: true
}
};
Then run:
npx openclaw daemon --config daemon.config.mjs
Open the gateway at http://localhost:3000 to watch tokens, message bus stats, and any red flags.
Deploying the same pipeline on ClawCloud
Self-hosting is fun until Friday 5 p.m. when Haiku fails an SSL handshake. ClawCloud runs your agents in Firecracker VMs with auto-retries and secrets management.
- Log in at cloud.claw.co
- New Project → “Pi-workflow”
- Upload the same repo or connect GitHub → pick
node:22-slimimage - Set env vars:
OPENAI_KEY,ANTHROPIC_KEY - Scale to 3 instances (one per agent) or enable Foreman pattern with 1 + workers
The first cold start usually takes 45-60 s; after that, hand-offs average 300 ms for JSON payloads <5 KB.
Observability and failure modes
Even vanilla pipelines fail. What we’ve seen:
- Token explosions. Codex loops on ESLint warnings. Mitigation: add
max_output_tokensin agent config. - Schema drift. Optimizer adds fields, Implementer crashes. Use JSON Schema validation in the message handler.
- Race conditions in Blackboard pattern. Redis
WATCH/MULTIhelps. - Model hiccups. Haiku occasionally returns Markdown instead of JSON. Retry with a regex filter; if that fails, escalate to human fallback.
Enable metrics: true in daemon.config.mjs and point Prometheus at /metrics. Grafana dashboard JSON is in the examples repo.
Limitations and open problems
Two areas remain rough:
- Global reasoning. If the architecture change crosses microservice boundaries, optimizer needs a holistic view. There’s active work on a “scene graph” memory plug-in.
- Conflict resolution. Multiple implementers may touch the same file. Community is prototyping an LLM-powered
git merge --LLM.
Both issues are tracked on GitHub (#2401, #2410). Contributions welcome.
What to try next
Spin up this three-agent pipeline on a throwaway repo—git clone https://github.com/openclaw/examples.git has one ready. Tweak the system prompts, measure latency, and decide if you need a blackboard or supervisor upgrade. Post results in the #multi-agent Slack channel; the maintainer reads every message.