I thought multi agent orchestration meant agents should talk more — Reddit convinced me the opposite is usually better

Sarah MitchellMay 21, 2026 · 8 min read

I used to think the “advanced” version of multi agent orchestration was obvious: more agents, more channels, more chatter. If one GPT-5 agent is useful, then surely two GPT-5 agents arguing in Discord is better. Add Claude for review, maybe Qwen for cleanup, and suddenly you’ve got a tiny AI company living inside your laptop.

That idea is incredibly seductive when you first start building agent workflows. It also falls apart fast. The moment agents start talking freely in shared channels, the output gets harder to supervise, the conversation fills with fluff, and you spend more time figuring out what happened than benefiting from the work.

The thing that changed my mind wasn’t a polished blog post or some vendor architecture diagram. It was a small Reddit thread from someone trying to get two OpenClaw agents to collaborate in a Telegram group. The post itself wasn’t huge, but it was exactly the kind of question people ask when they’ve moved past toy demos and are trying to make agents useful.

And the best answer was not “make them talk more.” It was the opposite.

One commenter said it perfectly: have Agent 1 write a structured note with its proposal, then trigger Agent 2 to review it fresh, without inheriting the whole conversation history. That hit me immediately, because it explains why so many agent-to-agent chat setups feel smart while producing mediocre results.

When two agents share too much context, they stop critiquing each other and start harmonizing. They converge too early. They repeat assumptions. They reinforce the same mistake with extra confidence. What looks like collaboration is often just agreement arriving faster.

That’s why this advice stuck with me. It wasn’t anti-multi-agent. It was anti-group-chat.

While researching OpenClaw, I realized the framework itself already nudges you in this direction. Its session model is not built around one immortal shared conversation where every agent marinate forever in the same context. Direct messages can be shared, but group chats, rooms, channels, webhooks, and cron jobs are isolated, and cron jobs start a fresh session on every run.

That detail matters more than it sounds. If your orchestration strategy depends on every agent staying inside one giant rolling transcript, OpenClaw is quietly telling you not to do that.

Even the configuration makes the design philosophy obvious. OpenClaw documents session isolation patterns like { "session": { "dmScope": "per-channel-peer" } }, and it even defaults to a daily session reset at 4:00 AM local time on the gateway host. That is not the behavior of a framework that believes endless context accumulation is healthy.

It’s the behavior of a framework that assumes boundaries are useful.

The deeper I looked, the more I realized the real enemy in multi-agent systems isn’t lack of sociability. It’s drift. Another OpenClaw discussion on Reddit made that painfully clear: people were describing long-running workflows that became harder to supervise over time, with multiple terminals, half-finished tasks, research threads that made sense yesterday, and outputs that felt cryptic today.

If you’ve run serious agent workflows, that probably sounds familiar. One agent is coding, another is summarizing research, another is running an automation, one task failed quietly, another is technically still active but no longer useful, and now you’re trying to reconstruct the state of the world from a messy transcript.

That is not a communication problem. It’s a state management problem.

One commenter in that thread said drift happens quickly and that markdown files alone aren’t enough — you need recursive loops that check against known good states. I think that’s exactly right. Once you frame the problem that way, the architecture gets much simpler.

You don’t need agents socializing better. You need checkpoints, verification, bounded retries, and clear artifacts.

That’s why I’ve become much more bullish on the supervisor agent pattern than on free-form bot banter. One agent produces an artifact. Another agent reviews it from a cleaner starting point. A third agent, if needed, checks policy, tests edge cases, or validates against a known-good state. Each step is legible. Each handoff is inspectable.

That is orchestration.

Not a room full of bots roleplaying a startup.

To be fair, real-time agent chat can work. In the same OpenClaw thread, some people mentioned Discord channels with multiple agents collaborating, and someone else described a custom web group chat where agents alternate replies or even run in a kind of chaos mode. I believe them. I can also believe it was annoying.

A different commenter said Telegram doesn’t really let bots speak directly together in a natural way, and that Discord worked but “wrecked my head” trying to set it up, so they gave up and used the structured handoff approach instead. That sentence is more useful than most AI orchestration think pieces.

Because “possible” is not the same as “good default.” A lot of these external chat setups add friction, and they also add waste. Another Reddit commenter warned that Telegram bots should only consume messages they’re explicitly tagged in, otherwise you’re just burning tokens. That’s true operationally and financially.

And this is where the cost model starts to matter. If you’re running agents inside n8n, Make, Zapier, OpenClaw, or a custom workflow, shared chatter doesn’t just create supervision overhead. It can also turn into pure billing noise when you’re paying per token. Every irrelevant message becomes something you’re charged to process.

That’s one reason the artifact-first approach feels so much better in practice. It doesn’t just improve quality. It also cuts down on pointless context ingestion. And if you’re using something like Standard Compute, where the API is a drop-in OpenAI-compatible replacement with unlimited AI compute at a flat monthly price, you at least remove the constant anxiety of wondering whether your orchestration design is quietly running up a bill.

That doesn’t make bad architecture good. But it does make experimentation with reviewer loops, retries, and multi-step agents a lot more realistic. When your cost is predictable, you can design for reliability instead of designing around token fear.

Here’s how I think about the tradeoffs now.

Real-time agent chat in Discord or Telegram

Shared live conversation history
Higher supervision overhead
Faster convergence and premature agreement
More irrelevant context getting pulled into prompts
More likely to waste tokens if every message is consumed

Structured reviewer handoff

Agent 1 writes an explicit note or proposal
Agent 2 reviews it with fresh context
Better critique because the reviewer isn’t trapped in the same transcript
Easier checkpoints, auditability, and bounded revisions
Much easier to reason about in production workflows

OpenClaw internal coordination with session_send() or files

Less platform friction than Discord or Telegram
Fits deterministic routing and workspace-based workflows
Easier to inspect than free-form chat
Closer to how real teams hand off work: brief, artifact, questions, next step

That last option is the one I’d reach for first. OpenClaw’s routing and storage model already supports it well. Sessions are keyed by channel or thread context, transcripts are stored as JSON and JSONL, and session data lives in paths like ~/.openclaw/agents/<agentId>/sessions/sessions.json. Once you stop trying to force the whole system into a giant chat room, it becomes much easier to build workflows that are deterministic and debuggable.

If I were setting up a production multi-agent workflow today, I’d keep it boring on purpose. A worker agent does the first pass and writes a structured note. That note includes the goal, assumptions, proposed output, open questions, and failure risks.

Then a reviewer agent gets only the artifact and the handoff note, not the full transcript. The reviewer approves, rejects, or requests a bounded revision. If the task matters, a supervisor layer checks the result against tests, policy, or a known good state before the workflow moves on.

The key is that every step leaves evidence. Not vibes. Not chatter. Evidence.

There are still a few cases where agent-to-agent chat makes sense. Brainstorming is one. If you want GPT-5, Claude, and Llama to generate lots of divergent ideas quickly, a temporary shared channel can be useful because you’re optimizing for breadth, not auditability.

Cross-server or human-adjacent coordination is another. If agents live on different machines or need to interact with people in the same channel, Telegram or Discord can be practical despite the setup pain. And of course there’s demo value: a room full of agents talking looks futuristic in a way that file-based handoffs never will.

But demos are not production. A screen recording of five bots chatting is not the same thing as a workflow you can trust at 2 AM when a cron job has failed, a retry loop has fired twice, and you need to know which artifact is real.

That’s the part I think people miss when they romanticize multi-agent systems. The best multi-agent teams don’t actually act like a group chat. They act more like a newsroom.

A reporter files a draft. An editor reviews it fresh. Fact-checking happens against explicit claims. Revisions are bounded. The final version gets approved and archived. It’s less magical than autonomous bot chatter, but it’s dramatically better if your goal is reliable output.

That’s the design lesson I’d steal from these OpenClaw threads. Don’t optimize for agents that feel sociable. Optimize for legible handoffs, fresh review, and checkpoints that stop drift before it turns into lore.

Because the failure mode in multi-agent work is rarely silence. It’s two agents confidently talking each other into the same mistake.

I thought multi agent orchestration meant agents should talk more — Reddit convinced me the opposite is usually better

Keep reading

I thought multi agent orchestration meant agents should talk more — Reddit convinced me the opposite is usually better

I think the real AI agent war is who owns your inbox, browser, and calendar