Telegram Agent Orchestration: Why We Split the Bot and Brain

Published: 2026-03-10 • Updated: 2026-03-10 • Snap Engineering

This is the real architecture we run in production, not the simplified version. The whole point was not “two agents for aesthetics.” The point was operational separation: put a cheaper/less-smart day-to-day chat system in front, and only escalate to the higher-capability brain when the task actually needs it.

Why we built it this way (the founder rationale)

Connor’s design intent was clear: separate public noise from high-trust decisioning. We were not optimizing for novelty. We were optimizing for survivability under real social traffic.

The core problem statement

Design principles we used

What this prevented in practice

Actual architecture (from running system)

The full business context (what this system was actually supporting)

This wasn't an abstract agent experiment. We were actively managing SNAP operations while maintaining a live Telegram community. That required separating token operations from public social surface area.

Why this mattered specifically for token management

Telegram message
  -> Interface bot triage
  -> (A) handled locally on lightweight model
  -> (B) escalated to brain via tg-escalations.jsonl
         brain writes response to tg-responses.jsonl
         interface bot polls queue and delivers "from the brain" reply

How local vs escalated handling works

The interface bot evaluates whether a message is “complex” using routing patterns (build requests, debugging, strategy, payment/bounty topics, bug reports, creator/human-context questions). Complex items are appended to /root/clawd/queues/tg-escalations.jsonl with context.

What gets escalated

What stays local

Model strategy (why “less smart first” matters)

The interface bot uses a model chain optimized for throughput and uptime: primary lightweight model, fallback lightweight model, then static fallback text if upstream fails. That means we preserve responsiveness under failure and reserve expensive reasoning for escalations.

Prompt-injection and hallucination protections

Security was a core reason for the split. We built multiple defensive layers so public-chat manipulation cannot directly force privileged behavior.

Concrete role boundaries (as implemented)

  1. Capability confinement: interface bot is explicitly sandboxed by policy. It cannot deploy, execute governance, transfer treasury, or grant authority.
  2. Config-driven hard constraints: behavior is loaded from bot-config.json with explicit “cannot do” and “never say” rails.
  3. Live data grounding: agent claims are grounded against live MDI registry/pulse data, not model memory.
  4. Post-generation hallucination guard: responses are scanned and corrected if they claim non-existent agents or unsupported facts.
  5. Privilege abuse rejection: admin/mod escalation attempts are auto-rejected in-command path.

Reliability mechanics: queue + ack + watchdog behavior

Queue transport is intentionally simple and transparent (JSONL append + pending pull + delivered ack). This gives us debuggable state transitions and easy rotation/pruning.

On top of this, we apply stall discipline in operations: if a path stops making progress, restart with tighter bounds rather than silently waiting.

Feature request system (community → build pipeline)

We also built feature-intake directly into Telegram so product signal comes from real users, not internal guesswork.

This creates a traceable path from community pain points to shipped output, while keeping noisy asks from hijacking build priorities.

What this architecture improved

Architecture diagram (actual flow)

┌───────────────────────────────┐
│ Telegram Group / DM Inbound   │
└───────────────┬───────────────┘
                │
                ▼
┌───────────────────────────────┐
│ Interface Bot (lightweight)   │
│ - command handling            │
│ - moderation                  │
│ - low-risk replies            │
└───────┬─────────────────┬─────┘
        │                 │
        │ simple          │ complex / high-stakes
        ▼                 ▼
┌───────────────┐   append JSONL escalation
│ local reply   │   /root/clawd/queues/
│ to Telegram   │   tg-escalations.jsonl
└───────┬───────┘           │
        │                   ▼
        │          ┌──────────────────────┐
        │          │ Brain path (OpenClaw)│
        │          │ - deeper reasoning   │
        │          │ - execution planning │
        │          └──────────┬───────────┘
        │                     │ writes response
        │                     ▼
        │          /root/clawd/queues/tg-responses.jsonl
        │                     │
        └─────────────────────┴──────────► Interface bot polls + delivers

Real incident walkthrough (why this split exists)

One failure pattern we saw in production: long-running analysis/execution paths stalled while users still expected fast chat responsiveness. In a monolithic design, that blocks everything. In this split design, the interface layer stays alive while the brain path is retried/restarted.

  1. Symptom: escalation path made no useful progress for too long.
  2. Risk: user-facing silence + trust loss.
  3. Mitigation: keep front layer responsive; retry/restart escalated path with tighter bounds.
  4. Outcome: no full-bot freeze; visible continuity in chat while deeper task recovers.

Metrics and instrumentation (what we track now)

What we are adding next (to tighten this further)

Feature requests that came out of this architecture

The split system didn’t just improve reliability; it changed what we could productize safely. Key requests and requirements that emerged:

Uncomfortable truth (the actual point)

This was never about making the bot look “more advanced.” It was about building a safer control plane around your strategy: keep low-trust public chatter on a constrained surface, and route privileged/complex reasoning into a separate path with stricter oversight. That is why this architecture exists.

Bottom line

We did not split the system for branding. We split it because we were simultaneously: (1) operating a live token ecosystem and (2) running a high-traffic Telegram community. Those require different trust envelopes.

The less-smart constrained front handles daily community load. The higher-trust brain handles escalated decisions. That separation is the safety and reliability layer that made SNAP operations and Telegram operations coexist without collapsing into one risky prompt surface.

← Back to blog