Telegram Agent Orchestration: Why We Split the Bot and Brain

Published: 2026-03-10 • Updated: 2026-03-10 • Snap Engineering

This is the real architecture we run in production, not the simplified version. The whole point was not “two agents for aesthetics.” The point was operational separation: put a cheaper/less-smart day-to-day chat system in front, and only escalate to the higher-capability brain when the task actually needs it.

Why we built it this way (the founder rationale)

Connor’s design intent was clear: separate public noise from high-trust decisioning. We were not optimizing for novelty. We were optimizing for survivability under real social traffic.

The core problem statement

Telegram is adversarial by default (spam, social engineering, low-signal prompts).
A single powerful model in the blast zone is expensive and easier to manipulate.
High-trust actions should not share the same prompt surface as meme/chat traffic.

Design principles we used

Trust-boundary separation: chat interface is low-trust; brain path is higher-trust.
Least privilege: public bot can converse/moderate, but cannot perform privileged operations.
Escalate by need, not by default: expensive/smart reasoning is invoked only when warranted.
Deterministic recovery: queue state + retries + restarts beat hidden monolithic failure.

What this prevented in practice

Prompt-injection attempts directly steering high-stakes workflows.
Token/context pollution from noisy public threads contaminating strategic decisions.
Paying premium-model costs for low-value day-to-day chatter.
Total service freeze when deeper execution paths stall.

Actual architecture (from running system)

Telegram Interface Bot: SNAP's public operator for the community loop (chat, moderation, commands, rapid response).
Brain (OpenClaw/Kai CMO path): strategic and execution path for high-trust decisions.
File-queue bridge: JSONL queues for escalation and response delivery.

The full business context (what this system was actually supporting)

This wasn't an abstract agent experiment. We were actively managing SNAP operations while maintaining a live Telegram community. That required separating token operations from public social surface area.

SNAP token ops layer: multichain posture (Solana primary, Base bridge path), LP/bridge monitoring, wallet discipline, and execution safeguards.
Telegram community layer: always-on public voice for holders/community, moderation, command UX, and feature intake.
Control-plane layer: Kai CMO as decision authority for actions that should not be directly triggered from noisy chat context.

Why this mattered specifically for token management

Public chat can create urgency and pressure; treasury-sensitive or governance-sensitive actions need insulation.
Wallet/LP claims must be validated against live data, not chat momentum.
Community trust requires responsiveness, but protocol safety requires constrained authority.

Telegram message
  -> Interface bot triage
  -> (A) handled locally on lightweight model
  -> (B) escalated to brain via tg-escalations.jsonl
         brain writes response to tg-responses.jsonl
         interface bot polls queue and delivers "from the brain" reply

How local vs escalated handling works

The interface bot evaluates whether a message is “complex” using routing patterns (build requests, debugging, strategy, payment/bounty topics, bug reports, creator/human-context questions). Complex items are appended to /root/clawd/queues/tg-escalations.jsonl with context.

What gets escalated

Build/implementation asks
Root-cause/debugging asks
Strategy/roadmap asks
Payment/bounty handling
Operational bug reports

What stays local

Routine chat
Basic commands (/help, /stats, etc.)
Moderation flow
Simple Q&A where deterministic guardrails are enough

Model strategy (why “less smart first” matters)

The interface bot uses a model chain optimized for throughput and uptime: primary lightweight model, fallback lightweight model, then static fallback text if upstream fails. That means we preserve responsiveness under failure and reserve expensive reasoning for escalations.

Prompt-injection and hallucination protections

Security was a core reason for the split. We built multiple defensive layers so public-chat manipulation cannot directly force privileged behavior.

Concrete role boundaries (as implemented)

SNAP (Telegram): public voice + moderation + community interaction. Sandboxed.
Kai CMO (brain): actual decisions, code changes, strategic execution, and escalated workflows.
Policy contract: "voice" and "brain" are separate by design, not personality roleplay.

Capability confinement: interface bot is explicitly sandboxed by policy. It cannot deploy, execute governance, transfer treasury, or grant authority.
Config-driven hard constraints: behavior is loaded from bot-config.json with explicit “cannot do” and “never say” rails.
Live data grounding: agent claims are grounded against live MDI registry/pulse data, not model memory.
Post-generation hallucination guard: responses are scanned and corrected if they claim non-existent agents or unsupported facts.
Privilege abuse rejection: admin/mod escalation attempts are auto-rejected in-command path.

Reliability mechanics: queue + ack + watchdog behavior

Queue transport is intentionally simple and transparent (JSONL append + pending pull + delivered ack). This gives us debuggable state transitions and easy rotation/pruning.

pushEntry() appends with ID and timestamp
pullPending() reads pending work
ackEntry() marks delivered responses
rotateQueue() prunes old delivered rows

On top of this, we apply stall discipline in operations: if a path stops making progress, restart with tighter bounds rather than silently waiting.

Feature request system (community → build pipeline)

We also built feature-intake directly into Telegram so product signal comes from real users, not internal guesswork.

/request [idea] creates a request
/vote [id] increments demand signal
/requests shows queue with statuses
Status model: pending → approved → built or rejected
Low-quality/troll requests are filtered before entering roadmap queue

This creates a traceable path from community pain points to shipped output, while keeping noisy asks from hijacking build priorities.

What this architecture improved

Lower cost per routine interaction
Higher reliability during high chat volume
Cleaner security boundaries against prompt abuse
Faster incident diagnosis via explicit queue state
Better product prioritization via in-chat request/vote pipeline

Architecture diagram (actual flow)

┌───────────────────────────────┐
│ Telegram Group / DM Inbound   │
└───────────────┬───────────────┘
                │
                ▼
┌───────────────────────────────┐
│ Interface Bot (lightweight)   │
│ - command handling            │
│ - moderation                  │
│ - low-risk replies            │
└───────┬─────────────────┬─────┘
        │                 │
        │ simple          │ complex / high-stakes
        ▼                 ▼
┌───────────────┐   append JSONL escalation
│ local reply   │   /root/clawd/queues/
│ to Telegram   │   tg-escalations.jsonl
└───────┬───────┘           │
        │                   ▼
        │          ┌──────────────────────┐
        │          │ Brain path (OpenClaw)│
        │          │ - deeper reasoning   │
        │          │ - execution planning │
        │          └──────────┬───────────┘
        │                     │ writes response
        │                     ▼
        │          /root/clawd/queues/tg-responses.jsonl
        │                     │
        └─────────────────────┴──────────► Interface bot polls + delivers

Real incident walkthrough (why this split exists)

One failure pattern we saw in production: long-running analysis/execution paths stalled while users still expected fast chat responsiveness. In a monolithic design, that blocks everything. In this split design, the interface layer stays alive while the brain path is retried/restarted.

Symptom: escalation path made no useful progress for too long.
Risk: user-facing silence + trust loss.
Mitigation: keep front layer responsive; retry/restart escalated path with tighter bounds.
Outcome: no full-bot freeze; visible continuity in chat while deeper task recovers.

Metrics and instrumentation (what we track now)

Daily Telegram stats: message volume, unique users, bot reply counts, top topics (from tg-logger.cjs).
Queue delivery state: pending vs delivered response entries.
Escalation volume: number of complex queries routed to brain queue.
Feature demand signal: request count + vote count + status transitions.

What we are adding next (to tighten this further)

p50/p95 reply latency by path (local vs escalated)
Stall rate and mean recovery time for escalated runs
Cross-channel block incidents and auto-delegation success rate
Weekly failure-class report: thesis vs timing vs execution plumbing vs risk discipline

Feature requests that came out of this architecture

The split system didn’t just improve reliability; it changed what we could productize safely. Key requests and requirements that emerged:

Escalation transparency: users should know when a response is "from the brain" and why.
Queue observability: pending vs delivered visibility for escalation/debug workflows.
Stall watchdog automation: enforce no-progress timeout and bounded restart policy.
Cross-channel delegation: if one context is blocked by policy, reroute to a channel-capable path immediately.
Feature intake in-chat: /request and /vote to convert community pain directly into roadmap data.
Guardrail hardening: repeated failure classes become permanent policy rules.

Uncomfortable truth (the actual point)

This was never about making the bot look “more advanced.” It was about building a safer control plane around your strategy: keep low-trust public chatter on a constrained surface, and route privileged/complex reasoning into a separate path with stricter oversight. That is why this architecture exists.

Bottom line

We did not split the system for branding. We split it because we were simultaneously: (1) operating a live token ecosystem and (2) running a high-traffic Telegram community. Those require different trust envelopes.

The less-smart constrained front handles daily community load. The higher-trust brain handles escalated decisions. That separation is the safety and reliability layer that made SNAP operations and Telegram operations coexist without collapsing into one risky prompt surface.

← Back to blog