Architecture

ScalyClaw is built around three independent processes — Node, Worker, and Dashboard — connected entirely through Redis. There are no direct inter-process HTTP calls; every coordination happens through Redis data structures, BullMQ queues, and pub/sub channels. This design means each component can be scaled, restarted, or replaced independently without affecting the others.

Overview

Each process has a focused responsibility. The Node is the stateful brain: it owns channel connections, the orchestrator, and LLM calls. The Worker is the stateless execution engine: it picks up jobs from BullMQ queues and executes code, agents, and tools without holding any in-memory state. The Dashboard is a React 19 single-page application that reads and writes through a WebSocket-capable API, reflecting system state in real time.

ComponentTechnologyResponsibility
Node stateful Bun, TypeScript Channel adapters, session state machine, guards, orchestrator, system prompt assembly, LLM routing, queue producers
Worker stateless Bun, TypeScript, BullMQ BullMQ queue consumer for the scalyclaw-tools queue only — sandboxed code execution, skill invocations, and shell commands
Dashboard UI React 19, Vite, WebSocket 16-page admin SPA — config management, channel setup, mind editor, memory browser, skill/agent management, real-time logs

All three components connect to the same Redis instance. Redis plays six distinct roles in the system:

  • Message bus — BullMQ queues for all async work
  • Session store — per-channel state machine data
  • Pending queue — structured message buffer per channel
  • Config store — live system configuration at scalyclaw:config
  • Secret vault — encrypted secrets at scalyclaw:secret:*
  • Pub/sub — hot-reload signals for skills, agents, and config
Design principle

ScalyClaw intentionally avoids config files on disk for runtime configuration. Everything the running system needs — API keys, model settings, channel tokens, feature flags — lives in Redis and can be changed live from the dashboard without a restart.

Component Diagram

Telegram Discord Slack Web ... NODE (channel adapters → orchestrator → LLM) ↕ Redis (BullMQ + pub/sub) WORKER (message · agents · tools · scheduler) ↕ Redis (config + WebSocket events) DASHBOARD (React 19 SPA · real-time WebSocket)

Message Flow

Every inbound message follows a deterministic pipeline. The pipeline is designed so that each stage can short-circuit cleanly — a guard rejection or session conflict stops processing early without leaking state.

1. Channel receives message 2. Session state machine (Redis Lua — atomic check-and-set) ↓ accepted → PROCESSING state 3. Pending queue (scalyclaw:pending:{channelId}) ↓ dequeue next message 4. Echo guard (reject messages from self) ↓ passed 5. Content guard (LLM-based policy check) ↓ approved 6. Orchestrator builds system prompt ↓ disk files + code sections + dynamic data 7. LLM call (streamed, with tool-use support) 8. Tool execution? (local / tools queue / agents queue) ↓ loop until done 9. Response sent back to channel 10. Session state → IDLE (Lua atomic)

Session State Machine

Each channel gets its own isolated session state stored in Redis at scalyclaw:session:{channelId}. The state machine has six states:

StateMeaning
IDLENo active request; ready to accept a new message
PROCESSINGOrchestrator pipeline is running for this channel
TOOL_EXECWaiting for a BullMQ tool or agent job to complete
RESPONDINGStreaming or sending the LLM response back to the channel
DRAININGFinishing the current turn; will pick up the next pending message
CANCELLINGAbort requested; cleaning up and returning to IDLE

All state transitions are performed by a Lua script executed atomically inside Redis. This guarantees that no two concurrent requests — even with multiple worker instances — can race into the PROCESSING state for the same channel simultaneously. The Lua atomicity also ensures the pending queue and session state stay consistent with each other at all times.

lua
-- Atomic transition: IDLE → PROCESSING
-- Returns 1 if acquired, 0 if already busy
local key = KEYS[1]           -- scalyclaw:session:{channelId}
local state = redis.call("GET", key)
if state == "PROCESSING" then
  return 0                    -- enqueue, do not process now
end
redis.call("SET", key, "PROCESSING", "EX", 300)
return 1                      -- caller may proceed

System Prompt Assembly

The orchestrator builds the system prompt fresh for every LLM call. It combines three sources:

  • Disk filesmind/IDENTITY.md, mind/SOUL.md, mind/USER.md (user-editable personality)
  • Code sections — six hardcoded sections in scalyclaw/src/prompt/: orchestrator, home, memory, vault, agents, skills
  • Dynamic data — current time, active channel, recent memories retrieved via semantic search, resolved secrets from vault, skill manifests, agent definitions

Tool Execution Routing

When the LLM emits a tool call, the unified tool router in tool-impl.ts decides where to execute it. Local tools run inline in the Node process. Heavy or sandboxed tools are dispatched to BullMQ queues and the Node awaits the result:

ToolExecution TargetQueue
execute_codeWorker sandboxscalyclaw-tools
execute_skillWorker skill runnerscalyclaw-tools
execute_commandWorker shell executorscalyclaw-tools
delegate_agentNode agent executorscalyclaw-agents
Everything elseNode — inline (direct execution)— (no queue)

Queue System

ScalyClaw uses six BullMQ queues. The separate Worker process consumes only the scalyclaw-tools queue. The Node process runs its own BullMQ workers for the remaining five queues. Jobs are persistent — Redis stores them until acknowledged — so a restart never loses work in progress.

BullMQ queue naming

BullMQ forbids colons (:) in queue names because it uses colons internally as key separators in Redis. All ScalyClaw queue names use hyphens — e.g. scalyclaw-messages, not scalyclaw:messages.

Queue nameConsumed byConcurrencyJob types
scalyclaw-messages Node (message-processor.ts) 3 message-processing, command
scalyclaw-agents Node (agent-processor.ts) 3 agent-task
scalyclaw-tools Worker (tool-processor.ts) tool-execution, skill-execution
scalyclaw-proactive Node (proactive-processor.ts) 1 proactive-check
scalyclaw-scheduler Node (scheduler-processor.ts) 2 reminder, recurrent-reminder, task, recurrent-task
scalyclaw-system Node (system-processor.ts) 2 memory-extraction, scheduled-fire, proactive-fire

Worker Scaling

Because the Worker is stateless and only consumes the scalyclaw-tools queue, you can run as many Worker processes as you need to scale tool throughput. Each Worker connects to the same Redis and competes with peers for jobs. BullMQ handles distributed locking internally — a job is processed by exactly one worker even when many are running. The five Node-internal queues (scalyclaw-messages, scalyclaw-agents, scalyclaw-proactive, scalyclaw-scheduler, scalyclaw-system) are always consumed by the Node process itself.

bash
# Run two workers for higher tool throughput
scalyclaw worker start &
scalyclaw worker start &

Configuration

ScalyClaw stores all runtime configuration in Redis at the key scalyclaw:config as a JSON object. There are no config files on disk — the install is self-contained and portable. When you change a setting in the dashboard, it writes directly to Redis; all processes pick it up without a restart.

Config Structure

json
{
  "orchestrator": {
    "id": "default",
    "maxIterations": 50,
    "models": [{ "model": "claude-sonnet-4-20250514", "weight": 100, "priority": 1 }],
    "skills": [],
    "agents": []
  },
  "gateway": {
    "host": "127.0.0.1",
    "port": 3000,
    "bind": "127.0.0.1",
    "authType": "none",
    "authValue": null,
    "tls": { "cert": "", "key": "" },
    "cors": ["*"]
  },
  "logs": { "level": "info", "format": "json", "type": "console" },
  "memory": {
    "topK": 10,
    "scoreThreshold": 0.5,
    "embeddingModel": "auto"
  },
  "queue": {
    "lockDuration": 120000,
    "stalledInterval": 30000,
    "limiter": { "max": 10, "duration": 1000 },
    "removeOnComplete": { "age": 86400, "count": 1000 },
    "removeOnFail": { "age": 604800 }
  },
  "models": {
    "providers": {
      "anthropic": { "apiKey": "sk-ant-..." },
      "openai": { "apiKey": "sk-..." }
    },
    "models": [{
      "id": "claude-sonnet",
      "name": "claude-sonnet-4-20250514",
      "provider": "anthropic",
      "enabled": true,
      "priority": 1,
      "weight": 100,
      "temperature": 0.7,
      "maxTokens": 8192,
      "contextWindow": 200000,
      "toolEnabled": true,
      "imageEnabled": true,
      "audioEnabled": false,
      "videoEnabled": false,
      "documentEnabled": true,
      "reasoningEnabled": false,
      "inputPricePerMillion": 3,
      "outputPricePerMillion": 15
    }],
    "embeddingModels": [{
      "id": "text-embedding-3-small",
      "name": "text-embedding-3-small",
      "provider": "openai",
      "enabled": true,
      "priority": 1,
      "weight": 100,
      "dimensions": 1536,
      "inputPricePerMillion": 0.02,
      "outputPricePerMillion": 0
    }]
  },
  "guards": {
    "message": {
      "enabled": false,
      "model": "",
      "echoGuard": { "enabled": true, "similarityThreshold": 0.9 },
      "contentGuard": { "enabled": true }
    },
    "skill": { "enabled": false, "model": "" },
    "agent": { "enabled": false, "model": "" }
  },
  "budget": {
    "monthlyLimit": 0,
    "dailyLimit": 0,
    "hardLimit": false,
    "alertThresholds": [50, 80, 90]
  },
  "proactive": {
    "enabled": true,
    "model": "",
    "cronPattern": "*/15 * * * *",
    "idleThresholdMinutes": 120,
    "cooldownSeconds": 14400,
    "maxPerDay": 3,
    "quietHours": {
      "enabled": true,
      "start": 22,
      "end": 8,
      "timezone": "UTC"
    },
    "triggers": {
      "undeliveredResults": true,
      "firedScheduledItems": true,
      "unansweredMessages": true
    }
  },
  "channels": {},
  "skills": [],
  "mcpServers": {}
}

Hot Reload via Pub/Sub

When skills or agents change — whether edited in the dashboard or auto-created by the LLM — Redis pub/sub signals all running processes to reload their in-memory manifests without a restart. The Node subscribes to these channels at startup and refreshes its internal registry immediately on receipt.

Pub/sub channelTriggered byEffect
scalyclaw:skills:reload Dashboard skill editor, LLM skill creation tool Node reloads all skill manifests from Redis; worker flushes its skill module cache
scalyclaw:agents:reload Dashboard agent editor, LLM agent creation tool Node reloads all agent definitions; next delegate_agent call uses updated config
scalyclaw:config:reload Dashboard config editor All processes re-read scalyclaw:config from Redis and apply the new settings without a restart
typescript
// Node subscribes at startup
const subscriber = redis.duplicate();
await subscriber.subscribe(
  "scalyclaw:skills:reload",
  "scalyclaw:agents:reload",
  "scalyclaw:config:reload",
);

subscriber.on("message", async (channel) => {
  if (channel === "scalyclaw:skills:reload") {
    await reloadSkills();
  } else if (channel === "scalyclaw:agents:reload") {
    await reloadAgents();
  } else if (channel === "scalyclaw:config:reload") {
    await reloadConfig();
  }
});

Secrets

API keys and tokens are stored in the config object directly (e.g., providers.anthropic.apiKey). For additional secrets (channel tokens, MCP headers, etc.), use the vault at scalyclaw:secret:{name} in Redis. Secrets are managed via the dashboard Vault page and never written to disk.

Tip

You can inspect the live config at any time with redis-cli GET scalyclaw:config | jq. Changes written by the dashboard are immediately visible there.