Architecture

ScalyClaw is built around three independent processes — Node, Worker, and Dashboard — connected entirely through Redis. There are no direct inter-process HTTP calls; every coordination happens through Redis data structures, BullMQ queues, and pub/sub channels. This design means each component can be scaled, restarted, or replaced independently without affecting the others.

Overview

Each process has a focused responsibility. The Node is the stateful brain: it owns channel connections, the orchestrator, and LLM calls. The Worker is the stateless execution engine: it picks up jobs from BullMQ queues and executes code, agents, and tools without holding any in-memory state. The Dashboard is a React 19 single-page application that reads and writes through a WebSocket-capable API, reflecting system state in real time.

Component	Technology	Responsibility
Node stateful	Bun, TypeScript	Channel adapters, session state machine, guards, orchestrator, system prompt assembly, LLM routing, queue producers
Worker stateless	Bun, TypeScript, BullMQ	BullMQ queue consumer for the `scalyclaw-tools` queue only — sandboxed code execution, skill invocations, and shell commands
Dashboard UI	React 19, Vite, WebSocket	16-page admin SPA — config management, channel setup, mind editor, memory browser, skill/agent management, real-time logs

All three components connect to the same Redis instance. Redis plays six distinct roles in the system:

Message bus — BullMQ queues for all async work
Session store — per-channel state machine data
Pending queue — structured message buffer per channel
Config store — live system configuration at scalyclaw:config
Secret vault — encrypted secrets at scalyclaw:secret:*
Pub/sub — hot-reload signals for skills, agents, and config

Design principle

ScalyClaw intentionally avoids config files on disk for runtime configuration. Everything the running system needs — API keys, model settings, channel tokens, feature flags — lives in Redis and can be changed live from the dashboard without a restart.

Component Diagram

Telegram Discord Slack Web ... ↓ ↓ ↓ ↓ NODE (channel adapters → orchestrator → LLM) ↕ Redis (BullMQ + pub/sub) WORKER (message · agents · tools · scheduler) ↕ Redis (config + WebSocket events) DASHBOARD (React 19 SPA · real-time WebSocket)

Message Flow

Every inbound message follows a deterministic pipeline. The pipeline is designed so that each stage can short-circuit cleanly — a guard rejection or session conflict stops processing early without leaking state.

1. Channel receives message ↓ 2. Session state machine (Redis Lua — atomic check-and-set) ↓ accepted → PROCESSING state 3. Pending queue (scalyclaw:pending:{channelId}) ↓ dequeue next message 4. Echo guard (reject messages from self) ↓ passed 5. Content guard (LLM-based policy check) ↓ approved 6. Orchestrator builds system prompt ↓ disk files + code sections + dynamic data 7. LLM call (streamed, with tool-use support) ↓ 8. Tool execution? (local / tools queue / agents queue) ↓ loop until done 9. Response sent back to channel ↓ 10. Session state → IDLE (Lua atomic)

Session State Machine

Each channel gets its own isolated session state stored in Redis at scalyclaw:session:{channelId}. The state machine has six states:

State	Meaning
`IDLE`	No active request; ready to accept a new message
`PROCESSING`	Orchestrator pipeline is running for this channel
`TOOL_EXEC`	Waiting for a BullMQ tool or agent job to complete
`RESPONDING`	Streaming or sending the LLM response back to the channel
`DRAINING`	Finishing the current turn; will pick up the next pending message
`CANCELLING`	Abort requested; cleaning up and returning to IDLE

All state transitions are performed by a Lua script executed atomically inside Redis. This guarantees that no two concurrent requests — even with multiple worker instances — can race into the PROCESSING state for the same channel simultaneously. The Lua atomicity also ensures the pending queue and session state stay consistent with each other at all times.

lua

-- Atomic transition: IDLE → PROCESSING
-- Returns 1 if acquired, 0 if already busy
local key = KEYS[1]           -- scalyclaw:session:{channelId}
local state = redis.call("GET", key)
if state == "PROCESSING" then
  return 0                    -- enqueue, do not process now
end
redis.call("SET", key, "PROCESSING", "EX", 300)
return 1                      -- caller may proceed

System Prompt Assembly

The orchestrator builds the system prompt fresh for every LLM call. It combines three sources:

Disk files — mind/IDENTITY.md, mind/SOUL.md, mind/USER.md (user-editable personality)
Code sections — six hardcoded sections in scalyclaw/src/prompt/: orchestrator, home, memory, vault, agents, skills
Dynamic data — current time, active channel, recent memories retrieved via semantic search, resolved secrets from vault, skill manifests, agent definitions

Tool Execution Routing

When the LLM emits a tool call, the unified tool router in tool-impl.ts decides where to execute it. Local tools run inline in the Node process. Heavy or sandboxed tools are dispatched to BullMQ queues and the Node awaits the result:

Tool	Execution Target	Queue
`execute_code`	Worker sandbox	`scalyclaw-tools`
`execute_skill`	Worker skill runner	`scalyclaw-tools`
`execute_command`	Worker shell executor	`scalyclaw-tools`
`delegate_agent`	Node agent executor	`scalyclaw-agents`
Everything else	Node — inline (direct execution)	— (no queue)

Queue System

ScalyClaw uses six BullMQ queues. The separate Worker process consumes only the scalyclaw-tools queue. The Node process runs its own BullMQ workers for the remaining five queues. Jobs are persistent — Redis stores them until acknowledged — so a restart never loses work in progress.

BullMQ queue naming

BullMQ forbids colons (:) in queue names because it uses colons internally as key separators in Redis. All ScalyClaw queue names use hyphens — e.g. scalyclaw-messages, not scalyclaw:messages.

Queue name	Consumed by	Concurrency	Job types
`scalyclaw-messages`	Node (`message-processor.ts`)	3	`message-processing`, `command`
`scalyclaw-agents`	Node (`agent-processor.ts`)	3	`agent-task`
`scalyclaw-tools`	Worker (`tool-processor.ts`)	—	`tool-execution`, `skill-execution`
`scalyclaw-proactive`	Node (`proactive-processor.ts`)	1	`proactive-check`
`scalyclaw-scheduler`	Node (`scheduler-processor.ts`)	2	`reminder`, `recurrent-reminder`, `task`, `recurrent-task`
`scalyclaw-system`	Node (`system-processor.ts`)	2	`memory-extraction`, `scheduled-fire`, `proactive-fire`

Worker Scaling

Because the Worker is stateless and only consumes the scalyclaw-tools queue, you can run as many Worker processes as you need to scale tool throughput. Each Worker connects to the same Redis and competes with peers for jobs. BullMQ handles distributed locking internally — a job is processed by exactly one worker even when many are running. The five Node-internal queues (scalyclaw-messages, scalyclaw-agents, scalyclaw-proactive, scalyclaw-scheduler, scalyclaw-system) are always consumed by the Node process itself.

bash

# Run two workers for higher tool throughput
scalyclaw worker start &
scalyclaw worker start &

Configuration

ScalyClaw stores all runtime configuration in Redis at the key scalyclaw:config as a JSON object. There are no config files on disk — the install is self-contained and portable. When you change a setting in the dashboard, it writes directly to Redis; all processes pick it up without a restart.

Config Structure

json

{
  "orchestrator": {
    "id": "default",
    "maxIterations": 50,
    "models": [{ "model": "claude-sonnet-4-20250514", "weight": 100, "priority": 1 }],
    "skills": [],
    "agents": []
  },
  "gateway": {
    "host": "127.0.0.1",
    "port": 3000,
    "bind": "127.0.0.1",
    "authType": "none",
    "authValue": null,
    "tls": { "cert": "", "key": "" },
    "cors": ["*"]
  },
  "logs": { "level": "info", "format": "json", "type": "console" },
  "memory": {
    "topK": 10,
    "scoreThreshold": 0.5,
    "embeddingModel": "auto"
  },
  "queue": {
    "lockDuration": 120000,
    "stalledInterval": 30000,
    "limiter": { "max": 10, "duration": 1000 },
    "removeOnComplete": { "age": 86400, "count": 1000 },
    "removeOnFail": { "age": 604800 }
  },
  "models": {
    "providers": {
      "anthropic": { "apiKey": "sk-ant-..." },
      "openai": { "apiKey": "sk-..." }
    },
    "models": [{
      "id": "claude-sonnet",
      "name": "claude-sonnet-4-20250514",
      "provider": "anthropic",
      "enabled": true,
      "priority": 1,
      "weight": 100,
      "temperature": 0.7,
      "maxTokens": 8192,
      "contextWindow": 200000,
      "toolEnabled": true,
      "imageEnabled": true,
      "audioEnabled": false,
      "videoEnabled": false,
      "documentEnabled": true,
      "reasoningEnabled": false,
      "inputPricePerMillion": 3,
      "outputPricePerMillion": 15
    }],
    "embeddingModels": [{
      "id": "text-embedding-3-small",
      "name": "text-embedding-3-small",
      "provider": "openai",
      "enabled": true,
      "priority": 1,
      "weight": 100,
      "dimensions": 1536,
      "inputPricePerMillion": 0.02,
      "outputPricePerMillion": 0
    }]
  },
  "guards": {
    "message": {
      "enabled": false,
      "model": "",
      "echoGuard": { "enabled": true, "similarityThreshold": 0.9 },
      "contentGuard": { "enabled": true }
    },
    "skill": { "enabled": false, "model": "" },
    "agent": { "enabled": false, "model": "" }
  },
  "budget": {
    "monthlyLimit": 0,
    "dailyLimit": 0,
    "hardLimit": false,
    "alertThresholds": [50, 80, 90]
  },
  "proactive": {
    "enabled": true,
    "model": "",
    "cronPattern": "*/15 * * * *",
    "idleThresholdMinutes": 120,
    "cooldownSeconds": 14400,
    "maxPerDay": 3,
    "quietHours": {
      "enabled": true,
      "start": 22,
      "end": 8,
      "timezone": "UTC"
    },
    "triggers": {
      "undeliveredResults": true,
      "firedScheduledItems": true,
      "unansweredMessages": true
    }
  },
  "channels": {},
  "skills": [],
  "mcpServers": {}
}

Hot Reload via Pub/Sub

When skills or agents change — whether edited in the dashboard or auto-created by the LLM — Redis pub/sub signals all running processes to reload their in-memory manifests without a restart. The Node subscribes to these channels at startup and refreshes its internal registry immediately on receipt.

Pub/sub channel	Triggered by	Effect
`scalyclaw:skills:reload`	Dashboard skill editor, LLM skill creation tool	Node reloads all skill manifests from Redis; worker flushes its skill module cache
`scalyclaw:agents:reload`	Dashboard agent editor, LLM agent creation tool	Node reloads all agent definitions; next `delegate_agent` call uses updated config
`scalyclaw:config:reload`	Dashboard config editor	All processes re-read `scalyclaw:config` from Redis and apply the new settings without a restart

typescript

// Node subscribes at startup
const subscriber = redis.duplicate();
await subscriber.subscribe(
  "scalyclaw:skills:reload",
  "scalyclaw:agents:reload",
  "scalyclaw:config:reload",
);

subscriber.on("message", async (channel) => {
  if (channel === "scalyclaw:skills:reload") {
    await reloadSkills();
  } else if (channel === "scalyclaw:agents:reload") {
    await reloadAgents();
  } else if (channel === "scalyclaw:config:reload") {
    await reloadConfig();
  }
});

Secrets

API keys and tokens are stored in the config object directly (e.g., providers.anthropic.apiKey). For additional secrets (channel tokens, MCP headers, etc.), use the vault at scalyclaw:secret:{name} in Redis. Secrets are managed via the dashboard Vault page and never written to disk.

Tip

You can inspect the live config at any time with redis-cli GET scalyclaw:config | jq. Changes written by the dashboard are immediately visible there.