Built Plan · v1 · May 2026

Brigade v1 — what ships in 5.5 hours.

A concrete build spec. Pi SDK as the engine, OpenClaw-grade internals, NanoClaw-style smallness, Boop-pattern memory + sub-agents. Every code snippet on this page is verified against live source in the workspace on May 1, 2026.

~1,200LOC trunk
8tools
6primitives @ grade-A
1channel (TUI) v1
5.5hpair-program target
opus 4.7default model
§1 · At a glance

What Brigade IS

A personal headless agent service with a single core process, multiple streaming channel clients (TUI → Web → Mobile), persistent file-based memory, file-based skills, sub-agent support, and a deliberately small surface — built on Pi SDK so the loop is rock-solid from day one.

Identity

OpenClaw's brain, NanoClaw's smallness, in a fresh codebase you fully understand and control.

Foundation

Pi SDK (@mariozechner/pi-coding-agent@0.70.6) — the same stack OpenClaw runs on.

Wire format

WebSocket on ws://localhost:7777 — TUI, web, mobile all speak one protocol.

VIZ · 01 Brigade — system architecture

Single headless core. Many channel clients. One wire format. Every box maps to a real file in §4.

flowchart TB subgraph clients["CHANNEL CLIENTS · WS"] direction LR TUI["🖥️ TUI
v1 · Pi-TUI"] WEB["🌐 Web
v1.5 · Vite+React"] MOB["📱 Mobile
v2 · Expo / RN"] end subgraph core["BRIGADE CORE · single Node process"] direction TB WS["WebSocket Gateway
ws://localhost:7777"] BA["BrigadeAgent wrapper
Pi event → WS broadcast"] PI["Pi Agent loop
@mariozechner/pi-agent-core"] SP["Layered system prompt
cache-stable · 8 .md files"] SM["Pi SessionManager
JSONL tree"] MEM["Memory Store
JSONL per segment"] SK["Skills loader
Pi DefaultResourceLoader"] TOOLS["Tools · 8 v1
read · bash · edit · write · grep
+ write_memory · recall_memory · spawn_agent
"] SUB["spawn_agent → fresh Pi Agent
scoped tools · own context"] end subgraph providers["LLM PROVIDERS · pluggable via Pi-AI"] direction LR AN["Anthropic
claude-opus-4-7"] OAI["OpenAI"] GR["Groq · Ollama · OpenRouter"] end TUI --> WS WEB --> WS MOB --> WS WS --> BA BA --> PI PI --> SP PI --> SM PI --> TOOLS PI --> SK TOOLS --> MEM TOOLS --> SUB PI --> AN PI --> OAI PI --> GR classDef channel fill:#dbeafe,stroke:#1d4ed8,color:#1e3a8a,stroke-width:2px classDef coreBox fill:#fde7c4,stroke:#d97706,color:#92400e,stroke-width:2px classDef provider fill:#d1fae5,stroke:#047857,color:#064e3b,stroke-width:2px class TUI,WEB,MOB channel class WS,BA,PI,SP,SM,MEM,SK,TOOLS,SUB coreBox class AN,OAI,GR provider
§ Anatomy of a single turn

End-to-end — keystroke to streamed reply

Every box in the architecture diagram is a step in this sequence. Trace your finger from keystroke to render — that's the whole life of one turn.

FLOW · 01 User → TUI → Core → Pi → Anthropic → Tool → Anthropic → Core → TUI → User

A typical turn that calls one tool. ~14 hops, ~2-4 seconds wall clock for a short reply.

sequenceDiagram autonumber participant U as 👤 User participant TUI as 🖥️ TUI
(Pi-TUI Editor) participant WSC as ws-client.ts
(in TUI) participant WSS as ws-server.ts
(in Core) participant BA as BrigadeAgent
wrapper participant PI as Pi Agent loop
(pi-agent-core) participant AI as Anthropic API
(via pi-ai streamFn) participant T as write_memory
tool participant FS as 💾 .brigade/memory/
identity.jsonl U->>TUI: types "remember I prefer Python"
+ hits Enter TUI->>WSC: editor.onSubmit(text) WSC->>WSS: { type:"prompt", text } via WebSocket WSS->>BA: brigadeAgent.prompt(text) BA->>PI: agent.prompt(text) + emits "agent_start" WSS-->>TUI: broadcast "agent_start" → spinner shows PI->>PI: assembleSystemPrompt() reads .brigade/prompts/*.md PI->>PI: auto-recall hook injects memory citations PI->>AI: streamFn(model, context, tools)
HTTP POST + cache headers AI-->>PI: stream: text_delta + tool_use(write_memory,{...}) PI-->>BA: emits "message_delta" per chunk BA-->>WSS: broadcast each delta WSS-->>TUI: deltas → Markdown component updates PI->>PI: AJV validates tool args (TypeBox schema) PI->>T: execute(toolCallId, params, signal, onUpdate) T->>FS: fs.appendFile("identity.jsonl", JSON.stringify(entry)) FS-->>T: ok T-->>PI: { content:[{text:"Stored mem_..."}], details:{id} } PI->>AI: continue streamFn with toolResult appended AI-->>PI: stream: final text "Got it — I'll remember that." PI-->>BA: emits "message_end" + "agent_end" BA-->>WSS: broadcast "agent_end" WSS-->>TUI: agent_end → spinner clears TUI->>U: rendered reply visible
Steps 1–4 Input → wire

Editor submit fires; TUI's WS client serializes a single JSON message and sends it. Core's WS server parses and dispatches to brigadeAgent.prompt().

Steps 5–10 Context assembly

BrigadeAgent wraps Pi. assembleSystemPrompt() reads layered .md files and inserts the cache boundary. Auto-recall hook keyword-greps memory and prepends top matches.

Steps 11–15 Stream + tool dispatch

pi-ai issues HTTP to Anthropic with prompt-cache headers. Streaming chunks fan out to every WS client as message_delta. When a tool_use block arrives, AJV validates args against the TypeBox schema before execute() ever runs.

Steps 16–25 Tool result → final reply

Tool returns {content, details}. Pi appends a toolResult message and re-enters the loop. Anthropic returns the final text. Pi emits agent_end. WS broadcasts. TUI clears the spinner and finalizes the markdown.

FLOW · 02 Latency budget — typical single-tool turn

Where the wall-clock seconds go. Anthropic streaming dominates; everything else is sub-100ms.

WS round-trip (TUI ↔ Core, both directions)~5-15 ms
Prompt assembly + auto-recall~20-50 ms (file I/O)
Anthropic time-to-first-token~400-900 ms (warm cache: 150-300)
Streaming + tool execute + 2nd model call~1-3 s for short reply
§2 · The 6 primitives at grade-A

What "OpenClaw quality" actually means

These six primitives ARE the agent. Get every one to grade-A in v1 and Brigade is peer-grade with OpenClaw / Boop / Hermes at the runtime layer. Skip even one and Brigade is a toy.

PrimitiveToy versionBrigade v1 bar (grade-A)
Agent Loop Calls model in a while-loop Pi Agent with abort, steer, followUp, hooks, transformContext, AbortSignal threaded everywhere
System Prompt One big string concatenated each turn Layered .md files (soul / identity / instructions / tools), explicit cache boundary, dynamic memory section last → 10× cost win from turn 2
Tools Map of name → function TypeBox schemas + AJV validation + onUpdate streaming + {content, details} split + before/after hooks
Memory Array in RAM, lost on restart File-based JSONL with frontmatter + auto-recall extension hook + post-turn extraction (Boop pattern)
Skills All loaded at boot, full bodies in prompt Pi auto-discovery + eligibility filtering + lazy body load. Drop-a-folder-it-works.
Sub-agents Recursive function call Boop dispatcher/executor pattern — isolated Pi session, scoped tools, abort propagation, result-as-tool-result
VIZ · 02 Coverage of the 6 primitives — Brigade target vs neighbours

Each axis = one primitive scored 0–10. Brigade v1 targets grade-A (8/10) on all six, matching OpenClaw's runtime quality at a fraction of OpenClaw's surface area.

§3 · Stack decisions

Pi as engine, lifts as accelerators

Every layer named, every choice justified, every source linked.

LayerDecisionSource / lift
Loop enginePi Agent@mariozechner/pi-agent-core
SessionsPi JSONL tree (id/parentId)pi-coding-agent SessionManager
Built-in toolsread · bash · edit · write · grepPi createCodingTools()
WS gateway17-line broadcast patternLift boop/server/broadcast.ts
Sub-agentspawn_agent tool → fresh Pi AgentBoop execution-agent.ts:92-248
Memory toolsJSONL file per segmentBoop memory/tools.ts de-Convex'd
System promptLayered .md + cache boundaryOpenClaw system-prompt.ts pattern
Skills loaderPi DefaultResourceLoaderPi coding-agent built-in
Auto-recallPi context extension hookPi extension API
TUIPi-TUI components (no flicker)@mariozechner/pi-tui
Wire formatWebSocket JSON, ~15 message typesBrigade-original (Pi events 1:1)
Default modelclaude-opus-4-7Anthropic via Pi-AI
Code styleTS strict + BiomeSame as pi-mono
§ Tech stack · Bill of materials

Every dependency, every version, every role

Brigade's full stack — eight layers, ~14 direct dependencies, zero magic. Versions pinned to what the local pi-mono repo and boop-agent use today.

14Direct deps
4Pi packages
8+LLM providers
0Magic / vendor lock-in

Layer 1 · Loop engine

Pi SDK

The agent loop itself — model call → tool dispatch → result back → loop. Multi-provider streaming, abort, hooks.

"@mariozechner/pi-agent-core": "0.70.6"
"@mariozechner/pi-ai":         "0.70.6"

Layer 2 · Sessions + tools + extensions

Pi SDK

JSONL session tree, ResourceLoader (auto-discovers skills/extensions), AuthStorage, ModelRegistry, built-in coding tools.

"@mariozechner/pi-coding-agent": "0.70.6"

Layer 3 · Terminal UI

Pi SDK

Differential rendering (no flicker), Markdown component, Editor with autocomplete, Loader spinner.

"@mariozechner/pi-tui": "0.70.6"
"chalk":                "^5"

Layer 4 · LLM provider

External

Default Anthropic. Pi-AI swaps providers with one line. 8+ supported including OpenAI, Google, Groq, Ollama, OpenRouter.

"@anthropic-ai/sdk": "^0.81"
// model: "claude-opus-4-7"

Layer 5 · WebSocket gateway

Lift Boop

17-LOC broadcaster lifted from boop-agent. One core, many channel clients.

"ws":        "^8"
"@types/ws": "^8"

Layer 6 · Schema + validation

Inherited via Pi

TypeBox describes tool params; AJV validates at runtime before execute(). Single source of TS types + JSON schemas.

"@sinclair/typebox": "^0.34"
// ajv inherited via pi-agent-core

Layer 7 · Build & dev tooling

Standard

pnpm workspace, TypeScript 5 strict, Biome (lint + format), tsx for dev. Same setup as pi-mono.

"typescript": "^5"
"tsx":        "^4"
"@biomejs/biome": "^1.9"
# pnpm 8+ as package mgr

Layer 8 · Runtime

Standard

Node.js 20+ for the core. Bun optional for the TUI (faster startup). Zero browser dependencies in v1.

# engines.node >= 20.0.0
# platform: macOS / Linux / Windows
PLANNED Future channel stacks (v1.5 → v2)

Same WebSocket protocol — different renderers. Each new channel is just a UI on top of an unchanged core.

v1.5 · Web ~1 weekend · drop-in to nodebase

Aligned to D:\nodebase exactly so Brigade can mount inside nodebase as a route or sub-app later with zero stack rewiring.

"next":                  "^16.1.0"     // Turbopack
"react":                 "19.1.0"
"react-dom":             "19.1.0"
"tailwindcss":           "^4"
"ai":                    "^5.0.60"     // Vercel AI SDK 5
"@ai-sdk/anthropic":     "^2.0.23"
"@radix-ui/react-*":     "latest"      // shadcn primitives
"lucide-react":          "latest"
"convex":                "^1.18"      // optional sync
// shadcn style: "new-york" / base: "neutral"
// also lifts Boop's debug/ panels for memory + agent timeline
v2 · Mobile ~1 weekend · NativeWind for shadcn-parity

Same component vocabulary as v1.5 (Radix → React Native via NativeWind + RN-Reusables), so design tokens carry across web/mobile.

"expo":                  "~51"
"expo-router":           "~3.5"
"react-native":          "0.74"
"react":                 "19.1.0"
"nativewind":            "^4"          // Tailwind for RN
"react-native-reusables": "latest"     // shadcn-port
"expo-notifications":    "~0.28"
"ai":                    "^5.0.60"
// reuse v1.5 WS client logic + types verbatim
§ Packages · Real package.json files

Every package.json Brigade v1 ships with

Five package.json files total. Workspace root + four leaf packages. Copy-paste these to bootstrap.

brigade/package.json workspace root · pnpm

Root — workspace manifest

Just a workspace shell. No runtime deps at the root; only dev tooling shared across all apps and packages.

{
  "name": "brigade",
  "version": "1.0.0",
  "private": true,
  "packageManager": "pnpm@9.12.0",
  "engines": { "node": ">=20.0.0" },
  "scripts": {
    "dev:core":  "pnpm --filter @brigade/core dev",
    "dev:tui":   "pnpm --filter @brigade/tui  dev",
    "dev":       "pnpm -r --parallel dev",
    "build":     "pnpm -r build",
    "check":     "biome check .",
    "format":    "biome format --write ."
  },
  "devDependencies": {
    "@biomejs/biome": "^1.9.4",
    "@types/node":    "^22.7.0",
    "tsx":            "^4.19.0",
    "typescript":     "^5.6.0"
  }
}
apps/core/package.json headless service · the engine

Core — the headless agent service

Wraps Pi's Agent, runs WS server on port 7777, persists sessions and memory. Three Pi packages + WebSocket + Anthropic SDK = the whole engine.

{
  "name": "@brigade/core",
  "version": "1.0.0",
  "private": true,
  "type": "module",
  "scripts": {
    "dev":   "tsx watch src/index.ts",
    "build": "tsc -p tsconfig.json",
    "start": "node dist/index.js"
  },
  "dependencies": {
    "@anthropic-ai/sdk":             "^0.81.0",
    "@brigade/protocol":            "workspace:*",
    "@mariozechner/pi-agent-core":   "0.70.6",
    "@mariozechner/pi-ai":           "0.70.6",
    "@mariozechner/pi-coding-agent": "0.70.6",
    "@sinclair/typebox":            "^0.34.0",
    "ws":                           "^8.18.0"
  },
  "devDependencies": {
    "@types/ws": "^8.5.12"
  }
}
apps/tui/package.json v1 channel · terminal UI

TUI — the v1 channel client

Pi-TUI for the renderer (no flicker, markdown, editor with autocomplete). ws for the connection. Tiny.

{
  "name": "@brigade/tui",
  "version": "1.0.0",
  "private": true,
  "type": "module",
  "bin": { "brigade": "./dist/index.js" },
  "scripts": {
    "dev":   "tsx src/index.ts",
    "build": "tsc -p tsconfig.json"
  },
  "dependencies": {
    "@brigade/protocol":       "workspace:*",
    "@mariozechner/pi-tui":   "0.70.6",
    "chalk":                  "^5.3.0",
    "ws":                    "^8.18.0"
  },
  "devDependencies": {
    "@types/ws": "^8.5.12"
  }
}
packages/protocol/package.json shared types · zero runtime deps

Protocol — WebSocket message types

Pure TS types. Zero runtime dependencies. Imported by core and every channel client so message shapes stay in sync.

{
  "name": "@brigade/protocol",
  "version": "1.0.0",
  "private": true,
  "type": "module",
  "main": "./dist/index.js",
  "types": "./dist/index.d.ts",
  "exports": { ".": "./src/index.ts" },
  "scripts": {
    "build": "tsc -p tsconfig.json"
  }
}
pnpm-workspace.yaml workspace topology

Workspace — the package graph

Two glob patterns describe the whole repo. apps/* and packages/* — that's it.

packages:
  - "apps/*"
  - "packages/*"
RATIONALE Why these specific versions

Pinned to what the local reference repos actually use today.

PackageVersionWhy pinned here
@mariozechner/pi-*0.70.6Latest stable in pi-mono/packages/*/package.json on May 1, 2026. OpenClaw is pinned to the older 0.66.1; Brigade ships fresh and tracks current.
@anthropic-ai/sdk^0.81.0Same major as OpenClaw's pinned override; Pi-AI tested against this line
@sinclair/typebox^0.34.0Pi-AI re-exports Type from this; explicit dep avoids hoisting surprises
ws^8.18.0Same as Boop's broadcast.ts target; native Node WS, no transitive bloat
typescript^5.6.0Matches pi-mono's target (verified in pi-mono/package.json)
tsx^4.19.0Fastest TS dev runner; supports ESM + watch mode out of the box
@biomejs/biome^1.9.4Same as pi-mono. One binary for lint + format, zero config beyond biome.json
pnpm9.12.0Best workspace ergonomics; deterministic installs; same as pi-mono
§4 · File layout

Where every line lives

~1,200 LOC of trunk code, organized as a pnpm workspace. Apps consume packages. Everything new lives in brigade/.

brigade/
├── apps/
│   ├── core/                       # headless service (v1)
│   │   └── src/
│   │       ├── index.ts                [NEW]     boot: load tools, start WS
│   │       ├── ws-server.ts            [lift Boop]  WebSocket broadcast (~50 LOC)
│   │       ├── brigade-agent.ts        [NEW]     wraps Pi Agent + WS broadcast
│   │       ├── prompt-assembler.ts     [NEW]     layered .md + cache boundary
│   │       ├── session-manager.ts      [Pi]      thin wrapper over Pi SessionManager
│   │       ├── memory/
│   │       │   ├── store.ts            [NEW]     file-based JSONL (~80 LOC)
│   │       │   └── inject.ts           [NEW]     auto-recall context hook
│   │       └── tools/
│   │           ├── memory.ts           [adapt Boop] write_memory, recall_memory
│   │           └── spawn.ts            [adapt Boop] spawn_agent (sub-agent)
│   ├── tui/                        # v1 channel
│   │   └── src/
│   │       ├── index.ts                [NEW]     entry, ws://localhost:7777
│   │       ├── ws-client.ts            [NEW]     receive + dispatch
│   │       ├── render.ts               [Pi-TUI]   Markdown + Loader components
│   │       └── input.ts                [Pi-TUI]   Editor + autocomplete
│   ├── web/                        # v1.5 placeholder (lift Boop debug/)
│   └── mobile/                     # v2 placeholder (Expo)
├── packages/
│   └── protocol/
│       └── src/index.ts                [NEW]     WS message types (TS union)
├── .brigade/
│   ├── memory/                         # auto-created, .jsonl per segment
│   ├── skills/                         # user drops SKILL.md here, Pi auto-discovers
│   ├── prompts/                        # soul.md, identity.md, instructions.md, tools.md
│   └── sessions/                       # Pi JSONL transcripts
├── BRIGADE.md                          [NEW]     Brigade's own AGENTS.md
├── package.json
├── pnpm-workspace.yaml
├── tsconfig.base.json
└── biome.json

Legend: [NEW] Brigade-original · [lift / adapt] taken from another repo · [Pi / Pi-TUI] consumed from Pi SDK as-is

VIZ · 03 LOC budget — where the ~1,200 lines go

Most of Brigade's code is the TUI client and the core orchestration. Memory + tools + protocol total under 350 LOC. Pi handles everything else.

§5 · The 8 verified primitives in code

Every snippet, every source — verified against live repos

A subagent re-read the source on May 1, 2026. Every signature, every import, every line below traces to a file in this workspace.

① Agent Loop from pi-mono/packages/agent/src/agent.ts:190

Pi Agent — the loop, free

Brigade doesn't write the loop. It instantiates Pi's Agent and adds a thin wrapper that broadcasts every event over WebSocket. Multi-provider streaming, abort, steer, followUp, hooks — all inherited.

import { Agent } from "@mariozechner/pi-agent-core";
import { streamSimple, getModel } from "@mariozechner/pi-ai";

const agent = new Agent({
  initialState: {
    systemPrompt: await assembleSystemPrompt("./.brigade"),
    model: getModel("anthropic", "claude-opus-4-7"),
    tools: brigadeTools,
    thinkingLevel: "off",
  },
  streamFn: streamSimple,
});

agent.subscribe(broadcastToWS);    // every Pi event → all WS clients
await agent.prompt(userMessage); // runs to completion, streams over WS
② Custom tool using AgentTool from pi-mono/packages/agent/src/types.ts:308

The write_memory tool — TypeBox schema, dual content/details

Every Brigade tool follows this exact shape: TypeBox schema → typed params, returns content (goes to LLM) + details (goes to UI). AJV validates before execute() ever runs.

import { Type } from "@mariozechner/pi-ai";
import type { AgentTool } from "@mariozechner/pi-agent-core";

const writeMemoryParams = Type.Object({
  content:    Type.String({ description: "Fact to remember" }),
  segment:    Type.Union([Type.Literal("identity"), Type.Literal("preference"),
                          Type.Literal("project"),  Type.Literal("knowledge")]),
  importance: Type.Number({ minimum: 0, maximum: 1 }),
});

export const writeMemoryTool: AgentTool<typeof writeMemoryParams> = {
  name: "write_memory",
  label: "Write Memory",
  description: "Persist a durable fact for future turns",
  parameters: writeMemoryParams,
  async execute(_id, params) {
    const id = await memoryStore.write(params);
    return {
      content: [{ type: "text", text: `Stored ${id}` }], // → LLM sees this
      details: { id, ...params },                              // → UI only
    };
  },
};
③ Memory store distilled from boop/server/memory/tools.ts:18 (Convex stripped)

File-based memory — JSONL per segment, no DB needed

Boop's memory model is gold; its Convex backend is overkill for v1. Brigade keeps the segment / importance / lifecycle vocabulary, swaps storage to JSONL files. Trivial to upgrade to SQLite-FTS5 later if keyword search struggles.

import * as fs from "node:fs/promises";
import * as path from "node:path";

const MEMORY_DIR = "./.brigade/memory";
const SEGMENTS = ["identity", "preference", "project", "knowledge"] as const;

export async function write(record: { content: string; segment: string; importance: number }) {
  await fs.mkdir(MEMORY_DIR, { recursive: true });
  const id = `mem_${Date.now()}_${Math.random().toString(36).slice(2, 8)}`;
  const entry = { id, ...record, created: Date.now(), accessed: Date.now() };
  await fs.appendFile(path.join(MEMORY_DIR, `${record.segment}.jsonl`),
                       JSON.stringify(entry) + "\n");
  return id;
}

export async function recall(query: string, limit = 10) {
  const hits: any[] = [];
  for (const seg of SEGMENTS) {
    try {
      const lines = (await fs.readFile(path.join(MEMORY_DIR, `${seg}.jsonl`), "utf8"))
        .split("\n").filter(Boolean);
      for (const line of lines) {
        const rec = JSON.parse(line);
        if (rec.content.toLowerCase().includes(query.toLowerCase())) hits.push(rec);
      }
    } catch {}
  }
  return hits.slice(0, limit);
}
④ Sub-agent / spawn adapted from boop/server/execution-agent.ts:92

Dispatcher → Executor — Boop's pattern, Pi-flavored

The most powerful agent pattern in production. Cheap dispatcher answers chit-chat directly; only spawns an expensive executor when real work is needed. Each spawn = isolated Pi session, scoped toolset, separate context window. Result returns as a tool result.

import { createAgentSession, SessionManager } from "@mariozechner/pi-coding-agent";
import type { AgentTool } from "@mariozechner/pi-agent-core";

const spawnParams = Type.Object({
  task:  Type.String({ description: "What the sub-agent should do" }),
  tools: Type.Optional(Type.Array(Type.String())),
});

export const spawnAgentTool: AgentTool<typeof spawnParams> = {
  name: "spawn_agent",
  label: "Spawn Agent",
  description: "Delegate a focused task to a sub-agent with its own context window",
  parameters: spawnParams,
  async execute(toolCallId, { task, tools }, signal) {
    const { session } = await createAgentSession({
      model: getModel("anthropic", "claude-opus-4-7"),
      sessionManager: SessionManager.inMemory(),
      customTools: filterTools(tools ?? defaultExecutorTools),
    });
    session.agent.setSystemPrompt(`You are an executor. Your single task: ${task}`);
    await session.agent.prompt(task);
    const finalText = extractFinalText(session.messages);
    return {
      content: [{ type: "text", text: finalText }],
      details: { sessionId: session.id, toolsUsed: session.toolUsageStats() },
    };
  },
};
VIZ · 04 Sub-agent dispatch — what happens on spawn_agent

Parent agent stays cheap. Executor gets its own context window and tool subset. Result returns as a normal tool result.

sequenceDiagram autonumber participant U as User
(via TUI) participant P as Parent Brigade Agent
(dispatcher) participant T as spawn_agent tool participant E as Executor (fresh Pi Agent) participant LLM as Anthropic API U->>P: "Refactor the auth module" P->>LLM: prompt + tools (incl. spawn_agent) LLM-->>P: tool_use spawn_agent({ task, tools:["read","edit","grep"] }) P->>T: execute(toolCallId, params, signal) T->>E: createAgentSession({ scoped tools }) E->>LLM: own loop, own context window LLM-->>E: streaming response + tool calls E-->>E: read/edit/grep iteratively E-->>T: final assistant text T-->>P: { content:[{text}], details:{sessionId, toolsUsed} } P->>LLM: continue with tool result LLM-->>P: final reply to user P-->>U: streamed via WebSocket
⑤ Layered system prompt pattern from openclaw/src/agents/system-prompt.ts:39

Cache-stable prompt assembly — 10× cost win from turn 2

OpenClaw orders its 8 prompt files by priority (10–70) and explicitly marks where the Anthropic prompt cache should boundary. Stable identity above; volatile time/memory below. Cache hit rate determines whether Brigade costs $0.10 or $1.00 per turn.

import * as fs from "node:fs/promises";

const CACHE_BOUNDARY = "<!-- BRIGADE_CACHE_BOUNDARY -->";
const ORDER = [                              // stable first, volatile last
  { file: "soul.md",         pri: 20 }, // personality / tone
  { file: "identity.md",     pri: 30 }, // "You are Brigade…"
  { file: "instructions.md", pri: 40 }, // hard rules / prohibitions
  { file: "tools.md",        pri: 50 }, // how to use tools
];

export async function assembleSystemPrompt(brigadeDir: string): Promise<string> {
  const stable: string[] = [];
  for (const { file } of ORDER) {
    try { stable.push(await fs.readFile(`${brigadeDir}/prompts/${file}`, "utf8")); }
    catch {}
  }
  const volatile = [
    `Current time: ${new Date().toISOString()}`,
    await renderRecentMemoryCitations(),         // changes every turn
  ].join("\n\n");
  return stable.join("\n\n---\n\n") + "\n\n" + CACHE_BOUNDARY + "\n\n" + volatile;
}
⑥ WebSocket broadcast verbatim lift from boop/server/broadcast.ts — 17 LOC, MIT

The smallest WS gateway that works

No room to mess this up. Boop's broadcast is 17 lines and has been running in production for months. Brigade copies it verbatim with attribution.

// adapted from boop-agent (MIT) — server/broadcast.ts
import type { WebSocket } from "ws";

const clients = new Set<WebSocket>();

export function addClient(ws: WebSocket): void {
  clients.add(ws);
  ws.on("close", () => clients.delete(ws));
}

export function broadcast(event: string, data: unknown): void {
  const payload = JSON.stringify({ event, data, at: Date.now() });
  for (const ws of clients) {
    if (ws.readyState === 1) ws.send(payload);
  }
}
⑦ TUI client scaffold from pi-mono/packages/tui + the Pi gist

Pi-TUI markdown + editor — flicker-free, autocompleted

The same TUI components OpenClaw renders. Differential rendering means no flicker. Autocomplete provider for slash commands. WebSocket events drive the renderer.

import { TUI, ProcessTerminal, Editor, Markdown } from "@mariozechner/pi-tui";
import { connect } from "./ws-client.js";

const tui = new TUI(new ProcessTerminal());
const editor = new Editor(tui, theme);
tui.addChild(editor);
tui.setFocus(editor);

const ws = connect("ws://localhost:7777");
let streamingMd: Markdown | null = null;
let buffer = "";

ws.on("message_delta", ({ delta }) => {
  buffer += delta;
  if (!streamingMd) { streamingMd = new Markdown(buffer); tui.insertBeforeFocus(streamingMd); }
  else streamingMd.setText(buffer);
  tui.requestRender();
});
ws.on("agent_end", () => { streamingMd = null; buffer = ""; });

editor.onSubmit = (text) => ws.send({ type: "prompt", text });
tui.start();
⑧ WS protocol Brigade-original · packages/protocol/src/index.ts

The Brigade wire — discriminated union, 1:1 with Pi events

Channels (TUI, web, mobile) speak this. ~15 message types. Server messages mirror Pi event shapes so the client renderer is dead simple.

export type ClientMessage =
  | { type: "prompt";        sessionId: string; text: string; images?: ImageContent[] }
  | { type: "abort";         sessionId: string }
  | { type: "steer";         sessionId: string; text: string }
  | { type: "follow_up";     sessionId: string; text: string }
  | { type: "set_model";     sessionId: string; provider: string; modelId: string }
  | { type: "list_sessions" }
  | { type: "open_session";  sessionId: string }
  | { type: "new_session" };

export type ServerMessage =
  | { type: "agent_start";            sessionId: string }
  | { type: "turn_start";             sessionId: string }
  | { type: "message_start";          sessionId: string; role: string }
  | { type: "message_delta";          sessionId: string; delta: string }
  | { type: "message_end";            sessionId: string; message: AgentMessage }
  | { type: "tool_execution_start";   sessionId: string; toolCallId: string; toolName: string; args: any }
  | { type: "tool_execution_update";  sessionId: string; toolCallId: string; partial: any }
  | { type: "tool_execution_end";     sessionId: string; toolCallId: string; result: any; isError: boolean }
  | { type: "turn_end";               sessionId: string; usage: TokenUsage }
  | { type: "agent_end";              sessionId: string }
  | { type: "session_list";           sessions: SessionMeta[] }
  | { type: "error";                  message: string };
§6 · Build order

5.5 hours, end-to-end

Pair-programmed with Claude Opus 4.7. Each step is sequential — finish before the next. Times are realistic, not optimistic.

VIZ · 05 Build timeline — every step's duration

Cumulative ~5.5 hours. The TUI client is the longest single step (1 hr). Lifts from Boop / OpenClaw shrink the rest.

0

Scaffold pnpm workspace

Create brigade/, root package.json + pnpm-workspace.yaml, base tsconfig.json, biome.json, .gitignore, empty .brigade/ dirs.

15 min
1

Protocol package

Write packages/protocol/src/index.ts — the discriminated-union message types. ~50 LOC. Shared by core and every channel.

15 min
2

WS gateway (lift Boop verbatim)

Copy broadcast.ts from boop-agent. Add ws-server.ts that accepts connections on port 7777 and parses incoming JSON.

15 min
3

Brigade Agent wrapper

Write brigade-agent.ts — wraps Pi Agent, subscribes to all events, broadcasts each over WS. ~80 LOC.

30 min
4

Layered system prompt assembler

Port OpenClaw's pattern. Reads .brigade/prompts/*.md in priority order, inserts cache boundary, appends volatile section. ~40 LOC.

30 min
5

Memory store + tools

File-based JSONL implementation + write_memory + recall_memory tools with TypeBox schemas. ~120 LOC total.

30 min
6

spawn_agent tool

Adapt Boop's executor pattern. Tool that creates a fresh Pi session with scoped tools, runs to completion, returns final text. ~60 LOC.

20 min
7

Auto-recall extension

Pi context hook that runs recall_memory on the last user message and injects top-N matches into the system prompt. ~40 LOC.

20 min
8

Core entry point

apps/core/src/index.ts — wires Pi createAgentSession() with all custom tools, starts WS server, handles client messages. ~100 LOC.

30 min
9

TUI channel client

apps/tui/src/* — Pi-TUI components, WS client, event-driven render, editor for input with slash-command autocomplete. ~250 LOC.

1 hour
10

Smoke test

Boot core, boot TUI, run a real conversation: file ops + memory write/recall + one sub-agent spawn. Fix what breaks.

30 min
§7 · Out of scope for v1

What we deliberately don't build

Each of these is a real feature in OpenClaw or Hermes. Each is also a multi-week side quest. Defer them all to v1.5+.

ACP / IDE integration Sandboxing (Docker, micro-VMs) Multi-user / auth Cron / scheduler Composio integrations Vector embeddings Memory decay curves Two-DB session isolation Multi-channel routing MCP runtime Cloud deploy Approval flows Cost dashboards Eval harness Plugin SDK Self-improving learning loop
§8 · Future grades · The complete roadmap

Every version from v1 → v5+ in one place

Permanent reference. Each grade is a named bundle of capability with a fixed time estimate, exact tech additions, and a "definition of done" so you know when to ship.

MATRIX Grade × capability

A single-glance index. Find your current grade, look right to see what's already in, look down to see what's coming.

Grade Theme Time Status
v1The 6 primitives at grade-A · TUI · sub-agents~5.5 h🎯 next build
v1.5Web dashboard (Next.js 16 · drop-in to nodebase)~1 weekendqueued
v2Mobile (Expo + NativeWind)~1 weekendqueued
v2.5Memory upgrade — SQLite-FTS5 + decay + embeddings~3-5 daysqueued
v3Production hardening — approvals · budgets · audit · evals~2-3 weeksqueued
v3.5Sandbox via Docker (per-session container)~1 weekqueued
v4Multi-channel — Telegram · Discord · Slack · scheduler~2-4 weeksqueued
v5ACP server — Brigade in Zed · Cursor · JetBrains~1 weekqueued
v5+Self-improving learning loop (Hermes-inspired)researchexploratory
v1 · ~5.5 hours · 🎯 next build

Brigade Core — the engine

Everything in this brief from §1 through §6. Ships a real working terminal agent.

Adds
  • Pi Agent loop + 6 primitives at grade-A
  • 8 tools (read/bash/edit/write/grep + memory ×2 + spawn_agent)
  • WS gateway on :7777
  • Pi-TUI channel with markdown + editor
  • JSONL sessions + JSONL memory
  • Layered system prompt with cache boundary
Definition of done
  • TUI starts, connects, streams a turn end-to-end
  • spawn_agent runs an executor and returns text
  • write_memory + recall_memory persist across restarts
v1.5 · ~1 weekend

Web dashboard — drop-in to nodebase

Next.js 16 web app aligned to D:\nodebase's exact stack so it can mount inside nodebase as a route or sub-app later.

Adds
  • Next.js 16 + React 19 + Tailwind 4 + Turbopack
  • shadcn/ui (new-york style, neutral base)
  • Vercel AI SDK 5 for any direct LLM calls in the UI
  • Sessions browser · agents timeline · memory graph (lift Boop debug)
  • WS client mirrors Pi event stream → live updates
  • Optional Convex sync (@convex-dev/agent) for shared sessions across devices
Definition of done
  • Web client renders the same conversation as TUI in real time
  • Brigade dashboard mounts at /brigade inside nodebase without conflict
  • Auth: piggyback nodebase's Better Auth + Polar when integrated
v2 · ~1 weekend

Mobile — Expo + NativeWind

Same WS protocol. NativeWind so design tokens and shadcn primitives carry from web to phone.

Adds
  • Expo SDK 51 + expo-router
  • NativeWind 4 (Tailwind for RN) + react-native-reusables (shadcn-port)
  • expo-notifications for push on agent_end
  • Background mode keeps WS alive for long-running spawns
  • Reuses v1.5 WS client + protocol types verbatim
Definition of done
  • iOS + Android dev builds chat with the same Brigade core
  • Push notification fires when an agent finishes while app is backgrounded
v2.5 · ~3-5 days

Memory upgrade — SQLite-FTS5 + decay + embeddings

When v1 keyword grep starts to feel coarse, upgrade the memory backend. Hermes / Boop hybrid.

Adds
  • SQLite (better-sqlite3 or libsql) backend with FTS5 lexical search (Hermes pattern)
  • Optional embeddings via @ai-sdk/openai or Voyage
  • Adaptive exponential decay (Boop's memory/clean.ts pattern)
  • Post-turn extraction → auto-write durable facts
  • supersedes pointers for memory corrections
Definition of done
  • Recall returns top-N by hybrid score (FTS5 + embedding cosine)
  • Decay job runs every 6h, archives stale facts, prunes pointless ones
v3 · ~2-3 weeks

Production hardening — approvals · budgets · audit · evals

The "trust this with real work" milestone. Multi-user-ready.

Adds
  • Approval flow for dangerous tools (write/bash/spawn_agent) — UI prompt + auto-approve allowlist
  • Per-session token + dollar budgets with hard stop
  • Audit log of every tool call and decision (Paperclip pattern)
  • Eval harness — regression suite that runs on every commit
  • Better Auth (matching nodebase) for multi-user; Polar for billing if external
  • Structured logs via pino, OTEL traces optional
Definition of done
  • You can hand a stranger a Brigade login without losing sleep
  • Eval suite has 20+ scenarios, all passing
v3.5 · ~1 week

Sandbox — per-session Docker container

Bash and write tools route through Docker via Pi's operations hook. OpenClaw's pattern, lighter weight.

Adds
  • createBashTool({ operations: { exec: dockerExec } }) — Pi gives us this hook
  • Per-session container; workspace mounted RO + outbox RW
  • Credential proxy (NanoClaw-style) — agent never sees raw API keys
  • Timeout + memory caps per container
Definition of done
  • Brigade can run an untrusted prompt without risk to host
  • Container boot < 2s; tool call latency < 100ms over IPC
v4 · ~2-4 weeks

Multi-channel + scheduler

Brigade leaves your terminal. NanoClaw-style adapters + Paperclip-style routines.

Adds
  • Channel adapter interface + barrel-import registry (NanoClaw pattern)
  • Telegram (telegraf), Discord (discord.js), Slack (@slack/web-api)
  • Two-DB-per-session SQLite (NanoClaw inbound/outbound) for true isolation
  • Cron / heartbeat scheduler (croner) — recurring jobs that wake the agent
  • Slash commands per channel ("/scan inbox", "/plan tomorrow")
Definition of done
  • "DM me the sales summary every weekday at 9am" works end-to-end on Slack/Telegram
  • Adding a 4th channel takes < 4 hours of work
v5 · ~1 week

ACP server — IDE integration

Brigade speaks Agent Client Protocol → shows up in any ACP-compatible IDE. Same wire OpenClaw and Hermes use.

Adds
  • @agentclientprotocol/sdk wrapper
  • Translator: Pi events ↔ ACP events (lift OpenClaw src/acp/translator.ts)
  • ToolKind mapping: read · edit · execute · search · fetch · think · other
  • Stdio agent server — IDE spawns Brigade as a subprocess
  • Registry entry for Zed / JetBrains / Cursor / Cline
Definition of done
  • Brigade appears in Zed's agent picker and answers prompts
  • Tool calls render as ACP tool tiles in the IDE
v5+ · research / ongoing

Self-improving learning loop (Hermes-inspired)

Long-game endgame. The agent that grows with you. Speculative until v3 evals exist.

Adds
  • Skill self-authoring — agent proposes new SKILL.md candidates from successful turns
  • Honcho-style dialectic user model (stores beliefs about you, evolves)
  • Optional trajectory export for RL fine-tuning (Atropos-flavored)
  • "Refine" pass on existing skills based on usage outcomes
  • Personal model fine-tune cadence (monthly?)
Definition of done
  • Brigade after 3 months feels measurably more "yours" than Brigade on day 1
  • Eval scores trend upward without manual prompt tuning
DISCIPLINE The order matters — don't skip grades

Each grade builds on the last. Skipping ahead breaks compounding.

Don't start v3 before v1 is solid

Approvals on a flaky core = double-broken. Fix v1 bugs before adding v3 features.

Don't ship v4 before v3

A multi-channel agent without approvals is one prompt-injection away from disaster.

v5+ is exploratory, not blocking

The learning loop only makes sense once v3 evals exist to measure improvement. Don't let it gate practical milestones.

PROJECTION Cumulative LOC by grade

Forecast from current builds. Shows how each grade compounds. By v5 you've built ~10× v1's surface — but in 9 months of ~weekly increments, not a death march.

§9 · Lessons inherited

What each repo taught us

RepoSingle best lessonWhat Brigade takes
Pi-mono The agent loop is ~80 lines if you let an SDK handle providers. Engine itself
OpenClaw Layered prompt files with explicit cache boundary = 10× cost win. Prompt assembly + skill eligibility
Boop Dispatcher / executor split is the cleanest sub-agent pattern in production. Sub-agent + memory shape + WS broadcast
NanoClaw Two-DB-per-session + barrel imports = 14k LOC for a multi-channel platform. Channel adapter pattern (deferred two-DB to v1.5)
Hermes Self-improving learning loop is the long-game endgame. Memory tool design leaves room for future skill self-edit
Paperclip Heartbeat-driven scheduling > push-driven for autonomous work. (Deferred — v1 is interactive, no autonomy)
§ Why Brigade is actually doable

The feasibility audit — receipts for every claim

"5.5 hours · ~1,200 LOC · OpenClaw-grade engine" sounds like a pitch deck. It isn't. Each claim below has direct evidence in the local source.

REASON · 01

Pi SDK does the hard 80%.

The agent loop is ~80 LOC inside pi-mono/packages/agent/src/agent-loop.ts at runLoop(). We don't write the loop, the abort plumbing, the multi-provider streaming, or the JSONL session format. Brigade is a thin bezel around an SDK that's already in production behind OpenClaw.

REASON · 02

Every pattern is lifted from production code.

Boop's WS broadcast = 17 LOC running for months (boop-agent/server/broadcast.ts). OpenClaw's prompt-cache boundary = battle-tested at scale. Boop's dispatcher/executor pattern = the recommended Anthropic agent topology. We're not inventing protocols.

REASON · 03

Each primitive is < 250 LOC.

Memory store: ~80. WS gateway: ~50. spawn_agent: ~60. Auto-recall hook: ~40. Prompt assembler: ~80. The TUI is the largest single piece (~250) and most of that is event-handler glue around Pi-TUI components. Nothing is "big" by any normal definition.

REASON · 04

Tool surface fits in your head.

8 tools. All identical shape: TypeBox schema → typed params → returns {content, details}. AJV validates before execute() runs. You can describe the entire tool model on a napkin and write a new tool in 15 minutes.

REASON · 05

Zero infrastructure to set up.

No database. No cloud. No auth provider. No Docker. pnpm install && pnpm dev and you have a server. Sessions are JSONL on disk. Memory is JSONL on disk. Skills are markdown files. The whole thing runs from your home directory.

REASON · 06

All references are in the same workspace.

Pi-mono, OpenClaw, Boop, NanoClaw, Hermes — every reference repo is on disk. Copy + adapt is 3-5× faster than write from scratch. When stuck, grep the neighbours. There's no Stack Overflow tax on this build.

REASON · 07

Pair-programming with Claude collapses the time.

Verified empirically across this whole conversation: research that took hours solo took minutes with subagents. Boilerplate (TS types, package.json, glue code) compresses 5-10×. ACP translator equivalents take a day instead of a week.

REASON · 08

The failure mode is small.

Worst case: a v1 feature is buggy or the TUI feels off. v0 (just the loop + Anthropic) gives you a working agent in 25 minutes. Even if v1's auto-recall is broken, you've already shipped a usable terminal agent. The downside has a floor.

HONEST RISKS The 3 things that could actually slow you down

Forecasting failure honestly. None of these block v1. Each has a mitigation.

Risk · WS reconnection edge cases

If TUI disconnects mid-stream, what happens to in-flight tool calls? Mitigation: v1 ignores reconnects (single-client assumption). Add reconnect in v1.5 when it matters.

Risk · Auto-recall noise

Keyword recall might inject too many false positives, polluting context. Mitigation: Cap to top-3 hits with min-score gate; add a /forget slash command to debug.

Risk · Cost runaway from sub-agents

Naive spawn_agent could chain spawns into infinite cost. Mitigation: Hard depth-cap of 1 in v1 (sub-agents can't spawn). Lift to 3 with budget tracking in v1.5.

Verdict — go.

Brigade v1 is a 5.5-hour pair-programming sprint that produces a real working agent. The risks are bounded, the references are on disk, and the engine you'd otherwise have to build is already in node_modules/@mariozechner/. The only legitimate reason not to start today is "I want to keep researching" — but past a certain point, more research is procrastination wearing a lab coat.

§10 · Go

Ready to build

Path is locked. Spec is verified. Every snippet on this page is traceable to a live file in the workspace.

Brigade v1 — what to do next

Open Claude Code in this workspace. Say "go" — the scaffold begins in the next message. Estimated time to working v1: ~5.5 hours of pair programming.

$ cd c:\Users\SmartSystems\Downloads\ $ mkdir brigade && cd brigade $ "go" # in the Claude Code conversation

← Back to Strategy Brief

Strategy Brief · April 2026 · v3

Four agent platforms.
One protocol you actually own.

A plain-English tour of OpenClaw, Boop, Paperclip, and how to build their best ideas as a native feature inside Nodebase — with a custom wire protocol designed for it. No CS degree required.

Read time · 18 min
Audience · founders & engineers shipping agents
Goal · steal the patterns, design the protocol, ship the SaaS feature
OpenClaw · platform Boop · personal agent Paperclip · agent company Nodebase · the host
The 30-second version

If both projects were restaurants…

Both are kitchens that cook the same dish — an "AI assistant that talks to people on messaging apps and uses tools on their behalf." But one is a giant commercial kitchen with twenty stations, and the other is a well-equipped home kitchen built for one chef.

The commercial kitchen

OpenClaw

A full platform you install on your own machines.

  • Talks on 25+ messaging apps — WhatsApp, Telegram, Discord, Slack, Signal, iMessage, Matrix, Teams, …
  • Plugs into 30+ AI brains — OpenAI, Anthropic, Google, Groq, local models via Ollama, etc.
  • Has its own plugin system so anyone can extend it.
  • Ships native iPhone, Android and Mac apps.
  • Stores everything locally on your computer. No cloud database.
  • ~1,000+ source files. Production-grade.
vs
The home kitchen

Boop

A focused starter template you fork and customise.

  • Talks on iMessage only (using a service called Sendblue).
  • Uses Claude exclusively (Anthropic's AI).
  • No plugin system — you just edit the code.
  • No mobile app — there's a small debug dashboard for the developer.
  • Stores everything in Convex, a managed cloud database.
  • ~7,300 lines of code. Lean, opinionated, weekend-buildable.
25+Channels in OpenClaw
30+AI providers in OpenClaw
1Channel in Boop (iMessage)
1,000+SaaS tools Boop borrows via Composio
Vocabulary first

The 10 ideas you need to follow the rest of this page.

Both projects use the same handful of building blocks. Once these click, everything else is just "which version did each project pick?"

🧠

LLM (the AI brain)

A "Large Language Model" — the actual neural network like Claude, GPT-4 or Gemini that reads text and writes text.

Analogy: the chef. Doesn't know your kitchen, only knows how to cook.
🤖

Agent

An LLM plus a loop: read the message → think → call a tool → read the result → think again → reply. The "loop" is what makes it an agent instead of a chatbot.

Analogy: the chef wearing an apron, with the recipe and the timer. They can do things, not just talk.
🛠️

Tool

A small function the agent is allowed to call, like send_email, search_web, or save_memory. Each tool has a typed input and output.

Analogy: kitchen equipment — knife, oven, blender. Each does exactly one thing well.
🔌

MCP (Model Context Protocol)

A standard, invented by Anthropic, for plugging tools into any AI. Both projects use it. Think of it as USB for AI — any tool that speaks MCP works with any agent that speaks MCP.

Analogy: the USB-C port on the back of the kitchen — any gadget plugs in.
📡

Channel

A messaging platform the agent receives texts on: WhatsApp, iMessage, Slack, Discord, … Each channel needs an adapter that translates between the platform's API and the agent's internal format.

Analogy: the front door. WhatsApp is one door, Slack is another. Customer walks in through one of them.
🧩

Plugin / Extension

A self-contained bundle of code that adds a feature without touching the core. OpenClaw is built around this idea (it has 100+ of them). Boop doesn't have plugins — you just modify the source.

Analogy: attachments for a KitchenAid mixer. The base machine doesn't change; you snap on what you need.
🧠

Memory

Notes the agent saves about you between conversations: "user prefers concise answers", "their landlord is named X". The agent can recall these next time. Without memory, every chat starts from zero.

Analogy: the chef's little black book of regulars and their dietary quirks.
📅

Automation / Cron

"Do this every morning at 8am." A scheduler that wakes up and asks the agent to perform a task on a recurring basis.

Analogy: the kitchen alarm that says "rise and shine, time to bake bread."

Approval / Drafts

Before the agent sends a real email or moves real money, it shows you a draft and waits for "yes". Stops embarrassing or expensive mistakes. Boop has a beautiful version of this; OpenClaw uses approval hooks.

Analogy: the waiter shows you the receipt before charging your card.
🛰️

Gateway

A central server that all the channels, agents, and devices connect to. It routes messages, handles auth, and is what your phone app talks to. OpenClaw has one; Boop doesn't need one because it's only one channel and one user.

Analogy: the maître d' at the front of the restaurant directing every order to the right station.
Follow one message

What actually happens when you text "What's on my calendar tomorrow?"

This is the same journey in both projects. Watch how it changes shape depending on which kitchen you walk into.

In OpenClaw

Self-hosted, multi-channel, plugin-based.

flowchart TD A["📱 You text on WhatsApp"] --> B["WhatsApp plugin"] B --> C["🛰️ Gateway server"] C --> D["🤖 Agent loop"] D --> E{"Need a tool?"} E -- yes --> F["Calendar tool plugin"] F --> G["Google Calendar API"] G --> F F --> D E -- no --> H["Compose reply"] D --> H H --> C C --> B B --> A D <-.-> M[("💾 Local memory
~/.openclaw/")]

In Boop

Cloud-DB-backed, iMessage-only, dispatcher pattern.

flowchart TD A["📱 You text on iMessage"] --> B["Sendblue webhook"] B --> C["Express server"] C --> D["🎩 Dispatcher agent
Claude Sonnet"] D --> E{"Real work needed?"} E -- no --> R["Reply directly"] E -- yes --> S["Spawn execution agent
scoped to: gmail, calendar"] S --> T["Composio MCP tools"] T --> U["Google Calendar API"] U --> T T --> S S --> V["💾 Save draft"] V --> D D --> R R --> B B --> A D <-.-> M[("☁️ Convex cloud DB
memoryRecords")]

The single biggest difference, in one sentence

OpenClaw routes through a gateway you own and reaches tools through plugins you install. Boop routes through Sendblue + Convex and reaches tools through Composio's catalogue of 1,000+ pre-built integrations. One is "build it yourself, control it yourself." The other is "rent the hard parts, focus on personality."

Under the hood

The actual technology stack, visualised.

Every modern app is a Lego tower. Here's the bricks each project chose.

Where the code lives

Approximate file count per project area.

Capability coverage

How many built-in pieces each project ships.

OpenClaw's stack

  • Language: TypeScript (strict). Runs on Node.js 22+ or Bun.
  • Frameworks: Express (HTTP), ws (WebSocket), Grammy (Telegram), Slack Bolt, matrix-js-sdk, Baileys (WhatsApp Web).
  • AI: OpenAI, Anthropic, Google, Groq, Bedrock SDKs; plus 25+ more via plugins.
  • Storage: the file system (~/.openclaw/), SQLite, and LanceDB for vector memory. No cloud DB.
  • UI: SwiftUI for iOS/macOS, native Android, plus a separate Next.js "Mission Control" web console.
  • Validation: Zod for schemas. Vitest for tests with 70% coverage gates.
  • Distribution: npm package, signed Mac app, mobile app stores, Docker.

Boop's stack

  • Language: TypeScript. Runs on Node.js 20+.
  • Agent SDK: @anthropic-ai/claude-agent-sdk — Anthropic's official agent loop.
  • Integrations: Composio (1,000+ SaaS toolkits, OAuth handled for you).
  • Backend: Convex — a managed real-time database with typed queries.
  • Server: Express + WebSocket for the dashboard live feed. Croner for scheduling.
  • UI: React 19 + Vite + Tailwind 4 — but only as a developer-facing debug dashboard, not a user app.
  • Distribution: npm run dev on your laptop, paired with a Sendblue account and ngrok tunnel.
The standout idea in each

One thing each project does really well.

OpenClaw's signature move

A "manifest-first" plugin system

Instead of wiring every feature into the core code, OpenClaw says: describe what you do in a JSON file, drop your folder in extensions/, and the system will find you. A WhatsApp plugin, an OpenAI plugin, a Brave-Search plugin — all the same shape.

Why that matters in plain words: it means the project can grow forever without the core team having to approve every new feature. Anyone can write a plugin. The boundary between "core" and "extension" is enforced by the type system.

Analogy: like the App Store on your phone — Apple doesn't write Spotify. They wrote a clean way for Spotify to plug in.
Boop's signature move

3-phase adversarial memory consolidation

Most AI memory systems just append — they pile up notes forever. Boop does something genuinely clever every 24 hours:

  1. Proposer (a smart Claude model) reads all memories and suggests merges, supersessions, and prunes.
  2. Adversary (a cheap Claude model) plays devil's advocate against every proposal.
  3. Judge (a smart Claude model) weighs both sides and approves or rejects each change.

The result: a memory that self-edits like Wikipedia, with a full audit trail of why each note survived.

Analogy: a courtroom for your notebook. Prosecutor, defence, judge. Every note has to earn its spot.
Side by side

The full comparison.

Twenty rows. The same row, two answers. This is the cheat sheet to keep open while you decide.

OpenClaw Boop
Project typePlatform / frameworkStarter template
Messaging channels25+ built-in (WhatsApp, Telegram, Discord, Slack, Signal, iMessage, Matrix, Teams, …)1 (iMessage via Sendblue)
AI providers30+ (OpenAI, Anthropic, Google, Groq, Bedrock, Ollama, LMStudio, …)1 family (Claude Sonnet / Opus / Haiku)
How tools are addedPlugins in extensions/ following the public Plugin SDKComposio toolkit catalogue + custom MCP servers in server/
MemoryVector embeddings (LanceDB) + markdown wiki, local filesConvex table with tier (short / long / permanent), segment, decay scoring
Memory cleanupContext-engine compaction (implicit)Explicit 3-phase Proposer / Adversary / Judge pipeline
Approval before actionTwo-tier approval hooks (user + operator)Drafts table — every external write stages first
SchedulingCron registered inside agent harnessautomations table + 30s polling loop with Croner
Voice (TTS / STT)Yes — ElevenLabs, Deepgram, edge-tts, even voice callsNo
Sub-agentsYes — nested agent contextsYes — explicit dispatcher-spawns-executor pattern
SandboxYes — Playwright browser + code sandboxNo
Native mobile appsYes — iOS, Android, macOSNo
Web UIMission Control (Next.js)Debug dashboard (React + Vite) — for the developer, not end users
Where data livesYour computer (filesystem + SQLite + LanceDB)Convex cloud (managed DB)
Identity / multi-userPublic-key device pairing, multi-deviceSingle-tenant (one user)
Distributionnpm global, Mac app, mobile stores, Dockernpm run dev on your laptop
Code size~1,000+ TypeScript files~7,300 lines of code
Setup time (rough)Hours to days, depending on which channels you turn onAn afternoon
Best forBuilding a serious product or running a multi-channel personal assistantBuilding a personal Claude-powered iMessage agent fast
LicenseMITMIT
The layer cake

What every AI-agent app is made of.

If you stack the pieces from front-door to deep storage, every agent in 2026 looks like this — including both of these projects. The names of the layers are universal, even if the bricks differ.

1

Channel layer — "the front door"

Receives the user's message from a chat app and converts it to plain text + metadata. OpenClaw: 25+ built-in adapters. Boop: Sendblue webhook for iMessage.

2

Routing / gateway — "the maître d'"

Decides which conversation, which user, which agent should handle this message. OpenClaw: a typed Gateway over WebSocket. Boop: a simple Express handler that looks up the conversation in Convex.

3

Agent loop — "the chef"

Sends the message + memory + tool list to the LLM, gets back either a reply or a tool call, executes it, repeats. OpenClaw: custom harness. Boop: Anthropic's Claude Agent SDK.

4

Tool layer — "the equipment"

The actual functions the agent can call. Both projects use MCP, the open USB-for-AI standard. OpenClaw: tools come from plugins. Boop: tools come from Composio + a few custom MCP servers (drafts, automations, memory).

5

Memory layer — "the notebook"

Persistent notes about the user. OpenClaw: vector embeddings in LanceDB + markdown files. Boop: a Convex table with importance scores and decay rates, with extraction running after every turn.

6

Storage layer — "the pantry"

Everything else: conversations, drafts, automations, usage costs. OpenClaw: filesystem and SQLite (your machine). Boop: Convex cloud.

7

Observability — "the inspector window"

So you can see what the agent is doing live. OpenClaw: Mission Control web console + native apps. Boop: a React debug dashboard with WebSocket live feed.

Cost & complexity, plotted

Where each project sits on the effort-vs-power curve.

The X-axis is "how much can it do out of the box." The Y-axis is "how much work to set up and maintain." You generally don't get one without the other.

If you want a personal AI on iMessage this weekend

Pick Boop. You'll be done by Sunday night. The cost is: locked into Claude, locked into iMessage, locked into Sendblue + Convex.

If you want to build something real for many channels

Pick OpenClaw. The setup is more involved but you get every channel, every model, plus mobile apps for free.

If you want the best ideas of both

Take OpenClaw as the base. Port Boop's three killer ideas — the consolidation pipeline, the drafts table, the Composio integration — over as three new plugins. (See verdict below.)

Under the hood — round 2

The long version, for the curious.

If you read this far you probably want the actual mechanics — not metaphors. The next five sections are deep but still in plain English. They cover the four technologies these projects rest on (MCP, Composio, Convex, the Claude Agent SDK) and then the actual algorithms and contracts inside each repo: Boop's memory math and OpenClaw's plugin contract.

Concept #1 — MCP

"USB-C for AI assistants."

MCP — the Model Context Protocol — is an open standard from Anthropic (Nov 2024). Before MCP, every AI app had to write a custom connector for every data source. With MCP, anyone can publish a server that exposes tools and data, and any AI app that's an MCP client can plug it in. Think USB-C, but for capabilities.

The cast of characters

  • Host — the AI app you talk to (Claude Desktop, VS Code, OpenClaw, Boop).
  • Client — a tiny adapter inside the host. One client per server.
  • Server — the program that exposes tools/data (filesystem, GitHub, Sentry, your CRM…).
  • Transport — either stdio (local pipe) or Streamable HTTP (remote, OAuth-friendly).

What a server can offer

  • Tools — actions the LLM can call (create_issue, send_email).
  • Resources — read-only context (file contents, DB rows).
  • Prompts — reusable templates (think slash-commands).

Plus: sampling (server asks host's LLM to think), elicitation (server asks user a question), notifications (server pushes "tool list changed").

The handshake — in 4 messages, JSON-RPC 2.0

// 1. Client → Server: initialize
{ "jsonrpc":"2.0","id":1,"method":"initialize",
  "params":{ "protocolVersion":"2025-06-18","capabilities":{"elicitation":{}},
             "clientInfo":{"name":"openclaw","version":"2026.4.15"} } }

// 2. Server → Client: capability declaration
{ "jsonrpc":"2.0","id":1,"result":{
    "capabilities":{ "tools":{"listChanged":true}, "resources":{} },
    "serverInfo":{"name":"github-mcp","version":"1.0"} } }

// 3. Client: "I'm ready"
{ "jsonrpc":"2.0","method":"notifications/initialized" }

// 4. Client: "what tools do you have?"
{ "jsonrpc":"2.0","id":2,"method":"tools/list" }
// → server returns array of {name, description, inputSchema}

After that, calling a tool is just {"method":"tools/call","params":{"name":"…","arguments":{…}}}. The model never speaks JSON-RPC — the host wraps the response and feeds it back into the conversation.

Why it matters here: both OpenClaw and Boop speak MCP natively. OpenClaw can plug into any MCP server you point it at; Boop's own features (memory, spawn, drafts, automations) are implemented as in-process MCP servers so the Claude Agent SDK can talk to them without hard-coded glue. Same protocol, different scale.

Concept #2 — Composio

1,000+ apps, none of them YOUR problem.

Composio is a tool platform for agents. Instead of you writing OAuth, refresh, rate-limit, and request-shape code for every SaaS your agent needs, Composio runs a managed catalog of 1,000+ pre-wired toolkits — Gmail, Slack, GitHub, Linear, Notion, Stripe, HubSpot, Sentry, etc. Your agent gets a single session object; Composio handles the rest.

1. Managed Auth

Composio runs the OAuth dance, stores refresh tokens, scopes them to the user, and rotates them. Your code never sees a credential — only an authenticated session handle.

2. Smart tool search

Don't dump 1,000 tool schemas into the prompt — that costs tokens and confuses the model. Composio resolves tools by intent: only the relevant toolkits are returned at runtime.

3. Sandboxed execution

Long-running tasks run in remote sandboxes with a navigable filesystem; large API responses live there as files instead of blowing your context window.

How Boop uses it — five lines that do the work of a thousand

// inside the execution agent
const session   = composio.createSession({ userId });
const tools     = session.tools(["gmail", "gdrive"]);  // scoped to this task
const mcpServer = composio.toMcpServer(tools);          // expose as MCP

// Claude Agent SDK now sees Gmail + Drive tools, named like:
//   mcp__gmail__GMAIL_LIST_THREADS
//   mcp__gdrive__GDRIVE_FIND_FILES
// Boop's executor agent picks them, Composio handles auth & execution.

Composio is also a provider for the Claude Agent SDK, the OpenAI Agents SDK, Vercel AI SDK, LangGraph, CrewAI — so the same toolkit works regardless of which model framework you're on.

OpenClaw's equivalent is each channel/provider plugin owning its own auth (run openclaw onboard, paste a token once, gateway stores it locally). Different philosophy: OpenClaw is local-first and per-user, Composio is a managed cloud service with a free tier and paid scaling.

Concept #3 — Convex

"Firebase, but the queries are TypeScript."

Convex is the cloud database Boop uses. It's a document-relational, reactive, ACID-serializable backend where you write your queries and mutations as TypeScript functions that run inside the database. There's no SQL, no ORM, no migration tool, no caching layer, no WebSocket setup. The frontend's useQuery hook re-renders automatically whenever the underlying data changes.

Three function flavors

  • Query — pure read, deterministic, automatically subscribed-to.
  • Mutation — atomic transactional write (auto-retried on conflict, full serializable isolation).
  • Action — non-deterministic side-effects (LLM calls, fetch). Calls queries/mutations to touch data.

Why agent devs love it

  • Schema is TypeScript — LLMs already know how to write it.
  • Mutations can't corrupt state. Period. ACID.
  • Built-in cron jobs and scheduled functions (Boop's 6h cleanup + 24h consolidation).
  • Vector indexes (Boop's by_embedding, 1024 dims, lifecycle="active" filter).
  • Reactivity = the React debug UI updates the instant a memory is written, no manual subscription code.

Boop's memoryRecords table — straight from convex/schema.ts

memoryRecords: defineTable({
  memoryId: v.string(),                                      // mem_lvi9k_x7g
  content: v.string(),                                       // "User prefers SI units"
  tier: v.union(v.literal("short"),v.literal("long"),v.literal("permanent")),
  segment: v.union(/* identity | preference | correction | relationship
                      | project | knowledge | context */),
  importance: v.number(),     // 0..1
  decayRate: v.number(),      // per-segment multiplier
  accessCount: v.number(),    // reinforcement signal
  lastAccessedAt: v.number(),
  lifecycle: v.union(v.literal("active"),v.literal("archived"),v.literal("pruned")),
  supersedes: v.optional(v.array(v.string())),
  embedding: v.optional(v.array(v.float64())),               // 1024-d vector
  metadata: v.optional(v.string()),                          // JSON, e.g. {corrects:"..."}
  createdAt: v.number(),
})
.index("by_lifecycle", ["lifecycle"])
.vectorIndex("by_embedding", { vectorField: "embedding",
                               dimensions: 1024,
                               filterFields: ["lifecycle"] })
Concept #4 — Boop's memory math

The actual algorithm. With actual numbers.

Marketing copy says "memory that decays". Here's what's literally happening in server/memory/clean.ts every six hours.

The decay formula

For each active, non-permanent memory:

BASE_HALF_LIFE_DAYS = 11.25
DECAY_BETA          = 0.8
ARCHIVE_THRESHOLD   = 0.15
PRUNE_THRESHOLD     = 0.05

adaptiveHalfLife = BASE_HALF_LIFE_DAYS × (1 + importance)
λ_base           = ln(2) / adaptiveHalfLife × DECAY_BETA
λ_effective      = λ_base × (1 + decayRate)               // segment multiplier

decayed          = importance × exp(−λ_effective × daysSinceAccess)
reinforcement    = 1 + ln(1 + accessCount) × 0.1          // recall makes it stickier
score            = clamp(decayed × reinforcement, 0, 1)

if (score < 0.05)                       → lifecycle = "pruned"
else if (score < 0.15 && tier !== "long") → lifecycle = "archived"

The 7 segments — each is a different curve

SegmentTierImportanceDecay rate
identitypermanent0.850.01
correctionlong0.800.015
relationshiplong0.750.02
preferencelong0.700.02
projectlong0.650.025
knowledgelong0.600.03
contextshort0.400.08

"identity" memories (your name, role) lose ~1% per half-life — practically permanent. "context" memories (what we were just talking about) lose ~8% — gone in days unless reinforced.

Score over time, by segment

The 3-phase consolidation pipeline (every 24h)

Decay alone doesn't dedupe. So once a day Boop runs a three-LLM debate over up to 150 active memories. Each phase has its own model and its own JSON schema:

PHASE 1

Proposer

Model: claude-haiku-4-5
Outputs: {type:"merge"|"supersede"|"prune", …}. Hard rules: corrections always win over non-corrections; never merge if it loses information.

PHASE 2

Adversary

Model: claude-haiku-4-5
Must produce one challenge per proposal: {objection, severity:"low"|"medium"|"high"}. Job is to find the reason each merge is wrong.

PHASE 3

Judge

Model: claude-sonnet-4-6
Reads proposals + objections, returns {approve:bool, rationale}. High-severity objections usually reject; low-severity usually approve.

The full transcript — every proposal, every challenge, every ruling — is saved as JSON on consolidationRuns.details, so you can audit any historical run. Cost is tracked per phase in usageRecords (source: "consolidation-proposer" / "-adversary" / "-judge").

Concept #5 — OpenClaw's plugin contract

What you'd actually write to ship a plugin.

OpenClaw's plugin SDK lives in src/plugin-sdk/ and exposes ~150 typed exports. The architecture is manifest-first: a tiny JSON file declares what your plugin can do, and a TypeScript entry registers what it does. The host can answer "is Telegram supported?" without ever loading your code.

1. The manifest — openclaw.plugin.json

{
  "id": "telegram",
  "channels": ["telegram"],
  "channelEnvVars": {
    "telegram": ["TELEGRAM_BOT_TOKEN"]
  },
  "configSchema": { "type": "object", "properties": {} },
  "activation": {
    "onChannels": ["telegram"],
    "onCapabilities": ["channel"]
  },
  "contracts": {
    "providers": [], "speechProviders": []
  }
}

The host reads this without booting your code. That's how onboarding can show "Telegram available — set TELEGRAM_BOT_TOKEN" instantly.

2. The entry — definePluginEntry

import { definePluginEntry } from "openclaw/plugin-sdk";

export default definePluginEntry({
  id: "my-cool-plugin",
  name: "My Cool Plugin",
  description: "Adds a CoolDB provider",
  kind: "provider",
  configSchema: () => ({ /* zod or JSON schema */ }),

  register(api) {
    api.registerProvider({ /* ProviderPlugin */ });
    api.registerTool(makeCoolTool);
    api.registerHook("before-agent-reply", async ctx => {
      // mutate or veto the reply
    });
    api.registerCommand({ name: "cool", run: …  });
  },
});

The api object is the entire seam. There are 10 register* methods covering providers (text, speech, image, video, music, vision, realtime), CLI backends, tools, commands, services, and 30+ lifecycle hooks.

The lifecycle hooks (selected) — fire-points an LLM agent gives you

before-agent-start
Mutate system prompt, inject memory, swap model.
before-tool-call
Veto, rewrite arguments, force approval, log.
before-agent-reply
Final pass over outgoing text. Filter, format, redact.
approval.required
Pause for user confirmation on dangerous actions.
channel.message.received
First-touch on inbound messages from any channel.
model-override
Per-turn model selection (cost / capability routing).

The strict architectural rules — printed on the wall

  • Core must stay extension-agnostic. Adding a plugin should never require core edits.
  • Plugins talk to core only through openclaw/plugin-sdk/* — never relative imports into src/**.
  • No hardcoded plugin-id lists in core. Every special case must be expressed as a manifest field, capability, or contract.
  • Backwards-compatible by contract. Third-party plugins exist in the wild — every SDK addition is additive and versioned.

This is why the codebase has 1,800+ contributors and still ships a stable plugin API: the boundaries are enforced by lint rules and tests, not just convention.

Where they meet — Claude Agent SDK

The shared engine.

Both projects sit on top of Anthropic's Claude Agent SDK (formerly Claude Code SDK, ~1.4k stars, weekly releases). It's the runtime that turns a chat completion into an agent loop: it handles streaming tool calls, sub-agent spawning, hooks, MCP plumbing, session persistence, and the standard built-in tools (WebSearch, WebFetch, Bash, Read/Write/Edit, Skill).

How Boop uses it

  • Two separate query() calls per user message: dispatcher (Sonnet) → optional spawn → executor (Sonnet, more tools).
  • Allowed-tool whitelists per agent: dispatcher gets memory + spawn + drafts + ack; executor gets WebSearch + WebFetch + Composio MCP servers.
  • Disallowed list explicitly blocks Bash, Read, Write, Edit, Glob, Grep on the dispatcher so it can't accidentally do real work.

How OpenClaw uses it

  • Multi-provider: Claude is one option among 30+. Same agent loop wraps OpenAI, Gemini, Ollama, etc.
  • Sandboxed sessions: non-main sessions run inside Docker (default), SSH, or OpenShell — tools execute in the sandbox, not your host.
  • Skills system: ~/.openclaw/workspace/skills/<skill>/SKILL.md is loaded as a recipe; agent picks skills via Skill tool the same way Boop's executor does.
My honest read

If you're building something for yourself: do this.

You already have OpenClaw + a Mission Control console for it on your machine. That's an enormous head start. Don't throw it away by switching to a single-channel template. Instead, steal Boop's three best ideas as plugins:

  1. A Composio plugin — gives OpenClaw instant access to 1,000+ pre-wired SaaS tools (Gmail, Slack, GitHub, Linear, Notion, etc.) without writing each integration yourself.
  2. A memory-consolidation plugin — port the Proposer / Adversary / Judge pipeline so OpenClaw's memory self-edits like Wikipedia.
  3. A drafts plugin — explicit "stage external writes, approve before send" surface, displayed in Mission Control as its own panel.

All three are self-contained, all three respect OpenClaw's plugin boundaries, and all three become Mission Control panels. You end up with: every channel, every model, every Composio tool, the smartest memory model on the market, and the safety of staged writes — all in one local-first system you own.

Bottom line: use OpenClaw as the engine, harvest Boop for the three good ideas, ship via Mission Control. Don't merge codebases. Don't rebuild from scratch.

Build your own — the playbook

So you want to build one of these.

Here is exactly how I'd do it if I were starting tomorrow — which folders to read, which protocols to learn, which weekend builds which feature. No filler.

The 6-week build path

WEEK 1
Pick your spine. Scaffold a Convex + Next.js app (npm create convex@latest) and wire @anthropic-ai/claude-agent-sdk. End the week with a working text loop in a debug HTML page: user types → Claude replies → message in Convex. Nothing else. Don't add channels yet.
WEEK 2
Add memory + the dispatcher pattern. Copy Boop's three-pass split: interaction-agent (cheap, decides) → execution-agent (expensive, acts). Add a memoryRecords table with vector index. Implement recall() + write_memory() as in-process MCP servers. Skip consolidation for now.
WEEK 3
First channel — pick ONE. Telegram is easiest (free token from @BotFather, polling works without HTTPS). Then Sendblue (iMessage, paid) or Discord (free, OAuth). Build it as a folder under server/channels/<name> with inbound.ts + outbound.ts + config.ts. Mirror OpenClaw's extensions/telegram/ shape exactly.
WEEK 4
Plug in the world via Composio. Sign up for Composio, set COMPOSIO_API_KEY. Wire one toolkit at a time — Gmail first, then Calendar, then Slack. Boop's composio-loader.ts is the canonical pattern: build MCP servers per-integration, scope them to user accounts, never expose all tools at once.
WEEK 5
Drafts + automations + consolidation. Add the drafts table and the staging MCP. Add the croner-based automation loop (30-second poll). Add the Proposer/Adversary/Judge consolidation cron (24h). This is when the agent goes from "chatbot" to "knows me."
WEEK 6
Approval, sandboxing, and the second channel. Now generalize: any "do" tool requires an approval round-trip back through the channel. Bash/code execution goes through E2B or Docker. Add channel #2 — picking a different shape (DMs vs guilds vs threads) forces you to abstract properly. End with a 1-button install.

Folders to read first.

Don't read everything. Read these, in this order, and you've absorbed 80% of both repos.

From the OpenClaw side
  1. src/plugin-sdk/ — the public plugin contract. This is the masterclass on plugin boundaries. Read channel-contract.ts, provider-entry.ts, AGENTS.md.
  2. src/gateway/ — the WebSocket control plane. Look at server.ts, protocol/schema.ts, server-methods/. Shows how to do typed JSON-RPC over WS.
  3. extensions/telegram/ — the cleanest channel plugin. Read bot.ts, bot-handlers.runtime.ts, send.ts, approval-native.ts.
  4. extensions/openai/ — the cleanest provider plugin. Read openai-provider.ts and stream-wrappers.ts for the streaming-adapter pattern.
  5. src/agents/tools/ — every built-in tool (bash, read, write, browser, web search, cron). bash-tools.exec.ts + bash-tools.exec-approval-request.ts show approval-gated execution.
  6. src/cron/ — the scheduler. service.ts + isolated-agent.ts show how to run agents in a non-interactive context.
  7. src/commands/onboard*.ts — the env-var-detection + interactive setup wizard. Steal this pattern wholesale; first-run UX is what kills hobby projects.
  8. apps/macos/, apps/ios/, apps/android/ — three native clients all speaking the same WS protocol. Look at OpenClawProtocol/ for the Swift codec.
From the Boop side
  1. convex/schema.ts — the entire data model on one page. Start here. 9 tables, vector index on memory, status enums on drafts. This is what your DB should look like.
  2. server/memory/ — adaptive-decay memory. Read types.ts (segments, decay rates), extract.ts (post-turn fact extraction), clean.ts (the 6-hour decay loop).
  3. server/consolidation.ts — the Proposer/Adversary/Judge prompts in full. Copy verbatim to start; tune later. Cheapest "smart memory" you'll ever ship.
  4. server/interaction-agent.ts — the dispatcher system prompt. The "send_ack → spawn → reply" pattern is the single biggest UX win Boop has.
  5. server/execution-agent.ts — the worker prompt. Read the "Sources:" rule and the no-fabrication clause; both are battle-tested.
  6. server/draft-tools.ts + convex/drafts.ts — staged writes. The interaction-agent-only send_draft tool is the safety pattern you want.
  7. server/automations.ts — croner-based scheduler with notify-back. 30-second poll, simple, works.
  8. debug/ — Vite + React UI subscribed to Convex. Build something like this on day one. Watching tool calls stream in real time changes how you debug agents.

Protocols to actually study.

You don't need to read RFCs cover-to-cover. You do need to know which spec each piece of glue speaks.

Protocol What it solves Where you'll meet it
MCP (Anthropic, 2024)How an LLM calls tools — the universal "USB-C for AI". JSON-RPC 2.0 over stdio or Streamable HTTP.Every tool you ship. Both repos use it. spec/draft on modelcontextprotocol.io.
JSON-RPC 2.0The wire format under MCP, A2A, and OpenClaw's gateway. Request/response with method, params, id.Every WebSocket message in OpenClaw. jsonrpc.org/specification.
A2A (Linux Foundation, v1.0)Agent-to-Agent. How two opaque agents discover each other and collaborate without sharing internals.Future-proofing your agent for multi-agent swarms. a2a-protocol.org, 23k stars on GitHub.
WebSocket + SSELong-lived bidirectional and one-way streams for agent output, channel updates, and live UI.OpenClaw gateway = WS. Convex = WS. Claude streaming = SSE. Discord gateway = WSS with heartbeat opcodes.
OAuth 2.1 + Device CodeHow users authorize without pasting tokens. Device-code flow is what makes mobile pairing pleasant.Composio handles OAuth for 1,000+ apps. Slack, Discord, Google all want OAuth. RFC 8628 for device code.
Telegram Bot APIWebhook OR long-polling JSON over HTTPS. Inline keyboards, threads, business connections, payments, mini-apps.Easiest first channel. core.telegram.org/bots/api.
Discord GatewayPersistent WSS connection with opcode-based heartbeat. REST for sending, WS for receiving.Most "real-time" channel. Use discord.js unless you really like writing heartbeats.
WhatsApp Cloud APIGraph API REST + webhook. Templated messages, 24h customer-service window, phone-number rate limits.If you want WhatsApp without an iPhone in a box. developers.facebook.com/docs/whatsapp.
SendblueiMessage as a webhook. The reason Boop feels native on iPhone — green/blue bubbles handled by them.Paid (~$0.01/msg). Boop's sendblueDedup table is the idempotency pattern you need.
Sparkle appcastmacOS auto-update via signed RSS feed. EdDSA signatures, delta updates.Only if you ship a Mac app. OpenClaw's appcast.xml is a working example.
cron expression5-field schedule: min hour dom month dow. Use croner on Node — it handles timezones.Every "remind me at 8am" feature. Boop and OpenClaw both use croner.

The recommended stack.

Opinionated. This is what I would use, in 2026, after staring at both repos. You can swap any piece later.

DATA LAYER
ConvexReactive DB, vector index, cron, file storage, all in one. WebSocket subs make the debug UI free. Boop's call.
FRONT DOOR
Next.js 15 (App Router)Same-origin API routes for channel webhooks. RSC for the dashboard. Vercel-cheap.
BRAIN
Claude Agent SDK + GPT-5.4 fallbackSonnet for dispatcher + judge, Haiku for proposer + adversary + extract. Wire OpenAI as a fallback for when Anthropic 529s.
TOOLS
MCP + ComposioIn-process MCP servers for your own tools (memory, drafts, automations). Composio for everyone else's APIs.
CHANNELS
Telegram → Sendblue → DiscordIn that order. Telegram for dev (free, fast). Sendblue for daily-driver (iMessage). Discord for power users.
SANDBOX
E2B (managed) or DockerFor agent-run code. E2B if you don't want to run Docker yourself; OpenClaw's docker setup if you do.
EMBEDDINGS
Voyage AI (1024-dim)Boop's choice. Cheap, multilingual, decent recall. Swap for OpenAI text-embedding-3-large if you prefer.
OBSERVABILITY
Convex logs + LangfuseConvex for app state, Langfuse (or Helicone) for LLM traces and cost. Both have free tiers.

What it looks like, end to end.

The reference architecture, drawn from both repos.

flowchart TB
  classDef channel fill:#fde7c4,stroke:#d97706,color:#1a1410,font-weight:600
  classDef agent fill:#e0e7ff,stroke:#6366f1,color:#1a1410,font-weight:600
  classDef store fill:#dcfce7,stroke:#16a34a,color:#1a1410,font-weight:600
  classDef tool fill:#fef3c7,stroke:#a16207,color:#1a1410
  classDef external fill:#f1f5f9,stroke:#64748b,color:#1a1410

  U([👤 You])
  TG[Telegram]:::channel
  SB[Sendblue / iMessage]:::channel
  DC[Discord]:::channel
  WEB[Web Console]:::channel

  GW{{Webhook
Router}} IA(Interaction Agent
· dispatcher · cheap):::agent EA(Execution Agent
· worker · per-task):::agent CON(Consolidator
Proposer→Adversary→Judge):::agent CV[(Convex DB
messages · memory · drafts
automations · agentLogs)]:::store VEC[/Voyage embeddings
1024-dim vector index/]:::store MEM[memory MCP
recall · write]:::tool DRA[drafts MCP
save · send · reject]:::tool AUT[automations MCP
cron schedule]:::tool COM[Composio MCP
1000+ toolkits]:::tool WEB_S[WebSearch / WebFetch]:::tool GMAIL[Gmail]:::external CAL[Calendar]:::external SLK[Slack]:::external ETC[...etc]:::external U <--> TG & SB & DC & WEB TG & SB & DC & WEB --> GW GW <--> IA IA <--> CV IA <--> MEM IA <--> DRA IA <--> AUT IA -- spawn --> EA EA <--> CV EA <--> WEB_S EA <--> COM COM <--> GMAIL & CAL & SLK & ETC MEM <--> VEC CV --- VEC CON -. 24h cron .-> CV AUT -. 30s tick .-> EA
✓ Do
  • Build the debug UI on day one. Watching tool calls stream is non-negotiable.
  • Two agents from the start: cheap dispatcher, expensive worker. Don't merge them.
  • Stage every external write through a drafts table.
  • Use vector search for memory recall, not LIKE queries.
  • Make memory decay adaptive — important things last longer than chitchat.
  • Steal prompts. Don't write your own dispatcher prompt from scratch — Boop's is battle-tested.
  • Manifest-first: every plugin/channel/tool declares itself in JSON before any code runs.
✗ Don't
  • Don't write your own integrations for Gmail / Slack / Notion. That's what Composio is for.
  • Don't put bash execution in the dispatcher. Ever.
  • Don't ship without an approval round-trip on destructive tools.
  • Don't store API keys in your DB plain-text. Encrypt with a device-keyed AES-256-GCM, like the Mission Control approver pattern.
  • Don't conflate channel auth with user identity. They're different concepts.
  • Don't try to support every model on day one. Pick Claude, ship, swap later.
  • Don't skip the cron table. "Remind me tomorrow" is the killer feature.

Both repos in this comparison are MIT-licensed. You can copy any pattern, any prompt, any schema. The only thing you can't copy is the taste it took to put them together — and the only way to build that is to ship a v1 and live with it for a month.

Now zoom out — all four together

Same DNA. Different hosts.

Stare at OpenClaw, Boop, and Paperclip long enough and you notice something almost embarrassing: they're all the same thing. They differ in skin, ambition, and where they cache state — but underneath, every serious agent platform converges on the same twelve patterns. Here they are, lined up against your existing Nodebase substrate.

Concept OpenClaw Boop Paperclip Nodebase (today)
The agent abstractionplugin-sdk contractInteraction + ExecutionAdapter (claude-local …)Workflow + Standalone Agent
Unit of workSession / turnConversation turnIssue + heartbeat runWorkflow execution
Tool dispatchMCP + tool-catalogIn-process MCP serversPlugin tool dispatcherInngest channels (40+)
External I/OChannels (Telegram …)Sendblue + iMessageWeb UI + agent webhooksWorkflow triggers + chatbots
Memory / stateSession JSONLmemoryRecords + decayruntimeState + taskSessionsConvex tables (per feature)
Recurring worksrc/cron/ servicecroner 30s loopRoutines + triggersInngest cron-scheduler
Approval gateBash exec approvalsDrafts (staged writes)Two-stage approvals tableManual review flows
Audit trailJSONL session logsagentLogs tableactivity_log (immutable)Execution.executedNodes
Cost / budgetPer-provider cost ingestusageRecords tablecost_events + hard-stoptokenStatus per entity
Plugin contractopenclaw.plugin.jsonMCP server registrationpaperclip-plugin-* manifestNode.type enum (closed)
Realtime to UIWebSocket gatewayConvex subscriptionsheartbeat_run_events streamConvex subscriptions
Multi-tenant scopeSingle-user (local-first)Single-user (personal)company_id everywhereentityId everywhere

The twelve invariants of a serious agent platform.

If your design is missing one, you'll add it later in a panic. Plan it in now.

01 · IDENTITY
Every actor is named.Agent, user, system, plugin — each gets a stable id, and every action is signed by one. No anonymous mutations.
02 · TENANCY
One scope key on every row.entityId / companyId — pick one and put it on every table, every API call, every log line.
03 · ADAPTER
Pluggable execution.A single execute(ctx) → result contract. Local CLI, remote API, child process — same shape, different transport.
04 · MANIFEST
Declare before you run.Every plugin/adapter/tool ships JSON metadata first. Capabilities, schemas, auth needs — discovery without import.
05 · ATOMIC CHECKOUT
One worker, one task.Status transitions are conditional — "move to running iff currently queued". No two agents fight over the same job.
06 · STAGED WRITES
Drafts before sends.External actions land in a drafts / approvals table first. Only the dispatcher promotes them.
07 · LIVE STREAM
Events on every turn.Token-by-token, tool call by tool call. WS, SSE, or Convex subs — pick one and never block on full results.
08 · IMMUTABLE LOG
Mutations are events.An activity_log append-only table. Indexed by (scope, time). Never delete a row, ever.
09 · BUDGET STOP
Cost has a brake.Token use streams into a counter. Cross the line, agent flips to paused with reason. Recovery is manual.
10 · SCHEDULE
Cron is first-class.Triggers (cron, webhook, event) live next to tasks, not bolted on. nextRunAt is computed.
11 · WORKSPACE
Agents run somewhere.Ephemeral or shared, sandboxed or trusted — make the substrate explicit (git worktree / docker / e2b / inline).
12 · MEMORY
State that survives restart.Per-session, per-task, per-tenant. Decay or roll-forward — but never re-derive on every call.
The protocol

AOP/1 — your own wire spec.

Agent Orchestration Protocol, version 1. A small, opinionated spec that gives you OpenClaw-grade plumbing in roughly 600 lines of TypeScript. JSON-RPC 2.0 for control, SSE for streams, JSON Schema for everything else. No new wheels.

Built on standards your team already knows. Designed so a third-party plugin written today still works in three years.

GOAL 1
Boring on the wireJSON-RPC 2.0 + SSE. Curl-debuggable. No custom binary.
GOAL 2
Manifest-firstEvery adapter/tool/channel registers JSON before code loads.
GOAL 3
Tenant-shapedentityId is required on every method. No exceptions.
GOAL 4
Streamable by defaultEvery long call has a paired .stream SSE endpoint.

1. Transport

Two endpoints. Nothing else.

# Control plane — synchronous JSON-RPC 2.0
POST /api/aop/v1/rpc
Content-Type: application/json
X-Entity-Id: ent_8h2k...

# Streaming — Server-Sent Events
GET  /api/aop/v1/stream?runId=run_abc&cursor=0
Accept: text/event-stream

2. Method namespaces

Twelve namespaces map one-to-one onto the twelve invariants above. If you find yourself wanting a thirteenth, you're either doing app code or duplicating one of these.

agent.*
Define, list, configure, pause, retire agents.
agent.create · agent.update · agent.pause · agent.list
task.*
Single-assignee work units. Atomic checkout.
task.create · task.checkout · task.complete · task.list
run.*
Heartbeat invocation; the actual work happening.
run.start · run.cancel · run.get · run.events (SSE)
tool.*
List + invoke registered tools (your existing connectors).
tool.list · tool.describe · tool.call
draft.*
Staged writes. Worker drafts, dispatcher sends.
draft.save · draft.list · draft.send · draft.reject
approval.*
Two-stage gate for sensitive ops.
approval.request · approval.resolve · approval.list
memory.*
Recall + write durable facts with decay.
memory.recall · memory.write · memory.consolidate
routine.*
Recurring work. Cron + webhook + event triggers.
routine.create · routine.fire · routine.list
workspace.*
Realize/teardown sandbox env per run.
workspace.realize · workspace.release
budget.*
Policy + observed spend + hard-stop status.
budget.set · budget.status · budget.events (SSE)
activity.*
Append-only audit. Read-only externally.
activity.list · activity.tail (SSE)
plugin.*
Manifests, install, enable, configure.
plugin.register · plugin.enable · plugin.list

3. Wire shapes — by example

→ Start a run

{
  "jsonrpc": "2.0",
  "id": "req_01",
  "method": "run.start",
  "params": {
    "entityId": "ent_8h2k",
    "agentId": "agt_writer_42",
    "taskId": "tsk_91ab",           // optional — null = ad-hoc
    "input": {
      "prompt": "Draft a reply to alice@example.com",
      "toolNamespaces": ["gmail", "calendar"]
    },
    "sessionId": "sess_prev_xyz",    // optional — for stateful adapters
    "workspaceMode": "isolated",
    "invocationSource": "on_demand"      // or routine_trigger | wakeup | task_assignment
  }
}

← Response (immediate; run is async)

{
  "jsonrpc": "2.0",
  "id": "req_01",
  "result": {
    "runId": "run_a1b2c3",
    "status": "queued",
    "streamUrl": "/api/aop/v1/stream?runId=run_a1b2c3",
    "createdAt": "2026-04-29T14:32:11Z"
  }
}

← SSE stream (one event per line, monotonic seq)

event: state
data: {"runId":"run_a1b2c3","status":"running","seq":1}

event: token
data: {"seq":2,"text":"Looking up alice's last email…"}

event: tool_call
data: {"seq":3,"tool":"gmail.search","args":{"q":"from:alice"}}

event: tool_result
data: {"seq":4,"tool":"gmail.search","ok":true,"sizeBytes":1432}

event: draft
data: {"seq":5,"draftId":"drf_77","kind":"gmail.reply","summary":"Reply to alice"}

event: state
data: {"seq":6,"status":"succeeded","usage":{"inTok":2104,"outTok":387,"costCents":14}}

4. The plugin manifest

A single aop.plugin.json at the root. The runtime never imports a plugin until it's read this file.

{
  "id": "linear",
  "name": "Linear connector",
  "version": "1.4.0",
  "aopVersion": "1",
  "capabilities": ["tools", "routines"],
  "adapters": [],                       // would list custom agent runners
  "tools": [
    {
      "id": "linear.search-issues",
      "description": "Search Linear issues by query.",
      "parameters": { /* JSON Schema */ },
      "sideEffects": "read",             // "read" | "write" | "destructive"
      "requiresApproval": false
    }
  ],
  "routines": [
    { "id": "linear.daily-digest", "defaultCron": "0 9 * * 1-5" }
  ]
}

5. Lifecycle — one task, end to end

sequenceDiagram
  autonumber
  actor U as User
  participant CH as Channel
  participant DP as Dispatcher Agent
  participant CP as AOP Control Plane
  participant EX as Execution Agent
  participant TL as Tool (Linear)
  participant DB as Activity Log

  U->>CH: "Find blocked issues, ping owners"
  CH->>CP: rpc task.create
  CP->>DB: activity.append (task.created)
  CH->>CP: rpc run.start (agent=dispatcher)
  CP-->>CH: { runId, streamUrl }
  CH->>CP: SSE /stream?runId=...

  CP->>DP: invoke(adapter)
  DP->>CP: rpc run.start (agent=execution)
  CP->>EX: invoke(adapter, tools=[linear,slack])
  EX->>TL: tool.call linear.search-issues
  TL-->>EX: result
  EX->>CP: rpc draft.save (kind=slack.message)
  EX-->>CP: state=succeeded
  DP->>CP: rpc draft.send drf_77
  CP->>DB: activity.append (draft.sent)
  CP-->>CH: SSE token / draft / state events
  CH-->>U: "Pinged 3 owners — see thread."
      

6. Error model

JSON-RPC error codes plus a closed string union for the categories your code branches on. Never branch on free-form strings.

{
  "code": -32001,
  "message": "Budget exceeded",
  "data": {
    "reason": "budget.hard_stop",        // closed union
    "scope": { "type": "agent", "id": "agt_writer_42" },
    "limit": 100000,
    "observed": 100412
  }
}

Closed union for data.reason: budget.hard_stop · approval.required · tool.unavailable · workspace.unavailable · tenant.scope_violation · adapter.timeout · checkout.conflict.

7. Versioning

Path-versioned (/v1). Additive within a major. New methods → fine. New required param on existing method → bumps to /v2. Manifests carry aopVersion so the runtime can refuse plugins targeting a future major.

Source-code archaeology

How tools, memory, hierarchy, and connectors actually work in OpenClaw, Boop, and Paperclip.

Read-only audit of the three reference codebases. Every claim below is backed by a real file path. Where the three diverge, Brigade picks a side and keeps everything in one Postgres — no JSONL transcripts on disk, no “sessions folder” on a laptop, no per-machine memory. If it isn’t in the database, it doesn’t exist.

01 · Tools / capabilities
A profile-gated catalog, lazily filled by MCP.

OpenClaw: static catalog of typed tools (AgentToolWithMeta) in src/agents/tool-catalog.ts, scoped by ToolProfileId (minimal / coding / messaging / full). MCP servers are dynamically materialized at session start in src/agents/pi-bundle-mcp*.ts and injected into the same catalog. Sandbox + approvals layered separately (src/agents/tool-policy.ts, src/infra/exec-approvals.ts).

Boop: Composio is the entire tool plane. server/composio.ts + buildMcpServersForIntegrations() mints a per-run MCP server containing only the toolkits an agent is authorized for (Gmail, Slack, Linear, etc.). No tokens ever live in Boop — Composio holds them.

Paperclip: tools are LLM adapters, not SaaS connectors. packages/adapters/ exports ./server, ./ui, ./cli per agent runtime (Claude, Codex, Cursor). The adapter is the tool surface; integrations sit one layer below.

Brigade does: tools as DB rows (tool_definitions), MCP servers materialized lazily but their schemas cached in Postgres so the gateway can authorize calls without a round-trip. Composio for SaaS, Vercel AI SDK for LLMs.
02 · Memory
Tiered, segmented, embedded, and consolidated.

OpenClaw: markdown files on disk in ~/.openclaw/memory/ + LanceDB vectors via src/plugin-sdk/memory-lancedb.ts. Hybrid lexical (FTS) + semantic search via ResolvedMemorySearchConfig. Sessions and memory are searchable as separate sources. Per-laptop, not per-tenant.

Boop: the most ambitious of the three. convex/memoryRecords.ts defines tier (short / long / permanent), segment (identity / preference / correction / relationship / project / knowledge / context), importance, decayRate, accessCount, supersedes[], and a 1024-dim embedding. convex/consolidation.ts runs an adversarial three-model consensus (proposer → adversary → judge) to merge near-duplicate memories. Every memory mutation is logged to memoryEvents for audit.

Paperclip: no real memory subsystem. agentLogs is per-run text. Memory is an open seam, not a feature.

Brigade does: port Boop’s tiered+segmented model to Postgres + pgvector, scope every row by company_id, run consolidation as an Inngest cron, and write every mutation to a tamper-chained memory_events table.
03 · Hierarchy
A tree of agents who report to agents.

OpenClaw: single-parent subagent runs only. src/agents/subagent-registry.ts + SubagentRunRecord track parent → child with a hard depth limit. Every spawn is in-memory + persisted into the JSONL session file. No org chart. No roles.

Boop: flat. One human, one assistant, scheduled automations. No notion of manager / report.

Paperclip: the right answer. packages/db/src/schema/agents.ts has reportsTo as a nullable self-FK, plus role, title, budgetMonthlyCents, spentMonthlyCents. CEO has reportsTo = null. Hires require approval. Routes enforce company_id on every read.

Brigade does: Paperclip’s reportsTo tree, but make every node a real database row that can spawn an OpenClaw-style subagent run, with a depth limit and a parent-FK so the activity log can replay the whole company.
04 · State / persistence
Disk vs document DB vs Postgres.

OpenClaw: pure filesystem. ~/.openclaw/openclaw.json for config, ~/.openclaw/sessions/agent_<id>/<session>.jsonl for transcripts, ~/.openclaw/credentials/ for hashed keys, ~/.openclaw/device-identity.json for the RSA pairing keypair. Append-only JSONL is the audit log. No database at all.

Boop: Convex documents end-to-end. Every message, draft, automation, memory, usage record, agent log lives in a Convex table. Reactivity is automatic — the UI subscribes via useQuery and re-renders.

Paperclip: Postgres via Drizzle (packages/db/src/schema/*.ts), with embedded PGlite as a dev fallback. Every domain entity has company_id. Schema migrations are first-class.

Brigade does: Postgres only. Sessions, transcripts, memories, approvals, audit, secrets — all rows. Disk is a cache for cold artifacts (uploaded files), nothing else. If your laptop dies, you lose nothing.
05 · Connectors
MCP for shape, Composio for OAuth, secrets in encrypted rows.

OpenClaw: MCP everywhere. src/mcp/channel-shared.ts handles stdio + HTTP transports; src/mcp/plugin-tools-serve.ts turns each plugin into an MCP server. OAuth state machine lives in src/agents/chutes-oauth.ts + src/agents/auth-profiles*.ts with credentials encrypted via the device key.

Boop: Composio is the connector layer. Toolkits authorized via OAuth on platform.composio.dev, surfaced through MCP. The integrations: ["gmail"] array on a run is enough to mint the right MCP server.

Paperclip: connectors are agent adapters, not SaaS. Credentials live as env vars or in company_secrets; there’s no first-class OAuth flow.

Brigade does: MCP as the universal tool transport, Composio as a managed OAuth multiplexer for the long tail of SaaS, and a company_secrets table for tokens that customers want to hold themselves (BYOK). Each row is encrypted with a per-company KMS key + scoped to a single agent.
06 · Gateway / transport
A thin router with a typed protocol.

OpenClaw: WebSocket + JSON-RPC 2.0, schema in src/gateway/protocol/schema/*. Multiplexes by channel id + session key. Pairing via RSA device identity stored in device-identity.json; scope upgrades require re-pair. Stateless — the gateway itself owns no business state.

Boop: no gateway. Convex client is the transport; queries / mutations are RPC. Real-time falls out of the document model for free.

Paperclip: plain Express REST + Vite UI. Real-time is optional SSE; mostly request/response.

Brigade does: JSON-RPC 2.0 over WSS for control + SSE for streams, schema generated from a Markdown-first spec (AOP/1). Authorization is per-call, scoped to (company_id, agent_id, capability). Postgres LISTEN/NOTIFY fans events out to long-lived SSE consumers without a separate broker.
07 · Drafts & approvals
Three strategies. Brigade combines them.

OpenClaw: in-memory approval queue with structured payloads (ExecApprovalRequestPayload, SystemRunApprovalBinding). Routed to a designated approver session. Audit lives inline in the JSONL transcript.

Boop: drafts are the approval mechanism. convex/drafts.ts stores draftId, kind, summary, payload, status. External actions (send email, post to Slack) flow through draft → human review → send. Simple and bulletproof.

Paperclip: first-class approvals table with type, requested_by_agent_id, payload, status, decided_by_user_id, decided_at. Hires and CEO proposals require approval before they take effect.

Brigade does: all three. Drafts (Boop) for outbound side-effects, structured approval payloads (OpenClaw) for tool execs, an approvals table (Paperclip) for org changes — with n-of-m, role-gated, time-windowed policies as runtime objects.
08 · Activity / audit
An append-only ledger that chains by hash.

OpenClaw: JSONL session transcripts are the audit log. Append-only, atomic-write, but per-laptop and not queryable as a fleet.

Boop: usageRecords is the cost ledger; memoryEvents is memory’s ledger; agentLogs is the runtime log. Three different append-only tables for three different concerns.

Paperclip: one activity_log table with actor_type, actor_id, action, entity_type, entity_id, details JSONB, scoped by company_id. Every mutating route calls logActivity(). Replay = SELECT by company + time range.

Brigade does: Paperclip’s shape, plus a previous_log_hash column so the ledger is tamper-evident; SIEM export is COPY (SELECT ... ORDER BY id) TO STDOUT. Postgres triggers reject UPDATE/DELETE on the table outright.

Brigade’s day-one schema.

Combine Paperclip’s company tree, Boop’s memory tiers, and OpenClaw’s approval payloads. Every row is scoped by company_id and protected by Postgres row-level security. Nothing important lives outside this schema.

-- 01 · Company & identity
companies         (id, name, status, kms_key_arn, budget_monthly_cents, created_at)
users             (id, email, name, last_login_at) -- board operators
company_members   (company_id, user_id, role, scopes[]) -- M:N w/ RBAC

-- 02 · The org chart
agents            (id, company_id, name, reports_to, role, title,
                   adapter_type, adapter_config jsonb, permissions jsonb,
                   budget_monthly_cents, spent_monthly_cents,
                   last_heartbeat_at, status, created_at)
agent_api_keys    (id, company_id, agent_id, key_hash, last_used_at)

-- 03 · Work
issues            (id, company_id, parent_id, assignee_agent_id, title,
                   description, status, priority, checkout_run_id,
                   execution_run_id, billing_code, created_at)
issue_comments    (id, company_id, issue_id, author_agent_id, content)
heartbeat_runs    (id, company_id, agent_id, status, started_at, finished_at,
                   usage_json jsonb, result_json jsonb)

-- 04 · Memory (tiered, segmented, embedded)
memory_records    (id, company_id, agent_id, content, tier, segment,
                   importance, decay_rate, access_count, last_accessed_at,
                   embedding vector(1024), lifecycle, supersedes uuid[])
memory_events     (id, company_id, memory_id, kind, details jsonb, created_at)
memory_consolidations (id, company_id, run_id, status, decisions_json jsonb)

-- 05 · Tools, connectors, secrets
tool_definitions  (id, company_id, mcp_server_id, name, schema_json jsonb)
mcp_servers       (id, company_id, kind, transport, config jsonb)
integrations      (id, company_id, provider, composio_account_id, scopes[])
company_secrets   (id, company_id, name, encrypted_value, rotated_at)

-- 06 · Approvals & drafts
approvals         (id, company_id, type, requested_by_agent_id, policy jsonb,
                   payload jsonb, status, decided_by_user_id, decided_at)
approval_decisions(id, approval_id, user_id, decision, reason, created_at)
drafts            (id, company_id, agent_id, kind, summary, payload jsonb,
                   status, created_at, decided_at)

-- 07 · Money
cost_events       (id, company_id, agent_id, issue_id, run_id, source,
                   cost_cents, tokens_in, tokens_out, created_at)
budget_incidents  (id, company_id, agent_id, reason, paused_at, resumed_at)

-- 08 · Audit (tamper-chained, append-only)
activity_log      (id BIGSERIAL, company_id, actor_type, actor_id, action,
                   entity_type, entity_id, details jsonb,
                   previous_log_hash bytea, this_log_hash bytea, created_at)
-- BEFORE UPDATE OR DELETE TRIGGER → RAISE EXCEPTION

-- 09 · Real-time
-- LISTEN/NOTIFY channels: company_<id>_runs / _approvals / _drafts
-- SSE workers fan out to UI; Inngest cron drives consolidation + budgets
The single biggest delta vs OpenClaw: nothing is local.

OpenClaw stores transcripts as JSONL on the laptop running the CLI. Lose the laptop, lose the company memory. Brigade stores every message, tool call, memory event, draft, approval, and cost record as a Postgres row scoped by company_id. The CLI is just a renderer; the database is the company.

That single change unlocks: real multi-tenancy (RLS), tamper-evident audit (hash chain trigger), fleet-wide observability (one SELECT over all agents), zero-laptop disaster recovery (any new device pairs in & sees the full history), and BYOK encryption at rest (per-company KMS keys on memory_records, company_secrets, drafts.payload).

The launch

A new flagship, not a feature.

The orchestration layer is — honestly — not the hard part. The hard part is making it credibly enterprise. Buyers don’t want another “agentic platform.” They want one their CISO will sign on, their CFO can cap, and their auditors can replay. That product doesn’t exist yet. Below is the brief for shipping it.

Working name · brigade.run

Brigade. The agent control plane
for regulated enterprises.

Durable—like Temporal. Observable—like LangSmith. Governed—like Salesforce or ServiceNow. Vendor-neutral—like AgentCore. First product to combine all four. Cloud, VPC, and on-prem parity from line one. MCP, A2A, and OpenTelemetry GenAI native. Drafts before sends. Hard‑stop budgets. Approvals as runtime objects, not callbacks.

$50kTeam floor / yr
$150kEnterprise floor / yr
F500 IT/DevOpsBeachhead ICP
BYO-LLMOpenAI · Anthropic · Bedrock · Vertex · Azure · vLLM
vs · LangSmith Deployment (née LangGraph Platform)
Best place to write an agent ≠ best place to run one.

LangChain’s authoring stack is unrivalled, and LangSmith Deployment (the platform formerly known as LangGraph Platform) is great for hosted runs. Brigade is unrivalled for running in regulated production: Temporal-grade durability, true VPC + air-gap parity, BYO-LLM with no SaaS-only escape hatch, HIL approvals as a first-class runtime object — not a callback you wire up yourself. We import LangGraph agents natively.

vs · CrewAI Enterprise (AMP)
Crews are great for prototypes. Brigade is the substrate for prod.

CrewAI nails the “crew of roles” metaphor for prototyping. Brigade is what those crews need when they have to survive a redeploy, retry across hours, satisfy a CISO, and run in your own VPC against your own LLM provider. Keep the framework. Gain the runtime.

vs · OpenAI AgentKit
If you can bet on OpenAI — AgentKit. If you can’t — Brigade.

AgentKit is the right answer if you’ve standardized on OpenAI. Brigade is the right answer if you can’t make that bet — because of BAAs, EU residency, model-supply risk, or a CISO who needs every action loggable and replayable on infrastructure you control.

The wedge.

One vertical first. Fastest cycle, lowest clinical/regulatory risk, easiest reference logos. F500 IT & DevOps copilots — incident triage, change‑management, on‑call augmentation, runbook automation. ServiceNow owns inside ServiceNow data; nobody owns the cross‑tool, audited, durable, hard‑stop‑budgeted runtime that calls Datadog + GitHub + AWS + Jira + ServiceNow + PagerDuty in one trace. That’s where Brigade lands first.

ICP · Primary

Series C–D AI‑native co’s

200–2000 employees, building an internal AI platform team. Outgrew LangChain OSS, fear AgentKit lock‑in, must ship to their own customers under their own SOC 2.

ICP · Beachhead

F500 mid‑market, $1B–$10B

FinServ, insurance, healthcare, public sector, pharma. Currently evaluating Salesforce Agentforce / ServiceNow / IBM watsonx. Need a vendor‑neutral runtime to span those silos — on private infra.

ICP · Channel

SI & consultancies

Slalom, Avanade, BCG X, Deloitte, EY. They need a deployment‑grade platform under their delivery work. White‑label tier from day one.

Name candidates.

Domain + USPTO/EUIPO clearance still required before commit. Prefer single‑word stems with no “AI” / “agent” suffix.

Brigade · primary Marshalt Praxis Atrium Cohort Rollcall Ledger
Built for enterprise from line one

The 18 things procurement actually checks.

In 2026 the bar isn’t “we passed SOC 2.” It’s that an agent platform must look indistinguishable from a mature data‑plane vendor (Snowflake, Datadog, Okta) on every governance axis — plus a layer of agent‑specific controls (egress, sandboxing, prompt‑injection containment, evals) that didn’t exist on the checklist three years ago. Brigade ships flagship‑grade on six of these and table‑stakes on the rest.

Control Brigade LangSmith
Deployment
OpenAI
AgentKit
AWS
AgentCore
CrewAI
AMP
1 · SAML 2.0 SSO + SCIM provisioning
2 · RBAC + ABAC with custom roles
3 · Tamper‑evident immutable audit log + SIEM export
4 · BYOK / customer‑managed encryption keys (CMK)
5 · Data residency — EU + US + UK + AU + JP day‑one
6 · PII · DLP redaction at the model gateway
7 · On‑prem / VPC / air‑gap parity with cloud×
8 · SOC 2 Type II + ISO 27001 + HIPAA + FedRAMP Mod
9 · OpenTelemetry GenAI semconv native export
10 · Model gateway with BYO‑LLM — OpenAI / Anthropic / Bedrock / Vertex / Azure / vLLM×
11 · Prompt‑injection defense (CaMeL‑style capabilities, post‑tool‑call validator)
12 · Per‑agent egress allow‑listing + microVM tool sandbox
13 · Hard‑stop cost guardrails — org / tenant / agent / user
14 · Approval workflows as first‑class runtime objects (n‑of‑m, role‑gated, time‑windowed)
15 · Eval suite — offline regression + online grading + CI gate
16 · Retention + GDPR Art. 17 right‑to‑delete with crypto‑shred
17 · Network controls — PrivateLink / mTLS / IP allow‑list
18 · 24∕7 enterprise support with named CSM + SA + SLA credits
Flagship — how Brigade wins the RFP Supported at parity Roadmap or partial × Not available
SOC 2 Type II ISO 27001:2022 ISO 42001:2023 (AI mgmt) HIPAA BAA GDPR Art. 17 EU AI Act ready FedRAMP Moderate OTel GenAI 1.41 MCP v2025-11-25 A2A native AGNTCY member (40+)
The product, drawn

Linear’s craft. Stripe’s data discipline. Anthropic’s restraint.

Light theme. Single warm amber accent against a serious navy ink. Inter + JetBrains Mono. Hairline borders, no glassmorphism. Side‑panel drawers, not page nav. Command palette‑first. Monospace IDs everywhere. Tabular figures. Inline diffs with approval. Boring on purpose — until something demands attention.

Live agent run

A 200‑step run that an executive can read. Two‑pane: timeline left, current step right. Diffs for state changes are first‑class — approve in‑line, never via modal.

agents · support‑triage · run_a8f3c91 [ ] follow ··· ────────────────────────────┬────────────────────────────────────────────────────── StepsStep detail 14:02 Plan │ llm.call · claude‑opus‑4.7 14:02 Read CRM │ tokens 4,213 in / 612 out · 1.8s · $0.073 14:02 Search KB │ 14:03 Compose │ ┌─ Input ──────────────────────────────────────┐ └─ tool: gmail │ │ <system>You are a support agent. The user...│ 14:03 Approve │ └────────────────────────────────────────────────┘ │ ┌─ Output ──────────────────────────────────────┐ ─── live ─── │ │ I’ll search the KB for “refund window”... │ │ └────────────────────────────────────────────────┘ cost $0.412 │ tokens 12.4k │ guardrail · email‑redaction · 1 redaction applied latency 14.2s │ │ Diff (proposed)− ticket.status: new+ ticket.status: pending‑customer │ [ Approve ] [ Reject ] [ Always allow ] ────────────────────────────┴────────────────────────────────────────────────────── trace_id tr_9f2c1e · agent v1.4.2 · region eu‑west‑1 · idp Okta

Fleet overview

KPIs · live ticker of agents currently working · dense table sorted by recency · incidents pinned. The same primitive repeats at every level — Fleet → Agent → Run → Step.

Agents
12
9 healthy · 3 active
Runs (24h)
1,284
▲ 14% vs prev
Success
99.94%
p95 4.2s
Spend (24h)
$312.47
▲ 6% vs prev
Live · 3 active runs ──────────────────────────────────────────────────────────── support‑triage reading inbox 2.1s ░░░░░░░░░░ token 412/2k fleet‑watcher pinging service 0.4s token 88/2k lead‑qual awaiting approval -- token -- Agents ─────────────────────────────────────────────────────────────────────────── Name Status Last run p95 Spend 24h Owner support‑triage ● running 2m ago 3.1s $44.20 Sara S. lead‑qual ◐ paused 5m ago 4.0s $11.92 Tom J. fleet‑watcher ● running just now 0.4s $ 2.04 Sec Ops contract‑reviewer ○ idle 1h ago 12.s $98.41 Legal Incidents ──────────────────────────────────────────────────────────────────────── rate‑limit on anthropic, eu‑west‑1 14:01 acknowledged Tom J. budget 90% on lead‑qual 13:42 open

Three bold UI moves no competitor ships.

Move 01

Run‑as‑narrative.

Instead of a tree of tool calls, the run reads like a newsroom timeline: large monospace timestamp, an icon‑led headline (“Read your inbox”, “Drafted reply to Bob”), a 1‑line summary, fold‑down for the payload. A 200‑step run becomes legible to a non‑engineer executive — the hidden buyer.

Move 02

Live‑thinking ambient strip.

A 24px persistent strip at the top of every page showing what your agents are currently doing, ticker‑style: agent.support‑triage · reading email · 2.3s. Always‑on awareness without dashboard fatigue. Bloomberg has done this for finance for 30 years; nobody has done it for agents.

Move 03

Semantic state diff.

When an agent modifies a CRM record, a Linear issue, a calendar event — show the change as a git‑style colorized diff inside the timeline, not as a tool‑call payload. Inline accept · reject · always‑allow. Turns review into something humans actually do rather than skip.

From wedge to platform

Land in DevOps. Expand to the F500.

Six milestones. Each independently shippable behind a feature flag. Built on the existing SaaS substrate — auth, billing, multi‑tenancy, and 40+ connectors come from sister‑product plumbing — so the team focuses entirely on the agent layer.

Substrate

Brigade is a separate brand and a separate dashboard, but it shares the underlying SaaS plumbing of the parent platform — Postgres + Prisma, Convex realtime, Inngest jobs, Better Auth + SAML, multi‑tenant entityId isolation, and 40+ connector channels. That gives the launch team an 18‑month head start. No payments rebuild, no auth rebuild, no connector rebuild. The new code is the agent runtime, the protocol surface, the governance controls, and the UI — the parts customers actually pay for.

M1
Wire the rails.

Prisma schema (Agent, AgentRun, AgentDraft, AgentApproval, AgentActivity), Convex mirrors, the /api/aop/v1/rpc + /stream routes, manifest loader. Outcome: run.start an agent that echoes its prompt back via SSE, end‑to‑end audited.

M2
First adapter — Anthropic on Bedrock.

Wrap @ai-sdk/anthropic on AWS Bedrock for HIPAA‑BAA day one. Stream tokens to SSE, persist usage, OTel GenAI semconv. Outcome: a real Claude agent visible in the dashboard with live transcript — signed by your auditor.

M3
The DevOps wedge tools.

Datadog, GitHub, AWS, Jira, ServiceNow, PagerDuty, Slack — wrapped as MCP‑native tools with per‑agent egress allow‑listing. Outcome: the “incident‑triage copilot” demo that lands a paid pilot at three F500 IT shops.

M4
Governance flagship — drafts, approvals, hard‑stops, capabilities.

Drafts before sends. n‑of‑m approvals as runtime objects. Hard‑stop budgets with break‑glass. CaMeL‑style capability tokens on tool outputs. Outcome: the demo CISOs say yes to.

M5
Memory + routines + evals + observability.

Vector memory with adaptive decay (Boop’s formula). Routines on the existing cron substrate. Eval suite with CI gate (LangSmith‑parity grading + golden datasets). OTel GenAI traces visible in customer Datadog/Grafana. Outcome: “swap the model” becomes a CI check, not a vibe.

M6
Self‑host + plugin marketplace + multi‑agent.

Helm + Terraform + Replicated installer. Air‑gap bundle. Open the manifest loader to third‑party packages. Agent‑spawns‑agent (Boop’s interaction/execution split). Outcome: Brigade is a platform, not a feature — ready for SI channel + F500 expansion.

The whole picture.

Brigade orchestrates. The substrate provides plumbing. Customers see one product.

flowchart LR
  classDef brigade fill:#fde7c4,stroke:#d97706,color:#1a1410,font-weight:700
  classDef substrate fill:#dbeafe,stroke:#1d4ed8,color:#1a1410
  classDef store fill:#dcfce7,stroke:#16a34a,color:#1a1410
  classDef external fill:#f1f5f9,stroke:#64748b,color:#1a1410

  subgraph CL[Brigade UI]
    UI[Fleet · Runs · Approvals
Live transcript · Diffs]:::brigade end subgraph PROTO[AOP/1 surface] RPC[/api/aop/v1/rpc
JSON‑RPC 2.0/]:::brigade SSE[/api/aop/v1/stream
SSE/]:::brigade end subgraph CORE[Brigade runtime] DISP(Method dispatcher):::brigade GW[Model gateway
BYO‑LLM · DLP · OTel]:::brigade GOV[Governance
Approvals · Budgets · Audit]:::brigade PLUG[Plugin manifests
MCP · A2A]:::brigade end subgraph SUB[Shared SaaS substrate] AUTH[Better Auth
SAML · SCIM]:::substrate JOBS[Inngest
40+ connectors]:::substrate BIL[Billing
tokenStatus]:::substrate end subgraph DATA[Data] PG[(Postgres
Agents · Runs · Audit)]:::store CV[(Convex
realtime · vector mem)]:::store end subgraph EXT[External] LLM[OpenAI · Anthropic · Bedrock
Vertex · Azure · vLLM]:::external SAAS[Datadog · GitHub · AWS
Jira · ServiceNow · PagerDuty]:::external end UI --> RPC UI -. SSE .- SSE RPC --> DISP DISP --> GOV DISP --> GW DISP --> PLUG DISP --> PG DISP --> CV GW --> LLM PLUG -- tool.call --> JOBS JOBS --> SAAS AUTH --> RPC GOV --> BIL CV -. realtime .-> SSE
For the CISO & CFO
  • Tamper‑evident audit log streamable to your SIEM — "who did what" answered in <5 min.
  • Hard‑stop budgets per org / tenant / agent / user — no $40k Opus surprises.
  • BYOK + region pinning + on‑prem option — data never crosses your perimeter.
  • Approval workflows as runtime objects — auditable n‑of‑m before any state‑changing action.
  • Prompt‑injection containment by design (CaMeL‑style capabilities), not a classifier.
For the Head of AI Platform
  • One protocol surface to teach customers, partners, and AI tools — AOP/1, JSON‑RPC + SSE.
  • 40+ connectors become 40+ tools the day you ship — no integration scrum.
  • BYO‑LLM model gateway — ditch a provider in an afternoon, not a quarter.
  • OTel GenAI traces flow to whatever observability stack you already pay for.
  • Versioned protocol — third‑party plugins don’t break on every release.

The orchestration isn’t the moat. Governance, durability, and vendor‑neutrality at the same time — that’s the moat. Nobody has shipped it yet. Brigade does.

The honest question

Can I build my own — and what’s the best way?

Yes. Clean-room, on your own stack, on a protocol you control. Don’t fork — build. Here is how I’d do it if I started Monday morning. Twelve weeks to a real demo. Six months to a platform.

The golden rule: study the shape, write your own expression.

OpenClaw is MIT-licensed, so you could fork — but if your goal is “create one of my own,” forking traps you in their architecture forever. The clean-room move is simple: read their code to learn concepts, then write yours from a blank file.

  • Concepts are free. Channel abstraction, plugin SDK, draft + approval flow, memory consolidation, gateway pairing — these are uncopyrightable patterns. Document them in your own words.
  • Expression is theirs. Don’t paste code, comments, or docstrings. Don’t copy file structure verbatim. Don’t lift error messages. Read, close the tab, write.
  • Brand on day one. Pick your name before you write line one. “OpenClaw” is their trademark; never use it on your product, marketing, or even your repo name.
  • Acknowledge influences. A “Inspired by OpenClaw, Boop, and Paperclip” line in your README is good engineering hygiene and good karma. It’s not legally required.

Twelve weeks. Six milestones. One demo.

Each milestone is independently shippable behind a feature flag. If week 6 slips, you still have something running.

W1–2
Week 1–2 — Lock the protocol. Write the AOP/1 spec as a single Markdown file before any code. Methods, error codes, SSE event shapes, plugin manifest schema. Steal the shape from OpenClaw + MCP + A2A; pick names you like. Outcome: a spec your engineers (or you) can implement without reading anyone else’s code.
W3–4
Week 3–4 — The rails. Postgres schema (Agent, Run, Draft, Approval, Activity, Memory). One Next.js (or Hono) route serving JSON-RPC at /rpc; one route serving SSE at /stream. A method dispatcher. A plugin manifest loader. Outcome: run.start works end-to-end with a hardcoded echo agent.
W5–6
Week 5–6 — First real model + first real channel. Wrap one provider (Anthropic via @ai-sdk/anthropic) behind your provider plugin contract. Stream tokens to SSE. Wire one channel (Slack or Discord) behind your channel contract. Outcome: a real Claude agent answering on a real channel, traces visible in your dashboard.
W7–8
Week 7–8 — Drafts, approvals, atomic checkout. The safety floor. Worker agents call draft.save; a dispatcher calls draft.send. Approval tier for destructive tools. task.checkout with conditional UPDATE so two agents can’t double-execute. Outcome: nothing destructive ever runs blind — the feature CISOs actually buy.
W9–10
Week 9–10 — Memory, routines, budget hard-stop. Vector index for memory.recall with adaptive decay (Boop’s pattern, your code). Cron-scheduled routine.fire events. Per-entity spend window with hard cutoff — mutation rolls back when budget hits zero. Outcome: agents that remember, schedule themselves, and can’t bankrupt you.
W11–12
Week 11–12 — Enterprise wedge + first paid demo. Tamper-evident audit log streamable to SIEM. OTel GenAI 1.41 emission. SSO scaffold (WorkOS or Clerk Enterprise). One ICP demo agent (DevOps incident-triage is the easiest pilot). Outcome: a 30-minute pitch with a real running agent that a F500 director can sign for.

What to study, what to invent.

Study (read & adapt)
  • OpenClaw’s channel boundary — the cleanest plugin contract in the open-source agent space. Read src/channels/AGENTS.md.
  • OpenClaw’s plugin SDK split — how they keep core extension-agnostic. Read src/plugin-sdk/AGENTS.md.
  • Boop’s memory consolidation — Proposer / Adversary / Judge pipeline + adaptive decay.
  • Paperclip’s atomic task checkout — conditional UPDATE with status guards. Prevents double-assignment.
  • MCP + A2A specs — you don’t need to invent tool calling or agent-to-agent. Implement, don’t reinvent.
Invent (your moat)
  • AOP/1 the protocol surface — one wire spec across all four primitives (control, stream, tool, agent-to-agent). Nobody else owns this combination.
  • Tamper-evident audit log — hash-chained, SIEM-streamable, query-able by org / agent / user / time. Buyers ask for this in week one.
  • Hard-stop budget per entity — the one feature LangGraph and AgentKit don’t have.
  • Capability-based prompt-injection containment — CaMeL-style, not a classifier. Defense-in-depth at the runtime layer.
  • Your enterprise UI — live-run narrative, fleet overview, semantic state diff. None of the existing platforms have shipped this.

The stack I’d pick.

Layer Pick Why
RuntimeNode 22+ · TypeScript · Hono or Next.js route handlersBest AI SDK ecosystem. Streaming-first. Edge-compatible if needed.
PersistencePostgres (Neon / Supabase) · Drizzle ORMBoring, durable, every enterprise has it. Drizzle > Prisma for control.
RealtimeSSE for streams · Postgres LISTEN/NOTIFY for fan-outNo new infra. Works behind any reverse proxy. Convex if you want managed.
Job runnerInngest (managed) or BullMQ (self-host)Durable execution. Step-level retries. Crash-safe.
AuthBetter Auth (own it) or WorkOS (sell it to enterprise)Better Auth for indie speed; WorkOS once you have a SOC 2 audit pending.
LLM SDKVercel AI SDK (@ai-sdk/*)Provider-agnostic from day one. Swap models in an afternoon.
ObservabilityOpenTelemetry → whatever the customer pays forGenAI semconv 1.41 native. No lock-in to your own SaaS observability.
UIReact + Tailwind + Radix · Zustand for client stateBoring. Fast. Every enterprise designer can read it.

The folder you start with.

Run pnpm create, paste the layout, fill in the files in milestone order.

brigade/
├── spec/
│   └── aop-1.md                   # the protocol — written first
├── apps/web/                  # Next.js dashboard
│   ├── app/(dashboard)/agents/
│   ├── app/(dashboard)/runs/[id]/
│   └── app/api/aop/v1/
│       ├── rpc/route.ts            # JSON-RPC 2.0 dispatcher
│       └── stream/route.ts         # SSE bridge
├── packages/core/             # the runtime
│   ├── methods/                    # one file per AOP method
│   │   ├── agent.create.ts
│   │   ├── run.start.ts
│   │   ├── tool.call.ts
│   │   ├── draft.send.ts
│   │   └── ... (one per namespace)
│   ├── lib/
│   │   ├── checkout.ts             # atomic task transition
│   │   ├── budget.ts               # spend window + hard-stop
│   │   ├── activity.ts             # tamper-evident log
│   │   └── otel.ts                 # GenAI semconv 1.41
│   └── sdk/
│       ├── provider.ts             # LLM provider contract
│       ├── channel.ts              # channel contract
│       └── tool.ts                 # tool contract (MCP-shaped)
├── packages/db/
│   ├── schema/                     # Drizzle schemas
│   └── migrations/
├── plugins/                     # bundled plugins
│   ├── provider-anthropic/
│   ├── provider-openai/
│   ├── channel-slack/
│   └── channel-discord/
└── jobs/
    ├── agent-execute.ts            # Inngest function
    └── routine-tick.ts             # cron driver
Three things to not do.
  • Don’t fork and rename. Trademark risk, “why not just use OpenClaw?” review fatigue, upstream-merge tax forever, no clean IP story for investors. Build clean.
  • Don’t paste their code. Even “just for reference”, even “I’ll rewrite it later.” Read with the file open, then close the file before writing yours. This is the only way to keep your codebase legally clean.
  • Don’t skip the protocol. If you write the runtime first and the spec second, the spec becomes documentation of bugs. Spec first. Always.
Validation pass · April 2026
Six gaps the build plan needs to plug.

After a deep technical review of the 12-week plan above, here’s what was missing or under-specified. Each one is cheap to add early and expensive to retrofit later.

  1. Postgres row-level security from day one. Multi-tenancy isolation can’t be a W11 feature. Add RLS policies on every tenant-scoped table when you write the schema in W3, not after the first cross-tenant data leak.
  2. Cost-aware model routing. Vercel AI SDK gives you provider abstraction; it does not give you cost-per-token routing. Add a tiny lookup table at tool.call dispatch: long-context → Claude, code → GPT-5.4 mini, cheap drafts → Gemini Flash. Saves 30–60% on the demo.
  3. Eval / regression framework. “QA” in W11–12 isn’t enough. Wire Galileo or Arize from W6 alongside the first real model so every prompt change has a measurable delta. AGNTCY ecosystem makes this nearly free.
  4. SSE backpressure strategy. SSE has no built-in flow control — if a client stalls, your run buffers indefinitely. Document max concurrent streams, queue discipline, and reconnect strategy before 1,000-agent demos. Consider gRPC streaming for the high-concurrency tier.
  5. Long-running run state (24h+). Vector memory handles recall, not transcript size. Add time-window compression (summarize old turns) at W10, not when month-long agent threads OOM Postgres in production.
  6. MCP version pinning. Plugin manifests need a mcp.version field and a documented migration path. MCP v2025-11-25 is current; the next breaking change will ship and your third-party plugins will break unless you version-gate them.
Reality check: 12 weeks is doable end-to-end at ~9–10 weeks of pure execution — which means zero slack for schema rework or unforeseen API changes. Plan for 14 weeks unless you have one engineer doing nothing but this.

What you’re pricing against (April 2026).

Live pricing snapshot from vendor pages. Brigade’s $50k Team / $150k Enterprise floors land cleanly in the gap below.

Platform Headline price Governance tier? Reviewer’s top complaint
LangSmith Deployment
renamed from LangGraph Platform, Oct 2025
$39 / seat / mo (Plus)Enterprise — SSO/RBAC, self-hostObservability-first, not a full orchestration platform
OpenAI AgentKitToken-only ($5 / $30 per 1M for GPT-5.5)Opaque — sales-gatedNo transparent orchestration tier; just an API
AWS Bedrock AgentCorePer-service (Runtime + Gateway + Policy + Memory + Identity + Eval + Obs)Strong — Cedar policies, session isoMulti-service pricing requires a calculator; cost is unpredictable
CrewAI AMPOpaque (sales-gated, public docs blocked)UnknownPublic market position unclear; OSS version dominates mindshare
Lyzr Enterprise$0.08 / run cloud · $0.03 / run VPCSSO, RBAC, audit, on-prem, HIPAASeries A — thin enterprise reference list vs. AWS / Azure
Langfuse Cloud$29 Core · $199 Pro · $2,499 EnterprisePro+ adds GDPR/SOC2/ISO/BAA; Teams add-on +$300/mo for SSO/RBACTracing & prompt-mgmt only; not an orchestration runtime
Salesforce Agentforce$0.10 / action · $550 / user (Service) · $1,100 / user (Plus)Strong (Salesforce trust layer)Locked to Salesforce data graph; non-Salesforce shops can’t adopt
Google Gemini Enterprise$21–$30 / seat / mo (sales-confirmed)Sales-gated; Workspace-bundledPricing/governance behind sales walls; opaque vs. competitors
Bottom line

Don’t clone OpenClaw. Replace it.

Read OpenClaw, Boop, and Paperclip for one week. Write the AOP/1 spec for two. Build the runtime for nine more. Twelve weeks in, you have a real demo on a stack you fully own, with the enterprise wedge baked in from line one — not bolted on after.

At month four, ship the optional aop-bridge-openclaw shim — a small adapter that lets OpenClaw plugins run inside Brigade. Now you absorb their ecosystem without inheriting their codebase. Their plugin authors port to your platform; you don’t port to theirs.

That’s how you build “one of your own” — cleanly, defensibly, and with a moat (the protocol + the wedge) that nobody else has yet.

Brigade strategy brief · v3 · April 2026 · light theme, single accent, JetBrains Mono for IDs · this file is local and works offline (Mermaid + Chart.js via CDN).