PAW β€” Personal AI Workforce

Building the most
advanced personal
AI agent system.

|

0 versions shipped
0 specialized agents
0 active workflows
24/7 autonomous operation

One dev.
An entire team of agents.

What started as a structured task framework evolved into a full multi-agent orchestration platform β€” six versions, each forced by a hard ceiling hit in production.

Built by Rafe because it's genuinely useful and interesting to build. The goal was never just an AI assistant. It was a workforce β€” specialized agents that run autonomously, pick up tasks, and deliver results around the clock.

Six versions. Each one forced by a hard ceiling hit in production. Every limitation became the next version's motivation.

v1

Custom Task Framework β€” Jan 2026

Single agent in a single OpenClaw session with a custom task framework for managing work. Gave structure to AI work beyond raw chat prompts.

⚠️ Why it wasn't enough

One agent couldn't handle complex multi-step work. No parallelism. Hit the ceiling fast β€” the moment you needed more than one thing done at once, you were stuck.

v2

Multi-Agent System β€” Early Feb 2026

Multiple specialized agents β€” Programmer, Writer, Researcher, Reviewer, Architect. Agent spawning and task routing. Parallel work became possible.

⚠️ Why it broke down

No memory between sessions. Agents kept making the same mistakes. Every session started cold β€” lessons learned disappeared the moment the session ended.

v3

Reflections & Learning β€” Mid Feb 2026

Agent reflection system β€” workers reflect on completed tasks. Shared memory that persists lessons learned across sessions. Agents got smarter over time.

⚠️ Why it wasn't enough

Still running as scripts and manual orchestration. No proper server or API. Couldn't scale coordination or run reliably in the background without babysitting.

v4

Web Server Era β€” Late Feb 2026

lobs-server: FastAPI backend with REST API, task queue, worker management, health monitoring. Mission Control (SwiftUI macOS) and Lobs Mobile (iOS) joined the ecosystem.

⚠️ Why it broke down

38K lines of Python. Task execution was linear if/else chains β€” no branching, rollback, or complex workflows. The codebase became genuinely hard to maintain.

v5

DAG Workflow Engine β€” Late Feb 2026

Node-based DAG workflow system. State machines, conditional branching, rollback, event-driven triggers, cron scheduling. 19 workflow definitions shipped.

⚠️ Why it needed replacing

The Python codebase was massive and fragile. Running a separate server alongside OpenClaw added complexity β€” a glue layer heavier than the logic it connected.

v6

PAW Plugin β€” Mar 2026

Full rewrite as an OpenClaw TypeScript plugin. Runs inside OpenClaw β€” no separate server process. Everything from v1–v5 preserved but cleaner.

βœ… The breakthrough

Eliminated the glue layer. The entire system became a single process. Circuit breaker, model chooser with 5 tiers, budget guard β€” all first-class. 8K lines of TypeScript replaced 38K lines of Python.

38,000 lines Python β†’ 8,000 lines TS one process
v7

Nexus Dashboard & Custom Domain β€” Mar 2026 [CURRENT]

Nexus: React + Vite web dashboard replacing the SwiftUI macOS app. Self-hosted at lobslab.com with Caddy auto-TLS + Cloudflare Tunnel. The system now has a real home on the web.

βœ… The breakthrough

Cross-platform access from any device. Public site at lobslab.com, private dashboard at nexus.lobslab.com over Tailscale. Workers autonomously fix UI bugs via PAW tasks. The system builds its own interface.

The Stats

Real numbers from a system that actually runs in production, every day.

0
versions shipped
v1 task framework β†’ v7 Nexus. Seven hard ceilings hit and broken through, in under 3 months.
0
agent types
Lobs, Programmer, Writer, Researcher, Reviewer, Architect β€” each purpose-built with its own model tier and tool config.
0
active workflow definitions
Multi-step DAGs covering code, research, review, and reporting flows. State machines with branching and rollback.
0
model tiers
From free local Qwen to Claude Opus β€” auto-selected by task complexity. Cheapest model that handles the task always wins.
0
lines of TypeScript
The entire PAW plugin. Clean, typed, maintainable β€” and runs inside OpenClaw with no separate process.
24/7
autonomous operation
Workers spawn, execute, and complete without supervision. Zero manual intervention after task creation.

How It All Fits Together

A layered architecture where Lobs acts as the chat interface and coordinator, PAW routes work, and specialized workers execute in parallel β€” with automatic model selection and fault tolerance.

πŸ‘€
Rafe
Human
The human. Sends messages, creates tasks, reviews results via Discord, iOS, or direct DB.
πŸ€–
Lobs
Chat & Coordinator
Primary agent. Runs 24/7. Triages requests, creates tasks, routes to PAW orchestrator. The face of the system.
PAW ORCHESTRATOR β€” OpenClaw Plugin
βš™οΈ
Workflow Engine
19 workflow definitions. DAG execution, branching, rollback, cron triggers.
19 workflow definitions. DAG execution with state, branching, and rollback. Event-driven triggers and cron scheduling.
🧠
Model Chooser
5 tiers from local Qwen to Claude Opus. Cost-aware, fallback chains.
Tier-based model selection. Auto-fallback chains on failure. Cheapest model that handles the task wins.
⚑
Circuit Breaker
Failure detection, worker quarantine, escalation, and recovery.
Tracks worker failure rates. Quarantines bad actors. Prevents cascading failures.
πŸ—„οΈ
Task DB
SQLite task queue with full history, status, and agent assignment.
SQLite-backed task queue. Full task history, worker run records, workflow state.
πŸ’»
Programmer
Code & tests
Writes code, fixes bugs, runs tests, refactors. Default: standard tier.
✍️
Writer
Docs & content
Creates docs, write-ups, summaries, content. Default: small tier.
πŸ”
Researcher
Analysis & synthesis
Researches topics, compares options, synthesizes findings. Default: medium tier.
πŸ”Ž
Reviewer
QA & feedback
Code review, quality checks, feedback. Runs after Programmer.
πŸ›οΈ
Architect
Design & strategy
Technical strategy, design docs, planning. Default: strong tier (Opus).

Model Tier Selection β€” auto-matched by task complexity

microQwen local Β· free
β†’
smallSonnet
β†’
mediumSonnet+
β†’
standardCodex/Sonnet
β†’
strongClaude Opus

Falls back up the chain on failure. Cheapest model that can handle the task always wins.

How It Actually Works

From a message to a completed deliverable β€” six steps, zero manual intervention after the first.

01

Rafe sends a message or creates a task

Via Discord DM, iOS app, or directly in the task database. Natural language or structured input β€” both work.

02

Lobs triages and routes

Reads the request, determines urgency and agent type, breaks it into subtasks if needed, and writes it to the PAW task queue with the right model tier.

03

PAW orchestrator matches to a workflow

The control loop (runs every 10s) picks up active tasks, matches them to workflow definitions if applicable, and sequences the work across multiple agent types.

04

Model chooser selects the right tier

Based on task complexity, agent type, and cost budget. Circuit breaker kicks in if a model fails β€” automatic fallback up the tier chain.

05

Worker agent spawns and executes

A sandboxed OpenClaw session fires up with the appropriate agent config, workspace, and tool access. Max 2 concurrent workers. 15-minute timeout enforced.

06

Results captured, task completed, learning stored

Worker output lands in the task DB. Lessons learned get fed into shared memory. Rafe gets a quiet notification β€” only if it matters.

bash β€” create a task directly
# Insert a task into the PAW DB β€” orchestrator picks it up within 10s
sqlite3 ~/.openclaw/plugins/paw/paw.db \
  "INSERT INTO tasks
     (id, title, status, agent, model_tier, notes, created_at, updated_at)
   VALUES (
     lower(hex(randomblob(16))),
     'Refactor auth module to use JWT',
     'active', 'programmer', 'standard',
     'Follow existing patterns in /src/auth. Add tests.',
     datetime('now'), datetime('now')
   );"

# Worker spawns, codes, commits, reports back. Zero babysitting.

The Building Blocks

Five interlocking projects that make up the Lobs AI ecosystem. Click any card to expand.

🧩
openclaw-plugin-paw

PAW Plugin

+

The orchestrator brain. TypeScript plugin that runs inside OpenClaw β€” no separate server process. Workflow engine, model chooser, circuit breaker, task DB, and agent spawning all in one.

TypeScriptSQLiteOpenClawCurrent
βš™οΈ
Workflow Engine

19 workflow definitions. DAG execution with state machines, conditional branching, and rollback. Event-driven triggers and cron scheduling built-in.

🎯
Model Chooser

5-tier cost-aware model selection. Per-agent fallback chains. Cheapest model that can handle the task wins, always.

⚑
Circuit Breaker

Tracks worker failure rates. Quarantines bad actors. Escalation policies prevent cascading failures from propagating.

πŸ—„οΈ
Task Database

SQLite backend with full task history, worker run records, workflow state, inbox items, and research memos.

πŸ–₯️
lobs-mission-control

Mission Control

+

SwiftUI macOS dashboard β€” now replaced by Nexus (React + Vite). Cross-platform web dashboard at nexus.lobslab.com with real-time monitoring of agents, tasks, workflow runs, and system health.

SwiftUImacOSWebSocketREST API
πŸ“Š
Agent Status Monitoring

Real-time view of all 6 agent types β€” idle, running, or error state. Worker spawn history and success rates at a glance.

πŸ›€οΈ
Workflow Visualization

Visual DAG renderer showing active workflow steps, completed nodes, and pending branches in real time.

πŸ”Œ
Real-time WebSocket Updates

Push-based updates β€” no polling. Task state changes, worker completions, and alerts stream live to the dashboard.

πŸ”
Task Inspector

Drill into any task: notes, agent assignment, model tier, worker run log, output, and timing β€” all in one view.

πŸ“±
lobs-mobile

Lobs Mobile

+

iOS companion app for staying connected to your agent workforce on the go. Check task status, receive completions, and interact with Lobs from your phone.

SwiftiOSPush notifications
πŸ””
iOS Push Notifications

Get notified when blockers arise, urgent tasks complete, or the system needs attention β€” without checking Discord.

✍️
Task Creation On The Go

Create tasks from anywhere β€” on the go, away from the desk. Orchestrator picks them up within 10 seconds.

πŸ“‹
Task Queue View

Browse active, completed, and blocked tasks. Filter by agent type or model tier. See what's running right now.

πŸ’¬
Chat Interface

Message Lobs directly from the app. Same coordinator, same routing β€” just on mobile instead of Discord.

🧠
lobs-shared-memory

Shared Memory

+

A Git-backed cross-project knowledge base. ADRs, research memos, runbooks, and agent instructions that persist across sessions and sync across the entire system.

GitMarkdownADRsRunbooks
πŸ“
Architecture Decision Records

Every major technical decision recorded as an ADR β€” what was decided, why, what alternatives were rejected. Agents read these before acting.

πŸ”„
Git-Backed Sync

All memory files live in a git repo. Changes are committed and synced β€” workers can read, agents can write, everything is auditable.

πŸ—‚οΈ
Research Memos

Researcher agents write structured memos that persist across sessions. Next time a similar question comes up, the answer is already there.

πŸ“–
Agent Runbooks

Operational playbooks for each agent type β€” how to handle edge cases, what tools to use, and what not to do. Living documents that evolve.

🐍
lobs-server β€” Legacy / Retired

lobs-server

+

The FastAPI backend that powered v4 and v5 β€” REST API, task queue, worker management, and health monitoring. Replaced by the PAW plugin. 38K lines of Python that taught us everything.

PythonFastAPISQLiteRetired
πŸ“‘
REST API

Full task management API β€” create, read, update, delete tasks and projects. Backed Mission Control and Lobs Mobile before the plugin era.

βš™οΈ
Python Orchestrator

Multi-agent spawning with aiohttp. Taught us every lesson about race conditions, session leaks, and the limits of if/else workflow logic.

πŸ“Š
Health Monitoring

Worker status tracking, failure rates, and uptime metrics. The patterns here directly influenced the PAW plugin's circuit breaker design.

🧠
Why It Was Retired

38K lines of Python for what 8K lines of TypeScript now does better. The glue was heavier than the logic it connected. PAW eliminated the separate process entirely.

Build Timeline

Six versions. Three months. A system that actually runs in production, every day.

Jan 2026

v1 β€” Custom Task Framework

Lobs comes online. A custom task framework gives structure to AI work beyond raw chat prompts. Single agent, single session β€” but a real foundation.

v1 Β· task framework
Early Feb

v2 β€” Multi-Agent System

Multiple specialized agents: Programmer, Writer, Researcher, Reviewer, Architect. Agent spawning and task routing. Parallel work becomes possible for the first time.

v2 Β· multi-agent
Mid Feb

v3 β€” Reflections & Learning

Workers reflect on completed tasks. Lessons get written to shared memory. The system starts getting smarter over time, not just faster β€” without manual updates.

v3 Β· learning
Late Feb

v4 β€” Web Server Era

lobs-server FastAPI backend deployed. Mission Control (SwiftUI macOS) and Lobs Mobile (iOS) join the ecosystem. Real infrastructure β€” APIs, monitoring, mobile access.

v4 Β· server + apps
Late Feb

v5 β€” DAG Workflow Engine

Multi-step tasks now run as structured workflows with state, branching, and rollback. 19 workflow definitions covering code, research, review, reporting. Complex work becomes a pipeline, not a prompt.

v5 Β· DAG workflows
Mar 2026

v6 β€” PAW Plugin Rewrite (TypeScript)

The big one. Full rewrite as an OpenClaw TypeScript plugin. Runs inside OpenClaw β€” no separate server. Everything from v1–v5 preserved but cleaner. The glue layer disappears. 38K lines β†’ 8K lines.

v6 Β· PAW plugin
Mar 2026

v7 β€” Nexus Dashboard & Custom Domain

React + Vite web dashboard replaces SwiftUI macOS app. Self-hosted at lobslab.com with Caddy auto-TLS and Cloudflare Tunnel. Private Nexus dashboard at nexus.lobslab.com over Tailscale. PAW workers autonomously implement UI fixes.

v7 Β· current Β· you are here

Wins & Hard-Won Lessons

Real systems break in real ways. Every incident below happened in production. None were fun. All were instructive.

πŸ† Wins & Milestones

🀝

Multi-Agent Parallelism

Multiple specialized agents working simultaneously on different tasks β€” Programmer coding, Researcher investigating, Writer documenting β€” all at once. The ceiling on what one person can accomplish expanded dramatically.

πŸͺž

Agent Reflection & Learning

Workers reflect on completed tasks, capture lessons, and feed back into shared memory. The system gets meaningfully better week-over-week without manual updates. Mistakes stop repeating.

βš™οΈ

Workflow Engine as Control Plane

Multi-step tasks now run as structured DAG workflows with state, branching, and rollback β€” not one-shot prompts. Enabled work that previously required constant babysitting to run fully unattended.

🎯

Model Tier System

Tasks automatically route to the cheapest model that can handle them. First week of operation: 60% of tasks handled by the free local model. Cost stays manageable at scale.

πŸ”Œ

Plugin Architecture Win

Eliminating the separate server process removed an entire class of problems: IPC overhead, session handshake failures, sync issues, deployment complexity. One process, zero glue.

πŸ’€ Losses & War Stories

πŸ”„

The Restart Loop Incident HIGH

Workers edited plugin source code, then called gateway restart. The restart spawned fresh workers. Who picked up active tasks. Who edited source and called restart. Infinite loop. The system was restarting itself every 30 seconds for 20 minutes before it was caught.

πŸ’‘ What we learned

Gateway restart is now hard-denied in all worker agent tool configs. Workers can read source but cannot modify it. Defense in depth over clever permission systems.

πŸ’¬

Heartbeat Spam MEDIUM

Background exec commands in workers triggered event-driven heartbeats on completion β€” flooding the main session with "Exec completed" noise. At peak, 40+ heartbeat messages in 10 minutes. Completely drowned out actual notifications.

πŸ’‘ What we learned

Banned all background exec patterns (&, nohup, sleep &&) in worker configs. Direct, synchronous commands only. The AGENTS.md now has a whole section on this.

πŸ›

JSON Double-Conversion Bug MEDIUM

Swift's .convertFromSnakeCase was set on the decoder. Then manual CodingKeys were added for the same mapping. The decoder applied both. Fields silently dropped. No errors β€” just missing data. Took two sessions to diagnose.

πŸ’‘ What we learned

Never add CodingKeys for simple snake→camel conversions when .convertFromSnakeCase is active. Now documented as permanent institutional knowledge in TOOLS.md.

πŸ‘»

Orphaned Session Errors LOW

"No tool call found" errors from workers that completed their task without making any tool calls. Session validation expected at least one tool call β€” workers that answered purely from context would silently fail and get retried, wasting tokens.

πŸ’‘ What we learned

Stricter session validation distinguishing "no tools needed" from "something went wrong." Simple tasks that don't need tools are now explicitly tagged at creation time.

πŸ”

Reflection Spam LOW

Agent reflection was too eager β€” workers would reflect after every minor action: reading a file, running a test, making a small edit. Shared memory grew by hundreds of low-quality entries in a single day. Signal buried in noise.

πŸ’‘ What we learned

Reflection now gates behind strict completion criteria and rate limits. Only significant task completions trigger reflection. Quality over volume, every time.

The Tech Stack

No magic. Just well-chosen tools assembled with intention. Every component earns its place.

🦾
OpenClaw
Agent runtime & plugin host
⚑
TypeScript
PAW plugin & orchestrator
πŸ—„οΈ
SQLite
Task & workflow database
🍎
SwiftUI
macOS + iOS frontends
🐍
Python
Legacy lobs-server (retired)
πŸ”§
GitHub Actions
CI/CD & automation
🧠
Claude
Anthropic Β· strong/standard
πŸ€–
Codex / GPT
OpenAI Β· standard tier
πŸ’»
Qwen (local)
LM Studio Β· micro tier Β· free
πŸ’¬
Discord
Primary chat interface
🌐
Cloudflare
DNS + Tunnel
πŸ”€
Git
Shared memory sync

Contact

Interested in PAW, have questions about the architecture, or want to collaborate? Reach out.

πŸ“§
Email
rsymonds@umich.edu
πŸ™
GitHub
github.com/lobs-ai
πŸŽ“
Built by
Rafe Symonds β€” University of Michigan CSE