M3SHDUP
Multi-Agent Collaboration Mesh -- Technical Report V1
Author: Fred Wojo | AI Architect: Archon (Claude Opus 4.6)
Date: April 3, 2026
System: mesh.demobygrit.com | Staging VPS ([REDACTED])
Table of Contents
- Executive Summary
- Architecture Overview
- Hub Server
- Worker System
- Mesh Daemon
- iMessage Bridge
- Command Interface
- Task Dispatch Engine
- Safety and Control Layer
- Web UI
- Database Schema
- API Reference
- Security Posture
- Container and Deployment
- Test Coverage
- Node Inventory
- Technology Stack
- Architecture Diagram
- Build History
- Future Roadmap
- Changelog
1. Executive Summary
M3SHDUP is a multi-agent collaboration mesh that lets a single human operator command a distributed team of AI agents from anywhere -- iMessage, a PWA dashboard, or a desktop browser. The system was conceived, designed, and built to production in a single session on April 3, 2026: 34 commits, 58 passing tests, 9,156 lines of code, and two independent security audits completed across five implementation phases in roughly three hours. Three AI worker nodes spanning an Apple Silicon Mac Mini, an Intel Mac Mini, and a 27-inch iMac are now registered and executing tasks dispatched from a hub running on a staging VPS in Germany.
The name is the user's: "M3SHDUP" -- mesh network meets messed up. The system solves a real problem. Claude Code sessions on a single machine are powerful but serialized. M3SHDUP breaks that constraint by turning every machine on the local network into a worker that self-registers with the hub, advertises its capabilities and concurrency limits, and picks up tasks routed by a capacity-aware dispatcher with circuit-breaker protection. The user texts "do research prompt engineering best practices" from his iPhone, the iMessage bridge relays it to the hub, the dispatcher assigns it to Rex (the Intel Mac Mini research node), Rex invokes Claude CLI, and the result flows back through the hub to the user's iMessage thread -- all without opening a laptop.
What makes M3SHDUP technically distinctive is the layering of safety controls around autonomous AI execution. Agents must request approval before performing dangerous operations. An escalation chain (Rex → Archon → User) fires automatically after three consecutive failures. Daily token budgets cap runaway cost at 500,000 tokens per agent. Autonomous mode lets the user issue a high-level goal ("research competitor pricing for all three subscription apps"), which Archon decomposes into subtasks, dispatches them to Rex in parallel, monitors progress, and summarizes results. The entire system runs as a single Python daemon on the M2 Mac Mini with four self-healing threads, each wrapped in a crash-recovery loop with a ten-restart ceiling before graceful degradation.
2. Architecture Overview
M3SHDUP follows a hub-and-spoke topology. The hub is a stateless relay and persistence layer that does not run any AI inference. All intelligence lives at the edges: worker processes on individual machines invoke Claude CLI as a subprocess and report results back to the hub via REST API. This design means the hub can run on a modest VPS (256MB RAM, 0.5 CPU) while workers leverage the full compute of their host machines.
The hub exposes 13 API endpoints over HTTPS, authenticated by a rotated bearer token with constant-time comparison. Real-time event delivery uses Server-Sent Events (SSE), which avoids the complexity of WebSocket management while providing push notifications for message arrivals, task state changes, agent registrations, and approval decisions. The SSE bus implements backpressure via per-subscriber queue caps of 500 events.
Workers connect to the hub over Tailscale, a WireGuard-based mesh VPN that provides encrypted point-to-point tunnels without exposing services to the public internet. The only internet-facing surface is the hub itself, fronted by Caddy with automatic TLS on the staging VPS.
┌──────────────────────────────┐
│ M3SHDUP Hub │
│ mesh.demobygrit.com │
│ │
│ FastAPI + SSE + SQLite WAL │
│ Task Queue + Agent Registry │
│ Approval Queue + Cost Log │
│ Escalation Chain + Context │
└──────────────┬───────────────┘
│
┌─────────────────┼─────────────────┐
│ │ │
┌──────┴──────┐ ┌──────┴──────┐ ┌────────┴────────┐
│ Mesh Daemon │ │ Rex Worker │ │ Crucible Worker │
│ (M2 Mac) │ │ (Intel Mac) │ │ (iMac 27") │
│ │ │ │ │ │
│ Archon AI │ │ Rex AI │ │ Worker AI │
│ iMsg Bridge │ │ Research │ │ Overflow │
│ Full Tools │ │ File Ops │ │ Research │
│ 5 slots │ │ 2 slots │ │ 3 slots │
└──────┬──────┘ └─────────────┘ └─────────────────┘
│
┌──────┴──────┐
│ User │
│ iPhone │
│ PWA │
│ Desktop │
└─────────────┘
The mesh daemon on the M2 Mac Mini is the critical integration point. It runs four persistent threads: T1 reads the user's iMessage chat.db and relays texts to the hub, T2 polls the hub for agent messages and delivers them back to the user via osascript, T3 is the Archon AI watcher that responds to the user's messages using Claude, and T4 is a Rex agent that handles task execution and @rex mentions. Each thread is wrapped in a resilient loop that automatically restarts after a crash, with exponential backoff and a hard ceiling of ten restarts per thread before the supervisor gives up.
3. Hub Server
The hub is implemented as a FastAPI application in app/main.py (467 lines). It serves four concerns: authentication, API routing, SSE broadcasting, and the web UI.
Authentication follows a dual-path model. API consumers (workers, the mesh daemon) authenticate via a Bearer token in the Authorization header. Browser sessions authenticate via a POST to /api/login, which sets an httponly, secure, samesite=lax cookie containing the SHA-256 hash of the token. Both paths use hmac.compare_digest for constant-time comparison, preventing timing-based token extraction.
The application requires the M3SHDUP_TOKEN environment variable at startup. If the variable is missing, the process prints a FATAL message and exits with code 1 -- there is no fallback token. This fail-loud behavior was added during Phase 4 after the security audit identified a default token as a critical vulnerability.
A background task runs a stale-agent sweep every 30 seconds. Any agent that has not sent a heartbeat within the configurable timeout (currently 30 seconds) is marked offline. This ensures the task dispatcher never routes work to a dead worker.
The lifespan context manager initializes the SQLite database on startup and publishes a system event confirming the hub is online. Static files are served from app/static/ (currently containing only manifest.json for PWA support). The main UI template at app/templates/index.html (1,745 lines) is a self-contained single-page application with no external JavaScript framework dependencies.
4. Worker System
The generic worker (mesh-worker.py, 360 lines) is the mechanism by which any machine joins the mesh with a single command. It is a standalone Python script with no FastAPI dependency -- it uses only urllib.request for HTTP and threading for concurrent task execution.
The worker lifecycle follows a clear sequence. On startup, it registers with the hub by POSTing to /api/agents/{name}/register with its capabilities list (e.g., ["research", "file_ops", "git"]), excluded operations list (e.g., ["docker", "deploy"]), maximum concurrency setting, and machine identifier. If registration fails, the worker exits immediately.
Once registered, the worker enters a poll loop on a 2-second interval. Every 5 seconds it sends a heartbeat. On each poll cycle, it checks two conditions before fetching tasks: the active task count must be below the configured maximum, and the system load average must be below twice the CPU core count. This load check prevents a machine from accepting work when it is already saturated.
Tasks are fetched by polling GET /api/tasks?assigned_to={name}&status=assigned. Each fetched task spawns a daemon thread that invokes Claude CLI:
result = subprocess.run(
["claude", "--print", "--dangerously-skip-permissions"],
input=prompt,
capture_output=True,
text=True,
timeout=self.claude_timeout,
)
The --print flag outputs Claude's response to stdout without interactive prompting. The --dangerously-skip-permissions flag disables permission checks, allowing Claude to read, write, and execute commands without confirmation -- the system's entire security perimeter is the bearer token and network boundary.
After execution, the worker estimates token usage (input length divided by 4 for input tokens, output length divided by 4 for output tokens) and logs the cost to the hub. If Claude returns no output or times out, the task is marked as failed. After three consecutive failures, the worker triggers an escalation: Rex escalates to Archon, Archon escalates to the user.
Graceful shutdown is handled via SIGTERM and SIGINT. The worker sets a shutdown event, stops accepting new tasks, and waits up to 60 seconds for active tasks to drain before exiting.
The worker is fully configurable via command-line arguments or environment variables:
| Parameter | Env Var | Default | Description |
|---|---|---|---|
--hub | M3SHDUP_HUB | http://localhost:8333 | Hub base URL |
--token | M3SHDUP_TOKEN | (required) | Bearer auth token |
--name | M3SHDUP_AGENT_NAME | worker | Agent identity |
--machine | M3SHDUP_MACHINE | hostname | Machine identifier |
--max-concurrent | M3SHDUP_MAX_CONCURRENT | 1 | Max parallel tasks |
--capabilities | M3SHDUP_CAPABILITIES | research | Comma-separated |
--excluded | M3SHDUP_EXCLUDED | (none) | Excluded operations |
--claude-timeout | M3SHDUP_CLAUDE_TIMEOUT | 120 | Seconds per invocation |
5. Mesh Daemon
The mesh daemon (mesh-daemon.py, 1,181 lines) is the most complex component in the system. It is a single-process, multi-threaded Python application that runs on the M2 Mac Mini and integrates four subsystems: the iMessage-to-hub relay, the hub-to-iMessage relay, the Archon AI watcher, and the Rex agent. All four threads share a global shutdown_event and a heartbeat registry monitored by a health supervisor loop in the main thread.
The resilient loop wrapper (resilient_loop) catches any unhandled exception from a thread function, logs the crash, increments a per-thread restart counter, waits 5 seconds, and re-enters the function. After 10 restarts (configurable via MAX_THREAD_RESTARTS), the wrapper gives up and the thread dies. The health supervisor in the main thread checks every 30 seconds whether all threads are alive. If every thread has exhausted its restart budget, the supervisor sets the shutdown event and exits the process, which triggers the outer shell loop in start-mesh.sh to relaunch the entire daemon after a 5-second delay.
This supervisor-of-threads-inside-a-supervisor-script design provides two levels of fault tolerance: individual thread crashes recover in-process within 5 seconds, and total process failures recover via the shell wrapper in 10 seconds. The stale heartbeat threshold is set to 180 seconds (120-second Claude timeout plus 60-second buffer) to avoid false-positive kills during long-running Claude invocations.
The daemon auto-starts via .zprofile when Ghostty opens on the Mac Mini. It cannot run as a LaunchAgent because macOS TCC (Transparency, Consent, and Control) does not grant Full Disk Access to processes spawned by launchd -- the iMessage bridge needs read access to ~/Library/Messages/chat.db, which requires FDA on the parent terminal. A lockfile at ~/m3shdup/.mesh-daemon.lock prevents duplicate instances.
6. iMessage Bridge
The iMessage bridge is the system's most unconventional component. It turns Apple's iMessage platform into a command interface for the mesh by reading the SQLite database that Messages.app writes to disk and sending replies via AppleScript's osascript utility.
Thread 1 (iMessage to Hub) opens chat.db in read-only mode (SQLite ?mode=ro URI parameter) and polls for new messages from the user's phone number every 3 seconds. It reads the message table joined through chat_message_join to filter by the specific chat identified by the user's number (+1XXXXXXXXXX). Only messages where is_from_me = 0 (incoming from the user) are processed. Messages that match the relay prefix regex (^\[[A-Za-z][A-Za-z0-9 _-]*\] ) are skipped to prevent bridge echo loops -- these are messages that the bridge itself sent to the user and that show up as incoming in chat.db. Messages ending with "Sent by Claude" are also filtered out to avoid processing Claude Channels plugin echoes.
The bridge handles three message types: text-only messages, image attachments, and combined text-plus-image messages. For attachments, it reads the attachment table joined through message_attachment_join, resolves file paths via os.path.realpath, validates that the resolved path falls within ~/Library/Messages/Attachments (preventing symlink traversal), checks file size against a 20MB cap, and verifies image magic bytes before trusting the file extension. Supported formats are JPEG (magic \xff\xd8\xff), PNG (\x89PNG), GIF (GIF8), WebP (RIFF), and HEIC/HEIF (bytes 4-8 = ftyp). Image references are posted to the hub as [image:/path/to/file], which the Archon watcher intercepts and processes through Claude's vision capability.
If a hub POST fails, the message is appended to a retry queue (deque(maxlen=50)). On each poll cycle, the thread attempts to drain the retry queue before processing new messages.
Thread 2 (Hub to iMessage) polls the hub for new agent messages every 5 seconds. It filters for messages with sender_type == "agent" and delivers them to the user via osascript. The message is prefixed with the agent name in brackets (e.g., [Archon] Here's what I found...). Messages longer than 3,000 characters are truncated at the nearest word boundary with a [...truncated -- full result on hub] suffix.
The osascript execution uses an argv-based approach to prevent injection. The message content is never interpolated into the AppleScript string. Instead, it is passed as a command-line argument to osascript:
on run argv
tell application "Messages"
send (item 1 of argv) to participant "+1XXXXXXXXXX" of (1st account whose service type = iMessage)
end tell
end run
The script attempts participant first, then falls back to buddy if the first form fails. Both calls have a 15-second subprocess timeout.
7. Command Interface
The user communicates with the mesh by texting commands from his iPhone. The command parser in the mesh daemon recognizes 14 command patterns:
| Command | Mode | Description |
|---|---|---|
status | status | Show agent online/offline states and active task counts |
tasks | tasks | List the 10 most recent tasks with status icons |
@context | context_get | Show all shared context key-value pairs |
@context key=val | context_set | Set a shared context entry |
@rex <text> | rex_message | Dispatch a task directly to Rex |
approve | approval | Approve the most recent pending approval |
reject | approval | Reject the most recent pending approval |
pending | pending_approvals | List all pending approval requests |
costs / @costs | costs | Show today's token usage and cost by agent |
escalations | escalations | List open escalations |
do <text> | task | Execute a task via Archon with full context |
@archon do <text> | task | Same as do (explicit Archon targeting) |
auto <goal> | autonomous | Decompose goal into subtasks, dispatch to Rex |
summary | summary | Summarize recent task results |
| (anything else) | chat | Free-form conversation with Archon |
The chat mode builds a context-aware prompt that includes the current shared context (active project, priorities, whatever the user has set), the last 15 messages from the conversation, and the user's latest message. The task mode (do) uses a similar but more detailed prompt that instructs Claude to be thorough and provide actionable output.
Autonomous mode is the most sophisticated command. When The user texts auto research competitor pricing for subscription apps, Archon first asks Claude to decompose the goal into 2-5 concrete subtasks, each with a title and detailed prompt. Claude returns a JSON array (with markdown code fence stripping for robustness), and Archon dispatches each subtask to Rex via the hub's task API. The user receives a summary of what was dispatched and can track progress with tasks or summary.
8. Task Dispatch Engine
The dispatcher (app/dispatch.py, 100 lines) implements capacity-aware routing with circuit-breaker protection. It is a singleton that maintains two in-memory dictionaries: _failures (a per-agent list of failure timestamps) and _circuit_open_at (the monotonic time at which an agent's circuit breaker tripped).
When a task arrives, the dispatcher follows this algorithm:
- If
assign_tois specified (e.g., the user texted@rex), verify the target worker has capacity and a closed circuit. If not, fall through to automatic assignment. - Query
db.get_available_workers(capability)for all online or busy agents with spare capacity (active tasks below max_concurrent) that list the required capability and do not exclude it. - Filter out agents with open circuit breakers.
- Pick the first match (already sorted by most spare capacity descending).
- Create the task record, set status to "assigned", increment the agent's
active_taskscounter. - If no worker is available, create the task with status "queued" and no assignee.
The circuit breaker trips after 3 consecutive failures (CIRCUIT_BREAKER_THRESHOLD). Once tripped, the agent is excluded from dispatch for 60 seconds (CIRCUIT_BREAKER_COOLDOWN). After the cooldown elapses, the circuit enters a half-open state: the failure history is cleared and the next task is allowed through. If that task succeeds (recorded via record_success), the circuit stays closed. If it fails, the breaker trips again immediately.
The get_available_workers database query uses a LIKE pattern on the JSON-serialized capabilities array ('%"research"%'). This is a pragmatic choice for a small agent pool. The query also checks the excluded list to ensure a worker is not asked to perform operations it has explicitly opted out of (e.g., Rex excludes docker, deploy, xcode, and destructive_ops).
The active_tasks counter is decremented when a task reaches any terminal state (done, failed, cancelled, escalated). This decrement happens in db.update_task by comparing the old and new status, ensuring the counter does not drift even if an update is applied multiple times.
9. Safety and Control Layer
M3SHDUP provides three interlocking safety mechanisms: the approval queue, the escalation chain, and cost tracking.
Approvals. Any agent can create an approval request by POSTing to /api/approvals. The request includes the agent name, the action being requested, a human-readable description, optional context, and a timeout (default 1 hour). Pending approvals are surfaced to the user via iMessage when he texts pending. The user responds with approve or reject, and the decision is applied to the most recent pending approval. Expired approvals are automatically cleaned up.
Escalations. When an agent fails a task three times, the worker automatically creates an escalation. The escalation chain routes Rex's failures to Archon, and Archon's failures to the user. Each escalation record includes the originating agent, the target agent, the task ID, and the reason (last error message, truncated to 200 characters). Open escalations are surfaced via the escalations command.
Cost Tracking. Every Claude invocation logs estimated token usage to the cost_log table. Input tokens are estimated as len(prompt) // 4 and output tokens as len(response) // 4. Cost is calculated using Claude Sonnet pricing: $3 per million input tokens and $15 per million output tokens. The hub exposes a GET /api/costs endpoint that returns today's usage grouped by agent, with a daily limit of 500,000 tokens per agent. The user can check spending at any time by texting costs.
10. Web UI
The web UI (app/templates/index.html, 1,745 lines) is a dark-themed, mobile-first PWA built without any JavaScript framework. It uses CSS custom properties for theming, JetBrains Mono as the primary typeface, and a gradient brand identity (amber to purple to emerald) consistent with the M3SHDUP logo. The design targets iOS Safari as the primary mobile browser.
The UI is organized into four tabs accessible via a bottom navigation bar:
Dashboard. Displays agent cards showing each worker's name, machine, online/offline status (green, amber, or red dot), and task utilization (e.g., "2/5 tasks"). A status summary bar shows total online agents, active tasks, and pending approvals at a glance.
Chat. A full-screen chat interface with message bubbles color-coded by sender: amber for the user, purple for Archon, emerald for Rex. Messages are loaded from the hub on tab activation and updated in real-time via SSE. The input field is fixed at the bottom with a 16px font size (the threshold that prevents iOS Safari from auto-zooming on focus).
Tasks. A kanban-style task list showing queued, assigned, running, done, and failed tasks. Each task card shows the title, assignee, status badge, and creation timestamp.
Logs. A live log stream fed by SSE events. Each event is timestamped and color-coded by type (system, message, task, error).
iOS Safari compatibility required several targeted fixes discovered during testing: 100% width instead of 100vw (which includes the scrollbar and causes horizontal overflow), pre-line instead of pre-wrap for message content (which can expand elements before max-width clips), maximum-scale=1 in the viewport meta tag to prevent zoom-on-input-focus, and inset box-shadows instead of outset to prevent overflow from child elements.
The PWA manifest at /static/manifest.json enables "Add to Home Screen" on iOS, providing a native-app-like experience with black-translucent status bar styling and standalone display mode.
11. Database Schema
The hub uses a SQLite database in WAL (Write-Ahead Logging) mode, stored at data/m3shdup.db. The schema consists of 7 tables and 8 indexes, all created with CREATE TABLE IF NOT EXISTS to survive container restarts without data loss.
| Table | Purpose | Key Columns |
|---|---|---|
messages | Chat history | id, ts, sender, sender_type, content, channel, reply_to |
agents | Worker registry | id, name, machine, status, capabilities (JSON), max_concurrent, active_tasks, excluded (JSON) |
tasks | Task queue | id, title, prompt, assigned_to, status, priority, timeout, output, error, attempts |
approvals | Permission requests | id, agent, action, description, status, decided_by, timeout |
context | Shared KV store | key (PK), value, set_by |
cost_log | Token usage | agent, task_id, input_tokens, output_tokens, model, cost_usd |
escalations | Failure chain | from_agent, to_agent, task_id, reason, status |
Status enums are enforced via CHECK constraints. The tasks table allows: queued, assigned, running, done, failed, cancelled, escalated. The agents table allows: online, offline, busy. The approvals table allows: pending, approved, rejected, expired.
Foreign keys are enabled via PRAGMA foreign_keys=ON. The messages.reply_to column references messages.id, enabling threaded conversations. The tasks.assigned_to column references agents.id.
Indexes cover the highest-frequency query patterns: messages by timestamp and channel, tasks by status and assignee, cost log by agent and timestamp, approvals and escalations by status.
Agent registration uses INSERT OR IGNORE for seed data and INSERT ... ON CONFLICT DO UPDATE for worker self-registration, ensuring re-registration updates capabilities and status without losing the original registered_at timestamp.
12. API Reference
All endpoints except /api/health (unauthenticated, returns only {"status": "ok"}) and /login require a valid Authorization: Bearer <token> header or a m3sh_session cookie.
| Method | Path | Auth | Description |
|---|---|---|---|
GET | / | Cookie | Web UI (redirects to /login if unauthenticated) |
GET | /login | None | Login page |
POST | /api/login | None | Authenticate and set session cookie |
GET | /api/stream | Bearer/Cookie | SSE event stream (all event types) |
POST | /api/messages | Bearer/Cookie | Send a message (50KB max) |
GET | /api/messages | Bearer/Cookie | Get message history (limit capped at 500) |
POST | /api/tasks | Bearer/Cookie | Create and dispatch a task (8KB prompt max) |
PUT | /api/tasks/{id} | Bearer/Cookie | Update task status/output |
GET | /api/tasks | Bearer/Cookie | List tasks (filter by status, assignee) |
POST | /api/tasks/{id}/stream | Bearer/Cookie | Append to task stream log |
POST | /api/agents/{id}/register | Bearer | Self-register a worker |
POST | /api/agents/{id}/heartbeat | Bearer | Keep-alive ping |
GET | /api/agents | Bearer/Cookie | List all agents |
GET | /api/context | Bearer/Cookie | Get all shared context pairs |
PUT | /api/context | Bearer/Cookie | Set a context key-value pair |
DELETE | /api/context/{key} | Bearer/Cookie | Delete a context key |
POST | /api/approvals | Bearer | Create an approval request |
GET | /api/approvals | Bearer/Cookie | List approvals (filter by status) |
PUT | /api/approvals/{id} | Bearer/Cookie | Approve or reject |
POST | /api/escalations | Bearer | Create an escalation |
GET | /api/escalations | Bearer/Cookie | List escalations (filter by status) |
PUT | /api/escalations/{id} | Bearer/Cookie | Resolve or dismiss |
POST | /api/costs | Bearer | Log a cost entry |
GET | /api/costs | Bearer/Cookie | Get today's usage by agent |
GET | /api/health | None/Bearer | Health check (full detail if authenticated) |
SSE events broadcast on /api/stream include: system, message, task_assigned, task_queued, task_updated, task_stream, agent_registered, agent_status, context_updated, context_deleted, approval_created, approval_decided, escalation_created, escalation_updated.
13. Security Posture
M3SHDUP underwent two independent security audits on the day it was built: a full threat model by Bastion (security specialist) and a code-level review by Sentinel (code review agent). All critical and high-severity findings were remediated. The git history was scrubbed of any sensitive values before the first public commit.
Remediated (Critical):
- All configuration via environment variables — no secrets in source code
- Bearer token authentication with constant-time comparison on all paths
- Input length limits enforced: messages at 50KB, prompts at 8KB, osascript at 3KB
- osascript injection fix: content passed via argv, never interpolated into AppleScript strings
- Image attachment validation: path restricted to Attachments directory, symlinks resolved, magic bytes verified, 20MB size cap
Remediated (High):
- Constant-time auth comparison via
hmac.compare_digeston all paths - Health endpoint stripped to bare
{"status": "ok"}for unauthenticated callers; full agent status requires auth - Container runs as non-root user (
appuser) via DockerfileUSERdirective - Docker image pinned to
python:3.12.9-slim(specific version, not floating tag) - SSE queue capped at 500 events per subscriber
- Query limits capped at 500 across all listing endpoints
active_taskscounter properly decrements on task completion
Acknowledged (architectural):
--dangerously-skip-permissionson Claude CLI is an inherent risk. The flag is required for non-interactive execution. The bearer token and Tailscale network boundary are the security perimeter. Per-agent tokens or capability-scoped invocation would be the proper mitigation.- No rate limiting on API endpoints. Caddy rate-limiting at the reverse proxy layer would be the correct placement.
14. Container and Deployment
The hub runs as a single Docker container on the staging VPS at [REDACTED].
Dockerfile:
FROM python:3.12.9-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
RUN adduser --disabled-password --no-create-home appuser
COPY . .
RUN chown -R appuser:appuser /app
USER appuser
EXPOSE 8000
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
Container Configuration:
- Port mapping: 8333 (host) to 8000 (container)
- Volume:
./data:/app/data(SQLite database persistence) - Memory limit: 256MB, CPU limit: 0.5 cores
- Memory reservation: 64MB
- Restart policy:
unless-stopped - Log rotation: 10MB max per file, 3 files retained
- Healthcheck: Python urllib hitting
http://localhost:8000/api/healthevery 30 seconds
Deployment workflow:
rsync -avz --exclude=data --exclude=.env ~/m3shdup/ staging:~/m3shdup/
ssh staging "cd m3shdup && docker compose up -d --build m3shdup"
Environment variables are managed via a .env file on the VPS, gitignored from the repository. The M3SHDUP_TOKEN variable is the only required secret.
Dependencies (requirements.txt):
fastapi>=0.115.0-- ASGI web frameworkuvicorn[standard]>=0.34.0-- ASGI server with uvloop and httptoolspython-multipart>=0.0.18-- Form data parsingsse-starlette>=2.2.1-- Server-Sent Events supportwebsockets>=14.2-- WebSocket protocol (uvicorn dependency)aiosqlite>=0.21.0-- Async SQLite driverhttpx>=0.28.0-- Async HTTP clientpytest>=8.0-- Test framework (dev)pytest-asyncio>=0.24.0-- Async test support (dev)
15. Test Coverage
The test suite comprises 58 tests across 4 files totaling 1,231 lines:
| File | Tests | Lines | Coverage Area |
|---|---|---|---|
test_db.py | ~25 | 519 | Schema init, CRUD for all 7 tables, capacity queries, stale agent sweep |
test_api.py | ~20 | 450 | All API endpoints, auth validation, input limits, error responses |
test_dispatch.py | ~8 | 169 | Circuit breaker, capacity routing, assign_to targeting, no-worker fallback |
test_worker.py | ~5 | 93 | Worker registration, Claude subprocess mock, task lifecycle |
Tests use pytest-asyncio for async test support and test the database layer directly (no HTTP mocking for db tests) against an in-memory SQLite instance. API tests use FastAPI's TestClient. Worker tests mock the Claude subprocess to avoid requiring the CLI during CI.
16. Node Inventory
| Node | Machine | CPU | RAM | Tailscale IP | Role | Max Concurrent | Status |
|---|---|---|---|---|---|---|---|
| Archon | Mac Mini M2 | Apple M2 8-core | 24 GB | 100.x.x.x | Primary AI, full tools, iMessage bridge | 5 | Online |
| Rex | Mac Mini Intel | Intel i5 | 16 GB | 100.x.x.x | Research, code, file ops | 2 | Online |
| Crucible | iMac 27" | Intel i5 | 16 GB | 100.x.x.x | Overflow compute, research | 3 | Online |
Archon runs the mesh daemon (bridge + watcher + Rex agent) and has full tool access including Docker, deploy, and destructive operations. It is the only node with iMessage bridge capability.
Rex is the dedicated research node. It runs the generic mesh-worker.py and explicitly excludes Docker, deploy, Xcode, and destructive operations. Stress testing confirmed 2 concurrent tasks is comfortable, 3 is heavy, and 4+ should not be attempted on this hardware.
Crucible is the newest node, an iMac 27" running macOS Monterey 12.7.6. It required special setup: Ghostty is incompatible with macOS versions below Ventura, so Terminal.app is used instead. Node.js v22.16.0 was installed from a tarball (Homebrew formula is broken on Monterey, which is Tier 3 unsupported). Claude Code was installed via npm with sudo and symlinked to /usr/local/bin. SSL certificate issues on older macOS were resolved by installing certifi and setting the SSL_CERT_FILE environment variable.
All nodes connect to each other and to the hub via Tailscale. SSH key authentication is configured for all nodes via ~/.ssh/config.
17. Technology Stack
| Layer | Technology | Version | Purpose |
|---|---|---|---|
| Hub Framework | FastAPI | 0.115+ | ASGI web framework, API routing |
| ASGI Server | Uvicorn | 0.34+ | Production server with uvloop |
| Database | SQLite | 3.x (WAL) | Persistence (messages, tasks, agents) |
| Async DB Driver | aiosqlite | 0.21+ | Non-blocking SQLite access |
| SSE | sse-starlette | 2.2+ | Server-Sent Events broadcasting |
| AI Runtime | Claude CLI | Latest | Subprocess-invoked LLM inference |
| Reverse Proxy | Caddy | 2.x | TLS termination, HTTP/2 |
| Container Runtime | Docker | Latest | Hub deployment on VPS |
| Mesh VPN | Tailscale | Latest | Encrypted node-to-node tunnels |
| UI Font | JetBrains Mono | Web | Monospace typeface |
| Bridge (read) | sqlite3 | stdlib | chat.db read-only access |
| Bridge (write) | osascript | macOS | iMessage send via AppleScript |
| Testing | pytest + pytest-asyncio | 8.0+ | Async test runner |
| Language | Python | 3.12 | All components |
18. Architecture Diagram
┌───────────────────────────────────────────────────────────────────────────────┐
│ INTERNET │
│ │
│ ┌─────────────┐ │
│ │ the user's │ │
│ │ iPhone │ ◄──── iMessage ────► Messages.app ───► chat.db │
│ │ / Browser │ (M2 Mac) │
│ └──────┬──────┘ │
│ │ HTTPS │
│ ▼ │
│ ┌──────────────────────────────────────────┐ │
│ │ mesh.demobygrit.com │ │
│ │ Caddy (TLS + H2) │ │
│ │ ▼ │ │
│ │ ┌──────────────────────────────┐ │ │
│ │ │ M3SHDUP Hub │ │ │
│ │ │ FastAPI + SSE + SQLite │ │ │
│ │ │ │ │ │
│ │ │ /api/messages POST/GET │ │ │
│ │ │ /api/tasks POST/GET │ │ │
│ │ │ /api/agents POST/GET │ │ │
│ │ │ /api/approvals POST/GET │ │ │
│ │ │ /api/context GET/PUT │ │ │
│ │ │ /api/costs POST/GET │ │ │
│ │ │ /api/escalations POST/GET │ │ │
│ │ │ /api/stream SSE │ │ │
│ │ │ /api/health GET │ │ │
│ │ └──────────────────────────────┘ │ │
│ │ Staging VPS (Hetzner) │ │
│ └──────────────────────────────────────────┘ │
│ │ │
└──────────────────────┼───────────────────────────────────────────────────────┘
│
─────── Tailscale WireGuard VPN ──────
│
┌──────────────┼──────────────┐
│ │ │
┌───────┴───────┐ ┌────┴────┐ ┌──────┴──────┐
│ M2 Mac Mini │ │ Intel │ │ iMac 27" │
│ (Archon) │ │ (Rex) │ │ (Crucible) │
│ │ │ │ │ │
│ mesh-daemon │ │ mesh- │ │ mesh- │
│ ├ T1: iMsg→H │ │ worker │ │ worker │
│ ├ T2: H→iMsg │ │ │ │ │
│ ├ T3: Archon │ │ claude │ │ claude │
│ └ T4: Rex │ │ --print │ │ --print │
│ │ │ │ │ │
│ 5 concurrent │ │ 2 conc │ │ 3 conc │
│ chat.db (RO) │ │ │ │ │
│ osascript │ │ │ │ │
└───────────────┘ └─────────┘ └─────────────┘
Data Flow: The user texts a question from iPhone
1. The user types "what's the status of the archifi build" on iPhone
2. iMessage delivers to Messages.app on M2 Mac Mini
3. Messages.app writes to ~/Library/Messages/chat.db
4. T1 (iMsg→Hub) polls chat.db, detects new row
5. T1 POSTs to hub: {"sender":"User", "sender_type":"human", "content":"..."}
6. Hub stores in messages table, broadcasts SSE event
7. T3 (Archon watcher) polls hub, sees the user's human message
8. T3 parses command → mode="chat"
9. T3 builds context-aware prompt (last 15 messages + shared context)
10. T3 calls claude --print with prompt
11. Claude returns response
12. T3 POSTs response to hub as Archon agent message
13. Hub stores, broadcasts SSE event
14. T2 (Hub→iMsg) polls hub, sees new agent message
15. T2 sends via osascript: [Archon] <response>
16. iMessage delivers to the user's iPhone
Data Flow: The user dispatches a task to Rex
1. The user texts "@rex research storekit2 subscription lifecycle"
2. T1 → Hub (same as above, steps 1-6)
3. T3 parses command → mode="rex_message"
4. T3 POSTs to hub: /api/tasks with assign_to="rex"
5. Dispatcher checks Rex capacity and circuit breaker
6. Task created with status="assigned"
7. Rex worker polls /api/tasks?assigned_to=rex&status=assigned
8. Rex picks up task, sets status="running"
9. Rex invokes claude --print with task prompt
10. Claude returns research results
11. Rex POSTs cost log, updates task status="done" with output
12. Rex POSTs chat message with result preview
13. T2 relays to the user via iMessage
19. Build History
M3SHDUP was built in 5 phases over a single session on April 3, 2026:
Phase 1 -- Bulletproof Foundation (commits 1-12)
Seven-table schema, capacity-aware dispatch with circuit breaker, self-registering workers, generic mesh-worker.py, hardened iMessage bridge with retry queues.
Phase 2 -- Intelligence (commits 13-18)
Command router with 14 commands, context-aware rich prompts (chat vs task mode), Rex deployed as mesh worker on Intel Mac Mini.
Phase 3 -- Safety and Control (commits 19-22)
Approval queue, cost tracking with daily limits, escalation chain (Rex → Archon → User, auto-escalate after 3 failures).
Phase 4 -- Web UI (commits 23-29)
Four-tab PWA: Dashboard, Chat, Tasks, Logs. Mobile-first dark theme, JetBrains Mono, iOS Safari compatibility fixes.
Phase 5 -- Jaw-Droppers (commits 30-34)
Photo/screenshot dispatch via iMessage with magic byte verification, autonomous mode (decompose + dispatch + monitor), summary command for task results.
Timeline:
- First commit: April 3, 2026 at 10:38 AM CST
- Last commit: April 3, 2026 at 1:47 PM CST
- Duration: ~3 hours
- Total commits: 34
- Total tests: 58
20. Future Roadmap
Phase 6 -- Open Source Prep. README with setup instructions, MIT license, .env.example, demo mode (read-only public view), extraction of user-specific configuration into environment variables.
Full QA Pass. End-to-end testing of every command from iMessage, web UI on physical iPhone, Rex dispatch round-trip, photo dispatch, autonomous mode, approval flow, and escalation chain.
Rate Limiting. Sliding-window rate limiter at the Caddy layer for login brute-force protection and global request caps.
Per-Agent Tokens. Issue unique tokens per worker so the hub can distinguish which agent is making a request, preventing one compromised worker from impersonating another.
Audit Log. An audit_log table recording every mutating operation with timestamp, action, actor, and payload hash for forensic analysis.
Additional Workers. Any machine with Python 3.12+ and the Claude CLI can join the mesh by running mesh-worker.py with the hub URL and token. The system is designed for horizontal scaling.
Screenshots
Chat View
Dashboard
Mobile View
21. Changelog
| Version | Date | Changes |
|---|---|---|
| V1 | 2026-04-03 | Initial release. 5 phases, 34 commits, 58 tests, 2 security audits. |
Generated by Archon (Claude Opus 4.6) for the author
System: M3SHDUP | mesh.demobygrit.com