Files
state-hub/README.md
tegwick 166aedfa8d feat: add workplan aliases and legacy meter
Adds preferred workplan REST/event surfaces, legacy-meter telemetry and weekly review summaries, documentation/dashboard terminology updates, dashboard API loading fixes, and close-out sync for STATE-WP-0052 and STATE-WP-0054.
2026-06-04 08:25:31 +02:00

298 lines
10 KiB
Markdown

# State Hub
State Hub is the live coordination service for the Custodian ecosystem:
PostgreSQL persistence, FastAPI API, FastMCP server, Observable dashboard,
consistency tooling, and repo/workplan synchronization.
This repository is the standalone home for the service. It was extracted from
the former embedded implementation at:
```text
/home/worsch/the-custodian/state-hub
```
## Extraction State
The extraction workplan `CUST-WP-0043 - State Hub Repo Extraction` is complete.
Current state:
- The implementation has been imported here with subtree history.
- `CUST-WP-0042` has been re-homed into this repository.
- The old embedded tree in `the-custodian` remains only as a pointer.
- This repository is authoritative for State Hub code, docs, tests, dashboard,
migrations, scripts, policies, and State Hub-local workplans.
## Workplans
New State Hub-local workplans should use the prefix:
```text
STATE-WP-0001
```
Legacy Custodian-hosted State Hub plans, such as `CUST-WP-0042`, may retain
their existing IDs when that preserves State Hub workstream/task continuity.
Do not create duplicate workstreams manually; write the workplan file first,
then run consistency sync.
---
## Stack
| Layer | Technology | Port |
|-------|-----------|------|
| Database | PostgreSQL 16-alpine (Docker) | `127.0.0.1:5432` |
| API | FastAPI + SQLAlchemy 2.0 async + asyncpg | `127.0.0.1:8000` |
| MCP server | FastMCP SSE | `127.0.0.1:8001` |
| Dashboard | Observable Framework | `127.0.0.1:3000` |
| CLI | `custodian` (Python, uv entry point) | — |
All services bind to `127.0.0.1` only — nothing exposed to the network.
---
## Setup
### Prerequisites
- Docker Engine
- Python 3.12+ with `uv` (`pip install uv`)
- Node.js 18+ (dashboard only)
### First-time
```bash
cd /home/worsch/state-hub
cp .env.example .env # edit POSTGRES_PASSWORD
make install # uv sync
make db # docker compose up postgres
make migrate # alembic upgrade head
make seed # insert 6 canonical topics
make api # db + migrate + uvicorn :8000 (restarts if running)
```
### Dashboard
```bash
make dashboard # installs dashboard deps if needed, then Observable dev server on :3000
make dashboard-check # installs deps if needed, then runs Observable build
```
### Start Everything
To start all the infrastructure on separate consoles do:
```bash
make db # docker compose up postgres
make mcp-http # start state-hub mcp service
make dashboard # Observable dev server on :3000
make bridges # Set up ssh bridges for cross machines access
```
### CLI
```bash
make install-cli # symlink .venv/bin/custodian → ~/.local/bin
custodian status # API health + summary totals
custodian register-project # register cwd as a Custodian project
```
---
## Makefile Targets
| Target | What it does |
|--------|-------------|
| `make install` | `uv sync` — install Python deps + entry points |
| `make install-cli` | Symlink `custodian` to `~/.local/bin` |
| `make db` | Start postgres container |
| `make db-tools` | Start postgres + pgadmin (http://127.0.0.1:5050) |
| `make migrate` | `alembic upgrade head` |
| `make seed` | Insert 6 canonical topics |
| `make api` | `db` + wait + `migrate` + `uvicorn` (restarts if running) |
| `make dashboard-install` | Install dashboard npm deps from `dashboard/package-lock.json` |
| `make dashboard-check` | Build the Observable dashboard as a smoke/regression check |
| `make dashboard` | Install deps if needed, then start Observable dev server (restarts if running) |
| `make check` | `curl /state/health` |
| `make test` | Python test suite plus `make dashboard-check` |
| `make register-project DOMAIN=x PROJECT_PATH=y` | Register a project |
| `make clean` | `docker compose down -v` (destroys DB volume) |
---
## Database Schema
Five tables in dependency order:
```
topics
└── workstreams
└── tasks (self-FK: parent_task_id)
└── progress_events
decisions (FK: topic_id, workstream_id — at least one required)
└── progress_events
```
### Enums
| Enum | Values |
|------|--------|
| `topic_status` | `active` · `paused` · `archived` |
| `workstream_status` | `proposed` · `ready` · `active` · `blocked` · `backlog` · `finished` · `archived` |
| `task_status` | `wait` · `todo` · `progress` · `done` · `cancel` |
| `task_priority` | `low` · `medium` · `high` · `critical` |
| `decision_type` | `made` · `pending` |
| `decision_status` | `open` · `resolved` · `escalated` · `superseded` |
| `domain` | `custodian` · `railiance` · `markitect` · `coulomb_social` · `personhood` · `foerster_capabilities` |
### Governance constraints encoded in schema
- No hard DELETE endpoints — only soft: `archived`, `cancel`, `superseded`
- `progress_events` has no `updated_at` and no DELETE endpoint (append-only per constitution §5)
- `decisions` with financial/legal keywords + `pending` type → auto-set `escalation_note` (§4)
---
## API
Interactive docs at http://127.0.0.1:8000/docs once the API is running.
### Key endpoint: `/state/summary`
Returns a full snapshot in one call — used by both the MCP server and dashboard:
```json
{
"generated_at": "...",
"totals": {
"topics": { "active": 6, "paused": 0, "archived": 0, "total": 6 },
"workstreams": { "ready": 1, "active": 1, "blocked": 0, "finished": 1, "total": 3 },
"tasks": { "wait": 0, "todo": 9, "progress": 0, "done": 11, "cancel": 0, "total": 20 },
"decisions": { "open": 1, "resolved": 0, "escalated": 0, "total": 1 }
},
"topics": [...], // topics with nested workstream stubs
"blocking_decisions": [...], // pending decisions only
"waiting_tasks": [...],
"recent_progress": [...], // last 20 events
"open_workstreams": [...]
}
```
### Router summary
| Prefix | Operations |
|--------|-----------|
| `/topics` | CRUD (soft-delete: `archived`) |
| `/workplans` | Preferred CRUD surface for repo-backed workplans (soft-delete: `archived`) |
| `/workstreams` | Legacy compatibility CRUD surface; usage is recorded by legacy-meter |
| `/tasks` | CRUD (soft-delete: `cancel`); `PATCH` updates status |
| `/decisions` | CRUD (soft-delete: `superseded`); auto-escalation |
| `/progress` | `GET` list + `POST` append — no DELETE |
| `/legacy-meter` | Register, meter, and review legacy interface usage |
| `/state/summary` | Full snapshot |
| `/state/health` | DB connectivity check |
See `docs/workplan-terminology-transition.md` for the workstream-to-workplan
compatibility policy and retirement criteria.
---
## MCP Server
Runs as a persistent SSE service on `:8001`, independent of the Claude Code session.
Restart it anytime without restarting Claude Code.
```bash
make mcp-http # start (or restart) the MCP SSE server on :8001
```
Registered at user scope in `~/.claude.json`:
```json
{ "type": "sse", "url": "http://127.0.0.1:8001/sse" }
```
To re-register from scratch:
```bash
claude mcp remove state-hub -s user 2>/dev/null || true
claude mcp add-json -s user state-hub '{"type":"sse","url":"http://127.0.0.1:8001/sse"}'
```
See `mcp_server/TOOLS.md` for the full tool reference card (30 lines, faster than reading `server.py`).
### Tools at a glance
**Query** (read-only): `get_state_summary` · `get_topic` · `list_blocked_tasks` · `list_pending_decisions` · `get_recent_progress`
**Mutate** (each auto-emits a progress event): `create_task` · `update_task_status` · `record_decision` · `resolve_decision` · `add_progress_event` · `update_workstream_status`
**Resources**: `state://summary` · `state://topics` · `state://workstreams/{topic_slug}` · `state://decisions/blocking` · `state://tasks/blocked`
---
## `custodian` CLI
Installed into `.venv/bin/custodian` by `uv sync`; symlinked to `~/.local/bin` by `make install-cli`.
```
custodian register-project [--domain DOMAIN] [--path PATH]
```
- `--path` defaults to current working directory
- `--domain` is auto-detected from `project_charter_v*.md` frontmatter if omitted
```
custodian status
```
Prints API health, totals, and any blocking decisions.
### What `register-project` does
1. Verifies the API is reachable (fails fast with `make api` hint)
2. Looks up the topic ID for the domain via `/topics/?status=active`
3. Checks that `state-hub` is in `~/.claude.json`
4. Writes `$PROJECT_PATH/CLAUDE.md` from `scripts/project_claude_md.template`
5. Posts a `milestone` progress event recording the registration
---
## Project Registration Scripts
| Script | Purpose |
|--------|---------|
| `scripts/register_project.sh` | Shell version of `custodian register-project` |
| `scripts/patch_mcp_cwd.py` | Legacy: patched `cwd` for the old stdio registration (no longer needed) |
| `scripts/project_claude_md.template` | CLAUDE.md template with `{PROJECT_NAME}`, `{DOMAIN}`, `{TOPIC_ID}` |
| `scripts/seed.py` | Insert the 6 canonical topics into a fresh database |
| `scripts/pull_image.py` | WSL2 workaround: pull Docker images via Python urllib with Range-request chunking |
---
## Dashboard
Four pages at http://127.0.0.1:3000 (dev) or built with `npm run build`:
| Page | Content |
|------|---------|
| **Overview** | Status cards, task-by-status chart, recent activity feed, decisions due within 7 days |
| **Workstreams** | Filterable table by domain/status/owner; selected workstream task list; progress timeline |
| **Decisions** | Pending tab (with escalation highlights) and Made tab; resolution velocity chart |
| **Progress** | Append-only event feed with author badges; 30-day event volume chart |
Data loaders (`src/data/*.json.py`) are Python scripts that call the local API. They run at dev-server start and on `npm run build`. Clear the cache if data appears stale:
```bash
rm -rf dashboard/src/.observablehq/cache/
```
---
## Known Issues / WSL2 Notes
- **TLS bad record MAC on large downloads**: WSL2 corrupts packets on big TCP transfers. Use `scripts/pull_image.py` instead of `docker pull` for future image pulls.
- **MCP server is now SSE, not stdio**: Re-registration is `claude mcp add-json -s user state-hub '{"type":"sse","url":"http://127.0.0.1:8001/sse"}'`. The `patch_mcp_cwd.py` script and `.mcp.json` config are legacy artifacts from the old stdio setup.
- **AsyncSession concurrency**: SQLAlchemy 2.0 async sessions don't support concurrent operations. All queries in `/state/summary` run sequentially on a single session.