docs: add State Hub reference page and restructure reference index

New page (docs/state-hub.md) covers:
- Why: the invisible state problem across repos and agents
- What: Derived Data Store, Read Model, Agent Orchestration Layer,
  Cross-Repo Observatory — and what it is NOT
- Derived Data Store principle (ADR-003): fingerprint cache, rebuild
  guarantee, force-refresh
- Repository Orchestrator: session protocol, cross-domain coordination
  via messages + capability routing, Kaizen agents
- Architecture diagram (ASCII), technology choices, data model overview
- Running the hub, design principles, related docs

reference.md: add Architecture & Design section grouping state-hub,
TPSC, GDPR maturity, SCOPE.md, capabilities, and goals docs.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-20 02:01:58 +01:00
parent 1ee0343f75
commit 101c953e69
3 changed files with 283 additions and 0 deletions

View File

@@ -0,0 +1,269 @@
---
title: State Hub — Reference
---
# The Custodian State Hub
The State Hub is the central nervous system of the Custodian ecosystem — a
local-first service that collects, organises, and surfaces the state of all
registered repositories so that humans and agents can orient themselves in
seconds rather than minutes.
---
## Why it exists
Software projects accumulate invisible state. Workstreams stall, decisions go
unresolved, dependency licences drift, integration gaps widen — and none of
this is visible without opening every file in every repository. For a single
repo with a single engineer this is manageable. Across six domains, fifteen
repositories, and a growing fleet of autonomous agents it becomes untenable.
The State Hub exists to answer one question instantly:
> **"What is the actual state of the system right now, and what should happen next?"**
It does this without requiring any project to change how it works — repositories
remain the authority for their own data and the hub is a silent observer that
indexes and reflects their state.
---
## What it is — and what it is not
### What it is
| Role | Description |
|---|---|
| **Derived Data Store** | All hub data is computed from repo files and records. The hub holds no original information — it can be wiped and rebuilt from scratch at any time without data loss. |
| **Read Model** | Provides fast, pre-computed answers to common orientation queries: active workstreams, blocking decisions, DoI compliance tiers, SBOM licence risk, GDPR warnings. |
| **Agent Orchestration Layer** | Exposes an MCP server (Model Context Protocol) so that Claude Code sessions in any registered repository can orient themselves, record progress, resolve decisions, and coordinate with each other — all through a uniform tool interface. |
| **Cross-Repo Observatory** | The only place where data from all repositories is visible together. Detects dependencies between workstreams, licence risks that span repos, and integration gaps that no single repo can see about itself. |
### What it is not
| Not | Why |
|---|---|
| **Source of truth** | Repos are the authority. The hub reflects; it does not own. |
| **A project management tool** | Workplans live in repos as Markdown files. The hub indexes them, not the reverse. |
| **A deployment or CI/CD system** | The hub tracks *intent and state*, not execution pipelines. |
| **A multi-user SaaS** | Local-first, single-operator. Designed for sovereignty, not scale. |
---
## The Derived Data Store principle
The hub is a concrete implementation of what Martin Kleppmann calls a
**Derived Data Store** (*Designing Data-Intensive Applications*, Ch. 3 & 11):
a system whose entire dataset is fully derivable from upstream sources and
which therefore makes no authority claims of its own.
This is formalised in **ADR-003** (Materialized Derived State with Fingerprint
Invalidation). The practical consequence:
1. **Repositories are the primary artefacts.** SCOPE.md, workplan files,
`uv.lock`, `tpsc.yaml`, `CLAUDE.md` — these are canonical. The hub reads
them; it never writes them.
2. **The hub is a cache.** Every table that holds repo-derived data carries a
`fingerprint` column. When the fingerprint of the current source state
matches the stored fingerprint, the cached value is returned instantly.
When it differs, the value is recomputed and the cache is updated.
3. **The rebuild guarantee.** Any hub database can be dropped and recreated
from scratch by re-running migrations and re-ingesting all registered repos.
No information is lost because no information originates here.
4. **Force-refresh on demand.** Every derived endpoint accepts
`?force_refresh=true` to bypass the cache — useful for debugging, post-ingest
verification, and validating fingerprint coverage.
### What gets derived and how
| Source | Derived data | Mechanism |
|---|---|---|
| `uv.lock`, `package-lock.json`, etc. | SBOM entries + licence risk | `make ingest-sbom REPO=` |
| `tpsc.yaml` | Third-party service declarations + GDPR warnings | `make ingest-tpsc REPO=` |
| `SCOPE.md` capability blocks | Capability catalog | `make ingest-capabilities REPO=` |
| `workplans/*.md` | Workstream + task status | `make fix-consistency REPO=` |
| Repo files + DB records | DoI compliance tier | Fingerprint cache, auto-refreshed on read |
---
## The Repository Orchestrator
Beyond indexing, the hub acts as an **orchestrator** for the repository
ecosystem — the coordinator that no individual repo can be.
### Agent session protocol
Every Claude Code session in a registered repository follows the same ritual:
1. **Orient** — call `get_domain_summary("<slug>")` to load active workstreams,
pending tasks, blocking decisions, and suggested next steps for this domain.
2. **Check inbox** — call `get_messages(to_agent="<name>", unread_only=True)`
to receive coordination messages from other agents or from prior sessions.
3. **Work** — use MCP tools to record progress, resolve decisions, update task
status, and send messages to other agents.
4. **Close** — call `add_progress_event()` before ending the session so the
next session can see what happened.
This protocol means that any agent — human-directed or autonomous — arrives
in a fresh session with full context about the current state of its domain
in under a second.
### Cross-domain coordination
Because all agents share the same hub, they can coordinate without direct
communication:
- An agent working on `railiance` can send a message to `custodian` via
`send_message()`. The custodian agent reads it in its next session inbox.
- A capability request (`request_capability()`) routes to the domain that
advertises the relevant capability in its `SCOPE.md`. The fulfilling agent
accepts it, does the work, and marks it complete — unblocking the requesting
workstream automatically.
- Workstream dependencies (`create_dependency()`) let the hub surface "what
is blocking what" across repos that have never directly communicated.
### Kaizen agents
The hub hosts a library of specialised agent personas (`agents/agent-*.md`)
that any session can load via `get_kaizen_agent("<name>")`. These provide
expert instruction sets for specific tasks — test-driven development,
refactoring, infrastructure review — without baking domain knowledge into
every repo's CLAUDE.md.
---
## Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ Repositories │
│ workplans/*.md SCOPE.md CLAUDE.md uv.lock tpsc.yaml │
└─────────┬───────────────────────────────────────────────────┘
│ ingest scripts (make ingest-sbom / fix-consistency / …)
┌─────────────────────────────────────────────────────────────┐
│ PostgreSQL Database │
│ domains managed_repos workstreams tasks decisions │
│ sbom_entries tpsc_entries doi_cache capability_catalog │
│ progress_events agent_messages repo_goals … │
└─────────┬───────────────────────────────────────────────────┘
┌────┴──────────────┐
│ │
▼ ▼
┌─────────────┐ ┌─────────────────────────────────────────┐
│ FastAPI │ │ MCP Server (SSE :8001) │
│ REST API │ │ 40+ tools — orient, record, coordinate │
│ (:8000) │ │ Registered at user scope in ~/.claude │
└─────────────┘ └─────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│ Observable Framework Dashboard (:3000) │
│ Overview Repositories Workstreams Decisions SBOM │
│ TPSC Contributions Goals Capabilities … │
└─────────────────────────────────────────────────────────────┘
```
### Technology choices
| Component | Technology | Reason |
|---|---|---|
| Database | PostgreSQL (Docker) | Relational integrity, JSON columns, full async support |
| API | FastAPI + SQLAlchemy (async) | Type-safe, auto-documented, fast |
| MCP server | FastMCP over SSE | Persistent connection; no Claude Code restart on tool update |
| Dashboard | Observable Framework | Reactive, static-deployable, no JS build step for data loaders |
| Agent interface | Model Context Protocol | Standard Claude Code integration; tools available in all sessions |
---
## Data model overview
The hub's schema is organised in concentric layers:
**Governance (slow-changing):**
`domains``managed_repos``repo_goals``domain_goals`
**Work tracking (active):**
`workstreams``tasks``workstream_dependencies`
**Decision log (append-only):**
`decisions` · `progress_events` · `agent_messages`
**Derived snapshots (cached, rebuildable):**
`sbom_snapshots``sbom_entries`
`tpsc_snapshots``tpsc_entries`
`capability_catalog` · `capability_requests`
`doi_cache`
**Governance policies (editables):**
`policies/` (flat Markdown files, served via `/policy/<name>`)
---
## Running the hub
```bash
cd ~/the-custodian/state-hub
make db # Start PostgreSQL (Docker)
make migrate # Alembic upgrade head
make seed # Insert 6 canonical domain topics
make api # FastAPI on :8000
make mcp-http # MCP SSE server on :8001
make dashboard # Observable preview on :3000
```
Verify the MCP server is registered:
```bash
python3 -c "import json,os; d=json.load(open(os.path.expanduser('~/.claude.json'))); print(list(d.get('mcpServers',{}).keys()))"
# → ['state-hub']
```
Re-register if needed:
```bash
claude mcp add-json -s user state-hub '{"type":"sse","url":"http://127.0.0.1:8001/sse"}'
```
---
## Design principles
These principles are not aspirational — they are constraints that every hub
feature must satisfy before being accepted.
**Local-first, no vendor lock-in.**
The hub runs on a developer laptop. It requires only Docker (for Postgres)
and Python. It works offline. No cloud dependency, no SaaS subscription.
**Sovereignty by default.**
No data leaves the machine unless explicitly exported. The hub never calls
external APIs on its own.
**The rebuild guarantee (ADR-001, ADR-003).**
Dropping the database and re-running `make migrate && make seed` followed by
re-ingesting all repos must restore the full operational state. Any feature
that breaks this guarantee is rejected.
**Agents are co-creators, not authorities.**
The hub coordinates agent work but does not permit agents to make irreversible
decisions unilaterally. Financial, legal, and external-publication actions are
hard-blocked (see `canon/constitution/`).
**Append-only episodic memory.**
Progress events are never deleted or edited. Decisions are resolved, not
overwritten. This ensures that any session can reconstruct what happened and
why, even years later.
---
## Related
- [ADR-001](/docs/repo-integration) — Workplans as Repository Artefacts
- [ADR-003](https://github.com) — Materialized Derived State (see `canon/architecture/adr-003-materialized-derived-state.md`)
- [Connecting to the Hub](/docs/connecting)
- [Repo Integration](/docs/repo-integration)
- [Repository DoI](/policy/repo-doi) — Definition of Integrated
- [TPSC](/docs/tpsc) — Third-Party Services Catalog
- [SBOM](/docs/sbom)