Files
the-custodian/state-hub/dashboard/src/docs/state-hub.md
tegwick 1e0ae37c89 docs: add State Hub reference page and restructure reference index
New page (docs/state-hub.md) covers:
- Why: the invisible state problem across repos and agents
- What: Derived Data Store, Read Model, Agent Orchestration Layer,
  Cross-Repo Observatory — and what it is NOT
- Derived Data Store principle (ADR-003): fingerprint cache, rebuild
  guarantee, force-refresh
- Repository Orchestrator: session protocol, cross-domain coordination
  via messages + capability routing, Kaizen agents
- Architecture diagram (ASCII), technology choices, data model overview
- Running the hub, design principles, related docs

reference.md: add Architecture & Design section grouping state-hub,
TPSC, GDPR maturity, SCOPE.md, capabilities, and goals docs.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 02:01:58 +01:00

12 KiB

title
title
State Hub — Reference

The Custodian State Hub

The State Hub is the central nervous system of the Custodian ecosystem — a local-first service that collects, organises, and surfaces the state of all registered repositories so that humans and agents can orient themselves in seconds rather than minutes.


Why it exists

Software projects accumulate invisible state. Workstreams stall, decisions go unresolved, dependency licences drift, integration gaps widen — and none of this is visible without opening every file in every repository. For a single repo with a single engineer this is manageable. Across six domains, fifteen repositories, and a growing fleet of autonomous agents it becomes untenable.

The State Hub exists to answer one question instantly:

"What is the actual state of the system right now, and what should happen next?"

It does this without requiring any project to change how it works — repositories remain the authority for their own data and the hub is a silent observer that indexes and reflects their state.


What it is — and what it is not

What it is

Role Description
Derived Data Store All hub data is computed from repo files and records. The hub holds no original information — it can be wiped and rebuilt from scratch at any time without data loss.
Read Model Provides fast, pre-computed answers to common orientation queries: active workstreams, blocking decisions, DoI compliance tiers, SBOM licence risk, GDPR warnings.
Agent Orchestration Layer Exposes an MCP server (Model Context Protocol) so that Claude Code sessions in any registered repository can orient themselves, record progress, resolve decisions, and coordinate with each other — all through a uniform tool interface.
Cross-Repo Observatory The only place where data from all repositories is visible together. Detects dependencies between workstreams, licence risks that span repos, and integration gaps that no single repo can see about itself.

What it is not

Not Why
Source of truth Repos are the authority. The hub reflects; it does not own.
A project management tool Workplans live in repos as Markdown files. The hub indexes them, not the reverse.
A deployment or CI/CD system The hub tracks intent and state, not execution pipelines.
A multi-user SaaS Local-first, single-operator. Designed for sovereignty, not scale.

The Derived Data Store principle

The hub is a concrete implementation of what Martin Kleppmann calls a Derived Data Store (Designing Data-Intensive Applications, Ch. 3 & 11): a system whose entire dataset is fully derivable from upstream sources and which therefore makes no authority claims of its own.

This is formalised in ADR-003 (Materialized Derived State with Fingerprint Invalidation). The practical consequence:

  1. Repositories are the primary artefacts. SCOPE.md, workplan files, uv.lock, tpsc.yaml, CLAUDE.md — these are canonical. The hub reads them; it never writes them.
  2. The hub is a cache. Every table that holds repo-derived data carries a fingerprint column. When the fingerprint of the current source state matches the stored fingerprint, the cached value is returned instantly. When it differs, the value is recomputed and the cache is updated.
  3. The rebuild guarantee. Any hub database can be dropped and recreated from scratch by re-running migrations and re-ingesting all registered repos. No information is lost because no information originates here.
  4. Force-refresh on demand. Every derived endpoint accepts ?force_refresh=true to bypass the cache — useful for debugging, post-ingest verification, and validating fingerprint coverage.

What gets derived and how

Source Derived data Mechanism
uv.lock, package-lock.json, etc. SBOM entries + licence risk make ingest-sbom REPO=
tpsc.yaml Third-party service declarations + GDPR warnings make ingest-tpsc REPO=
SCOPE.md capability blocks Capability catalog make ingest-capabilities REPO=
workplans/*.md Workstream + task status make fix-consistency REPO=
Repo files + DB records DoI compliance tier Fingerprint cache, auto-refreshed on read

The Repository Orchestrator

Beyond indexing, the hub acts as an orchestrator for the repository ecosystem — the coordinator that no individual repo can be.

Agent session protocol

Every Claude Code session in a registered repository follows the same ritual:

  1. Orient — call get_domain_summary("<slug>") to load active workstreams, pending tasks, blocking decisions, and suggested next steps for this domain.
  2. Check inbox — call get_messages(to_agent="<name>", unread_only=True) to receive coordination messages from other agents or from prior sessions.
  3. Work — use MCP tools to record progress, resolve decisions, update task status, and send messages to other agents.
  4. Close — call add_progress_event() before ending the session so the next session can see what happened.

This protocol means that any agent — human-directed or autonomous — arrives in a fresh session with full context about the current state of its domain in under a second.

Cross-domain coordination

Because all agents share the same hub, they can coordinate without direct communication:

  • An agent working on railiance can send a message to custodian via send_message(). The custodian agent reads it in its next session inbox.
  • A capability request (request_capability()) routes to the domain that advertises the relevant capability in its SCOPE.md. The fulfilling agent accepts it, does the work, and marks it complete — unblocking the requesting workstream automatically.
  • Workstream dependencies (create_dependency()) let the hub surface "what is blocking what" across repos that have never directly communicated.

Kaizen agents

The hub hosts a library of specialised agent personas (agents/agent-*.md) that any session can load via get_kaizen_agent("<name>"). These provide expert instruction sets for specific tasks — test-driven development, refactoring, infrastructure review — without baking domain knowledge into every repo's CLAUDE.md.


Architecture

┌─────────────────────────────────────────────────────────────┐
│                        Repositories                          │
│  workplans/*.md  SCOPE.md  CLAUDE.md  uv.lock  tpsc.yaml   │
└─────────┬───────────────────────────────────────────────────┘
          │  ingest scripts (make ingest-sbom / fix-consistency / …)
          ▼
┌─────────────────────────────────────────────────────────────┐
│                      PostgreSQL Database                      │
│  domains  managed_repos  workstreams  tasks  decisions       │
│  sbom_entries  tpsc_entries  doi_cache  capability_catalog   │
│  progress_events  agent_messages  repo_goals  …              │
└─────────┬───────────────────────────────────────────────────┘
          │
     ┌────┴──────────────┐
     │                   │
     ▼                   ▼
┌─────────────┐   ┌─────────────────────────────────────────┐
│  FastAPI    │   │           MCP Server (SSE :8001)          │
│  REST API   │   │  40+ tools — orient, record, coordinate   │
│  (:8000)    │   │  Registered at user scope in ~/.claude   │
└─────────────┘   └─────────────────────────────────────────┘
     │
     ▼
┌─────────────────────────────────────────────────────────────┐
│              Observable Framework Dashboard (:3000)          │
│  Overview  Repositories  Workstreams  Decisions  SBOM        │
│  TPSC  Contributions  Goals  Capabilities  …                 │
└─────────────────────────────────────────────────────────────┘

Technology choices

Component Technology Reason
Database PostgreSQL (Docker) Relational integrity, JSON columns, full async support
API FastAPI + SQLAlchemy (async) Type-safe, auto-documented, fast
MCP server FastMCP over SSE Persistent connection; no Claude Code restart on tool update
Dashboard Observable Framework Reactive, static-deployable, no JS build step for data loaders
Agent interface Model Context Protocol Standard Claude Code integration; tools available in all sessions

Data model overview

The hub's schema is organised in concentric layers:

Governance (slow-changing): domainsmanaged_reposrepo_goalsdomain_goals

Work tracking (active): workstreamstasksworkstream_dependencies

Decision log (append-only): decisions · progress_events · agent_messages

Derived snapshots (cached, rebuildable): sbom_snapshotssbom_entries tpsc_snapshotstpsc_entries capability_catalog · capability_requests doi_cache

Governance policies (editables): policies/ (flat Markdown files, served via /policy/<name>)


Running the hub

cd ~/the-custodian/state-hub

make db          # Start PostgreSQL (Docker)
make migrate     # Alembic upgrade head
make seed        # Insert 6 canonical domain topics
make api         # FastAPI on :8000
make mcp-http    # MCP SSE server on :8001
make dashboard   # Observable preview on :3000

Verify the MCP server is registered:

python3 -c "import json,os; d=json.load(open(os.path.expanduser('~/.claude.json'))); print(list(d.get('mcpServers',{}).keys()))"
# → ['state-hub']

Re-register if needed:

claude mcp add-json -s user state-hub '{"type":"sse","url":"http://127.0.0.1:8001/sse"}'

Design principles

These principles are not aspirational — they are constraints that every hub feature must satisfy before being accepted.

Local-first, no vendor lock-in. The hub runs on a developer laptop. It requires only Docker (for Postgres) and Python. It works offline. No cloud dependency, no SaaS subscription.

Sovereignty by default. No data leaves the machine unless explicitly exported. The hub never calls external APIs on its own.

The rebuild guarantee (ADR-001, ADR-003). Dropping the database and re-running make migrate && make seed followed by re-ingesting all repos must restore the full operational state. Any feature that breaks this guarantee is rejected.

Agents are co-creators, not authorities. The hub coordinates agent work but does not permit agents to make irreversible decisions unilaterally. Financial, legal, and external-publication actions are hard-blocked (see canon/constitution/).

Append-only episodic memory. Progress events are never deleted or edited. Decisions are resolved, not overwritten. This ensures that any session can reconstruct what happened and why, even years later.