Files
the-custodian/workplans/CUST-WP-0025-fos-hub-bootstrap.md

18 KiB

id, type, title, domain, repo, status, owner, topic_slug, created, updated, state_hub_workstream_id
id type title domain repo status owner topic_slug created updated state_hub_workstream_id
CUST-WP-0025 workplan FOS Hub Bootstrap — Identity, Hub Extraction, Ops Hub, Fin Hub custodian the-custodian active custodian custodian 2026-03-20 2026-06-06 293a74fe-a85a-4ad6-8933-23d52a72fe8b

FOS Hub Bootstrap — Identity, Hub Extraction, Ops Hub, Fin Hub

Goal

Progress the Custodian from FOS maturity Level 1 (Single-Hub Emergence) toward Level 3 (Core Federation) by:

  1. Finalizing shared identity infrastructure (NetKingdom SSO)
  2. Extracting a generic reusable hub-core package from state-hub
  3. Renaming state-hub to dev-hub and transitioning all repos
  4. Creating the ops-hub for runtime operations coordination
  5. Building the fin-hub with railiance-as-a-service as first monetization path

Context

The state-hub has matured through 24 completed workplans (62 workstreams, 573 tasks) but remains a monolithic single hub mixing dev-coordination, governance, and generic infrastructure. Per FOS §13.1, this risks becoming the "Mega-Hub" anti-pattern.

Two standards govern the architecture:

  • FOS (Federated Organisation Standard): organizational recursion via domain hubs
  • OAS (Orthogonal Architecture Standard): compute substrate via 6 dimensions

Together they form a complete cybernetic stack: FOS gives the viable organization, OAS gives the viable infrastructure.

Key Decisions

Decision Choice Rationale
Hub-core packaging Separate pip-installable package Clean separation, versioned independently, each hub depends via uv
Phase sequencing Parallel start (Phase 1 + 2) Identity and extraction run concurrently; auth bolted on later
Ops Hub location New standalone repo FOS separation principle — each hub independently deployable
First monetization Railiance-as-a-service Package OAS infra stack as managed/consultancy for EU SMEs

Phase 1 — Identity Infrastructure

Goal: Finalized user-id infrastructure so all future hubs share one SSO plane. Repos: net-kingdom, railiance-cluster, railiance-platform Runs in parallel with Phase 2.

T01 — Complete NK-WP-0001: Keycloak + privacyIDEA on k3s

id: CUST-WP-0025-T01
status: todo
priority: high
state_hub_task_id: "f55078b6-7fa3-49ab-be30-37db622d64c9"

Complete the SSO/MFA platform deployment. Keycloak as OIDC provider with privacyIDEA for MFA, running on the k3s cluster. This is the identity foundation for all hubs and services.

Cross-reference: net-kingdom NK-WP-0001.

T02 — Complete NK-WP-0002: Local identity bootstrap

id: CUST-WP-0025-T02
status: todo
priority: high
state_hub_task_id: "0d7792f7-5695-4e1a-9726-b9661d5e7108"

Implement lightweight file-based OIDC server for dev/sandbox/bootstrap scenarios where the full Keycloak cluster is unavailable. Enables local development of hub services without cluster dependency.

Cross-reference: net-kingdom NK-WP-0002.

T03 — IAM Profile integration test

id: CUST-WP-0025-T03
status: todo
priority: medium
state_hub_task_id: "e9894ac9-add3-45a6-9893-ea67c6e5e260"

Prove a FastAPI service can authenticate via NetKingdom OIDC end-to-end. Write a minimal test service + integration test that:

  • Obtains a token via OIDC/PKCE flow
  • Calls a protected endpoint
  • Validates token claims (sub, roles, expiry)

This test becomes the template for hub-core auth middleware.

T04 — Canon standard: IAM Profile specification

id: CUST-WP-0025-T04
status: done
priority: medium
state_hub_task_id: "69acc880-394b-478a-94f0-476c9cbc1bc6"

Document the OIDC contract as canon/standards/iam-profile_v0.1.md:

  • Discovery endpoint structure
  • Required claims and scopes
  • Token lifecycle (access + refresh)
  • Hub-to-hub service account pattern
  • Human override / emergency access

Phase 2 — Hub Extraction & Dev Hub Rename

Goal: Extract generic hub-core package; rename state-hub to dev-hub. Repo: the-custodian (governance and workplan), /home/worsch/state-hub (authoritative source), /home/worsch/hub-core (new target repo) Runs in parallel with Phase 1.

Current repo reality (2026-06-06): CUST-WP-0043 completed the State Hub repo extraction, so this repository now keeps only the state-hub/README.md pointer. Phase 2 implementation work must read and refactor the standalone /home/worsch/state-hub checkout, while this workplan keeps the coordination record in the-custodian.

Extraction Boundary

Generic hub-core (~17 MCP tools, ~6 models, ~6 routers):

  • Models: Domain, AgentMessage, CapabilityCatalog, CapabilityRequest, ManagedRepo, TPSC*, ProgressEvent (generic event_types)
  • Routers: domains, repos, messages, capability_requests, tpsc, policy
  • MCP tools: orientation, messaging, capability routing, repo management, TPSC/GDPR, DoI

Dev-hub-specific (~51 MCP tools, ~12 models):

  • Topics, workstreams, tasks, decisions, dependencies, EP/TD, contributions, SBOM, goals, DoI cache, kaizen agents, consistency checker

T05 — Create hub-core package

id: CUST-WP-0025-T05
status: in_progress
priority: high
state_hub_task_id: "04bf480c-8847-4a89-a4f2-e7c5fc51088d"

Create /home/worsch/hub-core as a standalone repo with pyproject.toml (uv-managed). Extract from /home/worsch/state-hub:

  • Generic SQLAlchemy models (Domain, AgentMessage, CapabilityCatalog, CapabilityRequest, ManagedRepo, TPSC*, ProgressEvent)
  • Generic Pydantic schemas
  • Generic FastAPI routers (domains, repos, messages, capability_requests, tpsc, policy)
  • Alembic migration templates for core schema
  • Shared utilities (slug resolution, pagination, trailing-slash normalization)

Implementation start (2026-06-06): reviewed the standalone State Hub source and captured the first extraction boundary in docs/hub-core-extraction-boundary.md. Several candidate "generic" models still reference dev-hub tables (topics, workstreams, tasks, decisions), so the initial package slice should start with base DB primitives, domains, repos, messages, TPSC catalog/snapshots, and adapter seams for progress and capability requests rather than a blind file copy.

Implementation slice 2 (2026-06-06): expanded /home/worsch/hub-core with router factory functions for domains, repos, messages, TPSC, and policy lookup. The factories receive the host hub's get_session dependency instead of binding to State Hub globals, preserving the package boundary for future hubs. Verification currently covers package compilation, SQLAlchemy metadata registration, and router factory construction.

Implementation slice 3 (2026-06-06): added hub-core shared utilities for slug normalization, pagination, repo path resolution, and trailing-slash path normalization. Added Alembic migration templates plus an initial core schema migration covering domains, managed repos, agent messages, capability catalog, and TPSC tables. Hub-core verification now covers package compilation, router construction, registered metadata, and utility behavior.

T06 — Hub-core FastMCP base server

id: CUST-WP-0025-T06
status: todo
priority: high
state_hub_task_id: "6b49d94a-b1ea-4507-a8a3-e27c1a918491"

Add a base MCP server class to hub-core that provides the ~17 generic tools:

  • Orientation: get_state_summary, get_domain_summary, list_domains
  • Messaging: send_message, get_messages, mark_message_read, reply_to_message
  • Capability routing: register_capability, list_capabilities, request_capability, accept_capability_request, update_capability_request_status, list_capability_requests, get_capability_request
  • Repo management: register_repo, update_repo_path, list_domain_repos
  • TPSC/GDPR: register_service, list_services, ingest_tpsc_tool, get_gdpr_report
  • DoI: check_repo_doi, get_doi_summary

Domain-specific hubs inherit and add their own tools.

T07 — FOS §10 risk and alert tools

id: CUST-WP-0025-T07
status: todo
priority: medium
state_hub_task_id: "5a54af24-f7cb-451f-874f-66bd6979ab07"

Add get_risks() and get_alerts() to hub-core, formalizing existing ProgressEvent patterns. Define canonical event_type values:

  • risk_surfaced, risk_mitigated, risk_escalated
  • alert_raised, alert_acknowledged, alert_resolved

This completes the FOS §10 cross-hub contract.

T08 — Refactor state-hub to import from hub-core

id: CUST-WP-0025-T08
status: todo
priority: high
state_hub_task_id: "daf1d8ac-b55a-4692-b359-2671ddf6fc8a"

Refactor the standalone /home/worsch/state-hub codebase:

  • Replace generic models/routers/schemas with imports from hub-core
  • Keep dev-specific code (topics, workstreams, tasks, decisions, etc.) in state-hub
  • Ensure all existing tests pass with the new import structure
  • Update pyproject.toml to depend on hub-core

T09 — Rename MCP server state-hub to dev-hub

id: CUST-WP-0025-T09
status: todo
priority: high
state_hub_task_id: "2148a804-7d6a-4e26-b1a8-08da24929c88"

Rename across all integration points:

  • /home/worsch/state-hub/mcp_server/server.py: name="state-hub" → "dev-hub"
  • ~/.claude/CLAUDE.md: 3 locations (registration commands, references)
  • /home/worsch/state-hub/scripts/register_project.sh: validation checks
  • /home/worsch/state-hub/scripts/patch_mcp_cwd.py: config checks
  • /home/worsch/state-hub/custodian_cli.py: config checks
  • /home/worsch/state-hub/scripts/project_rules/session-protocol.template: template text
  • /home/worsch/state-hub/api/main.py: service metadata response

T10 — MCP config migration script

id: CUST-WP-0025-T10
status: todo
priority: medium
state_hub_task_id: "5953f129-089d-4d90-bbe5-f86da4eac1bf"

Create /home/worsch/state-hub/scripts/migrate_mcp_config.py that:

  • Reads ~/.claude.json
  • Renames mcpServers["state-hub"] to mcpServers["dev-hub"]
  • Preserves all other settings
  • Backs up original file before writing

T11 — Regenerate domain repo rule files

id: CUST-WP-0025-T11
status: todo
priority: medium
state_hub_task_id: "7b41766b-f97f-4e9f-9f3c-c0937edb355f"

After template update, regenerate .claude/rules/session-protocol.md for all registered domain repos:

  • railiance-infra, railiance-cluster, railiance-platform
  • railiance-enablement, railiance-apps
  • net-kingdom, markitect, coulomb.social
  • personhood, foerster-capabilities

T12 — Full test suite and consistency check

id: CUST-WP-0025-T12
status: todo
priority: high
state_hub_task_id: "e55ae544-3cea-485e-80d5-a9696ef97b96"

Gate: all of the following must pass before Phase 2 is considered complete:

  • cd /home/worsch/state-hub && make test — full test suite
  • make fix-consistency REPO=the-custodian — workplan ↔ DB sync
  • make check-consistency-all — all registered repos
  • Manual smoke test: start dev-hub MCP server, run get_domain_summary from a domain repo

Phase 3 — Ops Hub

Goal: Runtime operations coordination per FOS §7.3. Depends on: Phase 2 (hub_core available), Phase 1 (identity for service auth). Repo: ops-hub (new standalone repo, registered under custodian domain)

Inventory-first implementation slice (2026-06-05): CUST-WP-0047 carves out the minimum useful part of T14/T16/T18 before the full standalone ops-hub scaffold exists: a repo-owned service inventory contract, an initial service/location/evidence seed, and the handoff path for Inter-Hub widgets and activity-core probes. The T13-T19 tasks below remain the long-term ops-hub implementation; the inventory slice produces input artifacts that the eventual ops-hub repo can ingest rather than replace.

T13 — Create ops-hub repo from hub-core scaffold

id: CUST-WP-0025-T13
status: todo
priority: medium
state_hub_task_id: "2c6d1429-a67a-4f66-84d1-cb32ffdb890f"

Create ops-hub repo with:

  • pyproject.toml depending on hub-core
  • FastAPI app factory inheriting hub-core base
  • MCP server extending hub-core base server
  • Alembic setup with hub-core core migrations + ops-specific
  • Register as managed repo under custodian domain

T14 — Ops-specific models

id: CUST-WP-0025-T14
status: todo
priority: medium
state_hub_task_id: "0e811e9b-23a5-49f9-979e-cd1c5dcd937f"

Define SQLAlchemy models for:

  • Service: name, namespace, health_status, last_seen, endpoints
  • Incident: severity, status (open/investigating/mitigated/resolved), timeline
  • Runbook: service_id, trigger_conditions, steps, last_executed
  • AccessPath: type (ssh/k8s/http), target, auth_method, status
  • OperationalDebt: category, severity, location, owner
  • ChangeRecord: what changed, when, by whom, rollback_path

T15 — Ops-specific MCP tools

id: CUST-WP-0025-T15
status: todo
priority: medium
state_hub_task_id: "3fdd1f61-4c8e-4614-898b-df7a9aa4a514"

Implement ops-domain MCP tools:

  • Service registry: register_service, list_services, get_service_health
  • Health probes: probe_service, get_cluster_health, get_storage_health
  • Incident lifecycle: create_incident, update_incident, resolve_incident
  • Runbook: get_runbook, execute_runbook_step
  • Access: list_access_paths, check_access_path

T16 — Railiance infrastructure integration

id: CUST-WP-0025-T16
status: todo
priority: medium
state_hub_task_id: "702849c5-b253-4ede-afa7-0ab4f81e49a5"

Connect ops-hub to railiance infrastructure observability:

  • k3s cluster health via kubectl/API
  • Longhorn storage status and replication state
  • Certificate expiry tracking (cert-manager)
  • Backup status (S2 integrated backup)
  • SSH tunnel health (ops-bridge)

T17 — Cross-hub protocol: ops-hub to dev-hub

id: CUST-WP-0025-T17
status: todo
priority: medium
state_hub_task_id: "b99a3ed8-440b-4e28-88f5-495de7276f66"

Implement FOS §9.2.5 event coupling:

  • Deployment events in dev-hub → change signals in ops-hub
  • Incident events in ops-hub → blocker signals in dev-hub
  • Shared event vocabulary (canonical event_types)
  • HTTP-based event forwarding (keep it simple; upgrade to NATS later if needed)

T18 — Ops Hub "now view" dashboard

id: CUST-WP-0025-T18
status: todo
priority: low
state_hub_task_id: "5b6cea8b-3982-49be-bacf-7269a3d2104e"

Observable Framework dashboard for ops-hub:

  • Service status grid (green/amber/red)
  • Active incidents timeline
  • Access path map
  • Storage and certificate health
  • Recent change log

T19 — Register ops-hub as MCP server

id: CUST-WP-0025-T19
status: todo
priority: medium
state_hub_task_id: "f033c80e-4ebb-49cf-8987-20c9b2ff4c13"

Register ops-hub MCP server:

  • Port 8002 (dev-hub on 8001, ops-hub on 8002)
  • Update global ~/.claude/CLAUDE.md with ops-hub registration
  • Update session protocol: domain repos that touch infrastructure should call both get_domain_summary() (dev-hub) and ops-hub orientation

Phase 4 — Business Model & Fin Hub

Goal: First monetization via railiance-as-a-service + resource viability hub. Depends on: Phase 3 (multi-hub pattern proven).

T20 — Business model canvas: railiance-as-a-service

id: CUST-WP-0025-T20
status: todo
priority: medium
state_hub_task_id: "55db0560-2733-481d-adba-b72c3839ba45"

Define the offering:

  • Target: EU SMEs needing sovereign, GDPR-compliant DevOps infrastructure
  • Core: managed k3s cluster + observability + GitOps + backup
  • Differentiator: VSM-based organizational architecture, not just infra
  • Pricing tiers: self-hosted (open-source), managed, fully operated
  • Document as canon/projects/railiance/business-model-canvas_v0.1.md

T21 — Canon: Bootstrap Protocol document

id: CUST-WP-0025-T21
status: todo
priority: medium
state_hub_task_id: "ce54d3fc-140e-49be-a181-779abc434d4e"

Address FOS blindspot #2 (bootstrapping & initial capital):

  • Seed funding strategy and minimum viable budget
  • MVP scope definition (what must exist before first customer)
  • First 3 mandated roles: Constitutional Steward, Technical Operator, Financial Allocator
  • Revenue threshold for role formalization
  • Document as canon/constitution/bootstrap-protocol_v0.1.md

T22 — Create fin-hub repo from hub-core scaffold

id: CUST-WP-0025-T22
status: todo
priority: low
state_hub_task_id: "670757d8-305d-4736-9056-e79a150114b1"

Create fin-hub repo with same scaffold pattern as ops-hub. Register under custodian domain.

T23 — Fin-specific models

id: CUST-WP-0025-T23
status: todo
priority: low
state_hub_task_id: "8ebffb3f-0dbb-4672-b4e9-928992c41cf4"

Define SQLAlchemy models for:

  • Budget: domain, period, allocated, committed, spent
  • Commitment: type (subscription/contract/salary), amount, cadence, start/end
  • BurnRate: domain, period, actual_spend, projected_spend
  • RunwayProjection: current_balance, monthly_burn, months_remaining, alert_threshold
  • TokenSpend: provider (anthropic/openai), model, tokens_in, tokens_out, cost, session_id

T24 — Fin-hub implementation: cost tracking + runway

id: CUST-WP-0025-T24
status: todo
priority: low
state_hub_task_id: "405f81d3-dec5-4154-a1b8-a3af344a0cc4"

Implement:

  • Cloud cost ingestion (manual CSV import initially, OpenCost integration later)
  • Anthropic API token spend tracking (parse billing exports)
  • HostEurope server cost tracking
  • Runway calculator with burn-rate projection
  • Budget alerts when projected runway drops below threshold

T25 — Cross-hub coupling: fin-hub connections

id: CUST-WP-0025-T25
status: todo
priority: low
state_hub_task_id: "90a41790-7290-4145-b89f-88bf491d7652"

Implement FOS §9 cross-hub coupling:

  • fin→dev: resource pressure signals (budget alerts surface in dev-hub)
  • fin→ops: infrastructure cost attribution (per-service cost view)
  • fin→canon: viability alerts (runway below threshold escalates to System 5)

T26 — Pricing and packaging: railiance-as-a-service MVP

id: CUST-WP-0025-T26
status: todo
priority: low
state_hub_task_id: "e17ef269-e349-44cc-ab14-6c57b43199b1"

Concrete pricing:

  • Define 3 tiers with feature matrix
  • Create landing page content
  • Define onboarding workflow (customer → provisioned k3s + monitoring)
  • Legal: GmbH implications, liability, SLA framework
  • First customer acquisition strategy