diff --git a/docs/core-hub-replacement-evidence.md b/docs/core-hub-replacement-evidence.md index 6d599e3..00d6bdf 100644 --- a/docs/core-hub-replacement-evidence.md +++ b/docs/core-hub-replacement-evidence.md @@ -72,8 +72,10 @@ Inter-Hub bootstrap or rollback validation. - T19: cancel or defer ops-hub MCP server registration until post-cutover demand proves it is needed. -This is enough to rewrite `CUST-WP-0025` safely, but not enough to declare Core -Hub production cutover complete. +2026-06-27 follow-up: `CUST-WP-0025-T13` through `T19` have now been +rewritten around this recommendation. The rewrite is enough to stop the obsolete +standalone ops-hub scaffold sequence, but not enough to declare Core Hub +production cutover complete. ## Remaining Gates diff --git a/docs/fos-hub-bootstrap-sequence-status.md b/docs/fos-hub-bootstrap-sequence-status.md index b9fc5af..8b659cc 100644 --- a/docs/fos-hub-bootstrap-sequence-status.md +++ b/docs/fos-hub-bootstrap-sequence-status.md @@ -21,14 +21,14 @@ Do not restart FOS bootstrap at the old `NK-WP-0001` Keycloak path. That workpla | --- | --- | --- | | Identity | Old `CUST-WP-0025-T01` pointed at archived `NK-WP-0001`; local identity and IAM Profile v0.2 are done. | Keep T01 cancelled, T02 done, and make T03 the remaining identity gate: a protected FastAPI fixture using IAM Profile v0.2 against local-identity or KeyCape. | | Hub extraction/dev-hub | `CUST-WP-0025-T05` through `T12` are done: hub-core exists, State Hub imports hub-core, and MCP naming moved to dev-hub. | Treat Phase 2 as complete. Do not spend pickup energy here unless consistency drift appears. | -| Ops hub | The old `ops-hub`/Inter-Hub extension path has useful seed evidence, but Core Hub now has the credible replacement platform: local `/api/v2` compatibility, ops-hub bootstrap smoke, protected persistence-backed resources, and `/console` visual checks. | Create or update the Core Hub API-first continuation workplan. Treat Haskell Inter-Hub as legacy compatibility or rollback evidence. | -| Old ops-hub scaffold tasks | `CUST-WP-0025-T13`-`T19` still describe a standalone hub-core/FastAPI/MCP scaffold. Current implementation direction is Core Hub replacement-first, not Inter-Hub extension-first. | Reconcile these tasks after Core Hub has a deployed compatibility/evidence smoke: rewrite them to Core Hub-owned API/CLI/UI tasks or explicitly defer/cancel the old standalone scaffold. | +| Ops hub | Core Hub is now the replacement platform: `CORE-WP-0008` finished the API smoke harness, activity-core sink, staging profile, CLI wrappers, UI rebuild backlog, and Custodian handoff. Live deployed smokes and cutover evidence are still open. | Continue through Core Hub deployed evidence, migration import, activity-core smoke, and cutover gates. Treat Haskell Inter-Hub as legacy compatibility or rollback evidence. | +| Old ops-hub scaffold tasks | `CUST-WP-0025-T13`-`T19` have been rewritten around Core Hub API evidence, CLI parity, deployed smoke/cutover gates, whynot-aligned UI, and cancellation of immediate standalone ops-hub MCP registration. | Execute the remaining wait/todo gates in the rewritten Phase 3. Do not resume the obsolete standalone ops-hub scaffold sequence. | | Fin hub/business | `CUST-WP-0025-T20`-`T26` are all todo and depend on a proven multi-hub pattern. | Defer until ops-hub has a working first signal and the identity integration gate is proven. | ## Stable Pickup Order 1. Close the identity drift: T01 cancelled, T02 done, T03 remains as the one real identity integration test. -2. Use `CUST-WP-0052` to open or update the Core Hub API-first continuation lane. -3. Keep `CUST-WP-0047`/`CUST-WP-0049` as legacy evidence/fallback until Core Hub smoke evidence or an explicit supersede decision closes them. -4. Rewrite `CUST-WP-0025-T13`-`T19` after Core Hub proves the replacement path. +2. Use the finished `CORE-WP-0008` evidence lane and `CUST-WP-0052` reset notes as the Core Hub replacement baseline. +3. Keep `CUST-WP-0047`/`CUST-WP-0049` as legacy evidence/fallback until Core Hub deployed smoke evidence or an explicit supersede decision closes them. +4. Execute rewritten `CUST-WP-0025-T14`, `T16`, `T17`, and `T18` in API/CLI/UI order. 5. Start fin-hub/business work only after ops-hub proves the Core Hub pattern end-to-end. diff --git a/workplans/CUST-WP-0025-fos-hub-bootstrap.md b/workplans/CUST-WP-0025-fos-hub-bootstrap.md index f258f03..f7d36c2 100644 --- a/workplans/CUST-WP-0025-fos-hub-bootstrap.md +++ b/workplans/CUST-WP-0025-fos-hub-bootstrap.md @@ -350,15 +350,15 @@ few repos (unrelated to dev-hub rename); no new automation errors introduced. **Goal**: Runtime operations coordination per FOS §7.3. **Depends on**: Phase 2 (hub_core available), Phase 1 (identity for service auth). -**Repo**: ops-hub (new standalone repo, registered under custodian domain) +**Repo**: core-hub for replacement runtime; the-custodian for coordination; standalone ops-hub is deferred until post-cutover need is proven. **Inventory-first implementation slice (2026-06-05):** `CUST-WP-0047` -carves out the minimum useful part of T14/T16/T18 before the full standalone -`ops-hub` scaffold exists: a repo-owned service inventory contract, an initial +carves out the minimum useful part of T14/T16/T18 before the replacement runtime +is fully proven: a repo-owned service inventory contract, an initial service/location/evidence seed, and the handoff path for Inter-Hub widgets and -activity-core probes. The T13-T19 tasks below remain the long-term ops-hub -implementation; the inventory slice produces input artifacts that the eventual -ops-hub repo can ingest rather than replace. +activity-core probes. After the Core Hub reset, these artifacts feed Core Hub +ops evidence first; a separate ops-hub repo should ingest them only if a +post-cutover service boundary is proven useful. **Inter-Hub bootstrap access lane (2026-06-17):** `CUST-WP-0049` extracts the repeatable authenticated bootstrap routine needed to finish ops-hub production @@ -367,29 +367,37 @@ ops-warden owns the short-lived SSH certificate envelope, and operator secret custody remains outside Git. **Core Hub reset (2026-06-27):** `CUST-WP-0052` supersedes the Inter-Hub-first -implementation direction for future work. The old T13-T19 standalone ops-hub -scaffold should not be executed literally until it is rewritten around Core Hub: -API-first replacement contracts, CLI helpers second, and a rebuilt whynot-aligned -operator UI third. Keep this phase active as a coordination record, not as a -mandate to expand Haskell Inter-Hub. +implementation direction for future work. T13-T19 below have been rewritten +around Core Hub: API-first replacement contracts, CLI helpers second, deployed +evidence and cutover gates, and a rebuilt whynot-aligned operator UI third. Keep +Haskell Inter-Hub as legacy compatibility or rollback evidence, not the preferred +implementation target. -### T13 — Create ops-hub repo from hub-core scaffold +### T13 — Open Core Hub replacement lane ```task id: CUST-WP-0025-T13 -status: todo -priority: medium +status: done +priority: high state_hub_task_id: "2c6d1429-a67a-4f66-84d1-cb32ffdb890f" ``` -Create `ops-hub` repo with: -- pyproject.toml depending on hub-core -- FastAPI app factory inheriting hub-core base -- MCP server extending hub-core base server -- Alembic setup with hub-core core migrations + ops-specific -- Register as managed repo under custodian domain +Replace the old immediate standalone `ops-hub` repo scaffold with a Core +Hub-owned replacement lane. -### T14 — Ops-specific models +The replacement lane must keep the FOS intent of runtime operations +coordination while using the current implementation order: + +- Core Hub API resources and compatibility/evidence smokes first; +- thin operator CLI wrappers second; +- web UI rebuild third, after API/CLI parity is stable. + +Completed 2026-06-27: Core Hub workplan `CORE-WP-0008` finished as the +API-first execution counterpart, and Custodian recorded the replacement evidence +handoff in `docs/core-hub-replacement-evidence.md`. This task is complete as a +reframe/open-lane task; it does not claim production cutover is complete. + +### T14 — Define Core Hub ops evidence contract and read-model gaps ```task id: CUST-WP-0025-T14 @@ -398,91 +406,143 @@ priority: medium state_hub_task_id: "0e811e9b-23a5-49f9-979e-cd1c5dcd937f" ``` -Define SQLAlchemy models for: -- **Service**: name, namespace, health_status, last_seen, endpoints -- **Incident**: severity, status (open/investigating/mitigated/resolved), timeline -- **Runbook**: service_id, trigger_conditions, steps, last_executed -- **AccessPath**: type (ssh/k8s/http), target, auth_method, status -- **OperationalDebt**: category, severity, location, owner -- **ChangeRecord**: what changed, when, by whom, rollback_path +Define the Core Hub-owned operations evidence contract that replaces the old +standalone ops-specific model list. -### T15 — Ops-specific MCP tools +The contract should reconcile: + +- `CUST-WP-0047` service inventory and current evidence vocabulary; +- Core Hub hubs, manifests, widgets, API consumers, and interaction events; +- activity-core probe metadata and `core-hub-interaction-event` sink output; +- migration runs, deployment records, outcome signals, and cutover evidence; +- non-secret custody rules for key prefixes, hashes, routes, and evidence ids. + +Known Core Hub API/read-model gaps to resolve before UI expansion: + +- a protected migration-run read route such as `/api/v2/migration-runs`; +- non-deferred deployment/outcome evidence routes where needed; +- a mapping from service inventory ids to Core Hub widgets/events. + +Done when Core Hub has a workplan or spec that names the API resources, record +shape, evidence event vocabulary, and migration path from the existing +Custodian inventory artifacts. + +### T15 — Core Hub operator CLI parity ```task id: CUST-WP-0025-T15 -status: todo +status: done priority: medium state_hub_task_id: "3fdd1f61-4c8e-4614-898b-df7a9aa4a514" ``` -Implement ops-domain MCP tools: -- Service registry: register_service, list_services, get_service_health -- Health probes: probe_service, get_cluster_health, get_storage_health -- Incident lifecycle: create_incident, update_incident, resolve_incident -- Runbook: get_runbook, execute_runbook_step -- Access: list_access_paths, check_access_path +Replace the old MCP-first ops tool plan with API and CLI parity first. -### T16 — Railiance infrastructure integration +Required CLI surface: + +- deployed Core Hub smoke evidence; +- ops-hub bootstrap/status checks; +- migration bundle validate/import; +- cutover readiness summary from non-secret evidence reports. + +Completed 2026-06-27: `CORE-WP-0008-T05` added `make operator-cli` and +`scripts/core_hub_cli.py` with wrappers around the same Core Hub API behavior +used by tests and smokes. Any MCP surface should consume these proven APIs later +rather than becoming the first implementation path. + +### T16 — Deployed ops evidence and activity-core smokes ```task id: CUST-WP-0025-T16 -status: todo -priority: medium +status: wait +priority: high state_hub_task_id: "702849c5-b253-4ede-afa7-0ab4f81e49a5" ``` -Connect ops-hub to railiance infrastructure observability: -- k3s cluster health via kubectl/API -- Longhorn storage status and replication state -- Certificate expiry tracking (cert-manager) -- Backup status (S2 integrated backup) -- SSH tunnel health (ops-bridge) +Run the production-like Core Hub evidence smokes that replace the old direct +Railiance infrastructure integration task. -### T17 — Cross-hub protocol: ops-hub to dev-hub +Minimum evidence: + +- `make deployed-smoke` or `make operator-cli CLI_ARGS="deployed-smoke ..."` + against a real Core Hub staging URL; +- deployed activity-core Core Hub sink smoke with approved runtime token and + widget mapping; +- non-secret report fields only: run id, hub/manifest/API-consumer ids, + key prefixes, widget/event ids, counts, statuses, and containment booleans; +- State Hub progress note linking the evidence and naming any remaining gates. + +Blocked until an approved `CORE_HUB_BASE_URL`, operator/runtime token custody +path, and activity-core widget mapping are available. This task can close or +supersede `CUST-WP-0047-T05` and `CUST-WP-0049-T06` only after deployed Core +Hub evidence exists or an explicit supersede decision is recorded. + +### T17 — Core Hub, dev-hub, and cutover decision coupling ```task id: CUST-WP-0025-T17 -status: todo +status: wait priority: medium state_hub_task_id: "b99a3ed8-440b-4e28-88f5-495de7276f66" ``` -Implement FOS §9.2.5 event coupling: -- Deployment events in dev-hub → change signals in ops-hub -- Incident events in ops-hub → blocker signals in dev-hub -- Shared event vocabulary (canonical event_types) -- HTTP-based event forwarding (keep it simple; upgrade to NATS later if needed) +Replace the old ops-hub-to-dev-hub protocol task with Core Hub replacement +coupling and cutover decision records. -### T18 — Ops Hub "now view" dashboard +Minimum scope: + +- Core Hub readiness summary from deployed smoke, migration import, + activity-core sink, and optional legacy Inter-Hub reference evidence; +- State Hub progress/decision records that state whether legacy Inter-Hub + fallback remains required; +- compatibility notes for consumers that still expect Inter-Hub `/api/v2`; +- rollback and Haskell retirement gates kept explicit. + +Blocked until `CORE-WP-0005` staging import, dual-run smokes, and cutover +readiness evidence exist. Do not unblock `CORE-WP-0007` Haskell retirement from +local-only evidence. + +### T18 — Core Hub operator UI first screens ```task id: CUST-WP-0025-T18 status: todo -priority: low +priority: medium state_hub_task_id: "5b6cea8b-3982-49be-bacf-7269a3d2104e" ``` -Observable Framework dashboard for ops-hub: -- Service status grid (green/amber/red) -- Active incidents timeline -- Access path map -- Storage and certificate health -- Recent change log +Replace the old Observable Framework dashboard task with the Core Hub operator +UI rebuild backlog. -### T19 — Register ops-hub as MCP server +Initial UI work should implement only the first operator-critical screens: + +- readiness overview; +- registry explorer; +- evidence stream; +- migration/cutover state; +- action-required gates; +- access metadata as a support panel, not a broad expansion area. + +Use whynot-design tokens/components wherever practical and preserve +`make visual-check` style desktop/mobile, no-overlap, text-overflow, protected +route, and non-secret assertions. Start implementation from Core Hub +`docs/specs/operator-ui-rebuild-backlog.md`, not from old Inter-Hub screens. + +### T19 — Ops-hub MCP server registration decision ```task id: CUST-WP-0025-T19 -status: todo +status: cancel priority: medium state_hub_task_id: "f033c80e-4ebb-49cf-8987-20c9b2ff4c13" ``` -Register ops-hub MCP server: -- Port 8002 (dev-hub on 8001, ops-hub on 8002) -- Update global `~/.claude/CLAUDE.md` with ops-hub registration -- Update session protocol: domain repos that touch infrastructure should - call both `get_domain_summary()` (dev-hub) and ops-hub orientation +Cancel the old immediate registration of a standalone `ops-hub` MCP server. + +The preferred replacement path is Core Hub API first and operator CLI second. +Register a separate ops-hub MCP server only if post-cutover usage proves that a +separate service boundary is still useful. Until then, State Hub progress and +Core Hub API/CLI evidence are the coordination surfaces. ## Phase 4 — Business Model & Fin Hub diff --git a/workplans/CUST-WP-0051-infrastructure-stabilization-metaplan.md b/workplans/CUST-WP-0051-infrastructure-stabilization-metaplan.md index b92f0e4..f744641 100644 --- a/workplans/CUST-WP-0051-infrastructure-stabilization-metaplan.md +++ b/workplans/CUST-WP-0051-infrastructure-stabilization-metaplan.md @@ -456,21 +456,22 @@ Progress 2026-06-27: `NK-WP-0001` Keycloak task is cancelled as superseded, `NK-WP-0002` local identity is done, and the remaining identity gate is the IAM Profile v0.2 FastAPI integration test. -- Current ops-hub reality is extension-first: `ops-hub` exists, - `OPS-WP-0001` is finished, and `OPS-WP-0002` waits on authenticated - Inter-Hub bootstrap/runtime-key evidence. Reconcile `CUST-WP-0025-T13`-`T19` - after the first governed ops event lands. +- Current ops-hub reality is Core Hub replacement-first: `CORE-WP-0008` + finished the API smoke harness, activity-core sink, staging profile, CLI + wrappers, UI rebuild backlog, and Custodian handoff. `CUST-WP-0025-T13`-`T19` + have been rewritten away from the obsolete standalone scaffold. - Fin-hub/business tasks remain deliberately deferred until identity integration and ops-hub extension evidence are proven. Progress 2026-06-27 Core Hub reset: -- `CUST-WP-0052` now owns the reset criteria. `CUST-WP-0025-T13` through - `T19` should not be executed literally as the old standalone ops-hub scaffold - until Core Hub replacement evidence is good enough and the tasks are rewritten. -- Core Hub is promising enough to stop expanding the Inter-Hub-first path: - local ops-hub bootstrap compatibility and `/console` visual checks exist, but - staging import, deployed dual-run smokes, and cutover evidence are still open. +- `CUST-WP-0052` completed the Phase 3 reset. `CUST-WP-0025-T13` through + `T19` now point at Core Hub-owned API evidence, CLI parity, deployed + smoke/cutover gates, whynot-aligned UI, and cancellation of immediate + standalone ops-hub MCP registration. +- Core Hub is now the preferred replacement lane, but staging import, deployed + dual-run smokes, cutover evidence, and Haskell retirement approval remain + open. ## Task: Create The Stable Pickup Checkpoint diff --git a/workplans/CUST-WP-0052-core-hub-fos-reset.md b/workplans/CUST-WP-0052-core-hub-fos-reset.md index 23fb387..d40f4de 100644 --- a/workplans/CUST-WP-0052-core-hub-fos-reset.md +++ b/workplans/CUST-WP-0052-core-hub-fos-reset.md @@ -158,7 +158,7 @@ and gated UI rebuild criteria. ```task id: CUST-WP-0052-T04 -status: todo +status: done priority: high state_hub_task_id: "04c9c807-68d0-4750-bd72-a484730cd55d" ``` @@ -181,9 +181,14 @@ points future agents at the obsolete mega-hub/Inter-Hub scaffold sequence. summarizes the Core Hub replacement proof from `CORE-WP-0008-T02` through `T06`, records why `CUST-WP-0047-T05` and `CUST-WP-0049-T06` should remain legacy/fallback wait tasks for now, and gives rewrite guidance for -`CUST-WP-0025-T13` through `T19`. The actual `CUST-WP-0025` rewrite is still -open because no live deployed Core Hub smoke ids/counts or cutover proof exist -yet. +`CUST-WP-0025-T13` through `T19`. + +Completed 2026-06-27: rewrote `CUST-WP-0025-T13` through `T19` around Core +Hub-owned API evidence, operator CLI parity, deployed smoke/cutover gates, and +the whynot-aligned Core Hub UI backlog. The rewrite marks the old immediate +standalone ops-hub MCP registration as cancelled, keeps deployed evidence and +cutover tasks waiting on real staging/runtime proof, and does not claim Haskell +retirement is unblocked. ## Task: Align Helixforge Build And Environment Practices @@ -265,8 +270,8 @@ workplan notes, not buried in chat. - CUST-WP-0051, CUST-WP-0047, and CUST-WP-0049 point toward Core Hub replacement instead of further Inter-Hub expansion. -- CUST-WP-0025 has a clear reset gate and no one resumes the old standalone - ops-hub scaffold until it is rewritten. +- CUST-WP-0025 Phase 3 has been rewritten so no one resumes the old + standalone ops-hub scaffold sequence. - The next implementation lane is API first, CLI second, web UI third. - UI rebuild expectations name whynot-design and operator-priority views. - External ops-warden needs are routed through State Hub requirements, not