diff --git a/wiki/OpsHubInventory.md b/wiki/OpsHubInventory.md new file mode 100644 index 0000000..e8953a3 --- /dev/null +++ b/wiki/OpsHubInventory.md @@ -0,0 +1,132 @@ +# Ops Hub Initial Inventory + +Date: 2026-05-16 + +## Purpose + +This document is the first structured inventory for `ops-hub`, the VSM +Operations / System 1 hub. It turns the current operations situation into a +catalogable model before `ops-hub` has its own repository, collectors, or UI. + +Source background: + +- `wiki/CurrentOperationsSituation.md` +- `workplans/HF-WP-0001-establish-ops-hub-first-extension.md` + +## VSM Placement + +| Field | Value | +|---|---| +| Hub | `ops-hub` | +| Hub family | `vsm` | +| VSM function | `OPS` | +| VSM system | `S1` | +| Primary concern | Operational truth and evidence | + +`ops-hub` owns the description of what is currently running, where it runs, how +it is reached, what state it is in, and what operational evidence exists. It +does not replace State Hub workstreams or Inter-Hub governance. + +## Environments + +| Environment | Role | Current state | Notes | +|---|---|---|---| +| `local` | Workstation development and local services | Active, important, not production | Hosts State Hub and local build/runtime pieces. | +| `coulombcore` | Live transitional production | Active, production-like, historically hand-built | Public IP `92.205.130.254`; runs current Gitea and experimental operational services. | +| `railiance01` | Future production foundation | Provisioning target | Public IP `92.205.62.239`; first server of intended ThreePhoenix shape. | +| `threephoenix-prod` | Target production topology | Planned | Future governed multi-node production environment. | + +## Hosts + +| Host | Environment | Address | Role | Known gaps | +|---|---|---|---|---| +| `coulombcore` | `coulombcore` | `92.205.130.254` | Current live production-like server | Needs service catalog, drift tracking, backup/restore evidence, and migration disposition. | +| `railiance01` | `railiance01` | `92.205.62.239` | First ThreePhoenix production foundation node | Needs full inventory, readiness gates, and cluster/platform bootstrap evidence. | +| local workstation | `local` | local/private | State Hub and development runtime host | Needs explicit service ownership and backup expectations. | + +Ops Bridge may provide reachability evidence for connected servers, but it is +not the service catalog. `ops-hub` should turn bridge reachability into +inventory signals rather than treating the bridge itself as the inventory. + +## Clusters + +| Cluster | Environment | Role | Current notes | +|---|---|---|---| +| CoulombCore Kubernetes | `coulombcore` | Current operational Kubernetes runtime | Hosts current Gitea deployment and related services. | +| ThreePhoenix Kubernetes | `threephoenix-prod` | Target production runtime | Future governed production cluster assembled through Railiance repos. | + +## Services + +| Service | Current environment | Owner repo | Current evidence | Gaps | +|---|---|---|---|---| +| Gitea | `coulombcore` | `railiance-apps` | Helm release `gitea`, namespace `default`, app version `1.25.4`, NodePort `32166`, public registry path returns auth challenge. | SOPS Helm values update, package token, `docker login`, push, pull, backup coverage, restore evidence. | +| Gitea database | `coulombcore` | `railiance-platform` | Database `gitea-db` in namespace `databases`. | Backup and restore evidence not recorded here yet. | +| Gitea shared storage | `coulombcore` | `railiance-platform` / `railiance-apps` | PVC `default/gitea-shared-storage`. | Package blob backup and restore evidence not confirmed. | +| State Hub | `local` | `the-custodian/state-hub` | Local API and dashboard are operational enough for repo registration and workplan sync. | Future cluster deployment/readiness still needs gates and evidence. | +| Inter-Hub | live public endpoint | `inter-hub` | `https://hub.coulomb.social/api/v2/openapi.json` and docs are reachable. | Hub bootstrap still depends on authenticated UI or migration. | +| Ops Bridge | local/remote bridge | `ops-bridge` | Useful for connected-server visibility. | Not a service catalog; should emit reachability evidence into `ops-hub`. | + +## Endpoints + +| Endpoint | Service | Environment | Current status | Evidence | +|---|---|---|---|---| +| `https://gitea.coulomb.social/v2/` | Gitea OCI registry | `coulombcore` | Route fixed; returns registry auth challenge | Expected `401` with OCI registry challenge. | +| `https://hub.coulomb.social/api/v2/openapi.json` | Inter-Hub API | live Inter-Hub | Reachable | OpenAPI document fetched on 2026-05-16. | +| `https://hub.coulomb.social/Hubs` | Inter-Hub UI | live Inter-Hub | Requires login | Redirects to `/NewSession`. | +| `http://127.0.0.1:8000/state/health` | State Hub API | `local` | Reachable locally | Used for StateHub registration/sync. | + +## Service Catalog Gap + +There is no central place that answers these questions: + +- What runs where? +- Which repo owns its desired state? +- Which endpoint exposes it? +- Which data stores back it? +- Which backups and restore tests cover it? +- Which migration wave will replace or move it? +- Which current evidence proves it is healthy? + +`ops-hub` should be the first place where these answers are explicit and +machine-addressable. + +## First Ops Widgets + +Seed these in Inter-Hub once `ops-hub` exists: + +- `ops-env-local` +- `ops-env-coulombcore` +- `ops-env-railiance01` +- `ops-env-threephoenix-prod` +- `ops-host-coulombcore` +- `ops-host-railiance01` +- `ops-service-catalog` +- `ops-service-gitea` +- `ops-service-state-hub` +- `ops-service-inter-hub` +- `ops-endpoint-gitea-registry` +- `ops-readiness-gitea-registry` +- `ops-readiness-state-hub-cluster-deploy` +- `ops-migration-coulombcore-to-threephoenix` + +## First Evidence Events + +The first event should be the Gitea registry endpoint verification: + +```json +{ + "widgetId": "", + "eventType": "ops-endpoint-verified", + "viewContext": "railiance-apps/workplans/RAIL-AP-WP-0001", + "metadata": { + "vsmFunction": "OPS", + "vsmSystem": "S1", + "endpoint": "https://gitea.coulomb.social/v2/", + "expectedStatus": 401, + "observedHeader": "Docker-Distribution-Api-Version: registry/2.0" + } +} +``` + +This event is blocked until the ops event type is registered by an active +manifest and the target widget exists. diff --git a/wiki/OpsHubReadinessGates.md b/wiki/OpsHubReadinessGates.md new file mode 100644 index 0000000..e9d8568 --- /dev/null +++ b/wiki/OpsHubReadinessGates.md @@ -0,0 +1,63 @@ +# Ops Hub Readiness Gates + +Date: 2026-05-16 + +## Purpose + +These gates define what must be true before operational responsibility can move +from the current CoulombCore setup to the future ThreePhoenix production setup. +They are intended as the first `ops-hub` readiness model. + +Statuses: + +- `unknown` means no reliable evidence has been cataloged yet. +- `partial` means some evidence exists, but the gate is not complete. +- `blocked` means a required precondition is missing. +- `ready` means the evidence requirement is satisfied. + +## Gates + +| ID | Gate | Owner repo | Evidence requirement | Current status | +|---|---|---|---|---| +| OPS-G01 | Environment inventory exists | `helix-forge` | `local`, `coulombcore`, `railiance01`, and `threephoenix-prod` are represented with role, lifecycle state, and owner notes. | `partial` | +| OPS-G02 | Service catalog exists | `helix-forge` then future `ops-hub` | Each live and target service has environment, owner repo, endpoint, backing stores, lifecycle state, and evidence links. | `partial` | +| OPS-G03 | DNS and TLS are codified | `railiance-cluster` / `railiance-apps` | Public hostnames, ingress routes, certificate sources, and renewal paths are declared in repo files. | `unknown` | +| OPS-G04 | Git hosting is reproducible | `railiance-apps` / `railiance-platform` | Gitea or successor deployment can be recreated from repo state, including database and storage dependencies. | `partial` | +| OPS-G05 | Container registry publishing is proven | `railiance-apps` | `docker login`, push, and pull succeed against `https://gitea.coulomb.social/v2/` using governed secrets. | `partial` | +| OPS-G06 | Persistent data is backed up | `railiance-platform` | Each persistent data store has backup location, schedule, retention, ownership, and latest successful backup evidence. | `unknown` | +| OPS-G07 | Restore path is proven | `railiance-platform` / `railiance-apps` | Restore test evidence exists for Gitea database, package blobs, and State Hub data. | `unknown` | +| OPS-G08 | Secrets path is governed | `railiance-infra` / `railiance-apps` | SOPS/age keys and operator secret paths are documented; no required secret depends on shell memory. | `partial` | +| OPS-G09 | Cluster runtime is reproducible | `railiance-cluster` | Kubernetes runtime, ingress, CNI, operators, and routing primitives are recreated through repo-owned automation. | `unknown` | +| OPS-G10 | Platform services are reproducible | `railiance-platform` | PostgreSQL/CNPG, object storage, secret management, and identity dependencies have repo-owned deployment evidence. | `unknown` | +| OPS-G11 | Application deployment is reproducible | `railiance-apps` | Gitea, Inter-Hub, State Hub, and other application releases are declared with Helm values and deployment runbooks. | `partial` | +| OPS-G12 | Rollback path is documented | owning service repos | Each migration wave has rollback conditions, steps, and data safety notes. | `unknown` | +| OPS-G13 | Operator runbooks exist | owning service repos | Deploy, restore, rotate, incident response, and migration runbooks exist for each critical service. | `unknown` | +| OPS-G14 | Observability and health checks are explicit | `railiance-cluster` / `railiance-platform` / service repos | Health checks, logs, metrics, and endpoint probes are documented and tied to service catalog entries. | `unknown` | +| OPS-G15 | Inter-Hub ops bootstrap is available | `inter-hub` / `helix-forge` | `ops-hub` can be created through UI or migration, manifest activated, API consumer/key created, widgets seeded, and events accepted. | `partial` | + +## Initial Migration Waves + +| Wave | Goal | Required gates | +|---|---|---| +| `wave-0-catalog` | Establish the operational truth surface without moving services. | OPS-G01, OPS-G02, OPS-G15 | +| `wave-1-registry-proof` | Prove current Gitea registry publishing and evidence capture. | OPS-G03, OPS-G05, OPS-G08, OPS-G14 | +| `wave-2-backup-restore` | Confirm backups and restore paths for critical persistent state. | OPS-G06, OPS-G07, OPS-G13 | +| `wave-3-threephoenix-foundation` | Recreate cluster and platform foundations on railiance01/ThreePhoenix. | OPS-G09, OPS-G10 | +| `wave-4-service-migration` | Move or replace production responsibilities from CoulombCore to ThreePhoenix. | OPS-G04, OPS-G11, OPS-G12 plus service-specific gates | + +## Evidence Shape + +Each readiness gate should eventually be represented in `ops-hub` as a widget +or widget family with events like: + +- `ops-readiness-gate-updated` +- `ops-endpoint-verified` +- `ops-backup-verified` +- `ops-restore-tested` +- `ops-risk-raised` +- `ops-migration-gate-passed` +- `ops-migration-gate-failed` + +Until Inter-Hub can create all required records through API calls, the evidence +can be maintained in this repository and mirrored into Inter-Hub through the UI +or migrations. diff --git a/workplans/HF-WP-0001-establish-ops-hub-first-extension.md b/workplans/HF-WP-0001-establish-ops-hub-first-extension.md index 29ffe31..72fdb58 100644 --- a/workplans/HF-WP-0001-establish-ops-hub-first-extension.md +++ b/workplans/HF-WP-0001-establish-ops-hub-first-extension.md @@ -142,6 +142,48 @@ Assessment: the quickstart as aspirational for bootstrap automation until Inter-Hub is hardened. +## Confirmed Bootstrap Path + +Checked against the live API and local Inter-Hub source on 2026-05-16. + +Decision: + +- Bootstrap `ops-hub` through the authenticated Inter-Hub admin UI where + possible, with a migration-backed fallback when a repeatable bootstrap is + needed before the public API is hardened. +- Treat Pattern A, API Consumer Hub, as the first implementation pattern. +- Store VSM classification in the manifest capability description for now, + because the current `hubs` table only has `slug`, `name`, `domain`, and + `hub_kind`; there are no first-class `hub_family`, `vsm_function`, or + `vsm_system` columns yet. +- Use `hub_kind = domain` for `ops-hub`. +- Record missing first-class VSM metadata fields and create endpoints as + Inter-Hub hardening work under T10. + +Confirmed current support: + +- Live `/Hubs` redirects to `/NewSession`, so hub creation is an authenticated + UI flow. +- Local `HubsController` supports creating and editing hub rows through the UI. +- Local `HubCapabilityManifestsController` supports draft creation, JSON-array + vocabulary editing, activation, and registry upsert. +- Local `ApiConsumersController` and `ApiKeysController` support authenticated + UI creation of consumers and static keys. +- Local `/api/v2/interaction-events` supports `POST` and validates event types. + +Confirmed gaps: + +- Live OpenAPI has no `POST /api/v2/hubs`. +- Live OpenAPI has no `POST /api/v2/widgets`; widgets are read-only in v2. +- There are no v2 endpoints for manifest draft creation, manifest activation, + API consumer creation, or API key creation. +- There is no `/api/v2/policy-scopes` endpoint. +- Live type registries currently return empty arrays, so the ops vocabulary + still needs manifest activation before events can be accepted. +- Event metadata is exposed in the response schema, but the v2 interaction + event create controller currently does not persist submitted metadata. +- Webhook dispatch still uses the hard-coded `"clicked"` event name. + ## Architectural Decision Start with **Pattern A: API Consumer Hub** for `ops-hub`, plus a manual or @@ -291,7 +333,7 @@ The first explicit service-catalog gap: ```task id: HF-WP-0001-T01 -status: todo +status: done priority: high state_hub_task_id: "2587a3b8-3b9b-4948-acaf-1547644e4563" ``` @@ -315,6 +357,8 @@ Done when: there is a concrete, repeatable path to create the `ops-hub` row, manifest, API consumer, and API key, with enough metadata to classify it as the Operations / System 1 hub. +Output: `Confirmed Bootstrap Path` section in this workplan. + --- ### T02 — Register ops-hub in Inter-Hub as the Operations hub @@ -440,7 +484,7 @@ and annotations. ```task id: HF-WP-0001-T06 -status: todo +status: done priority: medium state_hub_task_id: "2a0b2f69-5a3d-433c-9cbd-85fd868b63d8" ``` @@ -465,6 +509,8 @@ Done when: a human can see the CoulombCore, local, railiance01, and ThreePhoenix relationship, including the current Gitea registry state, without reading multiple repo workplans or relying on shell history. +Output: `wiki/OpsHubInventory.md`. + --- ### T07 — Instrument the current Gitea registry work as the first ops-hub signal @@ -504,7 +550,7 @@ traceable back to the Railiance workplan. ```task id: HF-WP-0001-T08 -status: todo +status: done priority: medium state_hub_task_id: "72a58622-c3ac-4765-8026-5c2489af2058" ``` @@ -523,6 +569,8 @@ responsibility from CoulombCore to ThreePhoenix: Done when: each gate has an owner repo, evidence requirement, and status. +Output: `wiki/OpsHubReadinessGates.md`. + --- ### T09 — Decide whether to create a separate ops-hub repository