Files
helix-forge/wiki/OpsHubReadinessGates.md

73 lines
5.2 KiB
Markdown

# Ops Hub Readiness Gates
Date: 2026-06-14
## Purpose
These gates define what must be true before operational responsibility can move
from the current CoulombCore setup to the future ThreePhoenix production setup.
They are intended as the first `ops-hub` readiness model and should be ported
into the dedicated `ops-hub` implementation repo as that repo grows.
Statuses:
- `unknown` means no reliable evidence has been cataloged yet.
- `partial` means some evidence exists, but the gate is not complete.
- `blocked` means a required precondition is missing.
- `ready` means the evidence requirement is satisfied.
## Gates
| ID | Gate | Owner repo | Evidence requirement | Current status |
|---|---|---|---|---|
| OPS-G01 | Environment inventory exists | `helix-forge` handoff to `ops-hub` | `local`, `coulombcore`, `railiance01`, and `threephoenix-prod` are represented with role, lifecycle state, and owner notes. | `partial` |
| OPS-G02 | Service catalog exists | `ops-hub` | Each live and target service has environment, owner repo, endpoint, backing stores, lifecycle state, and evidence links. | `partial` |
| OPS-G03 | DNS and TLS are codified | `railiance-cluster` / `railiance-apps` | Public hostnames, ingress routes, certificate sources, and renewal paths are declared in repo files. | `unknown` |
| OPS-G04 | Git hosting is reproducible | `railiance-apps` / `railiance-platform` | Gitea or successor deployment can be recreated from repo state, including database and storage dependencies. | `partial` |
| OPS-G05 | Container registry publishing is proven | `railiance-apps` | `docker login`, push, and pull succeed against `https://gitea.coulomb.social/v2/` using governed secrets. | `partial` |
| OPS-G06 | Persistent data is backed up | `railiance-platform` | Each persistent data store has backup location, schedule, retention, ownership, and latest successful backup evidence. | `unknown` |
| OPS-G07 | Restore path is proven | `railiance-platform` / `railiance-apps` | Restore test evidence exists for Gitea database, package blobs, and State Hub data. | `unknown` |
| OPS-G08 | Secrets path is governed | `railiance-infra` / `railiance-apps` | SOPS/age keys and operator secret paths are documented; no required secret depends on shell memory. | `partial` |
| OPS-G09 | Cluster runtime is reproducible | `railiance-cluster` | Kubernetes runtime, ingress, CNI, operators, and routing primitives are recreated through repo-owned automation. | `unknown` |
| OPS-G10 | Platform services are reproducible | `railiance-platform` | PostgreSQL/CNPG, object storage, secret management, and identity dependencies have repo-owned deployment evidence. | `unknown` |
| OPS-G11 | Application deployment is reproducible | `railiance-apps` | Gitea, Inter-Hub, State Hub, and other application releases are declared with Helm values and deployment runbooks. | `partial` |
| OPS-G12 | Rollback path is documented | owning service repos | Each migration wave has rollback conditions, steps, and data safety notes. | `unknown` |
| OPS-G13 | Operator runbooks exist | owning service repos | Deploy, restore, rotate, incident response, and migration runbooks exist for each critical service. | `unknown` |
| OPS-G14 | Observability and health checks are explicit | `railiance-cluster` / `railiance-platform` / service repos | Health checks, logs, metrics, and endpoint probes are documented and tied to service catalog entries. | `unknown` |
| OPS-G15 | Inter-Hub ops bootstrap is available | `inter-hub` / `ops-hub` / `helix-forge` | `ops-hub` can be created through UI, supported API, or explicit migration fallback, manifest activated, API consumer/key created, widgets seeded, and events accepted. | `partial` |
## Current Bootstrap Gate Evidence
2026-06-14: `ops-hub/scripts/interhub-gate-probe.py` reports the preferred
production API bootstrap gate still closed. Live `/api/v2/hubs` returns `404`,
and OpenAPI does not yet list `/hubs`, `/hub-capability-manifests`,
`/api-consumers`, or `/policy-scopes`.
## Initial Migration Waves
| Wave | Goal | Required gates |
|---|---|---|
| `wave-0-catalog` | Establish the operational truth surface without moving services. | OPS-G01, OPS-G02, OPS-G15 |
| `wave-1-registry-proof` | Prove current Gitea registry publishing and evidence capture. | OPS-G03, OPS-G05, OPS-G08, OPS-G14 |
| `wave-2-backup-restore` | Confirm backups and restore paths for critical persistent state. | OPS-G06, OPS-G07, OPS-G13 |
| `wave-3-threephoenix-foundation` | Recreate cluster and platform foundations on railiance01/ThreePhoenix. | OPS-G09, OPS-G10 |
| `wave-4-service-migration` | Move or replace production responsibilities from CoulombCore to ThreePhoenix. | OPS-G04, OPS-G11, OPS-G12 plus service-specific gates |
## Evidence Shape
Each readiness gate should eventually be represented in `ops-hub` as a widget
or widget family with events like:
- `ops-readiness-gate-updated`
- `ops-endpoint-verified`
- `ops-backup-verified`
- `ops-restore-tested`
- `ops-risk-raised`
- `ops-migration-gate-passed`
- `ops-migration-gate-failed`
Until Inter-Hub can create all required records through API calls, the evidence
can be maintained as HelixForge handoff material or in the `ops-hub`
implementation repo and mirrored into Inter-Hub through the UI or explicit
migrations.