docs: establish ops-hub bootstrap artifacts

This commit is contained in:
2026-05-16 02:47:06 +02:00
parent 479abeb781
commit dfb6c1fc47
3 changed files with 246 additions and 3 deletions

132
wiki/OpsHubInventory.md Normal file
View File

@@ -0,0 +1,132 @@
# Ops Hub Initial Inventory
Date: 2026-05-16
## Purpose
This document is the first structured inventory for `ops-hub`, the VSM
Operations / System 1 hub. It turns the current operations situation into a
catalogable model before `ops-hub` has its own repository, collectors, or UI.
Source background:
- `wiki/CurrentOperationsSituation.md`
- `workplans/HF-WP-0001-establish-ops-hub-first-extension.md`
## VSM Placement
| Field | Value |
|---|---|
| Hub | `ops-hub` |
| Hub family | `vsm` |
| VSM function | `OPS` |
| VSM system | `S1` |
| Primary concern | Operational truth and evidence |
`ops-hub` owns the description of what is currently running, where it runs, how
it is reached, what state it is in, and what operational evidence exists. It
does not replace State Hub workstreams or Inter-Hub governance.
## Environments
| Environment | Role | Current state | Notes |
|---|---|---|---|
| `local` | Workstation development and local services | Active, important, not production | Hosts State Hub and local build/runtime pieces. |
| `coulombcore` | Live transitional production | Active, production-like, historically hand-built | Public IP `92.205.130.254`; runs current Gitea and experimental operational services. |
| `railiance01` | Future production foundation | Provisioning target | Public IP `92.205.62.239`; first server of intended ThreePhoenix shape. |
| `threephoenix-prod` | Target production topology | Planned | Future governed multi-node production environment. |
## Hosts
| Host | Environment | Address | Role | Known gaps |
|---|---|---|---|---|
| `coulombcore` | `coulombcore` | `92.205.130.254` | Current live production-like server | Needs service catalog, drift tracking, backup/restore evidence, and migration disposition. |
| `railiance01` | `railiance01` | `92.205.62.239` | First ThreePhoenix production foundation node | Needs full inventory, readiness gates, and cluster/platform bootstrap evidence. |
| local workstation | `local` | local/private | State Hub and development runtime host | Needs explicit service ownership and backup expectations. |
Ops Bridge may provide reachability evidence for connected servers, but it is
not the service catalog. `ops-hub` should turn bridge reachability into
inventory signals rather than treating the bridge itself as the inventory.
## Clusters
| Cluster | Environment | Role | Current notes |
|---|---|---|---|
| CoulombCore Kubernetes | `coulombcore` | Current operational Kubernetes runtime | Hosts current Gitea deployment and related services. |
| ThreePhoenix Kubernetes | `threephoenix-prod` | Target production runtime | Future governed production cluster assembled through Railiance repos. |
## Services
| Service | Current environment | Owner repo | Current evidence | Gaps |
|---|---|---|---|---|
| Gitea | `coulombcore` | `railiance-apps` | Helm release `gitea`, namespace `default`, app version `1.25.4`, NodePort `32166`, public registry path returns auth challenge. | SOPS Helm values update, package token, `docker login`, push, pull, backup coverage, restore evidence. |
| Gitea database | `coulombcore` | `railiance-platform` | Database `gitea-db` in namespace `databases`. | Backup and restore evidence not recorded here yet. |
| Gitea shared storage | `coulombcore` | `railiance-platform` / `railiance-apps` | PVC `default/gitea-shared-storage`. | Package blob backup and restore evidence not confirmed. |
| State Hub | `local` | `the-custodian/state-hub` | Local API and dashboard are operational enough for repo registration and workplan sync. | Future cluster deployment/readiness still needs gates and evidence. |
| Inter-Hub | live public endpoint | `inter-hub` | `https://hub.coulomb.social/api/v2/openapi.json` and docs are reachable. | Hub bootstrap still depends on authenticated UI or migration. |
| Ops Bridge | local/remote bridge | `ops-bridge` | Useful for connected-server visibility. | Not a service catalog; should emit reachability evidence into `ops-hub`. |
## Endpoints
| Endpoint | Service | Environment | Current status | Evidence |
|---|---|---|---|---|
| `https://gitea.coulomb.social/v2/` | Gitea OCI registry | `coulombcore` | Route fixed; returns registry auth challenge | Expected `401` with OCI registry challenge. |
| `https://hub.coulomb.social/api/v2/openapi.json` | Inter-Hub API | live Inter-Hub | Reachable | OpenAPI document fetched on 2026-05-16. |
| `https://hub.coulomb.social/Hubs` | Inter-Hub UI | live Inter-Hub | Requires login | Redirects to `/NewSession`. |
| `http://127.0.0.1:8000/state/health` | State Hub API | `local` | Reachable locally | Used for StateHub registration/sync. |
## Service Catalog Gap
There is no central place that answers these questions:
- What runs where?
- Which repo owns its desired state?
- Which endpoint exposes it?
- Which data stores back it?
- Which backups and restore tests cover it?
- Which migration wave will replace or move it?
- Which current evidence proves it is healthy?
`ops-hub` should be the first place where these answers are explicit and
machine-addressable.
## First Ops Widgets
Seed these in Inter-Hub once `ops-hub` exists:
- `ops-env-local`
- `ops-env-coulombcore`
- `ops-env-railiance01`
- `ops-env-threephoenix-prod`
- `ops-host-coulombcore`
- `ops-host-railiance01`
- `ops-service-catalog`
- `ops-service-gitea`
- `ops-service-state-hub`
- `ops-service-inter-hub`
- `ops-endpoint-gitea-registry`
- `ops-readiness-gitea-registry`
- `ops-readiness-state-hub-cluster-deploy`
- `ops-migration-coulombcore-to-threephoenix`
## First Evidence Events
The first event should be the Gitea registry endpoint verification:
```json
{
"widgetId": "<ops-endpoint-gitea-registry-widget-id>",
"eventType": "ops-endpoint-verified",
"viewContext": "railiance-apps/workplans/RAIL-AP-WP-0001",
"metadata": {
"vsmFunction": "OPS",
"vsmSystem": "S1",
"endpoint": "https://gitea.coulomb.social/v2/",
"expectedStatus": 401,
"observedHeader": "Docker-Distribution-Api-Version: registry/2.0"
}
}
```
This event is blocked until the ops event type is registered by an active
manifest and the target widget exists.

View File

@@ -0,0 +1,63 @@
# Ops Hub Readiness Gates
Date: 2026-05-16
## Purpose
These gates define what must be true before operational responsibility can move
from the current CoulombCore setup to the future ThreePhoenix production setup.
They are intended as the first `ops-hub` readiness model.
Statuses:
- `unknown` means no reliable evidence has been cataloged yet.
- `partial` means some evidence exists, but the gate is not complete.
- `blocked` means a required precondition is missing.
- `ready` means the evidence requirement is satisfied.
## Gates
| ID | Gate | Owner repo | Evidence requirement | Current status |
|---|---|---|---|---|
| OPS-G01 | Environment inventory exists | `helix-forge` | `local`, `coulombcore`, `railiance01`, and `threephoenix-prod` are represented with role, lifecycle state, and owner notes. | `partial` |
| OPS-G02 | Service catalog exists | `helix-forge` then future `ops-hub` | Each live and target service has environment, owner repo, endpoint, backing stores, lifecycle state, and evidence links. | `partial` |
| OPS-G03 | DNS and TLS are codified | `railiance-cluster` / `railiance-apps` | Public hostnames, ingress routes, certificate sources, and renewal paths are declared in repo files. | `unknown` |
| OPS-G04 | Git hosting is reproducible | `railiance-apps` / `railiance-platform` | Gitea or successor deployment can be recreated from repo state, including database and storage dependencies. | `partial` |
| OPS-G05 | Container registry publishing is proven | `railiance-apps` | `docker login`, push, and pull succeed against `https://gitea.coulomb.social/v2/` using governed secrets. | `partial` |
| OPS-G06 | Persistent data is backed up | `railiance-platform` | Each persistent data store has backup location, schedule, retention, ownership, and latest successful backup evidence. | `unknown` |
| OPS-G07 | Restore path is proven | `railiance-platform` / `railiance-apps` | Restore test evidence exists for Gitea database, package blobs, and State Hub data. | `unknown` |
| OPS-G08 | Secrets path is governed | `railiance-infra` / `railiance-apps` | SOPS/age keys and operator secret paths are documented; no required secret depends on shell memory. | `partial` |
| OPS-G09 | Cluster runtime is reproducible | `railiance-cluster` | Kubernetes runtime, ingress, CNI, operators, and routing primitives are recreated through repo-owned automation. | `unknown` |
| OPS-G10 | Platform services are reproducible | `railiance-platform` | PostgreSQL/CNPG, object storage, secret management, and identity dependencies have repo-owned deployment evidence. | `unknown` |
| OPS-G11 | Application deployment is reproducible | `railiance-apps` | Gitea, Inter-Hub, State Hub, and other application releases are declared with Helm values and deployment runbooks. | `partial` |
| OPS-G12 | Rollback path is documented | owning service repos | Each migration wave has rollback conditions, steps, and data safety notes. | `unknown` |
| OPS-G13 | Operator runbooks exist | owning service repos | Deploy, restore, rotate, incident response, and migration runbooks exist for each critical service. | `unknown` |
| OPS-G14 | Observability and health checks are explicit | `railiance-cluster` / `railiance-platform` / service repos | Health checks, logs, metrics, and endpoint probes are documented and tied to service catalog entries. | `unknown` |
| OPS-G15 | Inter-Hub ops bootstrap is available | `inter-hub` / `helix-forge` | `ops-hub` can be created through UI or migration, manifest activated, API consumer/key created, widgets seeded, and events accepted. | `partial` |
## Initial Migration Waves
| Wave | Goal | Required gates |
|---|---|---|
| `wave-0-catalog` | Establish the operational truth surface without moving services. | OPS-G01, OPS-G02, OPS-G15 |
| `wave-1-registry-proof` | Prove current Gitea registry publishing and evidence capture. | OPS-G03, OPS-G05, OPS-G08, OPS-G14 |
| `wave-2-backup-restore` | Confirm backups and restore paths for critical persistent state. | OPS-G06, OPS-G07, OPS-G13 |
| `wave-3-threephoenix-foundation` | Recreate cluster and platform foundations on railiance01/ThreePhoenix. | OPS-G09, OPS-G10 |
| `wave-4-service-migration` | Move or replace production responsibilities from CoulombCore to ThreePhoenix. | OPS-G04, OPS-G11, OPS-G12 plus service-specific gates |
## Evidence Shape
Each readiness gate should eventually be represented in `ops-hub` as a widget
or widget family with events like:
- `ops-readiness-gate-updated`
- `ops-endpoint-verified`
- `ops-backup-verified`
- `ops-restore-tested`
- `ops-risk-raised`
- `ops-migration-gate-passed`
- `ops-migration-gate-failed`
Until Inter-Hub can create all required records through API calls, the evidence
can be maintained in this repository and mirrored into Inter-Hub through the UI
or migrations.