Normalize agent instructions and workplan frontmatter (STATE-WP-0067)

- Align agent files with on-disk workplan prefixes (infer from workplan ids)
- Set workplan domain to registered domain_slug; add topic_slug where applicable
- Repair frontmatter delimiter formatting; migrate legacy task status literals
- Regenerate AGENTS.md, CLAUDE.md, and .claude/rules from State Hub templates
This commit is contained in:
2026-06-22 23:16:27 +02:00
parent cec94ac00f
commit 21bfd5fa49
19 changed files with 819 additions and 25 deletions

15
docs/README.md Normal file
View File

@@ -0,0 +1,15 @@
# ops-hub Docs
This directory contains the first repo-local version of the HelixForge
`HF-WP-0001` handoff.
- `initial-inventory.md` defines the first environment, host, cluster, service,
and endpoint catalog.
- `readiness-gates.md` defines the CoulombCore-to-ThreePhoenix readiness model.
- `bootstrap-runbook.md` defines the operator-ready Inter-Hub bootstrap path.
- `../seeds/ops-hub-manifest.draft.json` contains the initial capability
manifest draft.
- `../seeds/ops-hub-widgets.seed.json` contains the initial widget seed.
- `../seeds/ops-hub-bootstrap.sql` is an operator-approved fallback only; do
not use direct DB seeding while the supported Inter-Hub API path is viable or
pending.

146
docs/initial-inventory.md Normal file
View File

@@ -0,0 +1,146 @@
# Ops Hub Initial Inventory
Date: 2026-06-06
## Purpose
This document is the first structured inventory for `ops-hub`, the VSM
Operations / System 1 hub. It turns the current operations situation into a
catalogable model for this implementation repo.
Source background:
- `/home/worsch/helix-forge/wiki/CurrentOperationsSituation.md`
- `/home/worsch/helix-forge/workplans/HF-WP-0001-establish-ops-hub-first-extension.md`
## Repository Boundary
As of 2026-06-06, `ops-hub` implementation belongs in `/home/worsch/ops-hub`
with remote `gitea-remote:coulomb/ops-hub.git`.
- `ops-hub` owns collectors, adapters, scheduled probes, runtime
packaging, UI/extensions, tests, and Inter-Hub bootstrap/smoke clients.
- `inter-hub` remains the generic hub framework, manifest/registry substrate,
authentication surface, widget/event API, and bootstrap API owner.
- `helix-forge` keeps architecture context and the original coordinating
workplan.
- Railiance repos own deployable infrastructure/service state and the
operational evidence that `ops-hub` should surface.
## VSM Placement
| Field | Value |
|---|---|
| Hub | `ops-hub` |
| Hub family | `vsm` |
| VSM function | `OPS` |
| VSM system | `S1` |
| Primary concern | Operational truth and evidence |
`ops-hub` owns the description of what is currently running, where it runs, how
it is reached, what state it is in, and what operational evidence exists. It
does not replace State Hub workstreams or Inter-Hub governance.
## Environments
| Environment | Role | Current state | Notes |
|---|---|---|---|
| `local` | Workstation development and local services | Active, important, not production | Hosts State Hub and local build/runtime pieces. |
| `coulombcore` | Live transitional production | Active, production-like, historically hand-built | Public IP `92.205.130.254`; runs current Gitea and experimental operational services. |
| `railiance01` | Future production foundation | Provisioning target | Public IP `92.205.62.239`; first server of intended ThreePhoenix shape. |
| `threephoenix-prod` | Target production topology | Planned | Future governed multi-node production environment. |
## Hosts
| Host | Environment | Address | Role | Known gaps |
|---|---|---|---|---|
| `coulombcore` | `coulombcore` | `92.205.130.254` | Current live production-like server | Needs service catalog, drift tracking, backup/restore evidence, and migration disposition. |
| `railiance01` | `railiance01` | `92.205.62.239` | First ThreePhoenix production foundation node | Needs full inventory, readiness gates, and cluster/platform bootstrap evidence. |
| local workstation | `local` | local/private | State Hub and development runtime host | Needs explicit service ownership and backup expectations. |
Ops Bridge may provide reachability evidence for connected servers, but it is
not the service catalog. `ops-hub` should turn bridge reachability into
inventory signals rather than treating the bridge itself as the inventory.
## Clusters
| Cluster | Environment | Role | Current notes |
|---|---|---|---|
| CoulombCore Kubernetes | `coulombcore` | Current operational Kubernetes runtime | Hosts current Gitea deployment and related services. |
| ThreePhoenix Kubernetes | `threephoenix-prod` | Target production runtime | Future governed production cluster assembled through Railiance repos. |
## Services
| Service | Current environment | Owner repo | Current evidence | Gaps |
|---|---|---|---|---|
| Gitea | `coulombcore` | `railiance-apps` | Helm release `gitea`, namespace `default`, app version `1.25.4`, NodePort `32166`, public registry path returns auth challenge. | SOPS Helm values update, package token, `docker login`, push, pull, backup coverage, restore evidence. |
| Gitea database | `coulombcore` | `railiance-platform` | Database `gitea-db` in namespace `databases`. | Backup and restore evidence not recorded here yet. |
| Gitea shared storage | `coulombcore` | `railiance-platform` / `railiance-apps` | PVC `default/gitea-shared-storage`. | Package blob backup and restore evidence not confirmed. |
| State Hub | `local` | `the-custodian/state-hub` | Local API and dashboard are operational enough for repo registration and workplan sync. | Future cluster deployment/readiness still needs gates and evidence. |
| Inter-Hub | live public endpoint | `inter-hub` | `https://hub.coulomb.social/api/v2/openapi.json` and docs are reachable. | Hub bootstrap still depends on authenticated UI or migration. |
| Ops Bridge | local/remote bridge | `ops-bridge` | Useful for connected-server visibility. | Not a service catalog; should emit reachability evidence into `ops-hub`. |
## Endpoints
| Endpoint | Service | Environment | Current status | Evidence |
|---|---|---|---|---|
| `https://gitea.coulomb.social/v2/` | Gitea OCI registry | `coulombcore` | Route fixed; returns registry auth challenge | Expected `401` with OCI registry challenge. |
| `https://hub.coulomb.social/api/v2/openapi.json` | Inter-Hub API | live Inter-Hub | Reachable | OpenAPI document fetched on 2026-05-16. |
| `https://hub.coulomb.social/Hubs` | Inter-Hub UI | live Inter-Hub | Requires login | Redirects to `/NewSession`. |
| `http://127.0.0.1:8000/state/health` | State Hub API | `local` | Reachable locally | Used for StateHub registration/sync. |
## Service Catalog Gap
There is no central place that answers these questions:
- What runs where?
- Which repo owns its desired state?
- Which endpoint exposes it?
- Which data stores back it?
- Which backups and restore tests cover it?
- Which migration wave will replace or move it?
- Which current evidence proves it is healthy?
`ops-hub` should be the first place where these answers are explicit and
machine-addressable.
## First Ops Widgets
Seed these in Inter-Hub once `ops-hub` exists:
- `ops-env-local`
- `ops-env-coulombcore`
- `ops-env-railiance01`
- `ops-env-threephoenix-prod`
- `ops-host-coulombcore`
- `ops-host-railiance01`
- `ops-service-catalog`
- `ops-service-gitea`
- `ops-service-state-hub`
- `ops-service-inter-hub`
- `ops-endpoint-gitea-registry`
- `ops-readiness-gitea-registry`
- `ops-readiness-state-hub-cluster-deploy`
- `ops-migration-coulombcore-to-threephoenix`
## First Evidence Events
The first event should be the Gitea registry endpoint verification:
```json
{
"widgetId": "<ops-endpoint-gitea-registry-widget-id>",
"eventType": "ops-endpoint-verified",
"viewContext": "railiance-apps/workplans/RAIL-AP-WP-0001",
"metadata": {
"vsmFunction": "OPS",
"vsmSystem": "S1",
"endpoint": "https://gitea.coulomb.social/v2/",
"expectedStatus": 401,
"observedHeader": "Docker-Distribution-Api-Version: registry/2.0"
}
}
```
This event is blocked until the ops event type is registered by an active
manifest and the target widget exists.

63
docs/readiness-gates.md Normal file
View File

@@ -0,0 +1,63 @@
# Ops Hub Readiness Gates
Date: 2026-06-06
## Purpose
These gates define what must be true before operational responsibility can move
from the current CoulombCore setup to the future ThreePhoenix production setup.
They are the first repo-local `ops-hub` readiness model.
Statuses:
- `unknown` means no reliable evidence has been cataloged yet.
- `partial` means some evidence exists, but the gate is not complete.
- `blocked` means a required precondition is missing.
- `ready` means the evidence requirement is satisfied.
## Gates
| ID | Gate | Owner repo | Evidence requirement | Current status |
|---|---|---|---|---|
| OPS-G01 | Environment inventory exists | `ops-hub` | `local`, `coulombcore`, `railiance01`, and `threephoenix-prod` are represented with role, lifecycle state, and owner notes. | `partial` |
| OPS-G02 | Service catalog exists | `ops-hub` | Each live and target service has environment, owner repo, endpoint, backing stores, lifecycle state, and evidence links. | `partial` |
| OPS-G03 | DNS and TLS are codified | `railiance-cluster` / `railiance-apps` | Public hostnames, ingress routes, certificate sources, and renewal paths are declared in repo files. | `unknown` |
| OPS-G04 | Git hosting is reproducible | `railiance-apps` / `railiance-platform` | Gitea or successor deployment can be recreated from repo state, including database and storage dependencies. | `partial` |
| OPS-G05 | Container registry publishing is proven | `railiance-apps` | `docker login`, push, and pull succeed against `https://gitea.coulomb.social/v2/` using governed secrets. | `partial` |
| OPS-G06 | Persistent data is backed up | `railiance-platform` | Each persistent data store has backup location, schedule, retention, ownership, and latest successful backup evidence. | `unknown` |
| OPS-G07 | Restore path is proven | `railiance-platform` / `railiance-apps` | Restore test evidence exists for Gitea database, package blobs, and State Hub data. | `unknown` |
| OPS-G08 | Secrets path is governed | `railiance-infra` / `railiance-apps` | SOPS/age keys and operator secret paths are documented; no required secret depends on shell memory. | `partial` |
| OPS-G09 | Cluster runtime is reproducible | `railiance-cluster` | Kubernetes runtime, ingress, CNI, operators, and routing primitives are recreated through repo-owned automation. | `unknown` |
| OPS-G10 | Platform services are reproducible | `railiance-platform` | PostgreSQL/CNPG, object storage, secret management, and identity dependencies have repo-owned deployment evidence. | `unknown` |
| OPS-G11 | Application deployment is reproducible | `railiance-apps` | Gitea, Inter-Hub, State Hub, and other application releases are declared with Helm values and deployment runbooks. | `partial` |
| OPS-G12 | Rollback path is documented | owning service repos | Each migration wave has rollback conditions, steps, and data safety notes. | `unknown` |
| OPS-G13 | Operator runbooks exist | owning service repos | Deploy, restore, rotate, incident response, and migration runbooks exist for each critical service. | `unknown` |
| OPS-G14 | Observability and health checks are explicit | `railiance-cluster` / `railiance-platform` / service repos | Health checks, logs, metrics, and endpoint probes are documented and tied to service catalog entries. | `unknown` |
| OPS-G15 | Inter-Hub ops bootstrap is available | `inter-hub` / `ops-hub` / `helix-forge` | `ops-hub` can be created through UI, supported API, or explicit migration fallback, manifest activated, API consumer/key created, widgets seeded, and events accepted. | `partial` |
## Initial Migration Waves
| Wave | Goal | Required gates |
|---|---|---|
| `wave-0-catalog` | Establish the operational truth surface without moving services. | OPS-G01, OPS-G02, OPS-G15 |
| `wave-1-registry-proof` | Prove current Gitea registry publishing and evidence capture. | OPS-G03, OPS-G05, OPS-G08, OPS-G14 |
| `wave-2-backup-restore` | Confirm backups and restore paths for critical persistent state. | OPS-G06, OPS-G07, OPS-G13 |
| `wave-3-threephoenix-foundation` | Recreate cluster and platform foundations on railiance01/ThreePhoenix. | OPS-G09, OPS-G10 |
| `wave-4-service-migration` | Move or replace production responsibilities from CoulombCore to ThreePhoenix. | OPS-G04, OPS-G11, OPS-G12 plus service-specific gates |
## Evidence Shape
Each readiness gate should eventually be represented in `ops-hub` as a widget
or widget family with events like:
- `ops-readiness-gate-updated`
- `ops-endpoint-verified`
- `ops-backup-verified`
- `ops-restore-tested`
- `ops-risk-raised`
- `ops-migration-gate-passed`
- `ops-migration-gate-failed`
Until Inter-Hub can create all required records through API calls, the evidence
can be maintained in this repo and mirrored into Inter-Hub through the UI or
explicit operator-approved migrations.