Files
helix-forge/workplans/HF-WP-0001-establish-ops-hub-first-extension.md

702 lines
24 KiB
Markdown

---
id: HF-WP-0001
type: workplan
title: "Establish ops-hub as the First VSM Inter-Hub Extension"
domain: helix_forge
repo: helix-forge
status: active
owner: worsch
created: "2026-05-16"
updated: "2026-05-16"
planning_priority: high
planning_order: 1
related_repos:
- inter-hub
- railiance-infra
- railiance-cluster
- railiance-platform
- railiance-apps
state_hub_workstream_id: "48d91935-197e-4ad4-be07-7bbcd535847c"
---
# Establish ops-hub as the First VSM Inter-Hub Extension
## Goal
Use Inter-Hub as the generic hub framework and establish `ops-hub` as the
first VSM-oriented domain hub extension: the Operations / System 1 hub.
`ops-hub` should professionalize Railiance operations while the current
CoulombCore environment transitions toward the future ThreePhoenix production
setup. Just as importantly, it should prove the repeatable extension pattern
for the later VSM hubs:
- `ops-hub` — Operations and Activities / System 1
- `syn-hub` — Synchronization and Coordination / System 2
- `ctl-hub` — Internal Control and Regulation / System 3
- `aud-hub` — Audit and Monitoring / System 3*
- `int-hub` — Intelligence and Adaptation / System 4
- `pol-hub` — Policy and Identity / System 5
- `env-hub` — Boundary and Environment
The first increment should not replace State Hub or require a separate
`ops-hub` repository immediately. It should establish the operational model,
the VSM hub vocabulary, and the smallest governed integration with Inter-Hub.
A separate implementation repository can be created once the shape of the hub
is stable and the Inter-Hub extension bootstrap API is less manual.
## VSM Hub Extension Strategy
Inter-Hub is the framework. HelixForge should extend it with specific hubs that
map to the viable-system functions already named in `INTENT.md`.
| Hub | VSM function | First responsibility |
|---|---|---|
| `ops-hub` | Operations and Activities / System 1 | Operational truth surface for environments, hosts, clusters, services, endpoints, releases, backups, incidents, risks, runbooks, and migration waves. |
| `syn-hub` | Synchronization and Coordination / System 2 | Coordination between operational units, repos, workstreams, and service handoffs so local actions do not conflict. |
| `ctl-hub` | Internal Control and Regulation / System 3 | Current-state control, resource constraints, readiness gates, priorities, and operational governance. |
| `aud-hub` | Audit and Monitoring / System 3* | Independent evidence, checks, observations, drift detection, and verification trails. |
| `int-hub` | Intelligence and Adaptation / System 4 | Future sensing, migration analysis, forecasting, recommendations, and adaptation planning. |
| `pol-hub` | Policy and Identity / System 5 | Identity, values, ultimate constraints, policy decisions, and acceptable operating posture. |
| `env-hub` | Boundary and Environment | External actors, surfaces, endpoints, users, markets, partner systems, and environmental signals. |
This workplan starts only with `ops-hub`, but every bootstrap choice should be
judged by whether it can become the template for the next hub.
## Context
`wiki/CurrentOperationsSituation.md` captures the immediate operational
background as of 2026-05-15. The short version: the operational platform is
real, useful, and already carrying production-like responsibilities, but its
state is spread across live systems, repo workplans, shell knowledge, and
operator memory. There is no central service catalog or operational registry
yet.
Current operational reality:
- `coulombcore` / `92.205.130.254` is the live production-like server. It runs
the current Gitea deployment and other hands-on experimental services.
- The local workstation still hosts important services such as State Hub and
local build/runtime pieces.
- `railiance01` / `92.205.62.239` is the first server of the intended future
ThreePhoenix production environment.
- The Railiance repo stack already separates operational responsibility:
`railiance-infra` (S1), `railiance-cluster` (S2), `railiance-platform` (S3),
`railiance-enablement` (S4), and `railiance-apps` (S5).
- Gitea is live on the CoulombCore Kubernetes cluster as Helm release `gitea`
in namespace `default`, exposed through NodePort `32166`, with its database
in namespace `databases` and shared data on PVC `default/gitea-shared-storage`.
- The Gitea OCI registry route at `https://gitea.coulomb.social/v2/` now
returns the expected registry auth challenge, but publishing still needs to
be proven with encrypted Helm values, a package token, `docker login`, push,
and pull.
- Ops Bridge can help reveal which servers are connected and reachable, but it
is not itself a full operational service catalog.
`ops-hub` should become the operational truth surface across those realities:
environments, hosts, clusters, services, releases, endpoints, backups,
readiness gates, incidents, risks, service discovery, and migration waves.
## Inter-Hub API Findings
Checked live and local Inter-Hub evidence on 2026-05-16.
Live API:
- `https://hub.coulomb.social/api/v2/openapi.json` is available.
- `https://hub.coulomb.social/api/v2/docs` is available.
- Public UI route `https://hub.coulomb.social/Hubs` redirects to
`/NewSession`, so hub creation is currently an authenticated UI flow.
Live OpenAPI paths:
- `/widgets`, `/widgets/{id}`
- `/interaction-events`
- `/annotations`
- `/requirement-candidates`, `/requirement-candidates/{id}`
- `/decision-records`, `/decision-records/{id}`
- `/deployment-records`, `/deployment-records/{id}`
- `/outcome-signals`, `/outcome-signals/{id}`
- `/widget-types`, `/event-types`, `/annotation-categories`
- `/hub-registry`, `/hub-registry/{hubId}`
- `/widget-patterns`, `/widget-patterns/{id}`, `/widget-patterns/{id}/adopt`
- `/token`
Useful local Inter-Hub docs:
- `inter-hub/docs/domain-hub-extension-guide.md`
- `inter-hub/docs/new-hub-quickstart.md`
- `inter-hub/contracts/extensions/hub-capability-manifest-v1.md`
- `inter-hub/contracts/functional/interaction-reporting-v1.md`
Assessment:
- Inter-Hub provides enough guidance to start `ops-hub` as an API consumer
pattern or as a manually registered domain hub.
- Inter-Hub does not yet provide enough API surface to fully automate first hub
bootstrap. Hub creation, capability manifest creation/activation, API
consumer creation, API key issuance, and widget creation are primarily UI or
internal-controller workflows.
- The quickstart mentions `POST /api/v2/hubs` and `POST /api/v2/widgets`, but
the live OpenAPI and local routes do not expose those create endpoints. Treat
the quickstart as aspirational for bootstrap automation until Inter-Hub is
hardened.
## Confirmed Bootstrap Path
Checked against the live API and local Inter-Hub source on 2026-05-16.
Decision:
- Bootstrap `ops-hub` through the authenticated Inter-Hub admin UI where
possible, with a migration-backed fallback when a repeatable bootstrap is
needed before the public API is hardened.
- Treat Pattern A, API Consumer Hub, as the first implementation pattern.
- Store VSM classification in the manifest capability description for now,
because the current `hubs` table only has `slug`, `name`, `domain`, and
`hub_kind`; there are no first-class `hub_family`, `vsm_function`, or
`vsm_system` columns yet.
- Use `hub_kind = domain` for `ops-hub`.
- Record missing first-class VSM metadata fields and create endpoints as
Inter-Hub hardening work under T10.
Confirmed current support:
- Live `/Hubs` redirects to `/NewSession`, so hub creation is an authenticated
UI flow.
- Local `HubsController` supports creating and editing hub rows through the UI.
- Local `HubCapabilityManifestsController` supports draft creation, JSON-array
vocabulary editing, activation, and registry upsert.
- Local `ApiConsumersController` and `ApiKeysController` support authenticated
UI creation of consumers and static keys.
- Local `/api/v2/interaction-events` supports `POST` and validates event types.
Confirmed gaps:
- Live OpenAPI has no `POST /api/v2/hubs`.
- Live OpenAPI has no `POST /api/v2/widgets`; widgets are read-only in v2.
- There are no v2 endpoints for manifest draft creation, manifest activation,
API consumer creation, or API key creation.
- There is no `/api/v2/policy-scopes` endpoint.
- Live type registries currently return empty arrays, so the ops vocabulary
still needs manifest activation before events can be accepted.
- Event metadata is exposed in the response schema, but the v2 interaction
event create controller currently does not persist submitted metadata.
- Webhook dispatch still uses the hard-coded `"clicked"` event name.
## Architectural Decision
Start with **Pattern A: API Consumer Hub** for `ops-hub`, plus a manual or
migration-backed Inter-Hub registration. Treat `ops-hub` as the first VSM hub
instance rather than a one-off operational dashboard:
1. Register `ops-hub` as a domain hub in Inter-Hub.
2. Classify it as the Operations / System 1 hub in hub metadata or manifest
metadata, depending on what Inter-Hub currently supports.
3. Activate a `HubCapabilityManifest` for its operational vocabulary.
4. Create an `ApiConsumer` and API key for `ops-hub`.
5. Seed a small set of governed widgets representing operational surfaces.
6. Emit interaction events and annotations from lightweight scripts or a
prototype UI.
The first reusable contract to prove is:
```text
Hub identity + VSM function + manifest vocabulary + API consumer + seed widgets + evidence events
```
The next hubs should be able to follow the same shape with their own
vocabularies:
```text
syn-hub / ctl-hub / aud-hub / int-hub / pol-hub / env-hub
```
Do not create a separate `ops-hub` repository until the first inventory,
readiness, service catalog, and migration workflows have proven their data
model.
## Initial ops-hub Vocabulary
This vocabulary is deliberately scoped to Operations / System 1. Coordination,
control, audit, intelligence, policy, and environment concerns should be
represented only where they touch operational evidence; their own hubs will
own the broader semantics later.
Suggested manifest values:
### Widget Types
```json
[
"ops-environment",
"ops-host",
"ops-cluster",
"ops-service",
"ops-service-catalog",
"ops-endpoint",
"ops-release",
"ops-backup-set",
"ops-secret-set",
"ops-runbook",
"ops-incident",
"ops-readiness-gate",
"ops-migration-wave",
"ops-risk"
]
```
### Event Types
```json
[
"ops-inventory-registered",
"ops-inventory-updated",
"ops-service-discovered",
"ops-health-checked",
"ops-release-observed",
"ops-endpoint-verified",
"ops-backup-verified",
"ops-restore-tested",
"ops-runbook-executed",
"ops-drift-detected",
"ops-risk-raised",
"ops-risk-accepted",
"ops-readiness-gate-updated",
"ops-migration-gate-passed",
"ops-migration-gate-failed"
]
```
### Annotation Categories
```json
[
"ops-drift",
"ops-service-catalog-gap",
"ops-backup-gap",
"ops-security-gap",
"ops-routing-gap",
"ops-secret-gap",
"ops-readiness-blocker",
"ops-migration-risk",
"ops-observability-gap",
"ops-recovery-gap"
]
```
### Policy Scopes
```json
[
"ops-local",
"ops-transitional-prod",
"ops-production",
"ops-threephoenix",
"ops-registry",
"ops-secrets",
"ops-backup-retention"
]
```
## Initial Operational Inventory
The first ops-hub inventory should cover:
| Environment | Role | Current notes |
|---|---|---|
| `local` | Workstation services and development runtime | State Hub and local build/runtime pieces currently live here. |
| `coulombcore` | Live transitional production | Public IP `92.205.130.254`; hosts current Gitea and hand-built experimental production services. |
| `railiance01` | Future production foundation | Public IP `92.205.62.239`; first server of the intended ThreePhoenix setup. |
| `threephoenix-prod` | Target production topology | Future three-node Railiance production environment. |
The first services to model:
- Gitea / container registry
- State Hub and underlying services
- Inter-Hub itself
- PostgreSQL/CNPG services used by Gitea and State Hub
- Ingress/DNS/TLS endpoints for the above
- Backup and restore coverage for each persistent data store
- Ops Bridge connectivity as reachability evidence, not as the catalog itself
The first explicit service-catalog gap:
- There is no central place that answers "what runs where, why, who owns it,
how it is reached, and what evidence proves it is healthy." `ops-hub` should
make that question answerable before the ThreePhoenix migration becomes more
complicated.
## Tasks
### T01 — Confirm the VSM hub extension bootstrap path
```task
id: HF-WP-0001-T01
status: done
priority: high
state_hub_task_id: "2587a3b8-3b9b-4948-acaf-1547644e4563"
```
Confirm whether `ops-hub` should be registered through the Inter-Hub UI,
through a migration, or through new API endpoints. Capture the result as the
first repeatable VSM hub bootstrap path, not just as a local workaround for
Operations.
Checks:
- Confirm the active Inter-Hub deployment URL and authentication path.
- Confirm whether `/Hubs/new`, `/HubCapabilityManifests`, `/ApiConsumers`, and
`/ApiKeys` are accessible to the operator.
- Confirm whether direct DB migration is acceptable for initial bootstrap.
- Confirm where hub metadata can carry the VSM function (`OPS`, `SYN`, `CTL`,
`AUD`, `INT`, `POL`, `ENV`) and VSM system mapping.
- Record the chosen bootstrap path in this workplan so `syn-hub` can reuse it.
Done when: there is a concrete, repeatable path to create the `ops-hub` row,
manifest, API consumer, and API key, with enough metadata to classify it as the
Operations / System 1 hub.
Output: `Confirmed Bootstrap Path` section in this workplan.
---
### T02 — Register ops-hub in Inter-Hub as the Operations hub
```task
id: HF-WP-0001-T02
status: blocked
priority: high
state_hub_task_id: "8e9bd9b2-54fc-49a4-8bb8-11c8577be48d"
```
Create the Hub row:
- `name`: `Ops Hub`
- `slug`: `ops-hub`
- `domain`: `ops.coulomb.social` or another explicit domain chosen by the
operator
- `hub_kind`: `domain`
- VSM function metadata: `OPS`
- VSM system metadata: `S1`
- Hub family metadata: `vsm`
If Inter-Hub does not yet have explicit fields for VSM function, system, or hub
family, store them in manifest metadata and record the missing first-class
fields as an Inter-Hub API/model gap.
Done when: `ops-hub` appears in `/Hubs` and `/api/v2/hub-registry` after
authentication, and a human can tell that it is the VSM Operations hub.
Blocked until: an authenticated Inter-Hub admin session or deployment-side
migration is available.
Prepared artifacts:
- `wiki/OpsHubBootstrapRunbook.md`
- `wiki/ops-hub-manifest.draft.json`
- `wiki/ops-hub-bootstrap.sql`
---
### T03 — Activate the ops-hub capability manifest
```task
id: HF-WP-0001-T03
status: blocked
priority: high
state_hub_task_id: "55f5aeed-21c3-4a83-bc78-f90f92c7d597"
```
Create and activate a `HubCapabilityManifest` for `ops-hub` using the
vocabulary in this workplan. The manifest should make the VSM classification
explicit:
- `hub_family`: `vsm`
- `vsm_function`: `OPS`
- `vsm_system`: `S1`
- `scope`: operational truth and evidence, not coordination/control/audit
ownership
Validation:
- Declared widget types appear in `/api/v2/widget-types`.
- Declared event types appear in `/api/v2/event-types`.
- Declared annotation categories appear in `/api/v2/annotation-categories`.
- Policy scopes are visible in the Inter-Hub registry UI or DB, even though the
public v2 API currently lacks `/policy-scopes`.
- Future VSM hub values can be added by changing manifest vocabulary, not by
inventing a different bootstrap mechanism.
Done when: the manifest status is `active` and no type conflicts remain.
Blocked until: the `ops-hub` row exists in Inter-Hub and an authenticated
operator or migration can create and activate the manifest.
Prepared artifact: `wiki/ops-hub-manifest.draft.json`.
---
### T04 — Create ops-hub API consumer and key
```task
id: HF-WP-0001-T04
status: blocked
priority: high
state_hub_task_id: "ad08e729-8562-4a02-8bf6-dcdfebe430c8"
```
Create an `ApiConsumer` associated with the active `ops-hub` manifest, then
create a static API key with at least:
- `framework:read`
- `hub:ops-hub:read`
- `hub:ops-hub:write`
Store the key only in the operator secret store or local env file, never in Git.
Done when: `POST /api/v2/token` can exchange the static key for a short-lived
access token and `GET /api/v2/hub-registry` works with that token.
Blocked until: an authenticated operator creates the API key and stores the
full static key outside Git. The SQL fallback intentionally creates only the
consumer row, not the one-time visible secret.
---
### T05 — Seed first governed ops widgets
```task
id: HF-WP-0001-T05
status: blocked
priority: high
state_hub_task_id: "d303884d-d1f6-4fd0-a4ec-97afe6162164"
```
Create initial widgets for the operational surfaces:
- `ops-env-local`
- `ops-env-coulombcore`
- `ops-env-railiance01`
- `ops-env-threephoenix-prod`
- `ops-host-coulombcore`
- `ops-host-railiance01`
- `ops-service-catalog`
- `ops-service-gitea`
- `ops-service-state-hub`
- `ops-service-inter-hub`
- `ops-endpoint-gitea-registry`
- `ops-readiness-gitea-registry`
- `ops-readiness-state-hub-cluster-deploy`
- `ops-migration-coulombcore-to-threephoenix`
If Inter-Hub still lacks a widget creation API, seed these through the UI or a
migration and record that as an API gap.
Done when: the widgets appear under `ops-hub` and can accept interaction events
and annotations.
Blocked until: `ops-hub` and its active manifest exist in Inter-Hub.
Prepared artifacts:
- `wiki/ops-hub-widgets.seed.json`
- `wiki/ops-hub-bootstrap.sql`
---
### T06 — Build the first ops inventory artifact
```task
id: HF-WP-0001-T06
status: done
priority: medium
state_hub_task_id: "2a0b2f69-5a3d-433c-9cbd-85fd868b63d8"
```
Create an ops inventory document in `helix-forge` that expresses the current
state of:
- environments
- hosts
- clusters
- services
- endpoints
- service discovery and service-catalog gaps
- storage and backup coverage
- migration readiness gates
Use `wiki/CurrentOperationsSituation.md` as the seed background, then turn it
into a more structured inventory artifact. Use this as the working model before
creating a separate `ops-hub` repository.
Done when: a human can see the CoulombCore, local, railiance01, and
ThreePhoenix relationship, including the current Gitea registry state, without
reading multiple repo workplans or relying on shell history.
Output: `wiki/OpsHubInventory.md`.
---
### T07 — Instrument the current Gitea registry work as the first ops-hub signal
```task
id: HF-WP-0001-T07
status: blocked
priority: medium
state_hub_task_id: "ed3e0396-b16d-40c2-9519-e755ad6241eb"
```
Use the recently fixed Gitea `/v2` route as the first real operational signal.
Suggested event:
```json
{
"widgetId": "<ops-readiness-gitea-registry-widget-id>",
"eventType": "ops-endpoint-verified",
"viewContext": "railiance-apps/workplans/RAIL-AP-WP-0001",
"metadata": {
"vsmFunction": "OPS",
"vsmSystem": "S1",
"endpoint": "https://gitea.coulomb.social/v2/",
"expectedStatus": 401,
"observedHeader": "Docker-Distribution-Api-Version: registry/2.0"
}
}
```
Done when: the Gitea registry readiness event is visible in Inter-Hub and
traceable back to the Railiance workplan.
Blocked until: the `ops-endpoint-gitea-registry` widget exists, the
`ops-endpoint-verified` event type is active, and an ops-hub API key is
available to the operator.
---
### T08 — Define the ops-hub readiness gate model for ThreePhoenix migration
```task
id: HF-WP-0001-T08
status: done
priority: medium
state_hub_task_id: "72a58622-c3ac-4765-8026-5c2489af2058"
```
Define readiness gates that must be green before moving production
responsibility from CoulombCore to ThreePhoenix:
- DNS and TLS are codified.
- Service catalog entries exist for the live and target production services.
- Git hosting and container registry are reproducible.
- Persistent data stores have backup and restore evidence.
- Secrets and SOPS/age keys are available through governed operator paths.
- Cluster runtime and platform services are recreated through Railiance repos.
- Rollback path is documented.
- Operator runbooks exist for deploy, restore, rotate, and incident response.
Done when: each gate has an owner repo, evidence requirement, and status.
Output: `wiki/OpsHubReadinessGates.md`.
---
### T09 — Decide whether to create a separate ops-hub repository
```task
id: HF-WP-0001-T09
status: todo
priority: medium
state_hub_task_id: "0e5842fd-1d33-4e2a-9701-07f623a2b901"
```
Make the repository decision after T05-T08.
Create a separate repo when at least one of these is true:
- `ops-hub` needs its own UI beyond Inter-Hub's generic hub dashboards.
- `ops-hub` needs collectors, adapters, or scheduled probes.
- `ops-hub` needs its own release lifecycle.
- The ops vocabulary stabilizes enough to deserve reusable code.
- The VSM hub extension template needs shared scaffolding that should not live
inside `inter-hub` itself.
Until then, keep the model in `helix-forge` and register state in Inter-Hub.
Done when: the decision is recorded with rationale and a repo boundary if
needed.
---
### T10 — Inter-Hub API hardening for VSM hub bootstrap
```task
id: HF-WP-0001-T10
status: in_progress
priority: high
target_repo: inter-hub
state_hub_task_id: "7fa54508-7add-4885-8913-12edaadc4d92"
```
Create or link an `inter-hub` workplan to make VSM domain hub bootstrapping
machine-repeatable.
Recommended Inter-Hub improvements:
1. Add `POST /api/v2/hubs` and include it in OpenAPI.
2. Add `POST /api/v2/widgets` and include it in OpenAPI.
3. Add API endpoints for `HubCapabilityManifest` draft creation, update, and
activation.
4. Add a documented place for hub-family metadata such as `hub_family`,
`vsm_function`, and `vsm_system`.
5. Add API endpoints for `ApiConsumer` and API key creation, or a clearly
documented admin-only bootstrap command if API key creation remains UI-only.
6. Add `/api/v2/policy-scopes` to match the policy scope registry already used
by manifests.
7. Add distinct OpenAPI request schemas for create requests instead of reusing
response schemas.
8. Align `docs/new-hub-quickstart.md` with the actual live API until the create
endpoints exist.
9. Fix `Web.Controller.Api.V2.InteractionEvents` so manifest-declared event
types are actually decoded and enforced.
10. Fix webhook dispatch so it uses the submitted event type instead of the
hard-coded `"clicked"` event name.
11. Decide whether event `metadata` is part of the v2 create contract; if yes,
persist it in the controller and test it.
12. Document the bootstrap recipe as a template for `syn-hub`, `ctl-hub`,
`aud-hub`, `int-hub`, `pol-hub`, and `env-hub`.
Done when: the next VSM hub can be created from a script using documented API
calls and without direct DB access.
Linked Inter-Hub workplan:
`inter-hub/workplans/IHUB-WP-0019-vsm-hub-bootstrap-api.md`.
## Initial Acceptance Criteria
This workplan is complete when:
1. `ops-hub` is registered in Inter-Hub as the VSM Operations / System 1 hub.
2. Its capability manifest is active.
3. It has an API consumer and key.
4. Initial ops widgets exist for environments, services, readiness gates, and
migration waves.
5. At least one real operational event has been submitted.
6. The CoulombCore-to-ThreePhoenix readiness model is documented.
7. A decision has been made whether to create a separate `ops-hub` repository.
8. Inter-Hub bootstrap API gaps are either fixed or tracked in an Inter-Hub
workplan.
9. The bootstrap path is reusable enough that `syn-hub` can be created next
without rediscovering the whole process.
## Notes
`ops-hub` should complement State Hub during the transition:
- State Hub continues to track workstreams, decisions, and progress events.
- `ops-hub` tracks operational reality and readiness evidence.
- `syn-hub`, `ctl-hub`, and `aud-hub` can later absorb coordination, control,
and evidence responsibilities once the broader hub constellation is
established.