Situation and workplan established

This commit is contained in:
2026-05-16 02:37:50 +02:00
parent 9c565cd68d
commit 479abeb781
2 changed files with 273 additions and 40 deletions

View File

@@ -0,0 +1,111 @@
Here is a 26-05-15 assessment of the operations situation. The clean mental model is:
`92.205.130.254` / CoulombCore is the current real production-ish server, but it is also historically hand-built and experimental. The Railiance tooling is now trying to pull that reality back into governed repos and workplans without pretending it was born clean.
`92.205.62.239` / `railiance01` is the first server of the intended future production shape: ThreePhoenix. That future stack belongs mostly to the lower repos first: `railiance-infra`, `railiance-cluster`, `railiance-platform`, then finally `railiance-apps`.
Current Gitea state:
- Gitea is live on the CoulombCore Kubernetes cluster at `92.205.130.254`.
- It is deployed by Helm as release `gitea`, but currently in namespace `default`, not `gitea`.
- The live chart label says `gitea-12.5.0`, app version `1.25.4`.
- The running image is `docker.gitea.com/gitea:1.25.4-rootless`.
- The main service is `default/gitea`, exposed as NodePort `32166`.
- The database is `gitea-db` in namespace `databases`, owned by the platform layer.
- Gitea data lives on PVC `default/gitea-shared-storage`.
For the registry specifically:
- Before my change, Gitea itself already responded correctly on `/v2/` through the pod and NodePort.
- But `https://gitea.coulomb.social/v2/` returned a generic `404`, meaning public hostname routing was missing.
- I added and applied a narrow Traefik ingress for only `/v2` on `gitea.coulomb.social`.
- Now `https://gitea.coulomb.social/v2/` returns the expected OCI registry `401` auth challenge.
So the current registry situation is: routing is fixed, but publishing is not fully proven yet. We still need the encrypted Helm values updated via SOPS and a Gitea package token to run `docker login`, `push`, and `pull`.
How the tooling handles the split:
- `railiance-apps` owns the Gitea app deployment shape: Helm values, application ingress, app-level docs.
- `railiance-platform` owns the database/cache/storage services Gitea depends on.
- `railiance-cluster` owns Kubernetes runtime and routing primitives.
- `railiance-infra` owns server provisioning and the future Forgejo/ThreePhoenix direction.
- State Hub tracks the workstreams and decisions, but the source of truth is still repo files like workplans and manifests.
In short: CoulombCore is the current operational platform. `railiance01` is the future production foundation. The work we just did makes the current Gitea usable as a container registry for near-term State Hub image publishing, while the longer-term Forgejo/ThreePhoenix migration remains the strategic production path.
There is a lot of detail about what's running locally and it might be helpful to look at the ops-bridge state to see connected server.
As one notices there is no central service catalog or registry. So service discovery remains to be solved.
There is another aspect. I plan to transition from statehub to the inter-hub framework with specific hubs for
- Operations (ops-hub),
- Synchronization (syn-hub),
- Internal Control (ctl-hub),
- Audit (aud-hub),
- Intelligence aka Adaption (int-hub),
- Policy aka Identity (pol-hub),
- Environment (env-hub).
The framework is available at hub.coulomb.social/Hubs but i did not set up any specific hub extension yet.
Maybe it is a good time to establish ops-hub and provide a better framework to professionalize the
operations infrastructure while working our way to the three phoenix setup?
Yes, I think this is exactly the right moment to establish `ops-hub`, but with one caveat: make it a thin operational layer first, not a big-bang replacement for State Hub.
I could not fetch `hub.coulomb.social/Hubs` from here, so Im reasoning from the framework names and your current setup.
The current problem is no longer just “deploy Gitea” or “move to railiance01.” You now have an operations transition:
- CoulombCore `92.205.130.254`: live, useful, but hand-built and somewhat experimental.
- Local machine: still hosts important services like State Hub and local build/runtime pieces.
- railiance01 `92.205.62.239`: first node of the intended ThreePhoenix production architecture.
- Future goal: a governed, repeatable, multi-node production system.
That is very much an `ops-hub` concern.
Id position `ops-hub` as the operational truth layer for:
- environments: `local`, `coulombcore`, `railiance01`, future `threephoenix-prod`
- hosts/nodes: IPs, roles, lifecycle state, ownership
- clusters: current CoulombCore K8s vs future ThreePhoenix K8s
- services: Gitea, State Hub, databases, ingress, registry, identity
- endpoints: DNS names, TLS status, public/private exposure
- releases: what is deployed where, by which repo/workplan
- backups: what data exists, where, and whether it is covered
- operational risks: hand-built drift, missing secrets, manual steps, unverified restore
- runbooks: deploy, rollback, restore, rotate, migrate
State Hub can remain the workstream/task/decision tracker for now. `ops-hub` should not immediately replace it. Instead, `ops-hub` should professionalize the operational model around the work State Hub is already tracking.
For Gitea specifically, `ops-hub` would make the current state explicit:
```text
Service: Gitea
Current environment: coulombcore
Host/IP: 92.205.130.254
Cluster namespace: default
Helm release: gitea
Chart/app: gitea-12.5.0 / 1.25.4
Public host: gitea.coulomb.social
Registry endpoint: /v2 live, auth challenge working
Known gaps:
- SOPS Helm values not yet updated in this session
- Docker push/pull not proven with PAT
- backup coverage for package blobs not confirmed
Future disposition:
- migrate/replace under Forgejo or governed Gitea on ThreePhoenix
```
That is much better than having this knowledge scattered across workplans, shell history, and memory.
My recommendation: create `ops-hub` now with a very small first milestone:
1. Define the environment model: `local`, `coulombcore`, `railiance01`, `threephoenix-prod`.
2. Seed an operations inventory for Gitea and State Hub.
3. Mark CoulombCore as `live-but-transitional`.
4. Mark railiance01 as `production-foundation`.
5. Track readiness gates for migration: DNS, TLS, registry, backups, secrets, restore test, deployment reproducibility.
6. Keep State Hub as the coordination source until `coo-hub` exists.
In short: yes, establish `ops-hub` now. It gives you the missing professional operations spine while you work toward ThreePhoenix, without forcing a premature migration away from State Hub.

View File

@@ -1,7 +1,7 @@
---
id: HF-WP-0001
type: workplan
title: "Establish ops-hub as the First Inter-Hub Extension"
title: "Establish ops-hub as the First VSM Inter-Hub Extension"
domain: helix_forge
repo: helix-forge
status: active
@@ -19,23 +19,59 @@ related_repos:
state_hub_workstream_id: "48d91935-197e-4ad4-be07-7bbcd535847c"
---
# Establish ops-hub as the First Inter-Hub Extension
# Establish ops-hub as the First VSM Inter-Hub Extension
## Goal
Create `ops-hub` as the first practical domain extension of the Interaction Hub
Framework, focused on professionalizing Railiance operations while the current
Use Inter-Hub as the generic hub framework and establish `ops-hub` as the
first VSM-oriented domain hub extension: the Operations / System 1 hub.
`ops-hub` should professionalize Railiance operations while the current
CoulombCore environment transitions toward the future ThreePhoenix production
setup.
setup. Just as importantly, it should prove the repeatable extension pattern
for the later VSM hubs:
- `ops-hub` — Operations and Activities / System 1
- `syn-hub` — Synchronization and Coordination / System 2
- `ctl-hub` — Internal Control and Regulation / System 3
- `aud-hub` — Audit and Monitoring / System 3*
- `int-hub` — Intelligence and Adaptation / System 4
- `pol-hub` — Policy and Identity / System 5
- `env-hub` — Boundary and Environment
The first increment should not replace State Hub or require a separate
`ops-hub` repository immediately. It should establish the operational model,
the hub vocabulary, and the smallest governed integration with Inter-Hub. A
separate implementation repository can be created once the shape of the hub is
stable and the Inter-Hub extension bootstrap API is less manual.
the VSM hub vocabulary, and the smallest governed integration with Inter-Hub.
A separate implementation repository can be created once the shape of the hub
is stable and the Inter-Hub extension bootstrap API is less manual.
## VSM Hub Extension Strategy
Inter-Hub is the framework. HelixForge should extend it with specific hubs that
map to the viable-system functions already named in `INTENT.md`.
| Hub | VSM function | First responsibility |
|---|---|---|
| `ops-hub` | Operations and Activities / System 1 | Operational truth surface for environments, hosts, clusters, services, endpoints, releases, backups, incidents, risks, runbooks, and migration waves. |
| `syn-hub` | Synchronization and Coordination / System 2 | Coordination between operational units, repos, workstreams, and service handoffs so local actions do not conflict. |
| `ctl-hub` | Internal Control and Regulation / System 3 | Current-state control, resource constraints, readiness gates, priorities, and operational governance. |
| `aud-hub` | Audit and Monitoring / System 3* | Independent evidence, checks, observations, drift detection, and verification trails. |
| `int-hub` | Intelligence and Adaptation / System 4 | Future sensing, migration analysis, forecasting, recommendations, and adaptation planning. |
| `pol-hub` | Policy and Identity / System 5 | Identity, values, ultimate constraints, policy decisions, and acceptable operating posture. |
| `env-hub` | Boundary and Environment | External actors, surfaces, endpoints, users, markets, partner systems, and environmental signals. |
This workplan starts only with `ops-hub`, but every bootstrap choice should be
judged by whether it can become the template for the next hub.
## Context
`wiki/CurrentOperationsSituation.md` captures the immediate operational
background as of 2026-05-15. The short version: the operational platform is
real, useful, and already carrying production-like responsibilities, but its
state is spread across live systems, repo workplans, shell knowledge, and
operator memory. There is no central service catalog or operational registry
yet.
Current operational reality:
- `coulombcore` / `92.205.130.254` is the live production-like server. It runs
@@ -47,10 +83,19 @@ Current operational reality:
- The Railiance repo stack already separates operational responsibility:
`railiance-infra` (S1), `railiance-cluster` (S2), `railiance-platform` (S3),
`railiance-enablement` (S4), and `railiance-apps` (S5).
- Gitea is live on the CoulombCore Kubernetes cluster as Helm release `gitea`
in namespace `default`, exposed through NodePort `32166`, with its database
in namespace `databases` and shared data on PVC `default/gitea-shared-storage`.
- The Gitea OCI registry route at `https://gitea.coulomb.social/v2/` now
returns the expected registry auth challenge, but publishing still needs to
be proven with encrypted Helm values, a package token, `docker login`, push,
and pull.
- Ops Bridge can help reveal which servers are connected and reachable, but it
is not itself a full operational service catalog.
`ops-hub` should become the operational truth surface across those realities:
environments, hosts, clusters, services, releases, endpoints, backups,
readiness gates, incidents, risks, and migration waves.
readiness gates, incidents, risks, service discovery, and migration waves.
## Inter-Hub API Findings
@@ -99,21 +144,43 @@ Assessment:
## Architectural Decision
Start with **Pattern A: API Consumer Hub**, plus a manual or migration-backed
Inter-Hub registration:
Start with **Pattern A: API Consumer Hub** for `ops-hub`, plus a manual or
migration-backed Inter-Hub registration. Treat `ops-hub` as the first VSM hub
instance rather than a one-off operational dashboard:
1. Register `ops-hub` as a domain hub in Inter-Hub.
2. Activate a `HubCapabilityManifest` for its operational vocabulary.
3. Create an `ApiConsumer` and API key for `ops-hub`.
4. Seed a small set of governed widgets representing operational surfaces.
5. Emit interaction events and annotations from lightweight scripts or a
2. Classify it as the Operations / System 1 hub in hub metadata or manifest
metadata, depending on what Inter-Hub currently supports.
3. Activate a `HubCapabilityManifest` for its operational vocabulary.
4. Create an `ApiConsumer` and API key for `ops-hub`.
5. Seed a small set of governed widgets representing operational surfaces.
6. Emit interaction events and annotations from lightweight scripts or a
prototype UI.
The first reusable contract to prove is:
```text
Hub identity + VSM function + manifest vocabulary + API consumer + seed widgets + evidence events
```
The next hubs should be able to follow the same shape with their own
vocabularies:
```text
syn-hub / ctl-hub / aud-hub / int-hub / pol-hub / env-hub
```
Do not create a separate `ops-hub` repository until the first inventory,
readiness, and migration workflows have proven their data model.
readiness, service catalog, and migration workflows have proven their data
model.
## Initial ops-hub Vocabulary
This vocabulary is deliberately scoped to Operations / System 1. Coordination,
control, audit, intelligence, policy, and environment concerns should be
represented only where they touch operational evidence; their own hubs will
own the broader semantics later.
Suggested manifest values:
### Widget Types
@@ -124,9 +191,11 @@ Suggested manifest values:
"ops-host",
"ops-cluster",
"ops-service",
"ops-service-catalog",
"ops-endpoint",
"ops-release",
"ops-backup-set",
"ops-secret-set",
"ops-runbook",
"ops-incident",
"ops-readiness-gate",
@@ -140,14 +209,18 @@ Suggested manifest values:
```json
[
"ops-inventory-registered",
"ops-inventory-updated",
"ops-service-discovered",
"ops-health-checked",
"ops-release-observed",
"ops-endpoint-verified",
"ops-backup-verified",
"ops-restore-tested",
"ops-runbook-executed",
"ops-drift-detected",
"ops-risk-raised",
"ops-risk-accepted",
"ops-readiness-gate-updated",
"ops-migration-gate-passed",
"ops-migration-gate-failed"
]
@@ -158,6 +231,7 @@ Suggested manifest values:
```json
[
"ops-drift",
"ops-service-catalog-gap",
"ops-backup-gap",
"ops-security-gap",
"ops-routing-gap",
@@ -202,10 +276,18 @@ The first services to model:
- PostgreSQL/CNPG services used by Gitea and State Hub
- Ingress/DNS/TLS endpoints for the above
- Backup and restore coverage for each persistent data store
- Ops Bridge connectivity as reachability evidence, not as the catalog itself
The first explicit service-catalog gap:
- There is no central place that answers "what runs where, why, who owns it,
how it is reached, and what evidence proves it is healthy." `ops-hub` should
make that question answerable before the ThreePhoenix migration becomes more
complicated.
## Tasks
### T01 — Confirm Inter-Hub extension bootstrap path
### T01 — Confirm the VSM hub extension bootstrap path
```task
id: HF-WP-0001-T01
@@ -215,7 +297,9 @@ state_hub_task_id: "2587a3b8-3b9b-4948-acaf-1547644e4563"
```
Confirm whether `ops-hub` should be registered through the Inter-Hub UI,
through a migration, or through new API endpoints.
through a migration, or through new API endpoints. Capture the result as the
first repeatable VSM hub bootstrap path, not just as a local workaround for
Operations.
Checks:
@@ -223,14 +307,17 @@ Checks:
- Confirm whether `/Hubs/new`, `/HubCapabilityManifests`, `/ApiConsumers`, and
`/ApiKeys` are accessible to the operator.
- Confirm whether direct DB migration is acceptable for initial bootstrap.
- Record the chosen bootstrap path in this workplan.
- Confirm where hub metadata can carry the VSM function (`OPS`, `SYN`, `CTL`,
`AUD`, `INT`, `POL`, `ENV`) and VSM system mapping.
- Record the chosen bootstrap path in this workplan so `syn-hub` can reuse it.
Done when: there is a concrete, repeatable path to create the `ops-hub` row,
manifest, API consumer, and API key.
manifest, API consumer, and API key, with enough metadata to classify it as the
Operations / System 1 hub.
---
### T02 — Register ops-hub in Inter-Hub
### T02 — Register ops-hub in Inter-Hub as the Operations hub
```task
id: HF-WP-0001-T02
@@ -246,9 +333,16 @@ Create the Hub row:
- `domain`: `ops.coulomb.social` or another explicit domain chosen by the
operator
- `hub_kind`: `domain`
- VSM function metadata: `OPS`
- VSM system metadata: `S1`
- Hub family metadata: `vsm`
If Inter-Hub does not yet have explicit fields for VSM function, system, or hub
family, store them in manifest metadata and record the missing first-class
fields as an Inter-Hub API/model gap.
Done when: `ops-hub` appears in `/Hubs` and `/api/v2/hub-registry` after
authentication.
authentication, and a human can tell that it is the VSM Operations hub.
---
@@ -262,7 +356,14 @@ state_hub_task_id: "55f5aeed-21c3-4a83-bc78-f90f92c7d597"
```
Create and activate a `HubCapabilityManifest` for `ops-hub` using the
vocabulary in this workplan.
vocabulary in this workplan. The manifest should make the VSM classification
explicit:
- `hub_family`: `vsm`
- `vsm_function`: `OPS`
- `vsm_system`: `S1`
- `scope`: operational truth and evidence, not coordination/control/audit
ownership
Validation:
@@ -271,6 +372,8 @@ Validation:
- Declared annotation categories appear in `/api/v2/annotation-categories`.
- Policy scopes are visible in the Inter-Hub registry UI or DB, even though the
public v2 API currently lacks `/policy-scopes`.
- Future VSM hub values can be added by changing manifest vocabulary, not by
inventing a different bootstrap mechanism.
Done when: the manifest status is `active` and no type conflicts remain.
@@ -314,9 +417,13 @@ Create initial widgets for the operational surfaces:
- `ops-env-coulombcore`
- `ops-env-railiance01`
- `ops-env-threephoenix-prod`
- `ops-host-coulombcore`
- `ops-host-railiance01`
- `ops-service-catalog`
- `ops-service-gitea`
- `ops-service-state-hub`
- `ops-service-inter-hub`
- `ops-endpoint-gitea-registry`
- `ops-readiness-gitea-registry`
- `ops-readiness-state-hub-cluster-deploy`
- `ops-migration-coulombcore-to-threephoenix`
@@ -346,13 +453,17 @@ state of:
- clusters
- services
- endpoints
- service discovery and service-catalog gaps
- storage and backup coverage
- migration readiness gates
Use this as the working model before creating a separate `ops-hub` repository.
Use `wiki/CurrentOperationsSituation.md` as the seed background, then turn it
into a more structured inventory artifact. Use this as the working model before
creating a separate `ops-hub` repository.
Done when: a human can see the CoulombCore, local, railiance01, and
ThreePhoenix relationship without reading multiple repo workplans.
ThreePhoenix relationship, including the current Gitea registry state, without
reading multiple repo workplans or relying on shell history.
---
@@ -375,6 +486,8 @@ Suggested event:
"eventType": "ops-endpoint-verified",
"viewContext": "railiance-apps/workplans/RAIL-AP-WP-0001",
"metadata": {
"vsmFunction": "OPS",
"vsmSystem": "S1",
"endpoint": "https://gitea.coulomb.social/v2/",
"expectedStatus": 401,
"observedHeader": "Docker-Distribution-Api-Version: registry/2.0"
@@ -400,6 +513,7 @@ Define readiness gates that must be green before moving production
responsibility from CoulombCore to ThreePhoenix:
- DNS and TLS are codified.
- Service catalog entries exist for the live and target production services.
- Git hosting and container registry are reproducible.
- Persistent data stores have backup and restore evidence.
- Secrets and SOPS/age keys are available through governed operator paths.
@@ -428,6 +542,8 @@ Create a separate repo when at least one of these is true:
- `ops-hub` needs collectors, adapters, or scheduled probes.
- `ops-hub` needs its own release lifecycle.
- The ops vocabulary stabilizes enough to deserve reusable code.
- The VSM hub extension template needs shared scaffolding that should not live
inside `inter-hub` itself.
Until then, keep the model in `helix-forge` and register state in Inter-Hub.
@@ -436,7 +552,7 @@ needed.
---
### T10 — Inter-Hub API hardening for extension bootstrap
### T10 — Inter-Hub API hardening for VSM hub bootstrap
```task
id: HF-WP-0001-T10
@@ -446,7 +562,7 @@ target_repo: inter-hub
state_hub_task_id: "7fa54508-7add-4885-8913-12edaadc4d92"
```
Create or link an `inter-hub` workplan to make domain hub bootstrapping
Create or link an `inter-hub` workplan to make VSM domain hub bootstrapping
machine-repeatable.
Recommended Inter-Hub improvements:
@@ -455,29 +571,33 @@ Recommended Inter-Hub improvements:
2. Add `POST /api/v2/widgets` and include it in OpenAPI.
3. Add API endpoints for `HubCapabilityManifest` draft creation, update, and
activation.
4. Add API endpoints for `ApiConsumer` and API key creation, or a clearly
4. Add a documented place for hub-family metadata such as `hub_family`,
`vsm_function`, and `vsm_system`.
5. Add API endpoints for `ApiConsumer` and API key creation, or a clearly
documented admin-only bootstrap command if API key creation remains UI-only.
5. Add `/api/v2/policy-scopes` to match the policy scope registry already used
6. Add `/api/v2/policy-scopes` to match the policy scope registry already used
by manifests.
6. Add distinct OpenAPI request schemas for create requests instead of reusing
7. Add distinct OpenAPI request schemas for create requests instead of reusing
response schemas.
7. Align `docs/new-hub-quickstart.md` with the actual live API until the create
8. Align `docs/new-hub-quickstart.md` with the actual live API until the create
endpoints exist.
8. Fix `Web.Controller.Api.V2.InteractionEvents` so manifest-declared event
9. Fix `Web.Controller.Api.V2.InteractionEvents` so manifest-declared event
types are actually decoded and enforced.
9. Fix webhook dispatch so it uses the submitted event type instead of the
10. Fix webhook dispatch so it uses the submitted event type instead of the
hard-coded `"clicked"` event name.
10. Decide whether event `metadata` is part of the v2 create contract; if yes,
11. Decide whether event `metadata` is part of the v2 create contract; if yes,
persist it in the controller and test it.
12. Document the bootstrap recipe as a template for `syn-hub`, `ctl-hub`,
`aud-hub`, `int-hub`, `pol-hub`, and `env-hub`.
Done when: the next domain hub can be created from a script using documented
API calls and without direct DB access.
Done when: the next VSM hub can be created from a script using documented API
calls and without direct DB access.
## Initial Acceptance Criteria
This workplan is complete when:
1. `ops-hub` is registered in Inter-Hub.
1. `ops-hub` is registered in Inter-Hub as the VSM Operations / System 1 hub.
2. Its capability manifest is active.
3. It has an API consumer and key.
4. Initial ops widgets exist for environments, services, readiness gates, and
@@ -487,6 +607,8 @@ This workplan is complete when:
7. A decision has been made whether to create a separate `ops-hub` repository.
8. Inter-Hub bootstrap API gaps are either fixed or tracked in an Inter-Hub
workplan.
9. The bootstrap path is reusable enough that `syn-hub` can be created next
without rediscovering the whole process.
## Notes
@@ -494,6 +616,6 @@ This workplan is complete when:
- State Hub continues to track workstreams, decisions, and progress events.
- `ops-hub` tracks operational reality and readiness evidence.
- `coo-hub` can later become the coordination/workstream successor once the
broader hub constellation is established.
- `syn-hub`, `ctl-hub`, and `aud-hub` can later absorb coordination, control,
and evidence responsibilities once the broader hub constellation is
established.