Normalize agent instructions and workplan frontmatter (STATE-WP-0067)

- Align agent files with on-disk workplan prefixes (infer from workplan ids)
- Set workplan domain to registered domain_slug; add topic_slug where applicable
- Repair frontmatter delimiter formatting; migrate legacy task status literals
- Regenerate AGENTS.md, CLAUDE.md, and .claude/rules from State Hub templates
This commit is contained in:
2026-06-22 23:16:27 +02:00
parent cec94ac00f
commit 21bfd5fa49
19 changed files with 819 additions and 25 deletions

20
.claude/rules/agents.md Normal file
View File

@@ -0,0 +1,20 @@
## Kaizen Agents
Specialized agent personas available on demand via the state-hub MCP.
**Discover:** `list_kaizen_agents()` — returns all agents with name, description, category
**Load:** `get_kaizen_agent("tdd-workflow")` — returns full instructions; read and follow them
Common agents:
| Agent | Category | When to use |
|-------|----------|-------------|
| `tdd-workflow` | testing | Step-by-step TDD8 workflow for any feature |
| `code-refactoring` | quality | Code quality analysis and safe refactoring |
| `test-maintenance` | testing | Diagnose and fix failing tests |
| `requirements-engineering` | process | Prevent interface/mock mismatches upfront |
| `keepaTodofile` | process | Maintain TODO.md during work |
| `project-management` | process | Track status, determine next steps |
| `datamodel-optimization` | quality | Optimize dataclasses and data structures |
All 17 agents: call `list_kaizen_agents()` for the full list.

View File

@@ -0,0 +1,8 @@
## Architecture
<!-- TODO: Describe the key design decisions and component structure.
Key modules, data flows, external integrations, state machines, etc. -->
## Quick Reference
`~/state-hub/mcp_server/TOOLS.md` — MCP tool reference

View File

@@ -0,0 +1,50 @@
# Credential and access routing
**Audience:** Codex, Claude Code, Grok, and custodian agents that call **llm-connect**
for inference. Run this check **before** requesting secrets, API keys, SSH access,
login tokens, or database passwords — in any repo, not only `ops-warden`.
ops-warden **issues SSH certificates only** (`warden sign`, `cert_command`). Every
other credential need belongs to another subsystem. **Do not** message
`ops-warden` on State Hub expecting a secret value; the reply is a pointer, not a key.
### Lookup (do this first)
```bash
warden route find "<describe your need>" --json
warden route show <catalog-id> --json
```
Requires the `warden` CLI from `~/ops-warden` (`uv tool install .` or `uv run warden`).
| Agent runtime | How to orient |
| --- | --- |
| **Codex / Grok** (shell, HTTP State Hub) | `warden route` commands above; inbox `to_agent=ops-hub` is for coordination, not secret vending |
| **Claude Code** (MCP when available) | `get_domain_summary("custodian")` for workstreams; **still** use `warden route` for credential ownership |
| **llm-connect** (inference service) | Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by `warden route` |
### Quick routing table
| I need… | Owner | ops-warden executes? |
| --- | --- | --- |
| SSH cert (`adm`/`agt`/`atm`) | ops-warden | **Yes**`warden sign` |
| API key, DB password, provider token | OpenBao (`railiance-platform`) | No — route only |
| Login / OIDC / MFA | key-cape / Keycloak | No — route only |
| Authorization decision | flex-auth | No — route only |
| activity-core → issue-core emission | activity-core + issue-core | No — `warden route show activity-core-issue-sink` |
| SSH tunnel | ops-bridge (+ `cert_command` from warden) | No — route only |
### Anti-patterns (do not do these)
- `POST /messages/` to `ops-warden` asking for `ISSUE_CORE_API_KEY`, `OPENROUTER_API_KEY`, etc.
- Inventing `warden secret`, `warden login`, `warden bao`, `warden tunnel` — they do not exist
- Pasting secrets into Git, State Hub, workplans, logs, or chat
### Other capabilities (reuse-surface)
Non-credential capabilities are usually discovered through **reuse-surface** federation
(`reuse-surface` registry / `capability.*` indexes). Credential routing is inlined in
every repo's agent instructions because it is high-frequency, high-risk, and easy to
get wrong.
**Canon:** `~/ops-warden/wiki/CredentialRouting.md` · catalog `~/ops-warden/registry/routing/catalog.yaml`

View File

@@ -0,0 +1,38 @@
## First Session Protocol
Triggered when `get_domain_summary("infotech")` shows **no workstreams**.
The project is registered but work has not yet been structured.
**Step 1 — Read, don't write**
- `~/the-custodian/canon/projects/infotech/project_charter_v0.1.md` — purpose, scope
- `~/the-custodian/canon/projects/infotech/roadmap_v0.1.md` — planned phases
- Scan repo root: README, directory structure, existing code or docs
**Step 2 — Survey in-progress work**
Look for TODOs, open branches, half-finished files. Note done vs. started but incomplete.
**Step 3 — Propose workstreams to Bernd**
Propose 13 workstreams — each a coherent strand, weeks to months, anchored to a
roadmap phase. **Wait for approval before creating.**
**Step 4 — Create workplan file first, then DB record (ADR-001)**
```
workplans/OPS-WP-NNNN-<slug>.md ← write this first
```
Then register in the hub:
```
create_workstream(topic_id="1f2e4d10-c967-4803-ae6c-7f4b4e806409", title="...", owner="...", description="...")
create_task(workstream_id="<id>", title="...", priority="high|medium|low")
```
**Step 5 — Record the setup**
```
add_progress_event(
summary="First session: structured infotech into N workstreams, M tasks",
event_type="milestone",
topic_id="1f2e4d10-c967-4803-ae6c-7f4b4e806409",
detail={"workstreams": [...], "tasks_created": M}
)
```
<!-- Delete or archive this file once past first session -->

View File

@@ -0,0 +1,8 @@
## Repo boundary
This repo owns **ops-hub** only. It does not own:
<!-- TODO: List what belongs in adjacent repos, e.g.:
- SSH key management → railiance-infra/
- State hub code → state-hub/
-->

View File

@@ -0,0 +1,5 @@
**Purpose:** Operations / System 1 extension for Inter-Hub, focused on operational truth, readiness evidence, service catalog records, and migration gates.
**Domain:** infotech
**Repo slug:** ops-hub
**Topic ID:** 1f2e4d10-c967-4803-ae6c-7f4b4e806409

View File

@@ -0,0 +1,85 @@
## Session Protocol
Dev Hub (State Hub API): http://127.0.0.1:8000
MCP server name in `~/.claude.json`: `dev-hub`
**Step 1 — Orient**
Read the offline-safe brief first — it works without a live hub connection:
```bash
cat .custodian-brief.md
```
Then call the MCP tool for richer cross-domain context when MCP tools are exposed:
```
get_domain_summary("infotech")
```
If MCP tools are unavailable in the current agent session, use the REST API:
```bash
curl -s "http://127.0.0.1:8000/state/summary" | python3 -m json.tool
```
If the hub is offline: `cd ~/state-hub && make api`
**Step 2 — Check inbox**
With MCP tools:
```
get_messages(to_agent="ops-hub", unread_only=True)
```
Mark read with `mark_message_read(message_id)`. Reply or act on coordination
requests before proceeding.
Without MCP tools:
```bash
curl -s "http://127.0.0.1:8000/messages/?to_agent=ops-hub&unread_only=true" \
| python3 -m json.tool
curl -s -X PATCH "http://127.0.0.1:8000/messages/<id>/read" \
-H "Content-Type: application/json" -d '{}'
```
**Step 3 — Scan workplans**
```bash
ls workplans/
```
For each file with `status: ready`, `active`, or `blocked`, note pending
`wait`/`todo`/`progress` tasks.
**Step 4 — Present brief**
1. **Active workstreams** for `infotech` — title, task counts, blocking decisions
2. **Pending tasks** from `workplans/` + any `[repo:ops-hub]` hub tasks
3. **Goal guidance** — if `goal_guidance` in summary:
- `needs_workplan`: surface as top action — *"Repo goal '{title}' has no workplan yet"*
- `alignment_warnings`: flag if active work is not aligned with current goal
4. **Suggested next action** — highest-priority open item
5. **SBOM status** — flag if `last_sbom_at` is unset for this repo
If no workstreams: follow First Session Protocol (`first-session.md`).
**During work:** `record_decision()` · `add_progress_event()` · `resolve_decision()`
> State Hub is a *read model*. Bootstrap tools (`create_workstream`, `create_task`)
> are First Session Protocol only. Work structure belongs in repo files (ADR-001).
**Session close:**
With MCP tools:
```
add_progress_event(summary="...", topic_id="1f2e4d10-c967-4803-ae6c-7f4b4e806409", workstream_id="<uuid>")
```
Without MCP tools:
```bash
curl -s -X POST http://127.0.0.1:8000/progress/ \
-H "Content-Type: application/json" \
-d '{"topic_id":"1f2e4d10-c967-4803-ae6c-7f4b4e806409","workstream_id":"<uuid>","event_type":"note","summary":"what changed","author":"codex"}'
```
If workplan files were modified, ensure the local copy is up to date first:
```bash
git -C <repo_path> pull --ff-only
cd ~/state-hub && make fix-consistency REPO=ops-hub
```
For repos where implementation runs on a remote machine (e.g. CoulombCore),
use the combined target which pulls before fixing:
```bash
cd ~/state-hub && make fix-consistency-remote REPO=ops-hub
```
**C-15** (DB task ahead of file) is normal in multi-machine workflows — writeback
will sync the file to match DB. **C-16** (repo behind remote) blocks all writes
until you pull — intentional to prevent clobbering remote progress.

View File

@@ -0,0 +1,19 @@
## Stack
<!-- TODO: Fill in language, frameworks, and key dependencies -->
- **Language:**
- **Key deps:**
## Dev Commands
```bash
# TODO: Fill in the standard commands for this repo
# Install dependencies
# Run tests
# Lint / type check
# Build / package (if applicable)
```

View File

@@ -0,0 +1,40 @@
## Workplan Convention (ADR-001)
File location: `workplans/OPS-WP-NNNN-<slug>.md`
ID prefix: `OPS-WP-`
Work items originate as files in this repo **before** being registered in the hub.
Canonical workplan/workstream frontmatter statuses are:
`proposed`, `ready`, `active`, `blocked`, `backlog`, `finished`, `archived`.
Use `proposed` for a newly drafted plan, `ready` after review against current
repo state, and `finished` when implementation is complete. `stalled` and
`needs_review` are derived health labels, not stored statuses.
Closed workplans may be moved to `workplans/archived/` with a completion-date
prefix: `YYMMDD-OPS-WP-NNNN-<slug>.md`. The frontmatter id remains
unchanged; the prefix is only for quick visual reference.
Small opportunistic tasks discovered during another session use **Ad Hoc Tasks**:
`workplans/ADHOC-YYYY-MM-DD.md`, workstream slug `adhoc-YYYY-MM-DD`, and task ids
`ADHOC-YYYY-MM-DD-T01`, `T02`, etc. Use adhocs only for low-risk work completed
directly. Promote anything requiring analysis, design, approval, dependencies, or
multiple planned phases into a normal workplan.
Ecosystem todos from other agents arrive as `[repo:ops-hub]` hub tasks —
visible at session start. Pick one up by creating the workplan file, then registering
the workstream.
Task blocks use this shape:
```task
id: OPS-WP-NNNN-T01
status: wait | todo | progress | done | cancel
priority: high | medium | low
state_hub_task_id: "<uuid>" # written by fix-consistency — do not edit
```
Status progression is `todo``progress``done`; use `wait` for waiting or
blocked work and `cancel` for stopped work.
<!-- Ralph Loop rules and HEUREKA sequence: ~/.claude/CLAUDE.md — do not duplicate here -->

View File

@@ -2,9 +2,9 @@
## Repo Identity
**Purpose:** Inter-hub extension for the operations & resiliance subdimension of the orthogonal architecture standard perspective.
**Purpose:** Operations / System 1 extension for Inter-Hub, focused on operational truth, readiness evidence, service catalog records, and migration gates.
**Domain:** inter_hub
**Domain:** infotech
**Repo slug:** ops-hub
**Topic ID:** `1f2e4d10-c967-4803-ae6c-7f4b4e806409`
**Workplan prefix:** `OPS-WP-`
@@ -101,6 +101,63 @@ curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
---
## Credential and access routing
**Audience:** Codex, Claude Code, Grok, and custodian agents that call **llm-connect**
for inference. Run this check **before** requesting secrets, API keys, SSH access,
login tokens, or database passwords — in any repo, not only `ops-warden`.
ops-warden **issues SSH certificates only** (`warden sign`, `cert_command`). Every
other credential need belongs to another subsystem. **Do not** message
`ops-warden` on State Hub expecting a secret value; the reply is a pointer, not a key.
### Lookup (do this first)
```bash
warden route find "<describe your need>" --json
warden route show <catalog-id> --json
```
Requires the `warden` CLI from `~/ops-warden` (`uv tool install .` or `uv run warden`).
| Agent runtime | How to orient |
| --- | --- |
| **Codex / Grok** (shell, HTTP State Hub) | `warden route` commands above; inbox `to_agent=ops-hub` is for coordination, not secret vending |
| **Claude Code** (MCP when available) | `get_domain_summary("custodian")` for workstreams; **still** use `warden route` for credential ownership |
| **llm-connect** (inference service) | Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by `warden route` |
### Quick routing table
| I need… | Owner | ops-warden executes? |
| --- | --- | --- |
| SSH cert (`adm`/`agt`/`atm`) | ops-warden | **Yes** — `warden sign` |
| API key, DB password, provider token | OpenBao (`railiance-platform`) | No — route only |
| Login / OIDC / MFA | key-cape / Keycloak | No — route only |
| Authorization decision | flex-auth | No — route only |
| activity-core → issue-core emission | activity-core + issue-core | No — `warden route show activity-core-issue-sink` |
| SSH tunnel | ops-bridge (+ `cert_command` from warden) | No — route only |
### Anti-patterns (do not do these)
- `POST /messages/` to `ops-warden` asking for `ISSUE_CORE_API_KEY`, `OPENROUTER_API_KEY`, etc.
- Inventing `warden secret`, `warden login`, `warden bao`, `warden tunnel` — they do not exist
- Pasting secrets into Git, State Hub, workplans, logs, or chat
### Other capabilities (reuse-surface)
Non-credential capabilities are usually discovered through **reuse-surface** federation
(`reuse-surface` registry / `capability.*` indexes). Credential routing is inlined in
every repo's agent instructions because it is high-frequency, high-risk, and easy to
get wrong.
**Canon:** `~/ops-warden/wiki/CredentialRouting.md` · catalog `~/ops-warden/registry/routing/catalog.yaml`
<!-- REPO-AGENTS-EXTENSIONS -->
<!-- Append repo-specific agent instructions below this marker.
The state-hub template sync preserves content after this line. -->
---
## Workplan Convention (ADR-001)
Work items originate as files in this repo — not in the hub. The hub is a
@@ -124,7 +181,7 @@ anything needing analysis, design, approval, dependencies, or multiple phases.
id: OPS-WP-NNNN
type: workplan
title: "..."
domain: inter_hub
domain: infotech
repo: ops-hub
status: proposed | ready | active | blocked | backlog | finished | archived
owner: codex

12
CLAUDE.md Normal file
View File

@@ -0,0 +1,12 @@
# ops-hub — Claude Code Instructions
@SCOPE.md
@.claude/rules/repo-identity.md
@.claude/rules/session-protocol.md
@.claude/rules/first-session.md
@.claude/rules/workplan-convention.md
@.claude/rules/stack-and-commands.md
@.claude/rules/architecture.md
@.claude/rules/repo-boundary.md
@.claude/rules/credential-routing.md
@.claude/rules/agents.md

View File

@@ -7,9 +7,16 @@ updated: "2026-06-06"
## Why it exists
Inter-hub extension for the operations & resiliance subdimension of the orthogonal architecture standard perspective.
`ops-hub` is the Operations / System 1 extension for Inter-Hub. It turns
operational reality into governed, queryable, and evidence-backed hub records:
environments, hosts, clusters, services, endpoints, releases, backups,
incidents, risks, runbooks, readiness gates, and migration waves.
Inter-hub extension for the operations & resiliance subdimension of the orthogonal architecture standard perspective.
It exists because Railiance and HelixForge operations need a durable
operational truth surface while the current CoulombCore environment transitions
toward the ThreePhoenix production shape. State Hub continues to own
workstreams and decisions; Inter-Hub continues to own the generic hub substrate.
`ops-hub` owns the operations extension behavior built on top of that substrate.
## Governing principle
@@ -17,8 +24,16 @@ This repository should stay focused on the purpose above. Work that changes its
authority, ownership boundaries, or operational promises should be captured in a
workplan before implementation.
The first implementation rule is: domain-specific runtime code belongs here,
while generic hub framework behavior belongs in `inter-hub`.
## What it enables
- A coding agent can understand why the repository exists before changing it.
- State Hub can register and coordinate work for this repository.
- Future workplans can stay connected to the repository's intended role.
- Operators can see what runs where, how it is reached, and what evidence proves
it is healthy.
- Collectors, adapters, and scheduled probes can report operational facts into
Inter-Hub using the ops vocabulary.
- Readiness and migration gates can be represented as explicit, auditable
operational records.
- Future VSM hubs can reuse the extension pattern without turning Inter-Hub
itself into a domain-specific operations product.

View File

@@ -1 +1,6 @@
Inter-hub extension for the operations & resiliance subdimension of the orthogonal architecture standard perspective.
Operations / System 1 extension for Inter-Hub.
`ops-hub` is the operational truth surface for environments, hosts, clusters,
services, endpoints, releases, backups, incidents, risks, runbooks, readiness
gates, and migration waves. Generic hub framework work stays in `inter-hub`;
operations-specific extension code belongs here.

View File

@@ -1,32 +1,52 @@
# SCOPE
> This file was generated by `statehub register`. Refine it as the repository
> boundaries become clearer.
## One-liner
Inter-hub extension for the operations & resiliance subdimension of the orthogonal architecture standard perspective.
Operations / System 1 extension for Inter-Hub, focused on operational truth,
readiness evidence, and migration gates.
## Core Idea
ops-hub exists to provide the capability described in INTENT.md.
`ops-hub` is a domain-specific Inter-Hub extension. It should professionalize
operations by making environments, hosts, clusters, services, endpoints,
releases, backups, incidents, risks, runbooks, readiness gates, and migration
waves explicit and evidence-backed.
The repo is intentionally separate from `inter-hub`: generic framework and API
substrate work remains in `inter-hub`; operations-specific collectors,
adapters, probes, bootstrap clients, UI/extensions, tests, and packaging belong
here.
## In Scope
- Maintain the repository's primary implementation.
- Keep docs, tests, and operational metadata current.
- Operations hub implementation code and tests.
- Ops vocabulary clients, collectors, adapters, and scheduled probes.
- Inter-Hub bootstrap/smoke tooling for the `ops-hub` extension.
- Operations service catalog, readiness, migration, endpoint, backup, restore,
incident, and runbook models.
- Repo-local workplans for growing the Operations / System 1 extension.
## Out of Scope
- Own unrelated adjacent systems.
- Make irreversible operational decisions without human approval.
- Generic Inter-Hub framework behavior, API substrate, authentication, or
registry semantics.
- State Hub workstream, task, decision, or progress implementation.
- Railiance infrastructure, cluster, platform, enablement, or app desired state.
- Manual production DB seeding unless the operator explicitly chooses that
fallback.
- Irreversible operational decisions without human approval.
## Current State
- Status: active; implementation and stability should be verified by the repo agent.
- Status: active bootstrap.
- Implementation: no executable source tree yet; first real workplan is seeded
in `workplans/OPS-WP-0002-interhub-extension-bootstrap.md`.
- Live Inter-Hub production gate: `/api/v2/hubs` still returned `404` on
2026-06-06, so supported API bootstrap is not yet available in production.
## Getting Oriented
- Start with: INTENT.md
- Agent instructions: AGENTS.md
- Workplans: workplans/
- HelixForge handoff: `/home/worsch/helix-forge/workplans/HF-WP-0001-establish-ops-hub-first-extension.md`

15
docs/README.md Normal file
View File

@@ -0,0 +1,15 @@
# ops-hub Docs
This directory contains the first repo-local version of the HelixForge
`HF-WP-0001` handoff.
- `initial-inventory.md` defines the first environment, host, cluster, service,
and endpoint catalog.
- `readiness-gates.md` defines the CoulombCore-to-ThreePhoenix readiness model.
- `bootstrap-runbook.md` defines the operator-ready Inter-Hub bootstrap path.
- `../seeds/ops-hub-manifest.draft.json` contains the initial capability
manifest draft.
- `../seeds/ops-hub-widgets.seed.json` contains the initial widget seed.
- `../seeds/ops-hub-bootstrap.sql` is an operator-approved fallback only; do
not use direct DB seeding while the supported Inter-Hub API path is viable or
pending.

146
docs/initial-inventory.md Normal file
View File

@@ -0,0 +1,146 @@
# Ops Hub Initial Inventory
Date: 2026-06-06
## Purpose
This document is the first structured inventory for `ops-hub`, the VSM
Operations / System 1 hub. It turns the current operations situation into a
catalogable model for this implementation repo.
Source background:
- `/home/worsch/helix-forge/wiki/CurrentOperationsSituation.md`
- `/home/worsch/helix-forge/workplans/HF-WP-0001-establish-ops-hub-first-extension.md`
## Repository Boundary
As of 2026-06-06, `ops-hub` implementation belongs in `/home/worsch/ops-hub`
with remote `gitea-remote:coulomb/ops-hub.git`.
- `ops-hub` owns collectors, adapters, scheduled probes, runtime
packaging, UI/extensions, tests, and Inter-Hub bootstrap/smoke clients.
- `inter-hub` remains the generic hub framework, manifest/registry substrate,
authentication surface, widget/event API, and bootstrap API owner.
- `helix-forge` keeps architecture context and the original coordinating
workplan.
- Railiance repos own deployable infrastructure/service state and the
operational evidence that `ops-hub` should surface.
## VSM Placement
| Field | Value |
|---|---|
| Hub | `ops-hub` |
| Hub family | `vsm` |
| VSM function | `OPS` |
| VSM system | `S1` |
| Primary concern | Operational truth and evidence |
`ops-hub` owns the description of what is currently running, where it runs, how
it is reached, what state it is in, and what operational evidence exists. It
does not replace State Hub workstreams or Inter-Hub governance.
## Environments
| Environment | Role | Current state | Notes |
|---|---|---|---|
| `local` | Workstation development and local services | Active, important, not production | Hosts State Hub and local build/runtime pieces. |
| `coulombcore` | Live transitional production | Active, production-like, historically hand-built | Public IP `92.205.130.254`; runs current Gitea and experimental operational services. |
| `railiance01` | Future production foundation | Provisioning target | Public IP `92.205.62.239`; first server of intended ThreePhoenix shape. |
| `threephoenix-prod` | Target production topology | Planned | Future governed multi-node production environment. |
## Hosts
| Host | Environment | Address | Role | Known gaps |
|---|---|---|---|---|
| `coulombcore` | `coulombcore` | `92.205.130.254` | Current live production-like server | Needs service catalog, drift tracking, backup/restore evidence, and migration disposition. |
| `railiance01` | `railiance01` | `92.205.62.239` | First ThreePhoenix production foundation node | Needs full inventory, readiness gates, and cluster/platform bootstrap evidence. |
| local workstation | `local` | local/private | State Hub and development runtime host | Needs explicit service ownership and backup expectations. |
Ops Bridge may provide reachability evidence for connected servers, but it is
not the service catalog. `ops-hub` should turn bridge reachability into
inventory signals rather than treating the bridge itself as the inventory.
## Clusters
| Cluster | Environment | Role | Current notes |
|---|---|---|---|
| CoulombCore Kubernetes | `coulombcore` | Current operational Kubernetes runtime | Hosts current Gitea deployment and related services. |
| ThreePhoenix Kubernetes | `threephoenix-prod` | Target production runtime | Future governed production cluster assembled through Railiance repos. |
## Services
| Service | Current environment | Owner repo | Current evidence | Gaps |
|---|---|---|---|---|
| Gitea | `coulombcore` | `railiance-apps` | Helm release `gitea`, namespace `default`, app version `1.25.4`, NodePort `32166`, public registry path returns auth challenge. | SOPS Helm values update, package token, `docker login`, push, pull, backup coverage, restore evidence. |
| Gitea database | `coulombcore` | `railiance-platform` | Database `gitea-db` in namespace `databases`. | Backup and restore evidence not recorded here yet. |
| Gitea shared storage | `coulombcore` | `railiance-platform` / `railiance-apps` | PVC `default/gitea-shared-storage`. | Package blob backup and restore evidence not confirmed. |
| State Hub | `local` | `the-custodian/state-hub` | Local API and dashboard are operational enough for repo registration and workplan sync. | Future cluster deployment/readiness still needs gates and evidence. |
| Inter-Hub | live public endpoint | `inter-hub` | `https://hub.coulomb.social/api/v2/openapi.json` and docs are reachable. | Hub bootstrap still depends on authenticated UI or migration. |
| Ops Bridge | local/remote bridge | `ops-bridge` | Useful for connected-server visibility. | Not a service catalog; should emit reachability evidence into `ops-hub`. |
## Endpoints
| Endpoint | Service | Environment | Current status | Evidence |
|---|---|---|---|---|
| `https://gitea.coulomb.social/v2/` | Gitea OCI registry | `coulombcore` | Route fixed; returns registry auth challenge | Expected `401` with OCI registry challenge. |
| `https://hub.coulomb.social/api/v2/openapi.json` | Inter-Hub API | live Inter-Hub | Reachable | OpenAPI document fetched on 2026-05-16. |
| `https://hub.coulomb.social/Hubs` | Inter-Hub UI | live Inter-Hub | Requires login | Redirects to `/NewSession`. |
| `http://127.0.0.1:8000/state/health` | State Hub API | `local` | Reachable locally | Used for StateHub registration/sync. |
## Service Catalog Gap
There is no central place that answers these questions:
- What runs where?
- Which repo owns its desired state?
- Which endpoint exposes it?
- Which data stores back it?
- Which backups and restore tests cover it?
- Which migration wave will replace or move it?
- Which current evidence proves it is healthy?
`ops-hub` should be the first place where these answers are explicit and
machine-addressable.
## First Ops Widgets
Seed these in Inter-Hub once `ops-hub` exists:
- `ops-env-local`
- `ops-env-coulombcore`
- `ops-env-railiance01`
- `ops-env-threephoenix-prod`
- `ops-host-coulombcore`
- `ops-host-railiance01`
- `ops-service-catalog`
- `ops-service-gitea`
- `ops-service-state-hub`
- `ops-service-inter-hub`
- `ops-endpoint-gitea-registry`
- `ops-readiness-gitea-registry`
- `ops-readiness-state-hub-cluster-deploy`
- `ops-migration-coulombcore-to-threephoenix`
## First Evidence Events
The first event should be the Gitea registry endpoint verification:
```json
{
"widgetId": "<ops-endpoint-gitea-registry-widget-id>",
"eventType": "ops-endpoint-verified",
"viewContext": "railiance-apps/workplans/RAIL-AP-WP-0001",
"metadata": {
"vsmFunction": "OPS",
"vsmSystem": "S1",
"endpoint": "https://gitea.coulomb.social/v2/",
"expectedStatus": 401,
"observedHeader": "Docker-Distribution-Api-Version: registry/2.0"
}
}
```
This event is blocked until the ops event type is registered by an active
manifest and the target widget exists.

63
docs/readiness-gates.md Normal file
View File

@@ -0,0 +1,63 @@
# Ops Hub Readiness Gates
Date: 2026-06-06
## Purpose
These gates define what must be true before operational responsibility can move
from the current CoulombCore setup to the future ThreePhoenix production setup.
They are the first repo-local `ops-hub` readiness model.
Statuses:
- `unknown` means no reliable evidence has been cataloged yet.
- `partial` means some evidence exists, but the gate is not complete.
- `blocked` means a required precondition is missing.
- `ready` means the evidence requirement is satisfied.
## Gates
| ID | Gate | Owner repo | Evidence requirement | Current status |
|---|---|---|---|---|
| OPS-G01 | Environment inventory exists | `ops-hub` | `local`, `coulombcore`, `railiance01`, and `threephoenix-prod` are represented with role, lifecycle state, and owner notes. | `partial` |
| OPS-G02 | Service catalog exists | `ops-hub` | Each live and target service has environment, owner repo, endpoint, backing stores, lifecycle state, and evidence links. | `partial` |
| OPS-G03 | DNS and TLS are codified | `railiance-cluster` / `railiance-apps` | Public hostnames, ingress routes, certificate sources, and renewal paths are declared in repo files. | `unknown` |
| OPS-G04 | Git hosting is reproducible | `railiance-apps` / `railiance-platform` | Gitea or successor deployment can be recreated from repo state, including database and storage dependencies. | `partial` |
| OPS-G05 | Container registry publishing is proven | `railiance-apps` | `docker login`, push, and pull succeed against `https://gitea.coulomb.social/v2/` using governed secrets. | `partial` |
| OPS-G06 | Persistent data is backed up | `railiance-platform` | Each persistent data store has backup location, schedule, retention, ownership, and latest successful backup evidence. | `unknown` |
| OPS-G07 | Restore path is proven | `railiance-platform` / `railiance-apps` | Restore test evidence exists for Gitea database, package blobs, and State Hub data. | `unknown` |
| OPS-G08 | Secrets path is governed | `railiance-infra` / `railiance-apps` | SOPS/age keys and operator secret paths are documented; no required secret depends on shell memory. | `partial` |
| OPS-G09 | Cluster runtime is reproducible | `railiance-cluster` | Kubernetes runtime, ingress, CNI, operators, and routing primitives are recreated through repo-owned automation. | `unknown` |
| OPS-G10 | Platform services are reproducible | `railiance-platform` | PostgreSQL/CNPG, object storage, secret management, and identity dependencies have repo-owned deployment evidence. | `unknown` |
| OPS-G11 | Application deployment is reproducible | `railiance-apps` | Gitea, Inter-Hub, State Hub, and other application releases are declared with Helm values and deployment runbooks. | `partial` |
| OPS-G12 | Rollback path is documented | owning service repos | Each migration wave has rollback conditions, steps, and data safety notes. | `unknown` |
| OPS-G13 | Operator runbooks exist | owning service repos | Deploy, restore, rotate, incident response, and migration runbooks exist for each critical service. | `unknown` |
| OPS-G14 | Observability and health checks are explicit | `railiance-cluster` / `railiance-platform` / service repos | Health checks, logs, metrics, and endpoint probes are documented and tied to service catalog entries. | `unknown` |
| OPS-G15 | Inter-Hub ops bootstrap is available | `inter-hub` / `ops-hub` / `helix-forge` | `ops-hub` can be created through UI, supported API, or explicit migration fallback, manifest activated, API consumer/key created, widgets seeded, and events accepted. | `partial` |
## Initial Migration Waves
| Wave | Goal | Required gates |
|---|---|---|
| `wave-0-catalog` | Establish the operational truth surface without moving services. | OPS-G01, OPS-G02, OPS-G15 |
| `wave-1-registry-proof` | Prove current Gitea registry publishing and evidence capture. | OPS-G03, OPS-G05, OPS-G08, OPS-G14 |
| `wave-2-backup-restore` | Confirm backups and restore paths for critical persistent state. | OPS-G06, OPS-G07, OPS-G13 |
| `wave-3-threephoenix-foundation` | Recreate cluster and platform foundations on railiance01/ThreePhoenix. | OPS-G09, OPS-G10 |
| `wave-4-service-migration` | Move or replace production responsibilities from CoulombCore to ThreePhoenix. | OPS-G04, OPS-G11, OPS-G12 plus service-specific gates |
## Evidence Shape
Each readiness gate should eventually be represented in `ops-hub` as a widget
or widget family with events like:
- `ops-readiness-gate-updated`
- `ops-endpoint-verified`
- `ops-backup-verified`
- `ops-restore-tested`
- `ops-risk-raised`
- `ops-migration-gate-passed`
- `ops-migration-gate-failed`
Until Inter-Hub can create all required records through API calls, the evidence
can be maintained in this repo and mirrored into Inter-Hub through the UI or
explicit operator-approved migrations.

View File

@@ -2,9 +2,9 @@
id: OPS-WP-0001
type: workplan
title: "Bootstrap State Hub integration"
domain: inter_hub
domain: infotech
repo: ops-hub
status: ready
status: finished
owner: codex
topic_slug: inter_hub
created: "2026-06-06"
@@ -13,24 +13,28 @@ updated: "2026-06-06"
# Bootstrap State Hub integration
Inter-hub extension for the operations & resiliance subdimension of the orthogonal architecture standard perspective.
Bootstrap this repo's State Hub integration and replace generated placeholders
with the first concrete `ops-hub` operating frame.
## Review Generated Integration Files
```task
id: OPS-WP-0001-T01
status: todo
status: done
priority: high
```
Review `INTENT.md`, `SCOPE.md`, `AGENTS.md`, and `.custodian-brief.md`.
Replace generated placeholders with repo-specific facts where needed.
Completed 2026-06-06: `INTENT.md`, `SCOPE.md`, `AGENTS.md`, and `README.md`
now describe `ops-hub` as the Operations / System 1 Inter-Hub extension.
## Verify Local Developer Workflow
```task
id: OPS-WP-0001-T02
status: todo
status: done
priority: high
```
@@ -38,11 +42,16 @@ Identify the repo's install, test, lint, build, and run commands. Add or refine
those commands in the agent instructions so future coding sessions can verify
changes confidently.
Completed 2026-06-06: the repo currently has no executable source tree,
dependency manifest, test suite, or build command. `AGENTS.md` records
documentation/workplan verification commands and requires future source work to
add repo-native lint, test, build, and run commands.
## Seed First Real Workplan
```task
id: OPS-WP-0001-T03
status: todo
status: done
priority: medium
```
@@ -52,3 +61,6 @@ next change. After workplan file updates, run from `~/state-hub`:
```bash
make fix-consistency REPO=ops-hub
```
Completed 2026-06-06: seeded
`workplans/OPS-WP-0002-interhub-extension-bootstrap.md`.

View File

@@ -0,0 +1,176 @@
---
id: OPS-WP-0002
type: workplan
title: "Bootstrap ops-hub as an Inter-Hub Operations extension"
domain: infotech
repo: ops-hub
status: active
owner: codex
topic_slug: inter_hub
created: "2026-06-06"
updated: "2026-06-06"
---
# Bootstrap ops-hub as an Inter-Hub Operations extension
## Goal
Turn the HelixForge `HF-WP-0001` handoff into the first concrete `ops-hub`
implementation track.
`ops-hub` should become the Operations / System 1 Inter-Hub extension for
operational truth: environments, hosts, clusters, services, endpoints,
releases, backups, incidents, risks, runbooks, readiness gates, and migration
waves.
This repo owns domain-specific implementation assets. `inter-hub` remains the
generic framework, registry, authentication, manifest, widget, event, and
bootstrap API substrate.
## Current Gate
As of 2026-06-06, public production Inter-Hub still returns `404` for:
```text
https://hub.coulomb.social/api/v2/hubs
```
Do not run manual database seeding unless the operator explicitly chooses that
fallback. The preferred bootstrap path is the supported Inter-Hub API once
production exposes the current bootstrap surface.
Gate criteria:
- Unauthenticated `GET /api/v2/hubs` returns `401`, not `404`.
- OpenAPI lists `/hubs`, `/hub-capability-manifests`, `/api-consumers`, and
`/policy-scopes`.
- The bootstrap/smoke client can create or reuse the `ops-hub` hub, activate
its manifest, create the runtime API consumer/key, seed initial widgets, and
persist the first governed ops event.
## Handoff Sources
- `/home/worsch/helix-forge/workplans/HF-WP-0001-establish-ops-hub-first-extension.md`
- `/home/worsch/helix-forge/wiki/OpsHubInventory.md`
- `/home/worsch/helix-forge/wiki/OpsHubReadinessGates.md`
- `/home/worsch/helix-forge/wiki/OpsHubBootstrapRunbook.md`
- `/home/worsch/helix-forge/wiki/ops-hub-manifest.draft.json`
- `/home/worsch/helix-forge/wiki/ops-hub-widgets.seed.json`
## Port HelixForge Handoff Artifacts
```task
id: OPS-WP-0002-T01
status: done
priority: high
```
Create repo-local docs and seed data for the ops vocabulary, initial inventory,
readiness gates, bootstrap runbook, manifest draft, and widget seed.
Done when the `ops-hub` repo can be understood without opening HelixForge for
routine implementation details. Keep links back to HelixForge for architectural
context.
Completed 2026-06-06:
- Ported initial inventory to `docs/initial-inventory.md`.
- Ported readiness gates to `docs/readiness-gates.md`.
- Ported bootstrap runbook to `docs/bootstrap-runbook.md`.
- Ported manifest and widget seeds to `seeds/`.
- Added `docs/README.md` as the handoff index.
## Define Repository Source Layout
```task
id: OPS-WP-0002-T02
status: done
priority: high
```
Choose and create the first source layout for bootstrap/smoke tooling,
collectors, adapters, and tests. Add the repo-native lint, test, build, and run
commands to `AGENTS.md`.
Done when future code changes have an obvious home and a verification command.
Completed 2026-06-06:
- Added `pyproject.toml`.
- Added Python package layout under `src/ops_hub/`.
- Added operator scripts under `scripts/`.
- Added tests under `tests/`.
- Documented current verification commands in `AGENTS.md`.
## Implement Inter-Hub Production Gate Probe
```task
id: OPS-WP-0002-T03
status: done
priority: high
```
Build a small probe that checks the public Inter-Hub bootstrap API gate:
- `/api/v2/hubs` response is `401` unauthenticated.
- OpenAPI lists the required bootstrap paths.
- The result is machine-readable and suitable for a scheduled ops signal later.
Done when the probe can run locally without secrets and reports the current
gate as pass/fail with clear reasons.
Completed 2026-06-06: `scripts/interhub-gate-probe.py` checks unauthenticated
`/api/v2/hubs` status and required OpenAPI bootstrap paths, emits JSON, and
exits nonzero while the gate is closed.
## Implement Bootstrap Smoke Client
```task
id: OPS-WP-0002-T04
status: wait
priority: high
```
Implement the authenticated bootstrap/smoke client once Inter-Hub production
exposes the supported bootstrap API.
The client should use `IHUB_BASE` and `IHUB_OPERATOR_KEY` and should create or
reuse:
- `ops-hub` hub row
- active capability manifest
- runtime API consumer/key
- initial governed ops widgets
- first `ops-endpoint-verified` event
Done when a dry-run and an attended real run both produce repeatable evidence
without direct DB access.
Waiting on: Inter-Hub production API gate from T03.
## Seed First Operational Signal
```task
id: OPS-WP-0002-T05
status: wait
priority: medium
```
Submit the first governed ops signal for the Gitea registry endpoint once the
manifest, widget, event type, and API key exist.
Initial signal:
```json
{
"eventType": "ops-endpoint-verified",
"endpoint": "https://gitea.coulomb.social/v2/",
"expectedStatus": 401,
"viewContext": "railiance-apps/workplans/RAIL-AP-WP-0001"
}
```
Done when the event is visible in Inter-Hub and traceable to the owning
Railiance workplan.
Waiting on: T04 and an available `ops-hub` runtime API key.