# railiance-cluster — Agent Instructions ## Repo Identity **Purpose:** OAS S2 Cluster Runtime — k3s, Helm, ingress, CNI, operators **Domain:** financials **Repo slug:** railiance-cluster **Topic ID:** `ca369340-a64e-442e-98f1-a4fa7dc74a38` **Workplan prefix:** `RAIL-BS-WP-` --- ## State Hub Integration The Custodian State Hub tracks work across all domains. Interact via HTTP REST — there is no MCP server for Codex agents. | Context | URL | |---------|-----| | Local workstation | `http://127.0.0.1:8000` | | Remote via tunnel | `http://127.0.0.1:18000` | | Optional local edge relay | http://127.0.0.1:18080 | When an operator has enabled the edge relay, set API_BASE to the relay URL. Queueable writes return an explicit queued receipt if the central hub is unreachable. Treat that as pending local evidence, then ask the operator to run statehub outbox status/replay after connectivity returns. ### Orient at session start ```bash # Offline brief — works without hub connection cat .custodian-brief.md # Active workplans for this domain curl -s "http://127.0.0.1:8000/workplans/?topic_id=ca369340-a64e-442e-98f1-a4fa7dc74a38&status=active" \ | python3 -m json.tool # Check inbox curl -s "http://127.0.0.1:8000/messages/?to_agent=railiance-cluster&unread_only=true" \ | python3 -m json.tool ``` Mark a message read: ```bash curl -s -X PATCH "http://127.0.0.1:8000/messages//read" \ -H "Content-Type: application/json" -d '{}' ``` ### Log progress (required at session close) ```bash curl -s -X POST http://127.0.0.1:8000/progress/ \ -H "Content-Type: application/json" \ -d '{ "summary": "what was done", "event_type": "note", "author": "codex", "workplan_id": "", "task_id": "" }' ``` Omit `workplan_id` / `task_id` when not applicable. ### Update task status ```bash curl -s -X PATCH "http://127.0.0.1:8000/tasks/" \ -H "Content-Type: application/json" \ -d '{"status": "progress"}' # values: wait | todo | progress | done | cancel ``` ### Flag a task for human review ```bash curl -s -X PATCH "http://127.0.0.1:8000/tasks/" \ -H "Content-Type: application/json" \ -d '{"needs_human": true, "intervention_note": "reason"}' ``` --- ## Session Protocol **Start:** 1. `cat .custodian-brief.md` — domain goal and open workplans (offline-safe) 2. Check inbox: `GET /messages/?to_agent=railiance-cluster&unread_only=true`; mark read 3. Scan workplans: `ls workplans/` — note `status: ready`, `active`, or `blocked` files and open tasks 4. Check human-needed tasks: `GET /tasks/?needs_human=true` **During work:** - Update task statuses in workplan files as tasks progress - Record significant decisions via `POST /decisions/` **Close:** 1. Update workplan file task statuses to reflect progress 2. Log: `POST /progress/` with a summary of what changed 3. After workplan file changes, run: ```bash statehub fix-consistency ``` Coding agents should run this directly; ask the operator only if the CLI or State Hub API is unavailable. This syncs task status from files into the hub DB. --- ## Credential and access routing **Audience:** Codex, Claude Code, Grok, and custodian agents that call **llm-connect** for inference. Run this check **before** requesting secrets, API keys, SSH access, login tokens, or database passwords — in any repo, not only `ops-warden`. ops-warden **issues SSH certificates only** (`warden sign`, `cert_command`). Every other credential need belongs to another subsystem. **Do not** message `ops-warden` on State Hub expecting a secret value; the reply is a pointer, not a key. ### Lookup (do this first) ```bash warden route find "" --json warden route show --json ``` Requires the `warden` CLI from `~/ops-warden` (`uv tool install .` or `uv run warden`). | Agent runtime | How to orient | | --- | --- | | **Codex / Grok** (shell, HTTP State Hub) | `warden route` commands above; inbox `to_agent=railiance-cluster` is for coordination, not secret vending | | **Claude Code** (MCP when available) | `get_domain_summary("custodian")` for workplans; **still** use `warden route` for credential ownership | | **llm-connect** (inference service) | Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by `warden route` | ### Quick routing table | I need… | Owner | ops-warden executes? | | --- | --- | --- | | SSH cert (`adm`/`agt`/`atm`) | ops-warden | **Yes** — `warden sign` | | API key, DB password, provider token | OpenBao (`railiance-platform`) | No — route only | | Login / OIDC / MFA | key-cape / Keycloak | No — route only | | Authorization decision | flex-auth | No — route only | | activity-core → issue-core emission | activity-core + issue-core | No — `warden route show activity-core-issue-sink` | | SSH tunnel | ops-bridge (+ `cert_command` from warden) | No — route only | ### Anti-patterns (do not do these) - `POST /messages/` to `ops-warden` asking for `ISSUE_CORE_API_KEY`, `OPENROUTER_API_KEY`, etc. - Inventing `warden secret`, `warden login`, `warden bao`, `warden tunnel` — they do not exist - Pasting secrets into Git, State Hub, workplans, logs, or chat ### Other capabilities (reuse-surface) Non-credential capabilities are usually discovered through **reuse-surface** federation (`reuse-surface` registry / `capability.*` indexes). Credential routing is inlined in every repo's agent instructions because it is high-frequency, high-risk, and easy to get wrong. **Canon:** `~/ops-warden/wiki/CredentialRouting.md` · catalog `~/ops-warden/registry/routing/catalog.yaml` --- ## Workplan Convention (ADR-001) Work items originate as files in this repo — not in the hub. The hub is a read/cache/index layer that rebuilds from files. **File location:** `workplans/RAILIANCE-WP-NNNN-.md` **Archived location:** finished workplans may move to `workplans/archived/YYMMDD-RAILIANCE-WP-NNNN-.md`. The `YYMMDD` prefix is the completion/archive date; the frontmatter `id` does not change. **Ad Hoc Tasks:** small opportunistic fixes discovered during a session use `workplans/ADHOC-YYYY-MM-DD.md` with task ids `ADHOC-YYYY-MM-DD-T01`, etc. Use this only for low-risk work completed directly; create a normal workplan for anything needing analysis, design, approval, dependencies, or multiple phases. **Frontmatter:** ```yaml --- id: RAILIANCE-WP-NNNN type: workplan title: "..." domain: financials repo: railiance-cluster status: proposed | ready | active | blocked | backlog | finished | archived owner: codex topic_slug: ... created: "YYYY-MM-DD" updated: "YYYY-MM-DD" state_hub_workstream_id: "" # written by fix-consistency — do not edit --- ``` Use `proposed` for a new draft, `ready` after review against current repo state, and `finished` after implementation. `stalled` and `needs_review` are derived health labels, not frontmatter statuses. **Task block format** (one per `##` section): ``` ## Task Title ` ` `task id: RAILIANCE-WP-NNNN-T01 status: wait | todo | progress | done | cancel priority: high | medium | low state_hub_task_id: "" # written by fix-consistency — do not edit ` ` ` Task description text. ``` Status progression: `todo` → `progress` → `done`; use `wait` for waiting/blocked work and `cancel` for stopped work. To create a new workplan: 1. Write the file following the format above 2. Notify the custodian operator to run `make fix-consistency REPO=railiance-cluster` (or send a message to the hub agent via `POST /messages/`)