Files
railiance-cluster/AGENTS.md
tegwick 84c005254d
Some checks failed
railiance-tests / smoke (push) Has been cancelled
Regenerate agent instructions: workstream -> workplan terminology
Registration guidance now prescribes file-first + fix-consistency (C-06)
instead of manual create_workplan/create_workstream calls; progress-event
examples use workplan_id; legacy field names annotated.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-02 01:47:45 +02:00

7.6 KiB

railiance-cluster — Agent Instructions

Repo Identity

Purpose: OAS S2 Cluster Runtime — k3s, Helm, ingress, CNI, operators

Domain: financials Repo slug: railiance-cluster Topic ID: ca369340-a64e-442e-98f1-a4fa7dc74a38 Workplan prefix: RAIL-BS-WP-


State Hub Integration

The Custodian State Hub tracks work across all domains. Interact via HTTP REST — there is no MCP server for Codex agents.

Context URL
Local workstation http://127.0.0.1:8000
Remote via tunnel http://127.0.0.1:18000
Optional local edge relay http://127.0.0.1:18080

When an operator has enabled the edge relay, set API_BASE to the relay URL. Queueable writes return an explicit queued receipt if the central hub is unreachable. Treat that as pending local evidence, then ask the operator to run statehub outbox status/replay after connectivity returns.

Orient at session start

# Offline brief — works without hub connection
cat .custodian-brief.md

# Active workplans for this domain
curl -s "http://127.0.0.1:8000/workplans/?topic_id=ca369340-a64e-442e-98f1-a4fa7dc74a38&status=active" \
  | python3 -m json.tool

# Check inbox
curl -s "http://127.0.0.1:8000/messages/?to_agent=railiance-cluster&unread_only=true" \
  | python3 -m json.tool

Mark a message read:

curl -s -X PATCH "http://127.0.0.1:8000/messages/<id>/read" \
  -H "Content-Type: application/json" -d '{}'

Log progress (required at session close)

curl -s -X POST http://127.0.0.1:8000/progress/ \
  -H "Content-Type: application/json" \
  -d '{
    "summary": "what was done",
    "event_type": "note",
    "author": "codex",
    "workplan_id": "<uuid>",
    "task_id": "<uuid>"
  }'

Omit workplan_id / task_id when not applicable.

Update task status

curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
  -H "Content-Type: application/json" \
  -d '{"status": "progress"}'
# values: wait | todo | progress | done | cancel

Flag a task for human review

curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
  -H "Content-Type: application/json" \
  -d '{"needs_human": true, "intervention_note": "reason"}'

Session Protocol

Start:

  1. cat .custodian-brief.md — domain goal and open workplans (offline-safe)
  2. Check inbox: GET /messages/?to_agent=railiance-cluster&unread_only=true; mark read
  3. Scan workplans: ls workplans/ — note status: ready, active, or blocked files and open tasks
  4. Check human-needed tasks: GET /tasks/?needs_human=true

During work:

  • Update task statuses in workplan files as tasks progress
  • Record significant decisions via POST /decisions/

Close:

  1. Update workplan file task statuses to reflect progress
  2. Log: POST /progress/ with a summary of what changed
  3. After workplan file changes, run:
    statehub fix-consistency
    
    Coding agents should run this directly; ask the operator only if the CLI or State Hub API is unavailable. This syncs task status from files into the hub DB.

Credential and access routing

Audience: Codex, Claude Code, Grok, and custodian agents that call llm-connect for inference. Run this check before requesting secrets, API keys, SSH access, login tokens, or database passwords — in any repo, not only ops-warden.

ops-warden issues SSH certificates only (warden sign, cert_command). Every other credential need belongs to another subsystem. Do not message ops-warden on State Hub expecting a secret value; the reply is a pointer, not a key.

Lookup (do this first)

warden route find "<describe your need>" --json
warden route show <catalog-id> --json

Requires the warden CLI from ~/ops-warden (uv tool install . or uv run warden).

Agent runtime How to orient
Codex / Grok (shell, HTTP State Hub) warden route commands above; inbox to_agent=railiance-cluster is for coordination, not secret vending
Claude Code (MCP when available) get_domain_summary("custodian") for workplans; still use warden route for credential ownership
llm-connect (inference service) Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by warden route

Quick routing table

I need… Owner ops-warden executes?
SSH cert (adm/agt/atm) ops-warden Yeswarden sign
API key, DB password, provider token OpenBao (railiance-platform) No — route only
Login / OIDC / MFA key-cape / Keycloak No — route only
Authorization decision flex-auth No — route only
activity-core → issue-core emission activity-core + issue-core No — warden route show activity-core-issue-sink
SSH tunnel ops-bridge (+ cert_command from warden) No — route only

Anti-patterns (do not do these)

  • POST /messages/ to ops-warden asking for ISSUE_CORE_API_KEY, OPENROUTER_API_KEY, etc.
  • Inventing warden secret, warden login, warden bao, warden tunnel — they do not exist
  • Pasting secrets into Git, State Hub, workplans, logs, or chat

Other capabilities (reuse-surface)

Non-credential capabilities are usually discovered through reuse-surface federation (reuse-surface registry / capability.* indexes). Credential routing is inlined in every repo's agent instructions because it is high-frequency, high-risk, and easy to get wrong.

Canon: ~/ops-warden/wiki/CredentialRouting.md · catalog ~/ops-warden/registry/routing/catalog.yaml


Workplan Convention (ADR-001)

Work items originate as files in this repo — not in the hub. The hub is a read/cache/index layer that rebuilds from files.

File location: workplans/RAILIANCE-WP-NNNN-<slug>.md

Archived location: finished workplans may move to workplans/archived/YYMMDD-RAILIANCE-WP-NNNN-<slug>.md. The YYMMDD prefix is the completion/archive date; the frontmatter id does not change.

Ad Hoc Tasks: small opportunistic fixes discovered during a session use workplans/ADHOC-YYYY-MM-DD.md with task ids ADHOC-YYYY-MM-DD-T01, etc. Use this only for low-risk work completed directly; create a normal workplan for anything needing analysis, design, approval, dependencies, or multiple phases.

Frontmatter:

---
id: RAILIANCE-WP-NNNN
type: workplan
title: "..."
domain: financials
repo: railiance-cluster
status: proposed | ready | active | blocked | backlog | finished | archived
owner: codex
topic_slug: ...
created: "YYYY-MM-DD"
updated: "YYYY-MM-DD"
state_hub_workstream_id: "<uuid>"   # written by fix-consistency — do not edit
---

Use proposed for a new draft, ready after review against current repo state, and finished after implementation. stalled and needs_review are derived health labels, not frontmatter statuses.

Task block format (one per ## section):

## Task Title

` ` `task
id: RAILIANCE-WP-NNNN-T01
status: wait | todo | progress | done | cancel
priority: high | medium | low
state_hub_task_id: "<uuid>"         # written by fix-consistency — do not edit
` ` `

Task description text.

Status progression: todoprogressdone; use wait for waiting/blocked work and cancel for stopped work.

To create a new workplan:

  1. Write the file following the format above
  2. Notify the custodian operator to run make fix-consistency REPO=railiance-cluster (or send a message to the hub agent via POST /messages/)