Refresh agent instruction files
Some checks failed
railiance-tests / smoke (push) Has been cancelled

This commit is contained in:
2026-05-18 16:55:49 +02:00
parent 8aeef483b7
commit 0c38343fc9
10 changed files with 381 additions and 175 deletions

20
.claude/rules/agents.md Normal file
View File

@@ -0,0 +1,20 @@
## Kaizen Agents
Specialized agent personas available on demand via the state-hub MCP.
**Discover:** `list_kaizen_agents()` — returns all agents with name, description, category
**Load:** `get_kaizen_agent("tdd-workflow")` — returns full instructions; read and follow them
Common agents:
| Agent | Category | When to use |
|-------|----------|-------------|
| `tdd-workflow` | testing | Step-by-step TDD8 workflow for any feature |
| `code-refactoring` | quality | Code quality analysis and safe refactoring |
| `test-maintenance` | testing | Diagnose and fix failing tests |
| `requirements-engineering` | process | Prevent interface/mock mismatches upfront |
| `keepaTodofile` | process | Maintain TODO.md during work |
| `project-management` | process | Track status, determine next steps |
| `datamodel-optimization` | quality | Optimize dataclasses and data structures |
All 17 agents: call `list_kaizen_agents()` for the full list.

View File

@@ -0,0 +1,8 @@
## Architecture
<!-- TODO: Describe the key design decisions and component structure.
Key modules, data flows, external integrations, state machines, etc. -->
## Quick Reference
`~/state-hub/mcp_server/TOOLS.md` — MCP tool reference

View File

@@ -0,0 +1,38 @@
## First Session Protocol
Triggered when `get_domain_summary("railiance")` shows **no workstreams**.
The project is registered but work has not yet been structured.
**Step 1 — Read, don't write**
- `~/the-custodian/canon/projects/railiance/project_charter_v0.1.md` — purpose, scope
- `~/the-custodian/canon/projects/railiance/roadmap_v0.1.md` — planned phases
- Scan repo root: README, directory structure, existing code or docs
**Step 2 — Survey in-progress work**
Look for TODOs, open branches, half-finished files. Note done vs. started but incomplete.
**Step 3 — Propose workstreams to Bernd**
Propose 13 workstreams — each a coherent strand, weeks to months, anchored to a
roadmap phase. **Wait for approval before creating.**
**Step 4 — Create workplan file first, then DB record (ADR-001)**
```
workplans/railiance-cluster-WP-NNNN-<slug>.md ← write this first
```
Then register in the hub:
```
create_workstream(topic_id="ca369340-a64e-442e-98f1-a4fa7dc74a38", title="...", owner="...", description="...")
create_task(workstream_id="<id>", title="...", priority="high|medium|low")
```
**Step 5 — Record the setup**
```
add_progress_event(
summary="First session: structured railiance into N workstreams, M tasks",
event_type="milestone",
topic_id="ca369340-a64e-442e-98f1-a4fa7dc74a38",
detail={"workstreams": [...], "tasks_created": M}
)
```
<!-- Delete or archive this file once past first session -->

View File

@@ -0,0 +1,8 @@
## Repo boundary
This repo owns **railiance-cluster** only. It does not own:
<!-- TODO: List what belongs in adjacent repos, e.g.:
- SSH key management → railiance-infra/
- State hub code → state-hub/
-->

View File

@@ -0,0 +1,5 @@
**Purpose:** OAS S2 Cluster Runtime — k3s, Helm, ingress, CNI, operators
**Domain:** railiance
**Repo slug:** railiance-cluster
**Topic ID:** ca369340-a64e-442e-98f1-a4fa7dc74a38

View File

@@ -0,0 +1,84 @@
## Session Protocol
State Hub: http://127.0.0.1:8000
**Step 1 — Orient**
Read the offline-safe brief first — it works without a live hub connection:
```bash
cat .custodian-brief.md
```
Then call the MCP tool for richer cross-domain context when MCP tools are exposed:
```
get_domain_summary("railiance")
```
If MCP tools are unavailable in the current agent session, use the REST API:
```bash
curl -s "http://127.0.0.1:8000/state/summary" | python3 -m json.tool
```
If the hub is offline: `cd ~/state-hub && make api`
**Step 2 — Check inbox**
With MCP tools:
```
get_messages(to_agent="railiance-cluster", unread_only=True)
```
Mark read with `mark_message_read(message_id)`. Reply or act on coordination
requests before proceeding.
Without MCP tools:
```bash
curl -s "http://127.0.0.1:8000/messages/?to_agent=railiance-cluster&unread_only=true" \
| python3 -m json.tool
curl -s -X PATCH "http://127.0.0.1:8000/messages/<id>/read" \
-H "Content-Type: application/json" -d '{}'
```
**Step 3 — Scan workplans**
```bash
ls workplans/
```
For each file with `status: ready`, `active`, or `blocked`, note pending
`todo`/`in_progress` tasks.
**Step 4 — Present brief**
1. **Active workstreams** for `railiance` — title, task counts, blocking decisions
2. **Pending tasks** from `workplans/` + any `[repo:railiance-cluster]` hub tasks
3. **Goal guidance** — if `goal_guidance` in summary:
- `needs_workplan`: surface as top action — *"Repo goal '{title}' has no workplan yet"*
- `alignment_warnings`: flag if active work is not aligned with current goal
4. **Suggested next action** — highest-priority open item
5. **SBOM status** — flag if `last_sbom_at` is unset for this repo
If no workstreams: follow First Session Protocol (`first-session.md`).
**During work:** `record_decision()` · `add_progress_event()` · `resolve_decision()`
> State Hub is a *read model*. Bootstrap tools (`create_workstream`, `create_task`)
> are First Session Protocol only. Work structure belongs in repo files (ADR-001).
**Session close:**
With MCP tools:
```
add_progress_event(summary="...", topic_id="ca369340-a64e-442e-98f1-a4fa7dc74a38", workstream_id="<uuid>")
```
Without MCP tools:
```bash
curl -s -X POST http://127.0.0.1:8000/progress/ \
-H "Content-Type: application/json" \
-d '{"topic_id":"ca369340-a64e-442e-98f1-a4fa7dc74a38","workstream_id":"<uuid>","event_type":"note","summary":"what changed","author":"codex"}'
```
If workplan files were modified, ensure the local copy is up to date first:
```bash
git -C <repo_path> pull --ff-only
cd ~/state-hub && make fix-consistency REPO=railiance-cluster
```
For repos where implementation runs on a remote machine (e.g. CoulombCore),
use the combined target which pulls before fixing:
```bash
cd ~/state-hub && make fix-consistency-remote REPO=railiance-cluster
```
**C-15** (DB task ahead of file) is normal in multi-machine workflows — writeback
will sync the file to match DB. **C-16** (repo behind remote) blocks all writes
until you pull — intentional to prevent clobbering remote progress.

View File

@@ -0,0 +1,19 @@
## Stack
<!-- TODO: Fill in language, frameworks, and key dependencies -->
- **Language:**
- **Key deps:**
## Dev Commands
```bash
# TODO: Fill in the standard commands for this repo
# Install dependencies
# Run tests
# Lint / type check
# Build / package (if applicable)
```

View File

@@ -0,0 +1,28 @@
## Workplan Convention (ADR-001)
File location: `workplans/railiance-cluster-WP-NNNN-<slug>.md`
ID prefix: `RAILIANCE-WP`
Work items originate as files in this repo **before** being registered in the hub.
Canonical workplan/workstream frontmatter statuses are:
`proposed`, `ready`, `active`, `blocked`, `backlog`, `finished`, `archived`.
Use `proposed` for a newly drafted plan, `ready` after review against current
repo state, and `finished` when implementation is complete. `stalled` and
`needs_review` are derived health labels, not stored statuses.
Closed workplans may be moved to `workplans/archived/` with a completion-date
prefix: `YYMMDD-railiance-cluster-WP-NNNN-<slug>.md`. The frontmatter id remains
unchanged; the prefix is only for quick visual reference.
Small opportunistic tasks discovered during another session use **Ad Hoc Tasks**:
`workplans/ADHOC-YYYY-MM-DD.md`, workstream slug `adhoc-YYYY-MM-DD`, and task ids
`ADHOC-YYYY-MM-DD-T01`, `T02`, etc. Use adhocs only for low-risk work completed
directly. Promote anything requiring analysis, design, approval, dependencies, or
multiple planned phases into a normal workplan.
Ecosystem todos from other agents arrive as `[repo:railiance-cluster]` hub tasks —
visible at session start. Pick one up by creating the workplan file, then registering
the workstream.
<!-- Ralph Loop rules and HEUREKA sequence: ~/.claude/CLAUDE.md — do not duplicate here -->

162
AGENTS.md Normal file
View File

@@ -0,0 +1,162 @@
# railiance-cluster — Agent Instructions
## Repo Identity
**Purpose:** OAS S2 Cluster Runtime — k3s, Helm, ingress, CNI, operators
**Domain:** railiance
**Repo slug:** railiance-cluster
**Topic ID:** `ca369340-a64e-442e-98f1-a4fa7dc74a38`
**Workplan prefix:** `RAILIANCE-WP-`
---
## State Hub Integration
The Custodian State Hub tracks work across all domains. Interact via HTTP REST —
there is no MCP server for Codex agents.
| Context | URL |
|---------|-----|
| Local workstation | `http://127.0.0.1:8000` |
| Remote via tunnel | `http://127.0.0.1:18000` |
### Orient at session start
```bash
# Offline brief — works without hub connection
cat .custodian-brief.md
# Active workstreams for this domain
curl -s "http://127.0.0.1:8000/workstreams/?topic_id=ca369340-a64e-442e-98f1-a4fa7dc74a38&status=active" \
| python3 -m json.tool
# Check inbox
curl -s "http://127.0.0.1:8000/messages/?to_agent=railiance-cluster&unread_only=true" \
| python3 -m json.tool
```
Mark a message read:
```bash
curl -s -X PATCH "http://127.0.0.1:8000/messages/<id>/read" \
-H "Content-Type: application/json" -d '{}'
```
### Log progress (required at session close)
```bash
curl -s -X POST http://127.0.0.1:8000/progress/ \
-H "Content-Type: application/json" \
-d '{
"summary": "what was done",
"event_type": "note",
"author": "codex",
"workstream_id": "<uuid>",
"task_id": "<uuid>"
}'
```
Omit `workstream_id` / `task_id` when not applicable.
### Update task status
```bash
curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
-H "Content-Type: application/json" \
-d '{"status": "in_progress"}'
# values: todo | in_progress | done | blocked
```
### Flag a task for human review
```bash
curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
-H "Content-Type: application/json" \
-d '{"needs_human": true, "intervention_note": "reason"}'
```
---
## Session Protocol
**Start:**
1. `cat .custodian-brief.md` — domain goal and open workstreams (offline-safe)
2. Check inbox: `GET /messages/?to_agent=railiance-cluster&unread_only=true`; mark read
3. Scan workplans: `ls workplans/` — note `status: ready`, `active`, or `blocked` files and open tasks
4. Check blocked tasks: `GET /tasks/?needs_human=true`
**During work:**
- Update task statuses in workplan files as tasks progress
- Record significant decisions via `POST /decisions/`
**Close:**
1. Update workplan file task statuses to reflect progress
2. Log: `POST /progress/` with a summary of what changed
3. Note for the custodian operator: after workplan file changes, run from
`~/state-hub`:
```bash
make fix-consistency REPO=railiance-cluster
```
This syncs task status from files into the hub DB.
---
## Workplan Convention (ADR-001)
Work items originate as files in this repo — not in the hub. The hub is a
read/cache/index layer that rebuilds from files.
**File location:** `workplans/RAILIANCE-WP-NNNN-<slug>.md`
**Archived location:** finished workplans may move to
`workplans/archived/YYMMDD-RAILIANCE-WP-NNNN-<slug>.md`. The `YYMMDD` prefix is
the completion/archive date; the frontmatter `id` does not change.
**Ad Hoc Tasks:** small opportunistic fixes discovered during a session use
`workplans/ADHOC-YYYY-MM-DD.md` with task ids `ADHOC-YYYY-MM-DD-T01`, etc. Use
this only for low-risk work completed directly; create a normal workplan for
anything needing analysis, design, approval, dependencies, or multiple phases.
**Frontmatter:**
```yaml
---
id: RAILIANCE-WP-NNNN
type: workplan
title: "..."
domain: railiance
repo: railiance-cluster
status: proposed | ready | active | blocked | backlog | finished | archived
owner: codex
topic_slug: ...
created: "YYYY-MM-DD"
updated: "YYYY-MM-DD"
state_hub_workstream_id: "<uuid>" # written by fix-consistency — do not edit
---
```
Use `proposed` for a new draft, `ready` after review against current repo
state, and `finished` after implementation. `stalled` and `needs_review` are
derived health labels, not frontmatter statuses.
**Task block format** (one per `##` section):
```
## Task Title
` ` `task
id: RAILIANCE-WP-NNNN-T01
status: todo | in_progress | done | blocked
priority: high | medium | low
state_hub_task_id: "<uuid>" # written by fix-consistency — do not edit
` ` `
Task description text.
```
Status progression: `todo` → `in_progress` → `done` (or `blocked`)
To create a new workplan:
1. Write the file following the format above
2. Notify the custodian operator to run `make fix-consistency REPO=railiance-cluster`
(or send a message to the hub agent via `POST /messages/`)

184
CLAUDE.md
View File

@@ -1,177 +1,11 @@
# railiance-cluster — Claude Code Instructions # railiance-cluster — Claude Code Instructions
## Custodian State Hub Integration @SCOPE.md
@.claude/rules/repo-identity.md
This project is tracked as the **railiance** domain in the Custodian State Hub. @.claude/rules/session-protocol.md
Hub topic ID: `ca369340-a64e-442e-98f1-a4fa7dc74a38` @.claude/rules/first-session.md
@.claude/rules/workplan-convention.md
The State Hub runs locally at http://127.0.0.1:8000. The MCP server (`state-hub`) @.claude/rules/stack-and-commands.md
exposes tools for reading and writing state without touching the API directly. @.claude/rules/architecture.md
@.claude/rules/repo-boundary.md
--- @.claude/rules/agents.md
### Session Protocol
**On receiving your first message — before writing any response text — execute
this orientation sequence. Do not greet, do not ask what to do first.**
**Step 1 — Call the State Hub**
```
get_state_summary() # orientation: workstreams, decisions, recent progress
get_next_steps() # contextual suggestions from resolved decisions
```
If the call fails, the API is offline: `cd ~/the-custodian/state-hub && make api`
**Step 2 — Scan local workplans**
Read every `.md` file under `workplans/`. Use `Glob(pattern="**/*.md", path="workplans/")`
or Bash `ls workplans/` to discover them. For each file with `status: active`,
extract and note:
- The workplan title and ID
- All tasks whose `status` is `todo` or `in_progress`
**Step 3 — Present orientation to the user**
Output a concise brief covering:
1. **Active workstreams** for the `railiance` domain — title, task counts,
any blocking decisions
2. **Pending tasks for this repo** — from local `workplans/` files (Step 2)
plus any state hub tasks with `[repo:railiance-cluster]` in their title
3. **Goal guidance** — if the summary contains a `goal_guidance` key, act on it:
- **`needs_workplan`** entries: for each active repo goal with no linked workstream,
surface it as the top suggested action — *"Repo goal '{title}' has no workplan yet.
Suggest: create workplans/RAIL-BS-WP-NNNN-<slug>.md and register a workstream
with repo_goal_id='{goal_id}'"*. Treat this as higher priority than continuing
existing work unless Bernd says otherwise.
- **`alignment_warnings`** entries: if active workstreams exist but are not linked
to the current repo goal, name the most recently active one and note:
*"Current work on '{recent_workstream_title}' may not be aligned with the active
goal '{active_goal_title}'. Continue unless you hear otherwise — but flag it."*
4. **Suggested next action** — the highest-priority open item across all sources,
with goal alignment taken into account
5. **SBOM status**`last_sbom_at` for `railiance-cluster` is currently null
(gap: no lockfile yet — see `workplans/RAIL-BS-WP-0001-dependency-management.md`)
**During work:**
- Use `record_decision()` for any decision that affects direction or dependencies.
- Use `add_progress_event()` for notable events (milestones, blockers, insights).
- Use `resolve_decision()` to close a decision once the choice is made.
> **Design boundary:** The State Hub is a *read model*. Two write operations are
> permanently sanctioned: **Resolving Decisions** and **Suggesting Next Steps**.
> Bootstrap tools are only for First Session Protocol. Work structure belongs
> in the domain repo as files (ADR-001).
**At the end of every session:**
- Call `add_progress_event()` with a summary of what was accomplished.
Include `topic_id: ca369340-a64e-442e-98f1-a4fa7dc74a38` and `workstream_id`.
---
### Known Pending Tasks (as of 2026-03-01)
**RAIL-BS-WP-0001 — Dependency Management** (`workplans/RAIL-BS-WP-0001-dependency-management.md`)
The SBOM scanner finds nothing to ingest because Ansible and control-node pip
dependencies are not declared in any lockfile. This is the top-priority open
task for this repo.
| Task | Priority | Status |
|------|----------|--------|
| T01: Audit control-node pip deps | medium | todo |
| T02: Create pyproject.toml + uv.lock | medium | todo |
| T03: Ingest SBOM into State Hub | medium | todo |
| T04: Create ansible/requirements.yml | low | todo |
State Hub task ID: `5f8cade5-119c-42e8-ba93-e9d0478650e4`
---
### First Session Protocol
Triggered when `get_state_summary()` shows **no workstreams** for `railiance`.
**Step 1** — Read `~/the-custodian/canon/projects/railiance/project_charter_v0.1.md`
and `roadmap_v0.1.md`, then scan this repo root.
**Step 2** — Survey in-progress work (TODOs, open branches, half-finished files).
**Step 3** — Propose 13 workstreams. Wait for approval before creating anything.
**Step 4** — Create workplan file first (`workplans/RAIL-WP-NNNN-<slug>.md`),
then register in hub:
```
create_workstream(topic_id="ca369340-a64e-442e-98f1-a4fa7dc74a38", ...)
create_task(workstream_id="<id>", ...)
```
**Step 5** — Record setup with `add_progress_event()`.
---
### Workplan Convention (ADR-001)
Work items originate as files in `workplans/` before being registered in the hub.
When the custodian creates a task for this repo, it places a workplan file here
AND creates a state hub task with `[repo:railiance-cluster]` in the title.
Both appear at session start via the orientation above.
---
### Contribution Tracking
```
contrib/
bug-reports/ # br-YYYY-MM-DD--org--repo--slug.md
feature-requests/ # fr-YYYY-MM-DD--org--repo--slug.md
extension-points/ # EP-RAIL-NNN--org--repo--slug.md
upstream-prs/ # upr-YYYY-MM-DD--org--repo--slug.md
```
Templates: `~/the-custodian/canon/standards/contrib-templates/`
---
### SBOM
After creating and committing the lockfile (see RAIL-BS-WP-0001), ingest:
```bash
cd ~/the-custodian/state-hub
make ingest-sbom REPO=railiance-cluster SCAN=1 REPO_PATH=/home/worsch/railiance-cluster
```
---
### Remote Execution & State Hub Tunnel
This repo is designed to be worked on **from the HostEurope server**. The
State Hub runs on Bernd's local workstation at `127.0.0.1:8000` and is not
publicly reachable — a reverse SSH tunnel must be open before starting a
remote session.
**On your local machine, before SSHing to the server:**
```bash
cd ~/the-custodian/state-hub
make tunnel HOST=tegwick@92.205.130.254
```
Keep that terminal open. Then SSH in normally from a second terminal.
**Verify the tunnel is live from the remote:**
```bash
curl http://127.0.0.1:8000/state/health
# expected: {"status":"ok"}
```
**If the tunnel is not up (degraded mode):**
- The `get_domain_summary` call in Step 1 will fail
- Skip Step 1 — proceed from local workplans only (Step 2)
- Log any progress events manually from the local machine after the session
---
### Quick Reference
`~/the-custodian/state-hub/mcp_server/TOOLS.md` — compact MCP tool reference