Repo hygiene + new workplans (RAIL-BS-WP-0008/0009)
Some checks failed
railiance-tests / smoke (push) Has been cancelled

- Add RAIL-BS-WP-0008 (activity-core WP-0016 deploy) and RAIL-BS-WP-0009
  (admin-sync smoke) from inbox asks 87952ff1 / aa8b7986
- Archive finished workplans to workplans/archived/ per ADR-001 convention;
  normalize frontmatter statuses (completed/done -> finished)
- Fill stack-and-commands.md, complete repo-boundary.md, refresh SCOPE
  Current State, add docs/operator-runbook.md for production-touching targets

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
2026-07-02 00:02:36 +02:00
parent eefa6c1b2a
commit b3b0c3e3ff
15 changed files with 206 additions and 24 deletions

View File

@@ -0,0 +1,89 @@
---
id: RAIL-BS-WP-0008
type: workplan
title: "activity-core WP-0016 triage-output robustness deploy"
domain: financials
repo: railiance-cluster
status: ready
owner: railiance-cluster
topic_slug: railiance
created: "2026-07-01"
updated: "2026-07-01"
---
# activity-core WP-0016 triage-output robustness deploy
## Context
Inbox message `87952ff1` (activity-core, 2026-06-26): the scheduled daily WSJF
triage run on 2026-06-26 failed schema validation and the whole run was
discarded, resetting the WP-0006-T03 three-clean-run streak. ACTIVITY-WP-0016
hardened the instruction-executor output contract in-repo (commits
`5eb33bd..bf877b7` on activity-core main, 220 tests passed). The remaining
work is operator/cluster-owned on railiance01.
**Deploy coupling constraint:** `schemas/daily-triage-report.json` is now
strict per-item and is consumed by both the llm-connect hint and the
whole-doc validator. It MUST ship together with the new `executor.py`
(T03 per-item quarantine parser). Never deploy the schema ahead of the code.
## Deploy activity-core with coupled schema and executor
```task
id: RAIL-BS-WP-0008-T01
status: todo
priority: high
```
Rebuild/import the activity-core image from main (`bf877b7` or later) into
the railiance01 k3s runtime and reconcile the activity-core deployment so the
new executor and the strict per-item schema ship together.
## Update daily-statehub-wsjf-triage runtime-bundle Instruction
```task
id: RAIL-BS-WP-0008-T02
status: todo
priority: high
```
In the runtime projection (not the activity-core repo), update the
`daily-statehub-wsjf-triage` Instruction:
- raise `max_tokens` (currently ~1200; give clear headroom above the
~13001500-token 16-workstream list);
- prompt: bounded top-N (≤7) ranked recommendations, "if uncertain emit fewer
well-formed items rather than more";
- prompt: per-item NDJSON framing (leading summary object, then one
recommendation JSON object per line) so the T03 parser recovers items
independently.
## Pull raw llm-connect response for the 2026-06-26 run
```task
id: RAIL-BS-WP-0008-T03
status: todo
priority: medium
```
From the llm-connect pod logs / response store on railiance01, capture the
full raw response and `finish_reason` for the 2026-06-26 05:20:57Z run
(activity-core retained only a 4000-char preview; the JSON break is at char
5268). Send to activity-core to close ACTIVITY-WP-0016-T01. Logs only, no
secrets.
## Acceptance smoke
```task
id: RAIL-BS-WP-0008-T04
status: todo
priority: high
```
Trigger one daily-triage run against the reconciled runtime and confirm it
either (i) returns a clean schema-valid report, or (ii) degrades gracefully
(valid recommendations with `output_validated=true`, `partial=true`,
`quarantined_count>0`) instead of discarding the run. Confirm the State Hub
shows a matching `daily_triage` progress event. Closes ACTIVITY-WP-0016-T05
and unblocks the three-clean-run streak for ACTIVITY-WP-0010-T04 /
WP-0006-T03.

View File

@@ -0,0 +1,46 @@
---
id: RAIL-BS-WP-0009
type: workplan
title: "activity-core no-restart admin-sync smoke (ACTIVITY-WP-0012-T05)"
domain: financials
repo: railiance-cluster
status: ready
owner: railiance-cluster
topic_slug: railiance
created: "2026-07-01"
updated: "2026-07-01"
---
# activity-core no-restart admin-sync smoke (ACTIVITY-WP-0012-T05)
## Context
Inbox message `aa8b7986` (activity-core, 2026-06-18): activity-core commit
`3e93567` implements ACTIVITY-WP-0012 T01T04 (shared sync_service,
`POST /admin/sync`, explicit schedule upsert/pause/orphan-delete counts,
worker startup reuse, runbook docs; 192 tests passed). T05 is the
cluster-owned smoke: prove admin sync works **without** worker
SIGTERM/pod restart.
The deploy precondition is covered by RAIL-BS-WP-0008-T01 (main at
`bf877b7``3e93567`), so run this after that reconcile.
## Run the no-restart admin-sync smoke
```task
id: RAIL-BS-WP-0009-T01
status: wait
priority: medium
```
After RAIL-BS-WP-0008-T01 is deployed, without restarting the worker:
1. Change or use a customer ActivityDefinition enabled-flip/rename fixture.
2. Call `POST /admin/sync?definitions=true&schedules=true` from the operator
path.
3. Confirm the new Temporal schedule is active and the retired/disabled
schedule is paused or deleted per sync semantics.
4. Confirm event-triggered definitions still fire normally.
5. Record non-secret evidence in the State Hub. Response JSON should include
`definitions.synced`, `schedules.upserted`, `schedules.paused`,
`schedules.deleted_orphans`, and `errors[]`.

View File

@@ -4,7 +4,7 @@ type: workplan
title: "Dependency Management — Add lockfile for Ansible control-node deps"
domain: financials
repo: railiance-cluster
status: completed
status: finished
owner: railiance
topic_slug: railiance
state_hub_workstream_id: 59155efb-b461-4caa-ad7b-b3fce348db84

View File

@@ -4,7 +4,7 @@ type: workplan
title: "k3s and Kubernetes Platform Baseline"
domain: financials
repo: railiance-cluster
status: completed
status: finished
owner: railiance
topic_slug: railiance
repo_goal_id: "70ab2379-fb9d-4fec-a09d-b2a717e4ace8"

View File

@@ -4,7 +4,7 @@ type: bug-report
title: "pgpool CrashLoopBackOff on PostgreSQL HA failover — missing secret key"
domain: financials
repo: railiance-cluster
status: completed
status: finished
owner: tegwick
created: "2026-03-10"
updated: "2026-03-10"

View File

@@ -4,7 +4,7 @@ type: workplan
title: "Integrated Backup — S2 Kubernetes Runtime Layer"
domain: financials
repo: railiance-cluster
status: done
status: finished
owner: tegwick
topic_slug: railiance
state_hub_workstream_id: "7e8b0c20-51eb-40c9-9e3b-85dd380d7625"

View File

@@ -4,7 +4,7 @@ type: workplan
title: "Kubeconfig delivery for netkingdom SSO/MFA stack apply"
domain: financials
repo: railiance-cluster
status: done
status: finished
owner: railiance-worker
topic_slug: railiance
capability_request_id: "34b97d89-e80a-42ae-a623-a9185e5b17f5"