Repo hygiene + new workplans (RAIL-BS-WP-0008/0009)
Some checks failed
railiance-tests / smoke (push) Has been cancelled
Some checks failed
railiance-tests / smoke (push) Has been cancelled
- Add RAIL-BS-WP-0008 (activity-core WP-0016 deploy) and RAIL-BS-WP-0009 (admin-sync smoke) from inbox asks 87952ff1 / aa8b7986 - Archive finished workplans to workplans/archived/ per ADR-001 convention; normalize frontmatter statuses (completed/done -> finished) - Fill stack-and-commands.md, complete repo-boundary.md, refresh SCOPE Current State, add docs/operator-runbook.md for production-touching targets Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,89 @@
|
||||
---
|
||||
id: RAIL-BS-WP-0008
|
||||
type: workplan
|
||||
title: "activity-core WP-0016 triage-output robustness deploy"
|
||||
domain: financials
|
||||
repo: railiance-cluster
|
||||
status: ready
|
||||
owner: railiance-cluster
|
||||
topic_slug: railiance
|
||||
created: "2026-07-01"
|
||||
updated: "2026-07-01"
|
||||
---
|
||||
|
||||
# activity-core WP-0016 triage-output robustness deploy
|
||||
|
||||
## Context
|
||||
|
||||
Inbox message `87952ff1` (activity-core, 2026-06-26): the scheduled daily WSJF
|
||||
triage run on 2026-06-26 failed schema validation and the whole run was
|
||||
discarded, resetting the WP-0006-T03 three-clean-run streak. ACTIVITY-WP-0016
|
||||
hardened the instruction-executor output contract in-repo (commits
|
||||
`5eb33bd..bf877b7` on activity-core main, 220 tests passed). The remaining
|
||||
work is operator/cluster-owned on railiance01.
|
||||
|
||||
**Deploy coupling constraint:** `schemas/daily-triage-report.json` is now
|
||||
strict per-item and is consumed by both the llm-connect hint and the
|
||||
whole-doc validator. It MUST ship together with the new `executor.py`
|
||||
(T03 per-item quarantine parser). Never deploy the schema ahead of the code.
|
||||
|
||||
## Deploy activity-core with coupled schema and executor
|
||||
|
||||
```task
|
||||
id: RAIL-BS-WP-0008-T01
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Rebuild/import the activity-core image from main (`bf877b7` or later) into
|
||||
the railiance01 k3s runtime and reconcile the activity-core deployment so the
|
||||
new executor and the strict per-item schema ship together.
|
||||
|
||||
## Update daily-statehub-wsjf-triage runtime-bundle Instruction
|
||||
|
||||
```task
|
||||
id: RAIL-BS-WP-0008-T02
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
In the runtime projection (not the activity-core repo), update the
|
||||
`daily-statehub-wsjf-triage` Instruction:
|
||||
|
||||
- raise `max_tokens` (currently ~1200; give clear headroom above the
|
||||
~1300–1500-token 16-workstream list);
|
||||
- prompt: bounded top-N (≤7) ranked recommendations, "if uncertain emit fewer
|
||||
well-formed items rather than more";
|
||||
- prompt: per-item NDJSON framing (leading summary object, then one
|
||||
recommendation JSON object per line) so the T03 parser recovers items
|
||||
independently.
|
||||
|
||||
## Pull raw llm-connect response for the 2026-06-26 run
|
||||
|
||||
```task
|
||||
id: RAIL-BS-WP-0008-T03
|
||||
status: todo
|
||||
priority: medium
|
||||
```
|
||||
|
||||
From the llm-connect pod logs / response store on railiance01, capture the
|
||||
full raw response and `finish_reason` for the 2026-06-26 05:20:57Z run
|
||||
(activity-core retained only a 4000-char preview; the JSON break is at char
|
||||
5268). Send to activity-core to close ACTIVITY-WP-0016-T01. Logs only, no
|
||||
secrets.
|
||||
|
||||
## Acceptance smoke
|
||||
|
||||
```task
|
||||
id: RAIL-BS-WP-0008-T04
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Trigger one daily-triage run against the reconciled runtime and confirm it
|
||||
either (i) returns a clean schema-valid report, or (ii) degrades gracefully
|
||||
(valid recommendations with `output_validated=true`, `partial=true`,
|
||||
`quarantined_count>0`) instead of discarding the run. Confirm the State Hub
|
||||
shows a matching `daily_triage` progress event. Closes ACTIVITY-WP-0016-T05
|
||||
and unblocks the three-clean-run streak for ACTIVITY-WP-0010-T04 /
|
||||
WP-0006-T03.
|
||||
46
workplans/RAIL-BS-WP-0009-activity-core-admin-sync-smoke.md
Normal file
46
workplans/RAIL-BS-WP-0009-activity-core-admin-sync-smoke.md
Normal file
@@ -0,0 +1,46 @@
|
||||
---
|
||||
id: RAIL-BS-WP-0009
|
||||
type: workplan
|
||||
title: "activity-core no-restart admin-sync smoke (ACTIVITY-WP-0012-T05)"
|
||||
domain: financials
|
||||
repo: railiance-cluster
|
||||
status: ready
|
||||
owner: railiance-cluster
|
||||
topic_slug: railiance
|
||||
created: "2026-07-01"
|
||||
updated: "2026-07-01"
|
||||
---
|
||||
|
||||
# activity-core no-restart admin-sync smoke (ACTIVITY-WP-0012-T05)
|
||||
|
||||
## Context
|
||||
|
||||
Inbox message `aa8b7986` (activity-core, 2026-06-18): activity-core commit
|
||||
`3e93567` implements ACTIVITY-WP-0012 T01–T04 (shared sync_service,
|
||||
`POST /admin/sync`, explicit schedule upsert/pause/orphan-delete counts,
|
||||
worker startup reuse, runbook docs; 192 tests passed). T05 is the
|
||||
cluster-owned smoke: prove admin sync works **without** worker
|
||||
SIGTERM/pod restart.
|
||||
|
||||
The deploy precondition is covered by RAIL-BS-WP-0008-T01 (main at
|
||||
`bf877b7` ≥ `3e93567`), so run this after that reconcile.
|
||||
|
||||
## Run the no-restart admin-sync smoke
|
||||
|
||||
```task
|
||||
id: RAIL-BS-WP-0009-T01
|
||||
status: wait
|
||||
priority: medium
|
||||
```
|
||||
|
||||
After RAIL-BS-WP-0008-T01 is deployed, without restarting the worker:
|
||||
|
||||
1. Change or use a customer ActivityDefinition enabled-flip/rename fixture.
|
||||
2. Call `POST /admin/sync?definitions=true&schedules=true` from the operator
|
||||
path.
|
||||
3. Confirm the new Temporal schedule is active and the retired/disabled
|
||||
schedule is paused or deleted per sync semantics.
|
||||
4. Confirm event-triggered definitions still fire normally.
|
||||
5. Record non-secret evidence in the State Hub. Response JSON should include
|
||||
`definitions.synced`, `schedules.upserted`, `schedules.paused`,
|
||||
`schedules.deleted_orphans`, and `errors[]`.
|
||||
@@ -4,7 +4,7 @@ type: workplan
|
||||
title: "Dependency Management — Add lockfile for Ansible control-node deps"
|
||||
domain: financials
|
||||
repo: railiance-cluster
|
||||
status: completed
|
||||
status: finished
|
||||
owner: railiance
|
||||
topic_slug: railiance
|
||||
state_hub_workstream_id: 59155efb-b461-4caa-ad7b-b3fce348db84
|
||||
@@ -4,7 +4,7 @@ type: workplan
|
||||
title: "k3s and Kubernetes Platform Baseline"
|
||||
domain: financials
|
||||
repo: railiance-cluster
|
||||
status: completed
|
||||
status: finished
|
||||
owner: railiance
|
||||
topic_slug: railiance
|
||||
repo_goal_id: "70ab2379-fb9d-4fec-a09d-b2a717e4ace8"
|
||||
@@ -4,7 +4,7 @@ type: bug-report
|
||||
title: "pgpool CrashLoopBackOff on PostgreSQL HA failover — missing secret key"
|
||||
domain: financials
|
||||
repo: railiance-cluster
|
||||
status: completed
|
||||
status: finished
|
||||
owner: tegwick
|
||||
created: "2026-03-10"
|
||||
updated: "2026-03-10"
|
||||
@@ -4,7 +4,7 @@ type: workplan
|
||||
title: "Integrated Backup — S2 Kubernetes Runtime Layer"
|
||||
domain: financials
|
||||
repo: railiance-cluster
|
||||
status: done
|
||||
status: finished
|
||||
owner: tegwick
|
||||
topic_slug: railiance
|
||||
state_hub_workstream_id: "7e8b0c20-51eb-40c9-9e3b-85dd380d7625"
|
||||
@@ -4,7 +4,7 @@ type: workplan
|
||||
title: "Kubeconfig delivery for netkingdom SSO/MFA stack apply"
|
||||
domain: financials
|
||||
repo: railiance-cluster
|
||||
status: done
|
||||
status: finished
|
||||
owner: railiance-worker
|
||||
topic_slug: railiance
|
||||
capability_request_id: "34b97d89-e80a-42ae-a623-a9185e5b17f5"
|
||||
Reference in New Issue
Block a user