4.2 KiB
id, type, title, domain, repo, status, owner, topic_slug, created, updated, state_hub_workstream_id
| id | type | title | domain | repo | status | owner | topic_slug | created | updated | state_hub_workstream_id |
|---|---|---|---|---|---|---|---|---|---|---|
| RAIL-BS-WP-0008 | workplan | activity-core WP-0016 triage-output robustness deploy | financials | railiance-cluster | active | railiance-cluster | railiance | 2026-07-01 | 2026-07-02 | 7cbbe0d6-fea9-41c6-840c-46d0d8e8edde |
activity-core WP-0016 triage-output robustness deploy
Context
Inbox message 87952ff1 (activity-core, 2026-06-26): the scheduled daily WSJF
triage run on 2026-06-26 failed schema validation and the whole run was
discarded, resetting the WP-0006-T03 three-clean-run streak. ACTIVITY-WP-0016
hardened the instruction-executor output contract in-repo (commits
5eb33bd..bf877b7 on activity-core main, 220 tests passed). The remaining
work is operator/cluster-owned on railiance01.
Deploy coupling constraint: schemas/daily-triage-report.json is now
strict per-item and is consumed by both the llm-connect hint and the
whole-doc validator. It MUST ship together with the new executor.py
(T03 per-item quarantine parser). Never deploy the schema ahead of the code.
Deploy activity-core with coupled schema and executor
id: RAIL-BS-WP-0008-T01
status: todo
priority: high
state_hub_task_id: "079e39a9-f938-4d03-a5bc-4d3d2f7b1d83"
Rebuild/import the activity-core image from main (bf877b7 or later) into
the railiance01 k3s runtime and reconcile the activity-core deployment so the
new executor and the strict per-item schema ship together.
2026-07-02: Added make deploy-activity-core-triage-robustness /
bin/railiance deploy-triage-robustness as the repeatable operator path. The
command gates the remote activity-core repo on bf877b7 or later, checks the
runtime bundle contract before applying it, restarts the activity-core
deployments by default, waits for migrate/sync jobs and rollouts, then records
non-secret State Hub evidence. Live execution on railiance01 remains pending.
Update daily-statehub-wsjf-triage runtime-bundle Instruction
id: RAIL-BS-WP-0008-T02
status: todo
priority: high
state_hub_task_id: "129fb472-41e8-4e5c-bcbb-0995a96e223b"
In the runtime projection (not the activity-core repo), update the
daily-statehub-wsjf-triage Instruction:
- raise
max_tokens(currently ~1200; give clear headroom above the ~1300–1500-token 16-workstream list); - prompt: bounded top-N (≤7) ranked recommendations, "if uncertain emit fewer well-formed items rather than more";
- prompt: per-item NDJSON framing (leading summary object, then one recommendation JSON object per line) so the T03 parser recovers items independently.
2026-07-02: The new deploy command enforces this contract against
k8s/railiance/20-runtime.yaml before it will touch the cluster: it requires
the daily instruction, a top-7 bound, the "fewer well-formed" fallback, NDJSON
or one-object-per-line framing, and max_tokens headroom of at least 1800.
Pull raw llm-connect response for the 2026-06-26 run
id: RAIL-BS-WP-0008-T03
status: todo
priority: medium
state_hub_task_id: "59559f1d-821f-4660-8a7d-c623c6631864"
From the llm-connect pod logs / response store on railiance01, capture the
full raw response and finish_reason for the 2026-06-26 05:20:57Z run
(activity-core retained only a 4000-char preview; the JSON break is at char
5268). Send to activity-core to close ACTIVITY-WP-0016-T01. Logs only, no
secrets.
Acceptance smoke
id: RAIL-BS-WP-0008-T04
status: todo
priority: high
state_hub_task_id: "8096621a-54ee-4be5-943e-5dc2da19ed28"
Trigger one daily-triage run against the reconciled runtime and confirm it
either (i) returns a clean schema-valid report, or (ii) degrades gracefully
(valid recommendations with output_validated=true, partial=true,
quarantined_count>0) instead of discarding the run. Confirm the State Hub
shows a matching daily_triage progress event. Closes ACTIVITY-WP-0016-T05
and unblocks the three-clean-run streak for ACTIVITY-WP-0010-T04 /
WP-0006-T03.
2026-07-02: The deploy command now triggers the daily-triage definition after
reconcile and polls State Hub for a post-trigger daily_triage event with
output_validated=true. If the run is partial, it also requires
quarantined_count>0 before posting pass evidence.