Record daily triage clean streak checkpoint
This commit is contained in:
@@ -1,6 +1,6 @@
|
||||
# Daily-Triage Stabilization Status
|
||||
|
||||
Updated: 2026-06-27
|
||||
Updated: 2026-06-30
|
||||
|
||||
## Purpose
|
||||
|
||||
@@ -20,10 +20,16 @@ Recent scheduled run evidence:
|
||||
| 2026-06-25 | `cbba6bc0-14cb-492b-ab23-74b9349326c8` | schema-valid daily triage, working memory written |
|
||||
| 2026-06-26 | `97fd20a0-eee0-45ea-8290-6d91874e1515` | validation failed at char 5268, working memory written |
|
||||
| 2026-06-27 | `c5ab50a8-404b-4e30-849f-841b059ace65` | validation failed at char 5246, working memory written |
|
||||
| 2026-06-28 | `f0d8477e-1db9-4c07-bb8c-d28cbb868abc` | schema-valid daily triage, working memory written; still emitted 10 recommendations |
|
||||
| 2026-06-29 | `176d2ea7-f0e3-48cd-999b-4ab6055c6a55` | schema-valid daily triage, working memory written; still emitted 10 recommendations |
|
||||
| 2026-06-30 | `27d695b2-a537-481b-ada6-ca84ec24cd96` | schema-valid daily triage, working memory written; still emitted 10 recommendations |
|
||||
|
||||
The 2026-06-26 and 2026-06-27 failures are both overlong malformed JSON
|
||||
responses from `daily-triage-report`. They are not missed schedules and they are
|
||||
not silent sink failures.
|
||||
not silent sink failures. The 2026-06-28 through 2026-06-30 events restore a
|
||||
three-run schema-valid streak, but they do not prove the bounded WP-0016
|
||||
contract because the reports still emit 10 recommendations instead of the
|
||||
targeted top-N framing.
|
||||
|
||||
## Current Blocker
|
||||
|
||||
@@ -40,15 +46,16 @@ malformed tail.
|
||||
- producer guardrails and ADR-004;
|
||||
- regression tests for the 2026-06-26 failure shape.
|
||||
|
||||
The remaining gate is the live deployment/smoke path:
|
||||
The remaining gate is the live contract/smoke path:
|
||||
|
||||
1. Deploy the WP-0016 code and schema together.
|
||||
2. Update the Railiance runtime prompt bundle with bounded top-N instructions,
|
||||
per-item framing, value vocabularies, and sufficient `max_tokens` headroom.
|
||||
3. Run a live daily-triage smoke on railiance01 and confirm malformed-tail
|
||||
output degrades to partial valid output with quarantined items.
|
||||
4. Resume the three-clean-scheduled-run gate for `ACTIVITY-WP-0006-T03` and
|
||||
`ACTIVITY-WP-0010-T04`.
|
||||
4. Record the 2026-06-28 / 2026-06-29 / 2026-06-30 three-clean-run
|
||||
calibration result with the caveat that top-N contract adoption is still
|
||||
pending.
|
||||
|
||||
## Hygiene Note
|
||||
|
||||
@@ -66,3 +73,14 @@ to remove or reconcile stale duplicate task rows from the State Hub index.
|
||||
runner evidence proves the State Hub sink and working-memory path are reachable.
|
||||
The live human-needed notes now sit on the post-deployment smoke, WP-0016 live
|
||||
proof, and three-clean-run calibration tasks.
|
||||
|
||||
2026-06-30 recheck: State Hub now has schema-valid scheduled `daily_triage`
|
||||
events for 2026-06-28 (`f0d8477e-1db9-4c07-bb8c-d28cbb868abc`), 2026-06-29
|
||||
(`176d2ea7-f0e3-48cd-999b-4ab6055c6a55`), and 2026-06-30
|
||||
(`27d695b2-a537-481b-ada6-ca84ec24cd96`), all with working-memory notes. This
|
||||
is enough to bank the scheduling/sink/schema-validity streak for calibration,
|
||||
but not enough to close the WP-0016 live-proof gate: the reports still contain
|
||||
10 recommendations rather than the bounded top-N contract, and the local
|
||||
activity-core worktree already has separate in-flight diagnostic/status changes
|
||||
that should be committed by their owner before Custodian treats them as source
|
||||
truth.
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# Infrastructure Stabilization Pickup Checkpoint
|
||||
|
||||
Updated: 2026-06-27
|
||||
Updated: 2026-06-30
|
||||
Coordinator workplan: `CUST-WP-0051`
|
||||
|
||||
## Purpose
|
||||
@@ -68,7 +68,7 @@ separate ops-warden worker.
|
||||
| State Hub fallback retirement | Custodian/operator approval; `CUST-WP-0038-T08` | HA failover drill id, restore drill id, stabilization pass | Keep deferred until after HA drills; do not retire WSL2 fallback early. |
|
||||
| Inter-Hub ops-hub bootstrap | `inter-hub-bootstrap-ssh`, `openbao-api-key`, `ssh-cert-host-access` as needed | Hub id, manifest id, widget count, runtime key prefix only, smoke result | Legacy/fallback only. Prefer Core Hub deployed smoke; run attended Inter-Hub bootstrap only by explicit operator supersede/rollback decision. |
|
||||
| Ops-hub runtime evidence key | `openbao-api-key` / OpenBao custody | OpenBao path/version or populated key count, event smoke id | Do not materialize legacy `OPS_HUB_KEY` until a deployed Core Hub smoke or explicit legacy Inter-Hub smoke is ready to use it. |
|
||||
| Daily-triage live proof | activity-core deploy/runtime operator | State Hub `daily_triage` id, output-valid or partial/quarantine status, working-memory path | Deploy WP-0016 code/schema and bounded runtime prompt bundle, then run railiance01 smoke. |
|
||||
| Daily-triage live proof | activity-core deploy/runtime operator | State Hub `daily_triage` id, output-valid or partial/quarantine status, working-memory path | Bank the 2026-06-28 / 2026-06-29 / 2026-06-30 clean streak, then have the activity-core owner land/sync the in-flight WP-0016 diagnostics and prove bounded top-N plus graceful-degradation smoke. |
|
||||
| activity-core to issue-core | route `activity-core-issue-sink` | `actcore-runtime-secret` has key, activity-core points to issue-core port `8765`, HTTP 201, Gitea issue id | Inject `ISSUE_CORE_API_KEY` through approved custody, set REST sink env, restart/sync, run safe emission. |
|
||||
| Forgejo production design | Forgejo/operator decisions plus OpenBao/KeyCape/ops-bridge routes as needed | Decision id, SMTP smoke, backup/restore drill, package/action smoke, cutover approval id | Resolve T02 production choices before any production cutover work. |
|
||||
| OpenBao unseal and credential helper | `openbao-api-key`, `railiance-infra-principals`, `ssh-cert-host-access`, `key-cape-oidc-login` | Policy names, role names, token accessor only, allow/deny smoke | `warden-sign` lane is verified/banked; broader custody profile and issuer automation remain separate operator-design gates. |
|
||||
@@ -77,23 +77,32 @@ separate ops-warden worker.
|
||||
## Daily Automation Evidence
|
||||
|
||||
The scheduled daily-triage runner is alive and writing State Hub plus working
|
||||
memory evidence. The current blocker is output validation, not scheduling or
|
||||
sink reachability.
|
||||
memory evidence. The current blocker is bounded output-contract adoption and
|
||||
live graceful-degradation proof, not scheduling or sink reachability.
|
||||
|
||||
Latest clean scheduled run:
|
||||
Latest clean scheduled streak:
|
||||
|
||||
- 2026-06-25: State Hub event `cbba6bc0-14cb-492b-ab23-74b9349326c8`,
|
||||
schema-valid daily triage, working memory written.
|
||||
- 2026-06-28: event `f0d8477e-1db9-4c07-bb8c-d28cbb868abc`, schema-valid daily
|
||||
triage, working memory written.
|
||||
- 2026-06-29: event `176d2ea7-f0e3-48cd-999b-4ab6055c6a55`, schema-valid daily
|
||||
triage, working memory written.
|
||||
- 2026-06-30: event `27d695b2-a537-481b-ada6-ca84ec24cd96`, schema-valid daily
|
||||
triage, working memory written.
|
||||
|
||||
Latest failed scheduled runs:
|
||||
Latest failed scheduled runs before the clean streak:
|
||||
|
||||
- 2026-06-26: event `97fd20a0-eee0-45ea-8290-6d91874e1515`, validation failed
|
||||
at char 5268, working memory written.
|
||||
- 2026-06-27: event `c5ab50a8-404b-4e30-849f-841b059ace65`, validation failed
|
||||
at char 5246, working memory written.
|
||||
|
||||
Resume from `docs/daily-triage-stabilization-status.md` and
|
||||
`ACTIVITY-WP-0016` before restarting the three-clean-run gate.
|
||||
Bank the three-run calibration streak, but keep the WP-0016 live-proof gate open
|
||||
until the bounded top-N contract and graceful-degradation smoke are proven. The
|
||||
activity-core worktree currently has in-flight uncommitted ACTIVITY-WP-0016
|
||||
and ACTIVITY-WP-0018/0019 changes, so Custodian should wait for that owner to
|
||||
commit/sync or explicitly hand off before treating those files as source truth.
|
||||
Use activity-core repo-native automation status surface once it lands; do not
|
||||
use assistant-provided scheduling as operational evidence.
|
||||
|
||||
## Production Service Summary
|
||||
|
||||
@@ -117,8 +126,10 @@ Resume from `docs/daily-triage-stabilization-status.md` and
|
||||
2. Keep `CUST-WP-0047` and `CUST-WP-0049` as legacy evidence/fallback until
|
||||
Core Hub deployed smoke evidence or an explicit supersede decision closes
|
||||
them.
|
||||
3. Deploy the activity-core WP-0016 code/schema and bounded runtime prompt
|
||||
bundle, then run the railiance01 daily-triage smoke.
|
||||
3. Bank the 2026-06-28 / 2026-06-29 / 2026-06-30 clean daily-triage
|
||||
streak for calibration, then have the activity-core owner land/sync the
|
||||
in-flight WP-0016 diagnostics/status work and prove the bounded top-N plus
|
||||
graceful-degradation smoke.
|
||||
4. Complete the issue-core handoff by wiring activity-core to port `8765` with
|
||||
`ISSUE_SINK_TYPE=rest` and one known-safe emission smoke.
|
||||
5. Request explicit State Hub cutover approval for `CUST-WP-0011-T07`, or
|
||||
|
||||
@@ -10,7 +10,7 @@ topic_slug: custodian
|
||||
planning_priority: high
|
||||
planning_order: 51
|
||||
created: "2026-06-27"
|
||||
updated: "2026-06-27"
|
||||
updated: "2026-06-30"
|
||||
state_hub_workstream_id: "21cabc98-3f80-4d00-b3b7-06e2ac2af88f"
|
||||
---
|
||||
|
||||
@@ -270,6 +270,24 @@ Progress 2026-06-27:
|
||||
- Cleared the stale human-needed flag from the completed bridge/config task and
|
||||
moved live intervention notes onto the deploy/smoke/calibration gate.
|
||||
|
||||
Progress 2026-06-30 daily-triage recheck:
|
||||
|
||||
- State Hub now shows three consecutive schema-valid scheduled `daily_triage`
|
||||
events after the malformed 2026-06-26 and 2026-06-27 outputs:
|
||||
2026-06-28 `f0d8477e-1db9-4c07-bb8c-d28cbb868abc`, 2026-06-29
|
||||
`176d2ea7-f0e3-48cd-999b-4ab6055c6a55`, and 2026-06-30
|
||||
`27d695b2-a537-481b-ada6-ca84ec24cd96`; all wrote working memory.
|
||||
- This banks the scheduling/sink/schema-validity streak for
|
||||
`ACTIVITY-WP-0006-T03` calibration feedback, but not the full WP-0016
|
||||
live-proof gate because the reports still emit 10 recommendations instead of
|
||||
the bounded top-N contract.
|
||||
- /home/worsch/activity-core currently has in-flight uncommitted changes for
|
||||
ACTIVITY-WP-0016 diagnostics and new ACTIVITY-WP-0018/0019
|
||||
automation-status/inventory workplans. Custodian should not overwrite or
|
||||
commit that worktree; the next clean handoff is for the activity-core owner to
|
||||
commit/sync or explicitly hand it off, then use the repo-native automation
|
||||
status surface as evidence.
|
||||
|
||||
## Task: Finish Near-Term Production Service Lanes
|
||||
|
||||
```task
|
||||
|
||||
Reference in New Issue
Block a user