generated from coulomb/repo-seed
Harden WSJF triage report recovery
This commit is contained in:
@@ -169,7 +169,7 @@ TEMPORAL_HOST=localhost:7233 \
|
||||
STATE_HUB_URL=http://127.0.0.1:8000 \
|
||||
uv run python scripts/verify_daily_triage.py \
|
||||
--activity-id "$DAILY_TRIAGE_ACTIVITY_ID" \
|
||||
--working-memory-dir /home/worsch/the-custodian/working-memory \
|
||||
--working-memory-dir /home/worsch/the-custodian/memory/working \
|
||||
--live
|
||||
```
|
||||
|
||||
@@ -182,9 +182,9 @@ The verification is complete when all of these agree:
|
||||
- `activity_runs` has a row for the daily triage ActivityDefinition with today's
|
||||
`scheduled_for` or `fired_at` date.
|
||||
- State Hub `/progress/` contains a `daily_triage` event whose detail includes
|
||||
the same `activity_core_run_id`.
|
||||
the same `activity_core_run_id` and its `output_validated` flag.
|
||||
- The working-memory sink wrote `daily-triage-YYYY-MM-DD-<run>.md` and its
|
||||
frontmatter contains the same `activity_core_run_id`.
|
||||
frontmatter contains the same `activity_core_run_id` and validation metadata.
|
||||
- The ActivityDefinition's instruction model, token budget, and sink timeouts fit
|
||||
under `ACTIVITY_TIMEOUT_SECONDS` (default 900 seconds). Temporal retries each
|
||||
activity up to 10 attempts, so a slow LLM or sink failure should show as
|
||||
@@ -280,6 +280,8 @@ Leave a State Hub progress note, but do not page, when:
|
||||
|
||||
- A planned outage caused one skipped run and the schedule is healthy again.
|
||||
- A sink idempotency check reports `exists` for the expected run id.
|
||||
- An instruction report has `output_validated=false` but still emitted a
|
||||
validation-failure note preserving partial model output for review.
|
||||
- The report completed but calibration feedback says the recommendations were
|
||||
noisy, too long, or under-sensitive.
|
||||
|
||||
|
||||
Reference in New Issue
Block a user