generated from coulomb/repo-seed
test(ACTIVITY-WP-0016-T05): regression coverage incl. real 06-26 payload + over-depth
Add a test driving the actual captured 2026-06-26 failure payload (tests/fixtures/wp0016/...partial.json): it now recovers 6+ valid recommendations and quarantines the truncated tail, where before WP-0016 it discarded the whole run. Add an over-depth guardrail test. Together with T03/T04 the regression set now covers truncation, one-bad-item, oversized-string, over-depth, allow-list/injection-shaped, and happy-path count cap. In-repo portion of T05 complete; the live railiance01 graceful-degradation smoke is operator-owned cluster work (deploy-coupled with the T02 bundle changes) and remains outstanding. Hand-back notes posted to WP-0006-T03 and WP-0010-T04. Full suite: 220 passed, 1 skipped. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -349,6 +349,25 @@ Done when:
|
||||
that the output-robustness blocker is cleared so the three-clean-run gate can
|
||||
resume on its own.
|
||||
|
||||
2026-06-26 progress (in-repo portion complete):
|
||||
|
||||
- **Regression coverage complete.** Across T03/T04/T05: truncated-mid-list,
|
||||
one-bad-item-among-good (quarantine + partial), oversized-string and over-depth
|
||||
guardrail rejection, allow-list (injection-shaped) rejection, happy-path count
|
||||
cap, and a test driving the **actual captured 2026-06-26 payload**
|
||||
(`tests/fixtures/wp0016/daily_triage_2026-06-26_validation_failure.partial.json`)
|
||||
— it now recovers 6+ valid recommendations and quarantines the truncated tail,
|
||||
where before it discarded the whole run.
|
||||
- **Full suite green:** 218 passed, 1 skipped (recorded at T04; the T05 fixture +
|
||||
over-depth tests add to this — see the commit).
|
||||
- **Hand-back notes posted** to `ACTIVITY-WP-0006-T03` (State Hub event
|
||||
`b6b8c2b8`) and `ACTIVITY-WP-0010-T04` (`b813f0dc`).
|
||||
- **Remaining (remote, operator-owned):** the live daily-triage smoke on
|
||||
`railiance01` proving end-to-end graceful degradation. It depends on deploying
|
||||
the T02 bundle prompt/`max_tokens`/NDJSON changes together with this code, which
|
||||
is cluster/operator work outside this repo's SCOPE. T05 therefore stays
|
||||
`progress` until that live run exists; the in-repo deliverables are done.
|
||||
|
||||
## Relationships
|
||||
|
||||
- **Blocks / feeds:** `ACTIVITY-WP-0006-T03` (three clean scheduled runs) and
|
||||
|
||||
Reference in New Issue
Block a user