chore(consistency): renormalize lifecycle state [auto]

Updated by fix-consistency on 2026-06-26: - workplan status: proposed → active
2026-06-26 17:52:28 +02:00
parent 61f278d643
commit caa2608092
1 changed files with 34 additions and 1 deletions
--- a/workplans/ACTIVITY-WP-0016-llm-output-robustness-trust-boundary.md
+++ b/workplans/ACTIVITY-WP-0016-llm-output-robustness-trust-boundary.md
@@ -4,7 +4,7 @@ type: workplan
 title: "LLM Output Robustness & The Producer Trust Boundary"
 domain: custodian
 repo: activity-core
-status: proposed
+status: active
 owner: codex
 topic_slug: custodian
 created: "2026-06-26"
@@ -238,6 +238,39 @@ Done when:
 - the existing monolithic-document path remains as the fallback when framing is
  absent (backward compatible with task-only instructions).

+2026-06-26 progress (implemented in `src/activity_core/rules/executor.py`):
+
+- **Resilient recovery wired into `_execute`.** When the whole-document parse +
+  one retry still fail, report instructions (those with `report_sinks`) now run
+  `_resilient_report` *before* the total-loss `_invalid_output_report`. If it
+  recovers ≥1 valid item it returns a partial report; otherwise it returns None
+  and the prior total-loss path is preserved unchanged.
+- **Brace/quote-aware object scanner, not line-splitting.** The real 06-26 output
+  was pretty-printed (multi-line objects), so naive NDJSON line recovery would
+  have failed. `_extract_object_spans` walks the `recommendations` array
+  brace-depth- and string-aware, so it recovers each recommendation object
+  whether pretty-printed across many lines *or* emitted one-per-line (NDJSON).
+  The truncated trailing object is returned with `complete=False`.
+- **Layered mitigation per item:** `json.loads` → on failure for a truncated
+  tail, a best-effort `_try_repair` (balance open string/brackets/braces) →
+  then `_partition_items` validates each recovered object against the T02 item
+  schema. Valid items survive; malformed or over-`maxItems` items are
+  quarantined with provenance (`index`, `error`, `raw` snippet, `reason`).
+- **Report shape on degradation:** `output_validated=True` over the survivors,
+  `review_required=True`, `partial=True`, `quarantined_count`, and a bounded
+  `quarantined_items` list (cap 20). Degraded-but-usable is now reported
+  distinctly from total loss.
+- **Verified against the real failure shape.** New tests reconstruct a
+  pretty-printed report with 7 valid recommendations + a truncated tail (the
+  06-26 shape) and a one-bad-item-among-valid case. The 7-item run now recovers
+  all 7 and quarantines the broken tail (previously: whole run discarded);
+  log line `instruction_output_recovered: kept=7, quarantined=1`. The bad-item
+  run keeps 2 and quarantines the rank-less one.
+- **Deferred to T04 (clean scope boundary):** enforcing `maxItems` top-N on the
+  *happy* path (valid JSON, all items schema-valid, but > N items) — the resilient
+  path only runs on failure, so over-limit-on-success is a guardrail/count-cap
+  concern, which is exactly T04's remit.
+
 ## Producer Guardrails + ADR-004

 ```task