feat(ACTIVITY-WP-0016-T04): producer trust-boundary guardrails + ADR-004

Add ADR-004 documenting the producer trust boundary: untrusted producers (LLM, agent, human; erroneous and malicious), the trust-but-handle vs verify-and-mitigate postures, error-locality and quarantine-with-provenance principles, and the concrete activity-core mechanisms. Implement producer-agnostic guardrails in executor.py, applied uniformly on the happy path and the recovery path via _partition_items: structural-type -> schema -> structural caps (_MAX_DEPTH, _MAX_STRING_LEN) -> reference allow-list -> count cap. Each quarantine carries a reason. Closes the happy-path maxItems count cap deferred from T03 (valid 9-item report keeps 7, quarantines 2). Reference allow-list reads context["known_candidates"] via _allow_list_from_context; inert until a resolver populates it. SCOPE.md updated (executor bullet + ADR list); no INTENT drift. New tests: happy-path count cap, oversized-string guardrail, allow-list rejection. Full suite: 218 passed, 1 skipped. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 18:10:17 +02:00
parent c5440e8429
commit 9be4ddbdb7
5 changed files with 373 additions and 12 deletions
--- a/workplans/ACTIVITY-WP-0016-llm-output-robustness-trust-boundary.md
+++ b/workplans/ACTIVITY-WP-0016-llm-output-robustness-trust-boundary.md
@@ -297,6 +297,33 @@ Done when:
 - SCOPE.md / INTENT.md are checked for drift and updated if the boundary stance
  changes the documented contract.

+2026-06-26 progress:
+
+- **ADR-004 written** — `docs/adr/adr-004-producer-trust-boundary.md` documents
+  the untrusted-producer premise (erroneous + malicious; LLM/agent/human), the
+  A-vs-B posture taxonomy, the four governing principles, the concrete
+  activity-core mechanisms, a posture-by-layer table, consequences, and
+  alternatives considered. Accepted, scope cross-repo.
+- **Producer guardrails implemented** in `executor.py`, applied uniformly on the
+  happy path *and* the recovery path via `_partition_items`: per-item order is
+  structural-type → schema → structural caps (`_MAX_DEPTH=8`,
+  `_MAX_STRING_LEN=4000`) → reference allow-list → count cap (`maxItems`). Each
+  quarantine carries a `reason` (`malformed`/`schema`/`guardrail`/`allow_list`/
+  `over_limit`).
+- **Happy-path count cap closed** (the item deferred from T03): a syntactically
+  valid 9-item report now keeps 7 and quarantines 2 as `over_limit`, emitting a
+  `partial` report — without a retry.
+- **Reference allow-list wired but inert.** `_allow_list_from_context` reads
+  `context["known_candidates"]`; when present, recommendations with an unknown
+  `candidate` are quarantined (`reason: allow_list`). Absent today → check is
+  inert; activation is a one-line context-resolver change. Keeps the guardrail
+  producer-agnostic (principle #4) and ready.
+- **SCOPE.md updated** — instruction-executor bullet now names the quarantine
+  lane + guardrails; ADR-004 added to the Architecture Decisions list. No INTENT
+  drift: this hardens the existing output contract, it does not extend scope.
+- New tests: happy-path count cap, oversized-string guardrail, allow-list
+  rejection (all green).
+
 ## Tests + Calibration Re-Entry

 ```task