session-memory Phase 2: verify + catalog artifacts (T07)

End-to-end verification over real local sessions: ingest 94->93 -> 72 digests; detect 3 candidates (2 cross-flavor); curate --auto-approve cataloged 3 SolutionPatterns (2 cross-flavor approved/distribution_ready, 1 Claude-only), re-run fully idempotent, 3 hub decisions queued (API offline). Commits the 3 catalog artifacts as the source of truth. PRD §12 OQ4/OQ5/OQ6 marked resolved; README + design refreshed. Workplan finished; suite 72/72. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-07 10:08:52 +02:00
parent 519e76442a
commit d06791f070
5 changed files with 301 additions and 12 deletions
--- a/docs/PRD-helix-forge.md
+++ b/docs/PRD-helix-forge.md
@@ -255,12 +255,26 @@ record:
  three flavors?
 - **OQ3** Where does detection logic run — local batch jobs, hub-side, or a dedicated
  service? What volume do we actually expect?
- **OQ4** Pattern format: how do we keep one agnostic representation while giving each
-  distributor enough to render high-quality native artifacts?
- **OQ5** What's the minimum trustworthy evidence bar before a pattern is allowed to be
-  distributed to live agent environments?
- **OQ6** How do we prevent pattern bloat — too many low-value instructions degrading
-  agent context budgets (cf. the token-budget policy in global instructions)?
+- ~~**OQ4** Pattern format: how do we keep one agnostic representation while giving each
+  distributor enough to render high-quality native artifacts?~~ **Resolved (Phase 2,
+  AGENTIC-WP-0004):** the `SolutionPattern` core is flavor-agnostic (problem,
+  resolutions, scope, provenance) and carries per-flavor knowledge only in a separate
+  `rendering_hints` sub-structure keyed by flavor — distributors read the hints, the
+  core stays neutral. Catalogued as versioned files-first artifacts (FR-U3).
+- ~~**OQ5** What's the minimum trustworthy evidence bar before a pattern is allowed to be
+  distributed to live agent environments?~~ **Resolved (Phase 2):** a two-tier
+  evidence bar (`[curate.gate]`). A *promote* floor (frequency / distinct sessions /
+  cost-impact) admits a candidate as `provisional`; a stricter *distribution* floor
+  (higher frequency, optional cross-flavor requirement, cost-impact) is required to
+  mark a pattern `approved` + `distribution_ready`. Defaults are conservative and
+  config-tunable.
+- ~~**OQ6** How do we prevent pattern bloat — too many low-value instructions degrading
+  agent context budgets (cf. the token-budget policy in global instructions)?~~
+  **Resolved (Phase 2):** a bloat guard flags duplicate (same id) and near-duplicate
+  (same signal-type+locus) candidates at review time, and the catalog dedups
+  structurally on the source-candidate key so re-promotion never multiplies entries.
+  Thin candidates stay `provisional` (not distributed) rather than padding live
+  context.

 ## 13. Risks