session-memory Phase 2: verify + catalog artifacts (T07)

End-to-end verification over real local sessions: ingest 94->93 -> 72 digests;
detect 3 candidates (2 cross-flavor); curate --auto-approve cataloged 3
SolutionPatterns (2 cross-flavor approved/distribution_ready, 1 Claude-only),
re-run fully idempotent, 3 hub decisions queued (API offline). Commits the 3
catalog artifacts as the source of truth. PRD §12 OQ4/OQ5/OQ6 marked resolved;
README + design refreshed. Workplan finished; suite 72/72.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-07 10:08:52 +02:00
parent 519e76442a
commit d06791f070
5 changed files with 301 additions and 12 deletions

View File

@@ -255,12 +255,26 @@ record:
three flavors?
- **OQ3** Where does detection logic run — local batch jobs, hub-side, or a dedicated
service? What volume do we actually expect?
- **OQ4** Pattern format: how do we keep one agnostic representation while giving each
distributor enough to render high-quality native artifacts?
- **OQ5** What's the minimum trustworthy evidence bar before a pattern is allowed to be
distributed to live agent environments?
- **OQ6** How do we prevent pattern bloat — too many low-value instructions degrading
agent context budgets (cf. the token-budget policy in global instructions)?
- ~~**OQ4** Pattern format: how do we keep one agnostic representation while giving each
distributor enough to render high-quality native artifacts?~~ **Resolved (Phase 2,
AGENTIC-WP-0004):** the `SolutionPattern` core is flavor-agnostic (problem,
resolutions, scope, provenance) and carries per-flavor knowledge only in a separate
`rendering_hints` sub-structure keyed by flavor — distributors read the hints, the
core stays neutral. Catalogued as versioned files-first artifacts (FR-U3).
- ~~**OQ5** What's the minimum trustworthy evidence bar before a pattern is allowed to be
distributed to live agent environments?~~ **Resolved (Phase 2):** a two-tier
evidence bar (`[curate.gate]`). A *promote* floor (frequency / distinct sessions /
cost-impact) admits a candidate as `provisional`; a stricter *distribution* floor
(higher frequency, optional cross-flavor requirement, cost-impact) is required to
mark a pattern `approved` + `distribution_ready`. Defaults are conservative and
config-tunable.
- ~~**OQ6** How do we prevent pattern bloat — too many low-value instructions degrading
agent context budgets (cf. the token-budget policy in global instructions)?~~
**Resolved (Phase 2):** a bloat guard flags duplicate (same id) and near-duplicate
(same signal-type+locus) candidates at review time, and the catalog dedups
structurally on the source-candidate key so re-promotion never multiplies entries.
Thin candidates stay `provisional` (not distributed) rather than padding live
context.
## 13. Risks