generated from coulomb/repo-seed
End-to-end verification over real local sessions: ingest 94->93 -> 72 digests; detect 3 candidates (2 cross-flavor); curate --auto-approve cataloged 3 SolutionPatterns (2 cross-flavor approved/distribution_ready, 1 Claude-only), re-run fully idempotent, 3 hub decisions queued (API offline). Commits the 3 catalog artifacts as the source of truth. PRD §12 OQ4/OQ5/OQ6 marked resolved; README + design refreshed. Workplan finished; suite 72/72. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
178 lines
7.6 KiB
Markdown
178 lines
7.6 KiB
Markdown
---
|
||
id: AGENTIC-WP-0004
|
||
type: workplan
|
||
title: "Coding Session Memory — Phase 2 (Curate: review workflow + Pattern Catalog)"
|
||
domain: helix_forge
|
||
repo: agentic-resources
|
||
status: finished
|
||
owner: codex
|
||
topic_slug: helix-forge
|
||
created: "2026-06-06"
|
||
updated: "2026-06-07"
|
||
state_hub_workstream_id: "b3703684-f60e-42f3-b03e-dabe3e8ce3f4"
|
||
---
|
||
|
||
# Coding Session Memory — Phase 2 (Curate)
|
||
|
||
Implements the **Curate** phase (PRD §6.3, FR-U1–FR-U4) of
|
||
[PRD-helix-forge](../docs/PRD-helix-forge.md), continuing
|
||
[AGENTIC-WP-0003](AGENTIC-WP-0003-session-memory-phase1.md) (Detect).
|
||
|
||
Phase 1 surfaces ranked **candidate** problem/success patterns with evidence
|
||
(`python -m session_memory.detect --json`, persisted to the Tier 2 `patterns`
|
||
table by `detect/cluster.py::Pattern`). Phase 2 turns those candidates into
|
||
**reviewed, versioned Solution Patterns** held in an in-repo **Pattern Catalog**
|
||
— the source of truth that Phase 3 (Distribute) renders into per-flavor artifacts.
|
||
|
||
Design boundary (ADR-001 / PRD §9): the catalog is **files-first** — solution
|
||
patterns originate as versioned files in this repo; the State Hub indexes them and
|
||
records each promote/reject as an auditable decision. The agnostic core stays
|
||
flavor-neutral; per-flavor knowledge lives only in **rendering hints** consumed
|
||
later by distributor adapters (PRD §6.4 / FR-A2). New code lands under a new
|
||
`session_memory/curate/` package, mirroring the `detect/` layout from Phase 1.
|
||
|
||
Relevant design open questions this phase resolves: **OQ4** (one agnostic
|
||
representation that still gives distributors enough to render natively), **OQ5**
|
||
(minimum trustworthy evidence bar before a pattern is distribution-eligible),
|
||
**OQ6** (preventing pattern bloat / context-budget degradation).
|
||
|
||
## Solution Pattern Schema + Per-Flavor Rendering Hints
|
||
|
||
```task
|
||
id: AGENTIC-WP-0004-T01
|
||
status: done
|
||
priority: high
|
||
state_hub_task_id: "c6d20bb6-7b6c-48fd-bd25-30a349514f41"
|
||
```
|
||
|
||
Define the agnostic **Solution Pattern** artifact (FR-U2, OQ4) in
|
||
`session_memory/curate/schema.py`: stable id, name, semantic `version`, problem
|
||
description, one or more recommended resolutions, applicability scope
|
||
(repos/domains/flavors), provenance (source candidate `key` + an evidence
|
||
snapshot copied from the detect `Pattern`), and **per-flavor rendering hints**
|
||
kept in a separate sub-structure so the core stays flavor-agnostic while
|
||
distributors get enough to render high-quality native artifacts. Dataclass +
|
||
deterministic serialization (sorted keys), reusing the `Pattern.to_dict()`
|
||
contract for the embedded evidence. Unit-tested for round-trip stability.
|
||
|
||
## Versioned Pattern Catalog Store (files-first)
|
||
|
||
```task
|
||
id: AGENTIC-WP-0004-T02
|
||
status: done
|
||
priority: high
|
||
state_hub_task_id: "d40c7810-fd1e-4b14-8577-b8a64ddd337b"
|
||
```
|
||
|
||
Implement the in-repo **Pattern Catalog** as the source of truth (FR-U3, ADR-001)
|
||
in `session_memory/curate/catalog.py`: versioned solution-pattern files under a
|
||
catalog dir (e.g. `session_memory/catalog/<pattern-id>.json`), stable IDs, a
|
||
version bump on edit (supersede-in-place with history preserved), and
|
||
load/save/list with **dedup on pattern identity** (the source candidate key).
|
||
Files originate work; the hub indexes them. Verify save→load is lossless and
|
||
re-saving an unchanged pattern is a no-op (no spurious version bump).
|
||
|
||
## Review Workflow (discuss / approve / reject → promote)
|
||
|
||
```task
|
||
id: AGENTIC-WP-0004-T03
|
||
status: done
|
||
priority: high
|
||
state_hub_task_id: "e303d01f-564e-4499-9ce5-22cf959ed84c"
|
||
```
|
||
|
||
Implement the curation workflow (FR-U1/FR-U2) in
|
||
`session_memory/curate/review.py`: load Phase 1 detect candidates with their
|
||
evidence (cross-flavor first), present each candidate, accept a
|
||
**discuss/approve/reject** action, and on **approve** promote the candidate into
|
||
a Solution Pattern written to the catalog (T02) with default rendering-hint
|
||
stubs the reviewer can refine. Re-review is **idempotent**: candidates already
|
||
promoted are matched on source key and updated in place, never duplicated; a
|
||
prior reject is remembered so it is not re-surfaced unless evidence changed.
|
||
|
||
## Promotion Evidence-Bar + Bloat Guard
|
||
|
||
```task
|
||
id: AGENTIC-WP-0004-T04
|
||
status: done
|
||
priority: medium
|
||
state_hub_task_id: "d474425d-18af-48e4-8f5b-7716b2da0057"
|
||
```
|
||
|
||
Gate promotion on a **minimum trustworthy evidence threshold** (OQ5):
|
||
configurable floors on `frequency`, distinct supporting sessions, and — for
|
||
*distribution-eligible* patterns — `cross_flavor` and/or a `cost_impact` floor.
|
||
Candidates below the bar can be cataloged as `provisional` but not marked
|
||
distribution-ready. Add a **bloat guard** (OQ6): flag low-value or
|
||
near-duplicate patterns (same locus/signal-type already cataloged) so the
|
||
catalog stays lean and agent context budgets are protected. Knobs live in
|
||
`config.toml` alongside the existing retention/detect settings.
|
||
|
||
## State Hub Decision Integration
|
||
|
||
```task
|
||
id: AGENTIC-WP-0004-T05
|
||
status: done
|
||
priority: medium
|
||
state_hub_task_id: "449f12d4-fae0-450d-873f-143b3a570b5a"
|
||
```
|
||
|
||
Record every promote/reject as an **auditable hub decision** (FR-U4) via the
|
||
decision API (`record_decision` / `resolve_decision`), capturing rationale, the
|
||
source candidate key, and the evidence snapshot. **Degrade gracefully** when the
|
||
hub API is down — queue decisions locally and sync later (mirrors Phase 1's
|
||
after-the-fact status sync, recorded in the milestone for `055713a`). Keep the
|
||
hub a read model: the catalog file is the durable artifact; the decision is the
|
||
audit trail.
|
||
|
||
## Curate Entrypoint (`python -m session_memory.curate`)
|
||
|
||
```task
|
||
id: AGENTIC-WP-0004-T06
|
||
status: done
|
||
priority: medium
|
||
state_hub_task_id: "95d7747e-8407-41af-9a60-b919a4ee5e06"
|
||
```
|
||
|
||
Add a `session_memory/curate/__main__.py` entrypoint consuming detect candidates
|
||
(ranked cross-flavor first): an **interactive** review mode plus a
|
||
**batch/non-interactive** mode (e.g. `--auto-approve` above the evidence bar, for
|
||
kaizen-agent review). Emits a **catalog diff summary** (added / version-bumped /
|
||
rejected) and machine-readable JSON. Document usage in `session_memory/README.md`
|
||
next to the existing `detect` instructions, including the
|
||
detect → curate → (Phase 3) distribute flow.
|
||
|
||
## Tests + Verify Against Live Phase 1 Candidates
|
||
|
||
```task
|
||
id: AGENTIC-WP-0004-T07
|
||
status: done
|
||
priority: medium
|
||
state_hub_task_id: "20407007-0a8b-4999-a470-fa3c84e17eba"
|
||
```
|
||
|
||
Unit tests for schema/catalog/review/gating on synthetic candidates, plus an
|
||
**end-to-end** run that promotes at least one **real cross-flavor** candidate from
|
||
the live detect output (the Claude+Grok "clean pass" / "abandoned" patterns from
|
||
the WP-0003 verification) into the catalog and confirms a hub decision is logged
|
||
(or queued if the API is down). Confirm catalog round-trips and versioning is
|
||
idempotent on re-run. Refresh design open questions **OQ4/OQ5/OQ6** (PRD §12).
|
||
After workplan file updates, notify the custodian operator to run from
|
||
`~/state-hub`:
|
||
|
||
```bash
|
||
make fix-consistency REPO=agentic-resources
|
||
```
|
||
|
||
**Verification results (2026-06-07):** full suite 72/72 green (26 new curate
|
||
tests across schema/catalog/review/gating/decisions/entrypoint). Live pipeline
|
||
over real local sessions: fresh ingest 94→93 → 72 digests; detect surfaced 3
|
||
candidates, **2 cross-flavor** (Claude+Grok). `curate --auto-approve` promoted
|
||
all 3 into the files-first catalog — `sp-success-clean_pass-outcome` and
|
||
`sp-problem-abandoned-outcome` (both cross-flavor, `approved`/`distribution_ready`)
|
||
plus `sp-problem-budget_overrun-tokens` (Claude-only). 3 hub decisions queued
|
||
(API offline). Re-run was fully idempotent (3 skipped, 0 catalog writes, no
|
||
version bump). PRD §12 OQ4/OQ5/OQ6 resolved. The 3 catalog artifacts are
|
||
committed as the source of truth; operator runs `make fix-consistency` to index
|
||
them in the hub.
|