Files
agentic-resources/workplans/AGENTIC-WP-0004-session-memory-phase2.md
tegwick e51fd8154d session-memory Phase 2: review workflow (T03)
UI-free discuss/approve/reject engine driving detect candidates into the
catalog via a decide callback. candidate_to_pattern builds a provisional
SolutionPattern with per-flavor rendering-hint stubs. ReviewLog makes
re-review idempotent: prior rejects remembered, re-surfaced only when the
evidence fingerprint changes. 6 new tests; suite 58/58 green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-07 00:25:10 +02:00

166 lines
6.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: AGENTIC-WP-0004
type: workplan
title: "Coding Session Memory — Phase 2 (Curate: review workflow + Pattern Catalog)"
domain: helix_forge
repo: agentic-resources
status: ready
owner: codex
topic_slug: helix-forge
created: "2026-06-06"
updated: "2026-06-06"
state_hub_workstream_id: "b3703684-f60e-42f3-b03e-dabe3e8ce3f4"
---
# Coding Session Memory — Phase 2 (Curate)
Implements the **Curate** phase (PRD §6.3, FR-U1FR-U4) of
[PRD-helix-forge](../docs/PRD-helix-forge.md), continuing
[AGENTIC-WP-0003](AGENTIC-WP-0003-session-memory-phase1.md) (Detect).
Phase 1 surfaces ranked **candidate** problem/success patterns with evidence
(`python -m session_memory.detect --json`, persisted to the Tier 2 `patterns`
table by `detect/cluster.py::Pattern`). Phase 2 turns those candidates into
**reviewed, versioned Solution Patterns** held in an in-repo **Pattern Catalog**
— the source of truth that Phase 3 (Distribute) renders into per-flavor artifacts.
Design boundary (ADR-001 / PRD §9): the catalog is **files-first** — solution
patterns originate as versioned files in this repo; the State Hub indexes them and
records each promote/reject as an auditable decision. The agnostic core stays
flavor-neutral; per-flavor knowledge lives only in **rendering hints** consumed
later by distributor adapters (PRD §6.4 / FR-A2). New code lands under a new
`session_memory/curate/` package, mirroring the `detect/` layout from Phase 1.
Relevant design open questions this phase resolves: **OQ4** (one agnostic
representation that still gives distributors enough to render natively), **OQ5**
(minimum trustworthy evidence bar before a pattern is distribution-eligible),
**OQ6** (preventing pattern bloat / context-budget degradation).
## Solution Pattern Schema + Per-Flavor Rendering Hints
```task
id: AGENTIC-WP-0004-T01
status: done
priority: high
state_hub_task_id: "c6d20bb6-7b6c-48fd-bd25-30a349514f41"
```
Define the agnostic **Solution Pattern** artifact (FR-U2, OQ4) in
`session_memory/curate/schema.py`: stable id, name, semantic `version`, problem
description, one or more recommended resolutions, applicability scope
(repos/domains/flavors), provenance (source candidate `key` + an evidence
snapshot copied from the detect `Pattern`), and **per-flavor rendering hints**
kept in a separate sub-structure so the core stays flavor-agnostic while
distributors get enough to render high-quality native artifacts. Dataclass +
deterministic serialization (sorted keys), reusing the `Pattern.to_dict()`
contract for the embedded evidence. Unit-tested for round-trip stability.
## Versioned Pattern Catalog Store (files-first)
```task
id: AGENTIC-WP-0004-T02
status: done
priority: high
state_hub_task_id: "d40c7810-fd1e-4b14-8577-b8a64ddd337b"
```
Implement the in-repo **Pattern Catalog** as the source of truth (FR-U3, ADR-001)
in `session_memory/curate/catalog.py`: versioned solution-pattern files under a
catalog dir (e.g. `session_memory/catalog/<pattern-id>.json`), stable IDs, a
version bump on edit (supersede-in-place with history preserved), and
load/save/list with **dedup on pattern identity** (the source candidate key).
Files originate work; the hub indexes them. Verify save→load is lossless and
re-saving an unchanged pattern is a no-op (no spurious version bump).
## Review Workflow (discuss / approve / reject → promote)
```task
id: AGENTIC-WP-0004-T03
status: done
priority: high
state_hub_task_id: "e303d01f-564e-4499-9ce5-22cf959ed84c"
```
Implement the curation workflow (FR-U1/FR-U2) in
`session_memory/curate/review.py`: load Phase 1 detect candidates with their
evidence (cross-flavor first), present each candidate, accept a
**discuss/approve/reject** action, and on **approve** promote the candidate into
a Solution Pattern written to the catalog (T02) with default rendering-hint
stubs the reviewer can refine. Re-review is **idempotent**: candidates already
promoted are matched on source key and updated in place, never duplicated; a
prior reject is remembered so it is not re-surfaced unless evidence changed.
## Promotion Evidence-Bar + Bloat Guard
```task
id: AGENTIC-WP-0004-T04
status: todo
priority: medium
state_hub_task_id: "d474425d-18af-48e4-8f5b-7716b2da0057"
```
Gate promotion on a **minimum trustworthy evidence threshold** (OQ5):
configurable floors on `frequency`, distinct supporting sessions, and — for
*distribution-eligible* patterns — `cross_flavor` and/or a `cost_impact` floor.
Candidates below the bar can be cataloged as `provisional` but not marked
distribution-ready. Add a **bloat guard** (OQ6): flag low-value or
near-duplicate patterns (same locus/signal-type already cataloged) so the
catalog stays lean and agent context budgets are protected. Knobs live in
`config.toml` alongside the existing retention/detect settings.
## State Hub Decision Integration
```task
id: AGENTIC-WP-0004-T05
status: todo
priority: medium
state_hub_task_id: "449f12d4-fae0-450d-873f-143b3a570b5a"
```
Record every promote/reject as an **auditable hub decision** (FR-U4) via the
decision API (`record_decision` / `resolve_decision`), capturing rationale, the
source candidate key, and the evidence snapshot. **Degrade gracefully** when the
hub API is down — queue decisions locally and sync later (mirrors Phase 1's
after-the-fact status sync, recorded in the milestone for `055713a`). Keep the
hub a read model: the catalog file is the durable artifact; the decision is the
audit trail.
## Curate Entrypoint (`python -m session_memory.curate`)
```task
id: AGENTIC-WP-0004-T06
status: todo
priority: medium
state_hub_task_id: "95d7747e-8407-41af-9a60-b919a4ee5e06"
```
Add a `session_memory/curate/__main__.py` entrypoint consuming detect candidates
(ranked cross-flavor first): an **interactive** review mode plus a
**batch/non-interactive** mode (e.g. `--auto-approve` above the evidence bar, for
kaizen-agent review). Emits a **catalog diff summary** (added / version-bumped /
rejected) and machine-readable JSON. Document usage in `session_memory/README.md`
next to the existing `detect` instructions, including the
detect → curate → (Phase 3) distribute flow.
## Tests + Verify Against Live Phase 1 Candidates
```task
id: AGENTIC-WP-0004-T07
status: todo
priority: medium
state_hub_task_id: "20407007-0a8b-4999-a470-fa3c84e17eba"
```
Unit tests for schema/catalog/review/gating on synthetic candidates, plus an
**end-to-end** run that promotes at least one **real cross-flavor** candidate from
the live detect output (the Claude+Grok "clean pass" / "abandoned" patterns from
the WP-0003 verification) into the catalog and confirms a hub decision is logged
(or queued if the API is down). Confirm catalog round-trips and versioning is
idempotent on re-run. Refresh design open questions **OQ4/OQ5/OQ6** in
[DESIGN-session-memory.md](../docs/DESIGN-session-memory.md). After workplan file
updates, notify the custodian operator to run from `~/state-hub`:
```bash
make fix-consistency REPO=agentic-resources
```