From 5aea22f24f0b992cbe55ccb42ed604accc3a29bd Mon Sep 17 00:00:00 2001 From: tegwick Date: Sat, 6 Jun 2026 21:50:23 +0200 Subject: [PATCH] Register AGENTIC-WP-0003 (session-memory Phase 1) with State Hub Codex + Grok adapters, multi-file session merge, and the Detect pipeline (signals -> clustering -> evidence -> candidate report). Co-Authored-By: Claude Opus 4.8 --- .../AGENTIC-WP-0003-session-memory-phase1.md | 163 ++++++++++++++++++ 1 file changed, 163 insertions(+) create mode 100644 workplans/AGENTIC-WP-0003-session-memory-phase1.md diff --git a/workplans/AGENTIC-WP-0003-session-memory-phase1.md b/workplans/AGENTIC-WP-0003-session-memory-phase1.md new file mode 100644 index 0000000..d7b55d8 --- /dev/null +++ b/workplans/AGENTIC-WP-0003-session-memory-phase1.md @@ -0,0 +1,163 @@ +--- +id: AGENTIC-WP-0003 +type: workplan +title: "Coding Session Memory — Phase 1 (Codex + Grok adapters, Detect)" +domain: helix_forge +repo: agentic-resources +status: active +owner: codex +topic_slug: helix-forge +created: "2026-06-06" +updated: "2026-06-06" +state_hub_workstream_id: "88c75b47-1c89-43bc-bb3e-739ec3c8f7d4" +--- + +# Coding Session Memory — Phase 1 + +Extends Phase 0 ([AGENTIC-WP-0002](AGENTIC-WP-0002-session-memory-phase0.md)) along +two axes of [PRD-helix-forge](../docs/PRD-helix-forge.md): + +1. **Multi-flavor capture (G1/G6):** add the Codex and Grok collector adapters so + the agnostic core ingests all three families through thin edges. +2. **Detect (PRD §6.2):** run signal extractors over normalized sessions, cluster + recurring signals into candidate problem/success patterns, attach evidence, and + flag cross-flavor patterns. + +Both flavors' on-disk schemas are already confirmed in +[DESIGN-session-memory.md](../docs/DESIGN-session-memory.md) §2.2 (Codex) and §2.3 +(Grok), with the native→`kind` mapping in §4.3 — so the adapters are written +against known structures, not discovered ones. + +## Codex Collector Adapter + +```task +id: AGENTIC-WP-0003-T01 +status: todo +priority: high +state_hub_task_id: "91264fd4-ba99-4add-b317-e2320c3c932c" +``` + +Implement `adapters/codex.py` reading `~/.codex/sessions/YYYY/MM/DD/rollout-*.jsonl` +per design §2.2: line wrapper `{timestamp,type,payload}`; map `session_meta`→Session +fields, `turn_context`→model, `response_item/message`→`user_msg`/`assistant_msg`, +`function_call`+`function_call_output` (joined on `call_id`)→`tool_call`/`tool_result`, +`reasoning`→`thinking`, `event_msg/task_*`→`lifecycle`/`completion`, +`event_msg/token_count`→cost. Codex is flat: assign `seq`/`parent_seq` by temporal +order (no native DAG). Version-detect on `session_meta.cli_version`. Reuse the +`Normalized` bundle contract. Tests use synthetic rollout fixtures; confirm the +`token_count` payload field names against a real install if Codex is present +(design OQ1 residual). + +## Grok Collector Adapter + +```task +id: AGENTIC-WP-0003-T02 +status: todo +priority: high +state_hub_task_id: "fe3d7d1c-110e-4f16-8d56-062fa4a651aa" +``` + +Implement `adapters/grok.py` reading the per-session directory +`~/.grok/sessions///` per design §2.3: `summary.json`→Session +id/cwd/timestamps, `chat_history.jsonl`→messages, `events.jsonl`→explicit +`lifecycle` events and `turn_number` (key `seq` off it), tool calls/results from +`chat_history`/`updates.jsonl`, token fields from events/updates. Resolve the +url-encoded cwd dir name back to a path. Tests against the real local Grok +sessions on this workstation plus a synthetic dir fixture. + +## Multi-File / Multi-Part Session Merge + +```task +id: AGENTIC-WP-0003-T03 +status: todo +priority: medium +state_hub_task_id: "c4acfb63-84cd-4299-a44d-91bb6857fa88" +``` + +Address design OQ6 (surfaced in Phase 0): several files can map to one +`session_uid` (resume, sidechains; Grok dirs are inherently multi-file). Change +the store/ingest path to **merge** events across parts of one session rather than +last-file-wins upsert — stable event ordering and de-duplication keyed on native +identity. Verify event counts are additive and idempotent on re-run. + +## Signal Extractors + +```task +id: AGENTIC-WP-0003-T04 +status: todo +priority: high +state_hub_task_id: "20920c5d-16f7-43bb-9ed7-9afbfeaf7207" +``` + +Implement `detect/signals.py`: derive `Signal`s from normalized sessions/digests — +e.g. repeated test failure on the same target, budget overrun (cost vs. peers), +retry storm, fast clean resolution, human escalation, error-then-recovery. Each +signal carries its source `session_uid`, locus (file/tool/task), polarity +(problem|success), and magnitude. Pure functions over Tier 1 events + Tier 2 +digests; no new capture. Unit-tested on synthetic sessions. + +## Pattern Clusterer + +```task +id: AGENTIC-WP-0003-T05 +status: todo +priority: high +state_hub_task_id: "f42d57f6-34dc-4a92-bf6a-4d8eab572467" +``` + +Implement `detect/cluster.py`: group recurring signals across sessions/repos/ +flavors into candidate `ProblemPattern`/`SuccessPattern` records (PRD §5). Start +with deterministic keyed clustering (locus + signal-type + normalized message); +leave embedding-based similarity as a later option. Output candidates with +frequency and member session lists. + +## Pattern Evidence + Cross-Flavor Flagging + +```task +id: AGENTIC-WP-0003-T06 +status: todo +priority: medium +state_hub_task_id: "8fd502d6-d138-4a42-acd5-6f5921859605" +``` + +For each candidate pattern (PRD §6.2 FR-D3/FR-D4) attach evidence: supporting +sessions, frequency, affected repos, affected **flavors**, and estimated cost +impact (token/retry deltas vs. baseline). Explicitly flag candidates whose +evidence spans more than one flavor as `cross_flavor: true` — the highest-value +reuse targets. Persist candidates to a Tier 2 `patterns` store/table. + +## Candidate Pattern Report + +```task +id: AGENTIC-WP-0003-T07 +status: todo +priority: medium +state_hub_task_id: "34a96d5d-9165-4761-b91e-3643b0401410" +``` + +Add a `detect` entrypoint (`python -m session_memory.detect`) that runs extractors +→ clusterer → evidence and emits a human-readable candidate report (ranked by +cost impact × frequency, cross-flavor first), plus machine-readable JSON. This is +the input to the Curate phase (Phase 2) review workflow. Document usage in the +session_memory README. + +## Verify Across All Three Flavors + +```task +id: AGENTIC-WP-0003-T08 +status: todo +priority: medium +state_hub_task_id: "b272c3fa-af81-4a6c-9ed9-7b42173efa81" +``` + +Run the full pipeline (ingest all enabled sources → digest → detect) against the +real local Claude and Grok sessions on this workstation (Codex via fixtures if not +installed). Confirm: normalized rows for each flavor, at least one candidate +pattern surfaced, and at least one **cross-flavor** pattern detected if the data +supports it (PRD success metric). Record results and refresh design open +questions. After workplan file updates, notify the custodian operator to run from +`~/state-hub`: + +```bash +make fix-consistency REPO=agentic-resources +```