established rules

Add .repo-classification.yaml (CUST-WP-0050 T11 agent first-pass)
chore(consistency): sync task status from DB [auto]
2026-06-22 23:06:36 +02:00 · 2026-06-22 17:47:34 +02:00 · 2026-06-21 16:09:45 +02:00 · 2026-06-21 16:09:34 +02:00 · 2026-06-19 20:37:50 +02:00 · 2026-06-19 20:27:00 +02:00
126 changed files with 10478 additions and 15 deletions
--- a/.claude/rules/agents.md
+++ b/.claude/rules/agents.md
@@ -0,0 +1,20 @@
 ## Kaizen Agents
 Specialized agent personas available on demand via the state-hub MCP.
 **Discover:** `list_kaizen_agents()` — returns all agents with name, description, category
 **Load:** `get_kaizen_agent("tdd-workflow")` — returns full instructions; read and follow them
 Common agents:
 | Agent | Category | When to use |
 |-------|----------|-------------|
 | `tdd-workflow` | testing | Step-by-step TDD8 workflow for any feature |
 | `code-refactoring` | quality | Code quality analysis and safe refactoring |
 | `test-maintenance` | testing | Diagnose and fix failing tests |
 | `requirements-engineering` | process | Prevent interface/mock mismatches upfront |
 | `keepaTodofile` | process | Maintain TODO.md during work |
 | `project-management` | process | Track status, determine next steps |
 | `datamodel-optimization` | quality | Optimize dataclasses and data structures |
 All 17 agents: call `list_kaizen_agents()` for the full list.
--- a/.claude/rules/architecture.md
+++ b/.claude/rules/architecture.md
@@ -0,0 +1,8 @@
 ## Architecture
 <!-- TODO: Describe the key design decisions and component structure.
     Key modules, data flows, external integrations, state machines, etc. -->
 ## Quick Reference
 `~/state-hub/mcp_server/TOOLS.md` — MCP tool reference
--- a/.claude/rules/credential-routing.md
+++ b/.claude/rules/credential-routing.md
@@ -0,0 +1,50 @@
 # Credential and access routing
 **Audience:** Codex, Claude Code, Grok, and custodian agents that call **llm-connect**
 for inference. Run this check **before** requesting secrets, API keys, SSH access,
 login tokens, or database passwords — in any repo, not only `ops-warden`.
 ops-warden **issues SSH certificates only** (`warden sign`, `cert_command`). Every
 other credential need belongs to another subsystem. **Do not** message
 `ops-warden` on State Hub expecting a secret value; the reply is a pointer, not a key.
 ### Lookup (do this first)
 ```bash
 warden route find "<describe your need>" --json
 warden route show <catalog-id> --json
 ```
 Requires the `warden` CLI from `~/ops-warden` (`uv tool install .` or `uv run warden`).
 | Agent runtime | How to orient |
 | --- | --- |
 | **Codex / Grok** (shell, HTTP State Hub) | `warden route` commands above; inbox `to_agent=agentic-resources` is for coordination, not secret vending |
 | **Claude Code** (MCP when available) | `get_domain_summary("custodian")` for workstreams; **still** use `warden route` for credential ownership |
 | **llm-connect** (inference service) | Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by `warden route` |
 ### Quick routing table
 | I need… | Owner | ops-warden executes? |
 | --- | --- | --- |
 | SSH cert (`adm`/`agt`/`atm`) | ops-warden | **Yes** — `warden sign` |
 | API key, DB password, provider token | OpenBao (`railiance-platform`) | No — route only |
 | Login / OIDC / MFA | key-cape / Keycloak | No — route only |
 | Authorization decision | flex-auth | No — route only |
 | activity-core → issue-core emission | activity-core + issue-core | No — `warden route show activity-core-issue-sink` |
 | SSH tunnel | ops-bridge (+ `cert_command` from warden) | No — route only |
 ### Anti-patterns (do not do these)
 - `POST /messages/` to `ops-warden` asking for `ISSUE_CORE_API_KEY`, `OPENROUTER_API_KEY`, etc.
 - Inventing `warden secret`, `warden login`, `warden bao`, `warden tunnel` — they do not exist
 - Pasting secrets into Git, State Hub, workplans, logs, or chat
 ### Other capabilities (reuse-surface)
 Non-credential capabilities are usually discovered through **reuse-surface** federation
 (`reuse-surface` registry / `capability.*` indexes). Credential routing is inlined in
 every repo's agent instructions because it is high-frequency, high-risk, and easy to
 get wrong.
 **Canon:** `~/ops-warden/wiki/CredentialRouting.md` · catalog `~/ops-warden/registry/routing/catalog.yaml`
--- a/.claude/rules/first-session.md
+++ b/.claude/rules/first-session.md
@@ -0,0 +1,38 @@
 ## First Session Protocol
 Triggered when `get_domain_summary("infotech")` shows **no workstreams**.
 The project is registered but work has not yet been structured.
 **Step 1 — Read, don't write**
 - `~/the-custodian/canon/projects/infotech/project_charter_v0.1.md` — purpose, scope
 - `~/the-custodian/canon/projects/infotech/roadmap_v0.1.md` — planned phases
 - Scan repo root: README, directory structure, existing code or docs
 **Step 2 — Survey in-progress work**
 Look for TODOs, open branches, half-finished files. Note done vs. started but incomplete.
 **Step 3 — Propose workstreams to Bernd**
 Propose 1–3 workstreams — each a coherent strand, weeks to months, anchored to a
 roadmap phase. **Wait for approval before creating.**
 **Step 4 — Create workplan file first, then DB record (ADR-001)**
 ```
 workplans/AGENTIC-WP-NNNN-<slug>.md   ← write this first
 ```
 Then register in the hub:
 ```
 create_workstream(topic_id="f39fa2a3-c491-414c-a91b-b4c5fcc6139c", title="...", owner="...", description="...")
 create_task(workstream_id="<id>", title="...", priority="high|medium|low")
 ```
 **Step 5 — Record the setup**
 ```
 add_progress_event(
    summary="First session: structured infotech into N workstreams, M tasks",
    event_type="milestone",
    topic_id="f39fa2a3-c491-414c-a91b-b4c5fcc6139c",
    detail={"workstreams": [...], "tasks_created": M}
 )
 ```
 <!-- Delete or archive this file once past first session -->
--- a/.claude/rules/repo-boundary.md
+++ b/.claude/rules/repo-boundary.md
@@ -0,0 +1,8 @@
 ## Repo boundary
 This repo owns **agentic-resources** only. It does not own:
 <!-- TODO: List what belongs in adjacent repos, e.g.:
 - SSH key management → railiance-infra/
 - State hub code     → state-hub/
 -->
--- a/.claude/rules/repo-identity.md
+++ b/.claude/rules/repo-identity.md
@@ -0,0 +1,5 @@
 **Purpose:** Iterating towards optimal agentic performance.
 **Domain:** infotech
 **Repo slug:** agentic-resources
 **Topic ID:** f39fa2a3-c491-414c-a91b-b4c5fcc6139c
--- a/.claude/rules/session-protocol.md
+++ b/.claude/rules/session-protocol.md
@@ -0,0 +1,85 @@
 ## Session Protocol
 Dev Hub (State Hub API): http://127.0.0.1:8000
 MCP server name in `~/.claude.json`: `dev-hub`
 **Step 1 — Orient**
 Read the offline-safe brief first — it works without a live hub connection:
 ```bash
 cat .custodian-brief.md
 ```
 Then call the MCP tool for richer cross-domain context when MCP tools are exposed:
 ```
 get_domain_summary("infotech")
 ```
 If MCP tools are unavailable in the current agent session, use the REST API:
 ```bash
 curl -s "http://127.0.0.1:8000/state/summary" | python3 -m json.tool
 ```
 If the hub is offline: `cd ~/state-hub && make api`
 **Step 2 — Check inbox**
 With MCP tools:
 ```
 get_messages(to_agent="agentic-resources", unread_only=True)
 ```
 Mark read with `mark_message_read(message_id)`. Reply or act on coordination
 requests before proceeding.
 Without MCP tools:
 ```bash
 curl -s "http://127.0.0.1:8000/messages/?to_agent=agentic-resources&unread_only=true" \
  | python3 -m json.tool
 curl -s -X PATCH "http://127.0.0.1:8000/messages/<id>/read" \
  -H "Content-Type: application/json" -d '{}'
 ```
 **Step 3 — Scan workplans**
 ```bash
 ls workplans/
 ```
 For each file with `status: ready`, `active`, or `blocked`, note pending
 `wait`/`todo`/`progress` tasks.
 **Step 4 — Present brief**
 1. **Active workstreams** for `infotech` — title, task counts, blocking decisions
 2. **Pending tasks** from `workplans/` + any `[repo:agentic-resources]` hub tasks
 3. **Goal guidance** — if `goal_guidance` in summary:
   - `needs_workplan`: surface as top action — *"Repo goal '{title}' has no workplan yet"*
   - `alignment_warnings`: flag if active work is not aligned with current goal
 4. **Suggested next action** — highest-priority open item
 5. **SBOM status** — flag if `last_sbom_at` is unset for this repo
 If no workstreams: follow First Session Protocol (`first-session.md`).
 **During work:** `record_decision()` · `add_progress_event()` · `resolve_decision()`
 > State Hub is a *read model*. Bootstrap tools (`create_workstream`, `create_task`)
 > are First Session Protocol only. Work structure belongs in repo files (ADR-001).
 **Session close:**
 With MCP tools:
 ```
 add_progress_event(summary="...", topic_id="f39fa2a3-c491-414c-a91b-b4c5fcc6139c", workstream_id="<uuid>")
 ```
 Without MCP tools:
 ```bash
 curl -s -X POST http://127.0.0.1:8000/progress/ \
  -H "Content-Type: application/json" \
  -d '{"topic_id":"f39fa2a3-c491-414c-a91b-b4c5fcc6139c","workstream_id":"<uuid>","event_type":"note","summary":"what changed","author":"codex"}'
 ```
 If workplan files were modified, ensure the local copy is up to date first:
 ```bash
 git -C <repo_path> pull --ff-only
 cd ~/state-hub && make fix-consistency REPO=agentic-resources
 ```
 For repos where implementation runs on a remote machine (e.g. CoulombCore),
 use the combined target which pulls before fixing:
 ```bash
 cd ~/state-hub && make fix-consistency-remote REPO=agentic-resources
 ```
 **C-15** (DB task ahead of file) is normal in multi-machine workflows — writeback
 will sync the file to match DB.  **C-16** (repo behind remote) blocks all writes
 until you pull — intentional to prevent clobbering remote progress.
--- a/.claude/rules/stack-and-commands.md
+++ b/.claude/rules/stack-and-commands.md
@@ -0,0 +1,19 @@
 ## Stack
 <!-- TODO: Fill in language, frameworks, and key dependencies -->
 - **Language:**
 - **Key deps:**
 ## Dev Commands
 ```bash
 # TODO: Fill in the standard commands for this repo
 # Install dependencies
 # Run tests
 # Lint / type check
 # Build / package (if applicable)
 ```
--- a/.claude/rules/workplan-convention.md
+++ b/.claude/rules/workplan-convention.md
@@ -0,0 +1,40 @@
 ## Workplan Convention (ADR-001)
 File location: `workplans/AGENTIC-WP-NNNN-<slug>.md`
 ID prefix: `AGENTIC-WP-`
 Work items originate as files in this repo **before** being registered in the hub.
 Canonical workplan/workstream frontmatter statuses are:
 `proposed`, `ready`, `active`, `blocked`, `backlog`, `finished`, `archived`.
 Use `proposed` for a newly drafted plan, `ready` after review against current
 repo state, and `finished` when implementation is complete. `stalled` and
 `needs_review` are derived health labels, not stored statuses.
 Closed workplans may be moved to `workplans/archived/` with a completion-date
 prefix: `YYMMDD-AGENTIC-WP-NNNN-<slug>.md`. The frontmatter id remains
 unchanged; the prefix is only for quick visual reference.
 Small opportunistic tasks discovered during another session use **Ad Hoc Tasks**:
 `workplans/ADHOC-YYYY-MM-DD.md`, workstream slug `adhoc-YYYY-MM-DD`, and task ids
 `ADHOC-YYYY-MM-DD-T01`, `T02`, etc. Use adhocs only for low-risk work completed
 directly. Promote anything requiring analysis, design, approval, dependencies, or
 multiple planned phases into a normal workplan.
 Ecosystem todos from other agents arrive as `[repo:agentic-resources]` hub tasks —
 visible at session start. Pick one up by creating the workplan file, then registering
 the workstream.
 Task blocks use this shape:
 ```task
 id: AGENTIC-WP-NNNN-T01
 status: wait | todo | progress | done | cancel
 priority: high | medium | low
 state_hub_task_id: "<uuid>"         # written by fix-consistency — do not edit
 ```
 Status progression is `todo` → `progress` → `done`; use `wait` for waiting or
 blocked work and `cancel` for stopped work.
 <!-- Ralph Loop rules and HEUREKA sequence: ~/.claude/CLAUDE.md — do not duplicate here -->
--- a/.custodian-brief.md
+++ b/.custodian-brief.md
@@ -2,18 +2,12 @@
 # Custodian Brief — agentic-resources
 **Domain:** helix_forge  
-**Last synced:** 2026-06-05 22:10 UTC  
+**Last synced:** 2026-06-21 14:09 UTC  
 **State Hub:** http://127.0.0.1:8000 *(adjust if running on a remote machine)*
 ## Active Workstreams
-### Bootstrap State Hub integration
+*(none — repo may need first-session setup)*
 Progress: 0/3 done  |  workstream_id: `bb9a43a3-a54f-434b-97c2-e1c7142b52f5`
 **Open tasks:**
 - · Review Generated Integration Files  `3ad7b7a9`
 - · Verify Local Developer Workflow  `db248d57`
 - · Seed First Real Workplan  `9cbb7aa5`
 ---
 ## MCP Orientation (when available)
--- a/.gitignore
+++ b/.gitignore
@@ -174,3 +174,11 @@ cython_debug/
 # PyPI configuration file
 .pypirc
 # session-memory local store
 session_memory/.store/
 # generated per-flavor distribution proposals (HITL, regenerated each run)
 session_memory/proposals/
 __pycache__/
 *.pyc
 .pytest_cache/
--- a/.repo-classification.yaml
+++ b/.repo-classification.yaml
@@ -0,0 +1,18 @@
 repo_classification:
  standard: Repo Classification Standard
  version: '1.0'
  classified_at: '2026-06-22'
  classified_by: agent
  category: project
  domain: infotech
  secondary_domains: []
  capability_tags:
  - automation
  - orchestration
  business_stake:
  - technology
  - product
  - operations
  business_mechanics:
  - coordination
  - operation
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -4,7 +4,7 @@
 **Purpose:** Iterating towards optimal agentic performance.
-**Domain:** helix_forge
+**Domain:** infotech
 **Repo slug:** agentic-resources
 **Topic ID:** `f39fa2a3-c491-414c-a91b-b4c5fcc6139c`
 **Workplan prefix:** `AGENTIC-WP-`
@@ -101,6 +101,63 @@ curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
 ---
 ## Credential and access routing
 **Audience:** Codex, Claude Code, Grok, and custodian agents that call **llm-connect**
 for inference. Run this check **before** requesting secrets, API keys, SSH access,
 login tokens, or database passwords — in any repo, not only `ops-warden`.
 ops-warden **issues SSH certificates only** (`warden sign`, `cert_command`). Every
 other credential need belongs to another subsystem. **Do not** message
 `ops-warden` on State Hub expecting a secret value; the reply is a pointer, not a key.
 ### Lookup (do this first)
 ```bash
 warden route find "<describe your need>" --json
 warden route show <catalog-id> --json
 ```
 Requires the `warden` CLI from `~/ops-warden` (`uv tool install .` or `uv run warden`).
 | Agent runtime | How to orient |
 | --- | --- |
 | **Codex / Grok** (shell, HTTP State Hub) | `warden route` commands above; inbox `to_agent=agentic-resources` is for coordination, not secret vending |
 | **Claude Code** (MCP when available) | `get_domain_summary("custodian")` for workstreams; **still** use `warden route` for credential ownership |
 | **llm-connect** (inference service) | Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by `warden route` |
 ### Quick routing table
 | I need… | Owner | ops-warden executes? |
 | --- | --- | --- |
 | SSH cert (`adm`/`agt`/`atm`) | ops-warden | **Yes** — `warden sign` |
 | API key, DB password, provider token | OpenBao (`railiance-platform`) | No — route only |
 | Login / OIDC / MFA | key-cape / Keycloak | No — route only |
 | Authorization decision | flex-auth | No — route only |
 | activity-core → issue-core emission | activity-core + issue-core | No — `warden route show activity-core-issue-sink` |
 | SSH tunnel | ops-bridge (+ `cert_command` from warden) | No — route only |
 ### Anti-patterns (do not do these)
 - `POST /messages/` to `ops-warden` asking for `ISSUE_CORE_API_KEY`, `OPENROUTER_API_KEY`, etc.
 - Inventing `warden secret`, `warden login`, `warden bao`, `warden tunnel` — they do not exist
 - Pasting secrets into Git, State Hub, workplans, logs, or chat
 ### Other capabilities (reuse-surface)
 Non-credential capabilities are usually discovered through **reuse-surface** federation
 (`reuse-surface` registry / `capability.*` indexes). Credential routing is inlined in
 every repo's agent instructions because it is high-frequency, high-risk, and easy to
 get wrong.
 **Canon:** `~/ops-warden/wiki/CredentialRouting.md` · catalog `~/ops-warden/registry/routing/catalog.yaml`
 <!-- REPO-AGENTS-EXTENSIONS -->
 <!-- Append repo-specific agent instructions below this marker.
     The state-hub template sync preserves content after this line. -->
 ---
 ## Workplan Convention (ADR-001)
 Work items originate as files in this repo — not in the hub. The hub is a
@@ -124,7 +181,7 @@ anything needing analysis, design, approval, dependencies, or multiple phases.
 id: AGENTIC-WP-NNNN
 type: workplan
 title: "..."
-domain: helix_forge
+domain: infotech
 repo: agentic-resources
 status: proposed | ready | active | blocked | backlog | finished | archived
 owner: codex
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -0,0 +1,12 @@
 # agentic-resources — Claude Code Instructions
@SCOPE.md
@.claude/rules/repo-identity.md
@.claude/rules/session-protocol.md
@.claude/rules/first-session.md
@.claude/rules/workplan-convention.md
@.claude/rules/stack-and-commands.md
@.claude/rules/architecture.md
@.claude/rules/repo-boundary.md
@.claude/rules/credential-routing.md
@.claude/rules/agents.md
--- a/docs/ASSESSMENT-infra-friction.md
+++ b/docs/ASSESSMENT-infra-friction.md
@@ -0,0 +1,144 @@
 # Infrastructure Friction Assessment
 *Generated 2026-06-07 from captured coding-session data (Helix Forge session
 memory), after the Detect-hardening pass ([AGENTIC-WP-0005]). First data-driven
 assessment of where our agentic coding sessions spend effort on plumbing rather
 than work.*
 ## Method & data quality
 - **Corpus:** 72 sessions captured across Claude + Grok. A session-quality filter
  ([detect/quality.py]) drops health-checks, smoke-tests, and interrupted runs
  (mostly `llm-connect` *"Say hello in one word"*). **27 are real coding sessions.**
 - **Caveat:** the 41 % that were filtered out had been mislabeled `abandoned` by
  the outcome heuristic and produced a *false-positive* "cross-flavor abandoned"
  pattern in the first catalog — now purged. Treat any pre-hardening finding with
  suspicion.
 - **Key framing:** all 27 real sessions ended in `success`. So the friction here
  is **cost/efficiency, not failure** — sessions get there, but pay an avoidable
  tax to do it.
 ## The headline number
 Across the 27 real sessions, tool-call activity breaks down as:
 | Bucket | Share |
 |--------|------:|
 | shell (Bash / run_terminal) | 38.2 % |
 | edit | 30.2 % |
 | read | 12.9 % |
 | **State Hub MCP** | **10.3 %** |
 | **task-management plumbing** | **5.8 %** |
 | **schema-loading (`ToolSearch`)** | **1.5 %** |
 | other | 1.1 % |
 **~17.6 % of all tool calls in real coding sessions are coordination plumbing
 (hub + task + schema-loading), not touching the repo.** Per-session infra-overhead
 share: median **11.7 %**, p90 **26.1 %**, max **43.3 %** — it concentrates badly.
 ## Ranked friction
 ### 1. State Hub call volume — *highest cost, addressable*
 State Hub MCP is 10.3 % of all tool calls and dominates the worst sessions:
 | Repo (one session) | total calls | State Hub calls | overhead share |
 |--------------------|------:|------:|------:|
 | vergabe-teilnahme | 570 | **231** | 43 % |
 | activity-core | 488 | 98 | 23 % |
 | flex-auth | 236 | 35 (+27 task) | 29 % |
 | net-kingdom | 129 | 25 | 22 % |
 Root cause: many **fine-grained** calls — per-task status updates, per-event
 progress writes, repeated `get_domain_summary`. 231 hub calls in a single session
 is coordination overhead, not work.
 ### 2. Schema-loading thrash (`ToolSearch`) — *low cost, near-zero-effort fix*
 **106 `ToolSearch` calls across 22 of 27 sessions (81 %).** The State Hub MCP
 tools are *deferred*, so nearly every session re-discovers and re-loads the same
 tool schemas before it can call them. This is pure overhead with no work value —
 and it is **exactly the CLI/MCP-interface friction hypothesized.**
 ### 3. Task-management plumbing — 5.8 %
 `TaskUpdate` / `TaskCreate` / `todo_write` / `update_task_status`. Overlaps with
 (1); much of it is redundant status churn within a session.
 ### 4. Tool thrash — *session-shape, watch only*
 11 sessions hammer a single tool 80–230× (usually Bash or Edit). Less an infra
 problem than a sign of missing higher-level tooling; low priority.
 ### 5. Budget overrun — 3 sessions
 Token cost well above peers. Secondary; revisit once (1)–(2) are addressed.
 ## Recommendations
 **The CLI/MCP-interface hypothesis is validated as a top-2 friction, not a minor
 issue.** Two high-ROI moves:
 - **A. A State Hub skill (highest ROI).** A skill (or a pre-loaded tool manifest)
  that (i) **front-loads the common hub tool schemas** so agents stop
  `ToolSearch`-ing for them — eliminates finding #2 almost entirely (81 % of
  sessions) — and (ii) **teaches batched writes** (sync N task statuses in one
  call, fewer progress events) to attack finding #1. Low effort, broad reach.
 - **B. Coarser hub operations.** Add bulk endpoints / a single "sync workplan
  statuses" op so a session doesn't make 200+ individual hub calls. This is the
  structural fix behind the skill's guidance.
 - **C. Measure the effect (Phase 4).** After A/B land, compare infra-overhead
  share on subsequent sessions against this baseline (median 11.7 %, p90 26.1 %).
  This is precisely what the Measure phase is for — the loop closes here.
 ## Content-level root causes (error-body mining)
 *Added 2026-06-07 from [AGENTIC-WP-0006] — `build_digest` now mines normalized
 error fingerprints into the durable digest, and `sig_recurring_error` clusters
 them. This is the "why" the tool-mix view above could not see.*
 **26 of 27 real sessions hit at least one error.** Top recurring error
 fingerprints across the corpus (by # sessions affected):
 | # sessions | occ | flavors | top sample |
 |-----------:|----:|---------|------------|
 | **12** | 32 | claude | `<tool_use_error>File has not been read yet. Read it first before writing to it.` |
 | **6** | 13 | claude | `<tool_use_error>File has been modified since read …` |
 | **4** | 9 | **claude + grok** | `make: *** [Makefile:227: fix-consistency] Error 1` |
 | 3 | 21 | claude | `MCP error -32602: Invalid request parameters` |
 | 3 | 6 | claude | `Error calling tool 'update_task_status': 'title'` |
 | 2 | 6 | claude | `make: *** [Makefile:21: test] Error 1` |
 Reading:
 - **#1 — Edit/Write-before-Read (12/27 sessions, 8 repos).** The single most
  common error is agents trying to edit a file they haven't read into context.
  This is a *workflow* friction, highly addressable: a Read-before-Edit reflex in
  the agent instructions / a skill, or a harness affordance. (Observed live: the
  author hit this exact error twice while writing this workplan.)
 - **#2 — stale-read conflicts (6 sessions):** "File has been modified since read"
  — same family, a re-read-before-edit discipline fixes both.
 - **#3 — cross-flavor `make fix-consistency` failures (claude + grok, 3 repos):**
  the consistency tooling itself fails across flavors — a shared infra issue worth
  a look on the state-hub side (cf. [STATE-WP-0058]).
 - **State Hub MCP instability** (`-32602`, `update_task_status 'title'`) shows up
  in 3 sessions each — corroborates the plumbing-overhead story and the live MCP
  flakiness seen during this work (REST fallback used).
 **Fingerprint noise — mostly handled.** `_is_failed` now excludes successful hub
 JSON responses (top-level no-error payloads) and file-read snapshots (numbered
 `cat -n` source lines), which cut distinct fingerprints **444 → 269 (~40 %)**
 without touching the top entries. Residual low-value items remain in the long tail
 (bare structural lines like `{`, linter "N errors" summaries); the *top*
 fingerprints are real. Note several entries (`MCP error -32602`,
 `update_task_status 'title'`) reflect the State Hub MCP instability hit live during
 this work — genuine, if self-referential, friction.
 ## What this assessment still can't see
 - ~~**Why** a session was expensive at the content level.~~ **Now addressed**
  (error-body mining, above), modulo the fingerprint-noise caveat.
 - Repeated *failed approaches* (as opposed to surfaced errors) — e.g. an agent
  silently retrying a wrong strategy without an error — are still invisible.
 - Grok/Codex are thin in the corpus (4 Grok, 0 Codex sessions), so cross-flavor
  friction claims are Claude-weighted for now.
 [AGENTIC-WP-0005]: ../workplans/AGENTIC-WP-0005-detect-hardening.md
 [AGENTIC-WP-0006]: ../workplans/AGENTIC-WP-0006-error-body-mining.md
 [STATE-WP-0058]: handed off to the state-hub repo worker
 [detect/quality.py]: ../session_memory/detect/quality.py
--- a/docs/DESIGN-session-memory.md
+++ b/docs/DESIGN-session-memory.md
@@ -0,0 +1,461 @@
 # Design Document — Coding Session Memory
 **Domain:** helix_forge
 **Repo:** agentic-resources
 **Status:** Draft v0.1
 **Author:** Claude (drafted with Bernd Worsch)
 **Created:** 2026-06-06
 **Updated:** 2026-06-06
 **Related:** [PRD-helix-forge.md](./PRD-helix-forge.md) (this is the Capture + storage layer, FR-C* / §8)
 ---
 ## 1. Purpose
 Helix Forge's loop (Capture → Detect → Curate → Distribute → Measure) needs a
 durable, bounded **memory of coding sessions**. This document specifies that
 memory: how we **access** each coding agent's session protocol, how we
 **normalize** those protocols into one schema, where we **store** the result, and
 how we **age it out** — preferring a *storage-budget-based* eviction that drops
 old raw content once it has been analyzed or no longer fits, rather than a naive
 fixed time window.
 The guiding asymmetry: **raw transcripts are bulky and re-derivable; the distilled
 analysis is small and precious.** So we keep a *bounded cache* of raw sessions and
 a *durable, compact* layer of extracted digests/signals. Eviction targets the
 former, never the latter.
 ## 2. Research — How to Access Each Agent's Session Protocol
 All three families persist sessions to the local filesystem as JSONL (plus, for
 Grok, a per-session directory). All findings below were verified against the live
 installs on this workstation (`~/.claude`, `~/.grok`) and public docs (Codex; not
 installed here).
 ### 2.1 Claude Code  ✅ verified on disk
 | Aspect | Finding |
 |--------|---------|
 | Session transcripts | `~/.claude/projects/<url-encoded-cwd>/<session-uuid>.jsonl` — one JSONL per session |
 | Subagent sidechains | same dir, `agent-<id>.jsonl`; records carry `isSidechain: true` |
 | Global prompt history | `~/.claude/history.jsonl` |
 | Record format | one JSON object per line; **`type`** discriminates: `user`, `assistant`, `attachment`, `queue-operation`, `ai-title`, `last-prompt`, `summary`, plus tool-result records |
 | Key fields | `type`, `timestamp`, `sessionId`, `uuid`, `parentUuid` (turn DAG), `message` (`role` + content blocks: `text`/`thinking`/`tool_use`/`tool_result`), `cwd`, `gitBranch`, `version`, `requestId`, `toolUseResult`, `userType` |
 | Token usage | inside assistant `message.usage` (input/output/cache tokens) |
 | Model | `message.model` (e.g. `claude-opus-4-8`) |
 | Side data | `~/.claude/todos/`, `~/.claude/tasks/`, `~/.claude/file-history/`, `~/.claude/shell-snapshots/` |
 | Live capture hook | Claude Code **SessionEnd / Stop / SessionStart hooks** can fire our ingest on session close (push), in addition to batch scanning (pull) |
 The turn DAG (`uuid`/`parentUuid`) lets us reconstruct branching, retries, and
 sidechains exactly.
 ### 2.2 OpenAI Codex CLI  ✅ schema confirmed from source (not installed locally)
 Schema confirmed from the openai/codex source (`codex-rs/protocol/src/protocol.rs`
 via DeepWiki) and a reverse-engineering writeup with real example lines — the two
 cross-agree.
 | Aspect | Finding |
 |--------|---------|
 | Session ("rollout") files | `$CODEX_HOME/sessions/YYYY/MM/DD/rollout-*.jsonl` (default `$CODEX_HOME = ~/.codex`) |
 | Line wrapper (`RolloutLine`) | every line: **`{timestamp, type, payload}`** (UTC ts + a `RolloutItem`) |
 | `type` discriminator | `session_meta` · `response_item` · `event_msg` · `turn_context` · `compacted` |
 | `session_meta` | `{id, source, cwd, model_provider, cli_version}` (+ model) — restores env |
 | `turn_context` | `{model, approval_policy, sandbox_policy}` — per-turn settings snapshot |
 | `response_item` | raw model output / tool calls; `payload.type` ∈ `message` · `function_call` · `function_call_output` · `reasoning` |
 | → `message` | `{role: developer\|user\|assistant, content:[{type:"output_text"\|…, text}]}` |
 | → `function_call` | `{name, arguments (JSON string), call_id}` |
 | → `function_call_output` | `{call_id, output}` |
 | `event_msg` | protocol events; `payload.type` ∈ `task_started` · `task_complete` · `user_message` · `agent_message` · `token_count` · lifecycle |
 | Token usage | `event_msg` with `payload.type = token_count`, interspersed (no fixed cadence) |
 | Turn linkage | **flat — tool calls/outputs linked by `call_id`, no parent-ref DAG**; causality inferred from temporal order (unlike Claude's `uuid`/`parentUuid`) |
 | Schema versions | older installs differ ("new ≥0.44 / mid / oldest 2025/08"); adapter version-detects on `session_meta.cli_version` |
 | Naming / resume | filenames + `session_id` auto-generated; `codex resume --last`; `codex exec` for headless (trajectory-JSON is gh issue #2288) |
 | Override location | `CODEX_HOME` env var |
 **Adapter notes:** map `event_msg/task_started|task_complete` → `lifecycle`
 events and outcome; `response_item/message` → `user_msg`/`assistant_msg`;
 `function_call`+`function_call_output` → `tool_call`/`tool_result` joined on
 `call_id`; `response_item/reasoning` → `thinking`; `event_msg/token_count` → cost
 block. Because there is no parent-ref DAG, the adapter assigns `seq`/`parent_seq`
 from temporal order rather than native links.
 ### 2.3 Grok CLI (xAI)  ✅ verified on disk
 Grok stores **a directory per session**, which is the richest source of the three.
 | Aspect | Finding |
 |--------|---------|
 | Session dir | `~/.grok/sessions/<url-encoded-cwd>/<session-uuid>/` |
 | `chat_history.jsonl` | full conversation; `type` = `system`/`user`/`assistant` + content |
 | `events.jsonl` | **structured lifecycle events** — `{ts, type, session_id, turn_number, model_id, yolo_mode, conversation_message_count, session_relationship, schema_version}`; types like `turn_started`, `loop_started` |
 | `updates.jsonl` | streaming incremental updates |
 | `summary.json` | `{id, cwd, session_summary, created_at, updated_at}` |
 | `prompt_context.json` | injected context, incl. which AGENTS.md/CLAUDE.md files were loaded |
 | `system_prompt.txt` | exact system prompt for the session |
 | `rewind_points.jsonl`, `plan_mode.json` | rewind/plan-mode state |
 | Per-cwd prompt history | `~/.grok/sessions/<cwd>/prompt_history.jsonl` — `{timestamp, session_id, prompt, is_bash}` |
 | Global structured log | `~/.grok/logs/unified.jsonl` — `{ts, src, pid, lvl, msg, ctx, sid, ver}` |
 | Search index | `~/.grok/sessions/session_search.sqlite` — `session_docs(session_id, cwd, updated_at, title)` + FTS5 (`session_docs_fts`) we can query directly |
 | Integration surfaces | Grok exposes **ACP (Agent Client Protocol)**, **headless mode** (`grok -p`), and **hooks** (`~/.grok/docs/user-guide/10-hooks.md`) — push-capture options |
 ### 2.4 Cross-family summary
 | | Claude Code | Codex CLI | Grok CLI |
 |--|--|--|--|
 | Root | `~/.claude/projects/` | `~/.codex/sessions/` | `~/.grok/sessions/` |
 | Unit | one `.jsonl`/session | one `rollout-*.jsonl`/session | one **dir**/session |
 | Layout | flat per-cwd dir | date-partitioned `YYYY/MM/DD` | per-cwd, per-session dir |
 | Discriminator | `type` | `type` (version-dependent) | `type` (in `chat_history`/`events`) |
 | Lifecycle events | inferred from records | inferred from records | **explicit** `events.jsonl` |
 | Token usage | `message.usage` | per-line usage | from events/updates |
 | Push capture | Stop/SessionEnd hooks | `codex exec` wrappers | hooks / ACP |
 | Pull capture | scan dir by mtime | scan date partitions | scan dirs / query FTS sqlite |
 **Implication:** the common denominator is *"JSONL records discriminated by a
 `type` field, with a session id, timestamps, turn linkage, tool calls, and token
 usage."* That maps cleanly onto one normalized schema (§4). Per-family quirks
 (Grok's explicit `events.jsonl`, Codex's schema versions, Claude's sidechains) are
 handled inside each adapter.
 ## 3. Tiered Storage Model
 ```
 Tier 0  SOURCE (agents' own logs)        read-only, never mutated
         ~/.claude/projects  ~/.codex/sessions  ~/.grok/sessions
                 │  collector adapters (per family) + ingest cursor
                 ▼
 Tier 1  RAW CACHE (bounded, EVICTABLE)   normalized Session + Event records
                 │  signal extractors / digesters
                 ▼
 Tier 2  DISTILLED MEMORY (durable, small)  session digests + signals + pattern evidence
 ```
 - **Tier 0 — Source.** The agents' own logs. We treat them as read-only. We keep a
  small **ingest cursor** per source so re-scans are incremental (see §6).
 - **Tier 1 — Raw cache.** Normalized copies of sessions/events. This is the bulky
  tier and the *only* tier subject to budget eviction.
 - **Tier 2 — Distilled memory.** Per-session **digest** (outcome, costs, tool
  histogram, error/retry/intervention markers, key snippets) plus extracted
  **signals** and **pattern evidence pointers**. Compact and durable. A session can
  be fully evicted from Tier 1 once its Tier 2 digest exists.
 This is what makes "drop old content once it has been analyzed" safe: analysis
 *promotes* the valuable bits into Tier 2 before the raw bytes are dropped.
 ### 3.1 Per-session lifecycle / watermarks
 Each session row carries timestamps that drive eviction:
 ```
 discovered_at → ingested_at → analyzed_at → [evictable] → evicted_at
 ```
 - `ingested_at` set when normalized into Tier 1.
 - `analyzed_at` set when the Tier 2 digest is written. **A session is evictable iff
  `analyzed_at` is set.**
 - `evicted_at` set when raw bytes are dropped from Tier 1 (Tier 2 digest remains).
 ## 4. Normalized Schema (Tier 1)
 Two record kinds. Field names are stable across all adapters.
 ### 4.1 `Session`
 ```jsonc
 {
  "session_uid": "claude:17092961-…",      // "<flavor>:<native id>", globally unique
  "flavor": "claude" | "codex" | "grok",
  "native_session_id": "17092961-…",
  "repo": "agentic-resources",             // resolved from cwd
  "domain": "helix_forge",                 // resolved from repo→domain map
  "cwd": "/home/worsch/agentic-resources",
  "git_branch": "main",
  "model": "claude-opus-4-8",
  "started_at": "2026-06-05T21:59:30Z",
  "ended_at": "2026-06-05T22:14:00Z",
  "outcome": "success|fail|abandoned|unknown",
  "cost": { "input_tokens": 0, "output_tokens": 0, "cache_tokens": 0,
            "wall_clock_s": 0, "turns": 0, "retries": 0 },
  "task_ref": "AGENTIC-WP-0002-T01",       // if derivable; else null
  "source_path": "~/.claude/projects/…/….jsonl",
  "source_bytes": 0,
  "schema_version": 1,
  "ingested_at": "…", "analyzed_at": null, "evicted_at": null
 }
 ```
 ### 4.2 `SessionEvent`
 ```jsonc
 {
  "session_uid": "claude:17092961-…",
  "seq": 12,                               // monotonic within session
  "parent_seq": 11,                        // turn DAG (Claude uuid/parentUuid)
  "ts": "2026-06-05T22:01:13Z",
  "kind": "user_msg | assistant_msg | thinking | tool_call | tool_result"
        + "| error | test_run | edit | retry | human_intervention | decision"
        + "| lifecycle | completion",
  "role": "user|assistant|system|tool",
  "tool": "Bash|Edit|Read|…",              // when kind=tool_call/result
  "summary": "ran pytest -q",              // short, human-readable
  "payload_ref": "blob://…",               // pointer to full content in Tier 1 blob store
  "tokens": 0,
  "is_sidechain": false
 }
 ```
 Adapters map native records onto `kind`. Grok's `events.jsonl` populates
 `lifecycle`/`turn` events directly; Claude/Codex lifecycle is inferred from the
 record stream. Bulky bodies live behind `payload_ref` so Tier 1 rows stay light
 and blobs can be evicted independently.
 ### 4.3 Native → `kind` mapping (all three families)
 Each cell is the native record/discriminator an adapter reads to emit that
 `SessionEvent.kind`. `—` = not natively present; the adapter synthesizes or omits.
 | `kind` | Claude Code (`type` / `message`) | Codex CLI (`type` → `payload.type`) | Grok CLI (file → `type`) |
 |--------|----------------------------------|--------------------------------------|---------------------------|
 | `user_msg` | `user`, `message.role=user` | `response_item` → `message` `role=user`/`developer` | `chat_history` → `user` |
 | `assistant_msg` | `assistant`, `message.role=assistant`, content `text` | `response_item` → `message` `role=assistant` (`output_text`) | `chat_history` → `assistant` |
 | `thinking` | `assistant` content block `type=thinking` | `response_item` → `reasoning` | `chat_history`/`updates` reasoning block |
 | `tool_call` | `assistant` content block `type=tool_use` (`name`,`input`) | `response_item` → `function_call` (`name`,`arguments`,`call_id`) | `chat_history`/`updates` tool-call entry |
 | `tool_result` | `user`/tool record `type=tool_result` + `toolUseResult` | `response_item` → `function_call_output` (join on `call_id`) | `updates` tool-result entry |
 | `test_run` | derived from `tool_call` (Bash running tests) | derived from `function_call` (`exec_command`) | derived from tool-call entry |
 | `edit` | `tool_use` where `name` ∈ Edit/Write/NotebookEdit | `function_call` apply-patch/file-write tool | tool-call entry (edit/write) |
 | `error` | `toolUseResult` error / non-zero result | `function_call_output` error / `event_msg` error | `events.jsonl` error / failed update |
 | `retry` | repeated `tool_use` after error (inferred via DAG) | repeated `function_call` after error (inferred, temporal) | `events.jsonl` loop/retry event |
 | `human_intervention` | `user` record mid-turn (interrupt), `userType` | `event_msg` → `user_message` mid-task | `prompt_history` mid-session / `events.jsonl` |
 | `decision` | recorded out-of-band (State Hub `/decisions`) | recorded out-of-band (State Hub) | recorded out-of-band (State Hub) |
 | `lifecycle` | inferred: first/last record, `summary`, `queue-operation` | `event_msg` → `task_started` / `task_complete` | **`events.jsonl`** → `turn_started`/`loop_started`/… (explicit) |
 | `completion` | inferred: last `assistant` + `Stop`/`SessionEnd` hook | `event_msg` → `task_complete` | `events.jsonl` turn end + `summary.json` |
 **Linkage note (drives `seq`/`parent_seq`):** Claude has a true turn DAG
 (`uuid`/`parentUuid`) — preserve it directly. Codex is **flat**, joined only by
 `call_id`; assign `seq` by temporal order. Grok carries explicit `turn_number` in
 `events.jsonl`; key `seq` off that plus record order.
 **Cost block sources:** Claude `message.usage`; Codex `event_msg/token_count`;
 Grok `events.jsonl` / `updates.jsonl` token fields.
 ## 5. Retention & Eviction
 The user's stated preference: **storage-budget-based**, dropping old content once
 it has been analyzed or once it no longer fits — *better than* a fixed daily/weekly
 window. We implement budget-based as primary, with a time backstop and a scheduled
 cadence as the trigger.
 ### 5.1 Configurable knobs
 ```toml
 [session_memory.retention]
 raw_soft_cap_bytes   = "4GiB"   # begin evicting analyzed sessions above this
 raw_hard_cap_bytes   = "6GiB"   # absolute ceiling for Tier 1
 raw_max_age_days     = 45       # backstop: analyzed raw older than this is evictable regardless of space
 distilled_cap_bytes  = "1GiB"   # Tier 2 ceiling (should grow slowly; alert, don't auto-drop)
 cadence              = "daily"  # ingest+analyze+evict sweep: daily | weekly | on-hook
 ```
 ### 5.2 Eviction algorithm (runs after each ingest+analyze sweep)
 1. **Compute** current Tier 1 usage.
 2. **Backstop pass:** evict any session where `analyzed_at` is set AND
   `age > raw_max_age_days`.
 3. **Budget pass:** while `usage > raw_soft_cap_bytes`:
   - pick the **oldest `analyzed_at`** session that is not yet evicted;
   - drop its Tier 1 raw rows + blobs (Tier 2 digest is kept), set `evicted_at`;
   - if **no analyzed-but-unevicted session remains**, stop the budget pass
     (we will not destroy un-analyzed data to free space) and go to step 4.
 4. **Back-pressure / overflow:** if `usage > raw_hard_cap_bytes` and the only
   remaining bulk is **un-analyzed**:
   - first try to **analyze now** (run extraction) to make those sessions
     evictable, then re-run the budget pass;
   - if still over hard cap (analysis can't keep up or fails), evict the **oldest
     un-analyzed** sessions as a last resort and emit a
     `session_memory.data_loss` warning event + a State Hub progress note. This is
     the only path that loses un-analyzed data, and it is always reported.
 5. **Tier 2 guard:** if distilled usage > `distilled_cap_bytes`, **do not
   auto-drop**; flag for human/curation review (digests are the product).
 **Invariant:** *no session's raw bytes are dropped before its Tier 2 digest
 exists, except the explicitly-reported hard-cap overflow path.*
 ### 5.3 Why budget-based beats fixed-window
 A fixed daily/weekly drop either deletes data we never analyzed (lossy) or hoards
 data we already distilled (wasteful). Budget + `analyzed_at` watermark ties
 deletion to **two** real conditions the user named — *"once it has been analyzed"*
 (promoted to Tier 2) and *"doesn't fit any longer"* (over budget) — and only falls
 back to time as a backstop.
 ## 6. Ingest Cursors (incremental, idempotent)
 Per source, persist a small cursor so sweeps are cheap and re-runnable:
 - **Claude / Grok (per-cwd dirs):** track `(file_path, size, mtime)` and last
  parsed line offset; re-ingest only grown/changed files. `session_uid` dedupes.
 - **Codex (date partitions):** track last-seen `YYYY/MM/DD` + per-file offset.
 - Ingest is **idempotent** keyed on `(session_uid, seq)` — safe to re-run after a
  crash or partial sweep.
 ## 7. Capture Modes
 - **Pull (default, portable):** scheduled sweep scans Tier 0 by mtime/partition.
  Works for all three families with zero coupling to the agent. Triggered on the
  configured `cadence` via the repo's scheduler (`/schedule`, cron, or `/loop`).
 - **Push (optional, low-latency):** wire the agent's own hooks to ping the ingester
  on session close — Claude `Stop`/`SessionEnd` hooks, Grok hooks/ACP, Codex
  `exec` wrappers. Push just enqueues; the same idempotent pull pipeline does the
  work.
 Capture must be **non-blocking** (PRD FR-C5): we read copies of logs out-of-band;
 we never sit in the agent's critical path.
 ## 8. Component Layout (proposed, in-repo)
 ```
 session-memory/
  adapters/
    claude.py      # Tier0→Tier1 normalizer (verified schema)
    codex.py       # version-detecting normalizer (confirm against real rollout)
    grok.py        # reads session dir incl. events.jsonl
  core/
    schema.py      # Session / SessionEvent dataclasses + versioning
    store.py       # Tier1 (rows+blobs) and Tier2 (digests) — SQLite to start
    cursor.py      # per-source ingest cursors
    retention.py   # §5 eviction algorithm
    digest.py      # Tier1→Tier2 session digest + signal stubs
  ingest.py        # one sweep: discover → normalize → analyze → evict
  config.toml      # §5.1 knobs + repo→domain map + source paths
 ```
 Storage starts as **SQLite + a blob dir** (rows in SQLite, bulky payloads as files
 under `payload_ref`); graduate to Postgres alongside the State Hub only if volume
 demands. Digests/decisions are also surfaced to the hub per ADR-001 (files-first;
 hub indexes).
 ## 9. Privacy / Safety
 - Tier 0 logs can contain secrets (the Grok `auth.json` and Claude `.credentials`
  live in the same trees). The ingester reads **only** session transcripts, never
  credential files, and **redacts** obvious secret patterns into `payload_ref`
  blobs.
 - All data is local; nothing leaves the workstation. Eviction of Tier 1 is a real
  delete (not just an index drop) so the bounded cache is also a privacy bound.
 ## 10. Open Questions
 - ~~**OQ1** Confirm Codex `rollout-*.jsonl` per-line schema.~~ **Resolved** (§2.2):
  `{timestamp,type,payload}` lines, `type` ∈ `session_meta`/`response_item`/`event_msg`/`turn_context`/`compacted`,
  tool calls flat-linked by `call_id`, tokens via `event_msg/token_count`. Remaining
  sub-item: verify the `token_count` payload field names against a real install when
  Codex is present (older-version variance only).
 - **OQ2** Outcome inference: how do we reliably label `success/fail/abandoned`
  across flavors (exit signals differ)? Start heuristic (last-turn + test results +
  human-intervention markers), refine in Detect phase.
 - **OQ3** `task_ref` resolution — can we always map a session to a workplan task
  (via cwd + branch + state-hub), or only sometimes?
 - ~~**OQ4** Right default for `raw_soft_cap_bytes`.~~ **Measured** (Phase 0, 85
  real local Claude files / 63 distinct sessions): source bytes per session
  min 396 · **median ~49 KB** · max 48 MB (one outlier) · ~103 MB total. Claude
  defaults (4 GiB soft / 6 GiB hard) leave ample headroom; revisit once Grok dirs
  (heavier, multi-file) are ingested in Phase 1.
 - **OQ6 (new, found in Phase 0)** Multi-file sessions: ~84 transcript files mapped
  to ~63 `session_uid`s — some sessions span multiple files (resume/sidechain
  sharing a `sessionId`). Current behavior upserts (last file wins per
  `(session_uid, seq)`); a future refinement is to *merge* events across files of
  one session rather than overwrite. Acceptable for Phase 0.
 - **OQ5** Should push-hooks be opt-in per machine to avoid surprising the agents?
 ---
 ## 11. Project metrics correlation (kaizen-agentic)
 Helix Forge owns **fleet-level** session capture and digests (this repo). The
 **kaizen-agentic** framework owns **project-scoped** agent execution metrics
 (ADR-004: `.kaizen/metrics/<agent>/executions.jsonl`). The two layers correlate
 by optional `helix_session_uid` on project records — link-by-reference, no
 duplicate ingestion in either repo.
 | Layer | Owner | Storage |
 |-------|-------|---------|
 | Fleet | agentic-resources (Helix Forge) | digest store (`digests` table) |
 | Project | kaizen-agentic | `.kaizen/metrics/<agent>/executions.jsonl` |
 **Cross-repo contract:** [Helix Forge Correlation Contract](https://gitea.coulomb.social/coulomb/kaizen-agentic/src/branch/main/docs/integrations/helix-forge-correlation.md)
 (kaizen-agentic). Field mapping from `Session.session_uid` → `helix_session_uid`,
 `digest.cost` → `tokens`, `tool_histogram` MCP share → `infra_overhead_share`.
 **Read path:** `kaizen-agentic metrics correlate <uid>` looks up a digest via
 `HELIX_STORE_DB` (this repo's session store). No write path from kaizen-agentic
 into Helix Forge.
 **Related kaizen-agentic docs:** [ADR-004 project metrics convention](https://gitea.coulomb.social/coulomb/kaizen-agentic/src/branch/main/docs/adr/ADR-004-project-metrics-convention.md),
 [wiki/EcosystemIntegration.md](https://gitea.coulomb.social/coulomb/kaizen-agentic/src/branch/main/wiki/EcosystemIntegration.md).
 ### 11.1 Session-close env export (dual-layer agents)
 Agents that run **both** Helix Forge capture and kaizen `metrics record` should
 export the following **after** the ingest sweep has written the session digest
 (`python -m session_memory.ingest` or an equivalent Stop/SessionEnd hook). Names
 match kaizen-agentic ADR-004 — do not invent parallel aliases.
 | Variable | Source in Helix Forge | Purpose |
 |----------|----------------------|---------|
 | `HELIX_SESSION_UID` | `Session.session_uid` | Primary correlation key → `helix_session_uid` |
 | `HELIX_REPO` | `digest.repo` | Project/repo scoping |
 | `HELIX_FLAVOR` | `digest.flavor` | Agent runtime (`claude` / `codex` / `grok`) |
 | `HELIX_TOKENS` | `digest.cost.input_tokens + digest.cost.output_tokens` | Token rollup → `tokens` |
 | `HELIX_INFRA_OVERHEAD_SHARE` | infra bucket share over `tool_histogram` (see `measure.metrics.session_metrics`) | MCP/plumbing overhead → `infra_overhead_share` |
 Example (after digest exists):
 ```bash
 export HELIX_SESSION_UID="claude:abc-123"
 export HELIX_REPO="agentic-resources"
 export HELIX_FLAVOR="claude"
 export HELIX_TOKENS=125000
 export HELIX_INFRA_OVERHEAD_SHARE=0.117
 # optional — lets kaizen correlate without guessing the store location:
 export HELIX_STORE_DB="$(pwd)/session_memory/.store/mem.db"
 kaizen-agentic metrics record   # merges HELIX_* when present
 ```
 ### 11.2 Digest store location and read API
 - **`HELIX_STORE_DB`** — absolute path to the SQLite file holding Tier 2 digests.
  Defaults to `config.toml` `[store].db_path` (`session_memory/.store/mem.db` relative
  to the repo root). Export as an absolute path when setting the variable on session
  close so `metrics correlate` works across hosts and working directories.
 - **Thin CLI** — `python -m session_memory.digest_lookup <session_uid> [--json]`
  prints one digest without running ingest. Exit `0` on hit, `1` when missing.
 - **Programmatic** — `Store.get_digest(session_uid)` returns the JSON blob written
  by `build_digest` / `analyze`.
 **Stable digest JSON shape** (fields consumers may rely on):
 | Field | Type | Notes |
 |-------|------|-------|
 | `session_uid` | string | Normalized uid (`<flavor>:<native-id>`) |
 | `flavor`, `repo`, `domain` | string | Session attribution |
 | `model` | string | Model id when known |
 | `started_at`, `ended_at` | string | ISO timestamps |
 | `outcome` | string | `success` / `fail` / `abandoned` / `unknown` |
 | `cost` | object | `input_tokens`, `output_tokens`, `cache_tokens`, `wall_clock_s`, `turns`, `retries` |
 | `tool_histogram` | object | Tool name → call count |
 | `event_count`, `kind_counts`, `markers` | object/int | Compact activity summary |
 | `first_prompt`, `last_assistant` | string | Short text snippets |
 | `error_snippets` | array | `{fingerprint, sample, count, tool}` entries |
 | `schema_version` | int | Digest schema version |
 ---
 *Implemented:* Phases 0–4, weekly retro ([AGENTIC-WP-0002]–[AGENTIC-WP-0010]);
 kaizen correlation follow-up ([AGENTIC-WP-0011]).
 ## Sources
 - Claude Code session format — verified on disk: `~/.claude/projects/*/*.jsonl`, `~/.claude/history.jsonl`.
 - Grok CLI session format — verified on disk: `~/.grok/sessions/`, `~/.grok/logs/unified.jsonl`, `~/.grok/sessions/session_search.sqlite`; `~/.grok/README.md` (ACP/headless/hooks).
 - Codex CLI session format — [ccusage Codex guide](https://ccusage.com/guide/codex/), [Codex advanced config](https://developers.openai.com/codex/config-advanced), [codex-trace](https://github.com/PixelPaw-Labs/codex-trace), [codex-logs](https://github.com/wondercoms/codex-logs), [Session/Rollout Files discussion #3827](https://github.com/openai/codex/discussions/3827), [trajectory-JSON issue #2288](https://github.com/openai/codex/issues/2288).
--- a/docs/PRD-helix-forge.md
+++ b/docs/PRD-helix-forge.md
@@ -0,0 +1,319 @@
 # Product Requirements Document — Helix Forge
 **Domain:** helix_forge
 **Repo:** agentic-resources
 **Status:** Draft v0.1
 **Author:** Claude (drafted with Bernd Worsch)
 **Created:** 2026-06-06
 **Updated:** 2026-06-19
 ---
 ## 1. Summary
 Helix Forge is a system for **handling a collection of repositories and evolving
 the utility of what those repositories provide**, by treating the coding sessions
 run against them as a first-class data source.
 Concretely: across a fleet of repos worked on by multiple coding agents (Claude,
 Codex, GrokBuild), Helix Forge **inspects the sessions**, **collects data about the
 problems agents hit and the moves that resolved them**, and turns that data into
 **reusable solution patterns** that can be discussed, implemented, and re-applied —
 across every agent flavor, not just the one that discovered the pattern.
 The name is the metaphor: a *helix* of repeated turns (session → pattern → improved
 session) feeding a *forge* where the tooling, environments, and instructions for our
 agents are hammered into better shape over time. This is the operational engine
 behind the INTENT.md goal of an *antifragile, continuously-optimizing agentic
 ecosystem*.
 ## 2. Problem Statement
 We run many coding sessions, across many repos, with several different agents. Today
 the value of each session is **trapped in that session**:
 - When an agent solves a tricky problem, the solution is not captured in a form
  another agent (or the same agent next week) can reuse.
 - When an agent fails, struggles, or burns excess budget on a problem, that failure
  signal is lost — we re-encounter the same friction repeatedly.
 - Each agent flavor (Claude, Codex, GrokBuild) has its own environment, instruction
  format, and extension mechanism, so a fix discovered for one is **not portable** to
  the others without manual translation.
 - We have no systematic, evidence-based answer to "what is actually slowing our
  agents down, and what consistently makes them faster?" — decisions about tooling,
  prompts, and environments are made on anecdote.
 **The cost:** repeated mistakes, non-transferable wins, slow and uneven improvement
 of agent performance, and no feedback loop from real session data back into the
 tools/environments/instructions that shape future sessions.
 ## 3. Goals & Non-Goals
 ### 3.1 Goals
 | # | Goal |
 |---|------|
 | G1 | **Capture** coding sessions from Claude, Codex, and GrokBuild in a normalized, comparable form. |
 | G2 | **Detect** recurring *problem patterns* (failure, friction, wasted budget) and *success patterns* (efficient resolutions) from that data. |
 | G3 | **Curate** detected patterns into a reviewed catalog of *solution patterns* that humans and agents can discuss and approve. |
 | G4 | **Distribute** approved patterns back into agent environments — as instructions, tools, or extensions — in a per-flavor-appropriate form. |
 | G5 | **Measure** whether distributed patterns actually improved subsequent sessions (close the loop). |
 | G6 | Keep the whole loop **agent-flavor-agnostic at the core**, with thin per-flavor adapters at the edges. |
 ### 3.2 Non-Goals (initial release)
 - Not a replacement for the coding agents themselves; Helix Forge observes and
  improves them, it does not execute coding tasks.
 - Not a general APM/observability product; scope is coding-session improvement, not
  arbitrary infrastructure monitoring.
 - Not an autonomous self-modifying system — pattern promotion into live agent
  environments requires human approval (HITL) for the first release.
 - Not building new model training/fine-tuning pipelines; we optimize *context,
  tooling, and environment*, not model weights.
 - Not replacing the Custodian State Hub; Helix Forge is a producer/consumer of hub
  state, not a competing system of record. (See §9.)
 ## 4. Users & Personas
 | Persona | Description | What they need from Helix Forge |
 |---------|-------------|----------------------------------|
 | **Operator (Bernd)** | Owns the agentic ecosystem; decides which patterns become standards. | A reviewable catalog of patterns with evidence; control over what ships to agents. |
 | **Coding agent (Claude / Codex / GrokBuild)** | Runs tasks in a repo; both the *source* of session data and the *consumer* of patterns. | To emit session data cheaply; to receive applicable patterns in its native format at session start. |
 | **Repo maintainer agent** | The per-repo agent persona (e.g. `agentic-resources`) following AGENTS.md conventions. | Patterns scoped to its repo/domain; integration via existing workplan + state-hub flow. |
 | **Reviewer (human or kaizen agent)** | Evaluates candidate patterns before they become standards. | Clear pattern proposals, supporting evidence, and a discuss/approve/reject workflow. |
 ## 5. Core Concepts (Domain Model)
 - **Session** — one bounded run of a coding agent against a repo. Has an agent flavor,
  repo, task reference, timeline of events, outcome, and cost (tokens/time).
 - **Session Event** — a normalized atomic record within a session: tool call, edit,
  test run, error, retry, human intervention, decision, completion.
 - **Signal** — a derived indicator extracted from sessions: e.g. *repeated test
  failure on same file*, *budget overrun*, *fast clean resolution*, *retry storm*,
  *human escalation*.
 - **Problem Pattern** — a recurring negative signal cluster ("agents repeatedly fail
  X because Y").
 - **Success Pattern** — a recurring positive resolution ("doing Z reliably resolves X
  cheaply").
 - **Solution Pattern** — a curated, reviewed artifact pairing a problem with one or
  more recommended resolutions, written agent-flavor-agnostically, with per-flavor
  rendering hints.
 - **Pattern Application** — the act of distributing a solution pattern into a specific
  agent environment (an instruction snippet, a tool, an extension), plus the record of
  its effect on later sessions.
 ## 6. Functional Requirements
 ### 6.1 Capture (G1)
 - **FR-C1** Ingest session transcripts/logs from each supported agent flavor via a
  per-flavor **collector adapter**.
 - **FR-C2** Normalize raw logs into the common `Session` + `Session Event` schema,
  regardless of source flavor.
 - **FR-C3** Tag every session with: agent flavor, repo, domain, task/workplan id (if
  any), outcome (success/fail/abandoned), and cost metrics (tokens, wall-clock,
  retries).
 - **FR-C4** Support both **batch import** (historical logs) and **incremental ingest**
  (new sessions as they close).
 - **FR-C5** Collection must be low-friction and non-blocking — an agent emitting
  session data must never slow or break the actual coding task.
 ### 6.2 Detect (G2)
 - **FR-D1** Run signal extractors over normalized sessions to surface problem and
  success signals.
 - **FR-D2** Cluster recurring signals across sessions/repos/flavors into candidate
  Problem Patterns and Success Patterns.
 - **FR-D3** For each candidate pattern, attach **evidence**: the supporting sessions,
  frequency, affected repos, affected flavors, and estimated cost impact.
 - **FR-D4** Flag **cross-flavor** patterns explicitly (a problem seen in Claude that
  Codex also hits) — these are the highest-value reuse targets.
 ### 6.3 Curate (G3)
 - **FR-U1** Present candidate patterns for review with their evidence in a
  discuss/approve/reject workflow.
 - **FR-U2** Allow a reviewer (human or kaizen agent) to promote a candidate into a
  **Solution Pattern**: a named, versioned artifact with problem description,
  recommended resolution(s), applicability scope, and per-flavor rendering hints.
 - **FR-U3** Maintain a **Pattern Catalog** as the source of truth for approved
  solution patterns, versioned and stored as files in-repo (consistent with ADR-001:
  files originate work, the hub indexes them).
 - **FR-U4** Record pattern decisions through the State Hub decision mechanism so
  rationale is auditable.
 ### 6.4 Distribute (G4)
 - **FR-X1** Render each approved solution pattern into per-flavor artifacts via
  **distributor adapters**:
  - Claude → `CLAUDE.md` snippets, skills, or settings/hooks.
  - Codex → `AGENTS.md` snippets / repo conventions.
  - GrokBuild → its native instruction/extension format.
 - **FR-X2** Scope distribution by repo and domain, so a pattern only lands where it
  applies.
 - **FR-X3** Distribution is **proposed, not auto-applied** in v1 — output is a
  reviewable change (e.g. a workplan or PR), gated by human approval.
 - **FR-X4** Track which patterns are currently active in which environments.
 ### 6.5 Measure (G5)
 - **FR-M1** After a pattern is applied, compare subsequent sessions touching the same
  signal against the pre-application baseline (cost, retry rate, success rate,
  human-intervention rate).
 - **FR-M2** Surface per-pattern **effectiveness** so ineffective patterns can be
  revised or retired.
 - **FR-M3** Provide a fleet-level view: are sessions across the collection getting
  cheaper / more reliable over time? (the helix turning.)
 ### 6.6 Multi-Agent Support (G6)
 - **FR-A1** The core schema, detection, catalog, and measurement are **flavor-agnostic**.
 - **FR-A2** All flavor-specific knowledge lives in **collector adapters** (input) and
  **distributor adapters** (output). Adding a fourth agent = adding one collector +
  one distributor, no core changes.
 - **FR-A3** A successful pattern discovered via one flavor MUST be expressible for all
  other supported flavors.
 ## 7. Architecture Overview
 ```
   ┌──────────── per-flavor edges ────────────┐         ┌──── flavor-agnostic core ────┐
   │                                           │         │                              │
 Claude ─┐                                     │         │                              │
 Codex  ─┼─► Collector Adapters ──► Normalizer ─┼────────►│  Session + Event Store       │
 Grok   ─┘                                     │         │           │                  │
                                               │         │           ▼                  │
                                               │         │  Signal Extractors           │
                                               │         │           │                  │
                                               │         │           ▼                  │
                                               │         │  Pattern Detector / Clusterer│
                                               │         │           │                  │
                                               │         │           ▼                  │
                                               │         │  Curation + Pattern Catalog  │  ◄─ reviewer (human/kaizen)
                                               │         │           │                  │
 Claude ◄┐                                     │         │           ▼                  │
 Codex  ◄┼── Distributor Adapters ◄────────────┼─────────│  Effectiveness Measurement   │
 Grok   ◄┘                                     │         │                              │
   └───────────────────────────────────────────┘         └──────────────────────────────┘
                                  ▲ feeds back into ▲  tools / environments / instructions
 ```
 **Design principle:** *agnostic core, thin adapters at the edges.* The expensive,
 reusable intelligence (normalized sessions, detection, catalog, measurement) is built
 once; each agent flavor only needs an input adapter and an output adapter.
 ## 8. Data & Storage
 - **Pattern Catalog** and **workplans**: files in `agentic-resources` (per ADR-001 in
  AGENTS.md — files are the source of truth, the hub indexes them).
 - **Session/event data**: a local store (start simple: structured files / SQLite;
  graduate to Postgres alongside the State Hub if volume warrants).
 - **Decisions & progress**: recorded through the Custodian State Hub so the broader
  ecosystem stays aware of Helix Forge's activity.
 ## 9. Integration with the Custodian State Hub
 Helix Forge runs inside the `helix_forge` domain and is **not** a competing system of
 record:
 - Work originates as **workplans** in this repo (`AGENTIC-WP-NNNN`), synced via
  `make fix-consistency REPO=agentic-resources`.
 - Pattern-promotion and distribution decisions are logged via the hub's decision API.
 - Each Helix Forge run logs at least one `add_progress_event()` / `POST /progress/`.
 - The hub remains a **read model**; Helix Forge writes its durable artifacts as files
  and lets the hub index them.
 ### 9.1 Downstream: kaizen-agentic project metrics correlation
 Helix Forge is a **fleet-level** producer of normalized session digests. The
 **kaizen-agentic** framework is a **project-scoped** consumer of optional
 correlation fields on its execution metrics (ADR-004). The two layers link
 **by reference** — kaizen-agentic does not re-implement JSONL ingestion or write
 into the Helix Forge store.
 | Layer | Owner | What it stores |
 |-------|-------|----------------|
 | Fleet | agentic-resources (`session_memory`) | Per-session digests in the local SQLite store |
 | Project | kaizen-agentic | `.kaizen/metrics/<agent>/executions.jsonl` |
 **Canonical spec in this repo:** [DESIGN-session-memory.md §11](DESIGN-session-memory.md#11-project-metrics-correlation-kaizen-agentic)
 (session-close env export, digest read path, stable JSON shape).
 **Authoritative cross-repo contract (kaizen-agentic):**
 [Helix Forge Correlation Contract](https://gitea.coulomb.social/coulomb/kaizen-agentic/src/branch/main/docs/integrations/helix-forge-correlation.md).
 Field mapping: `Session.session_uid` → `helix_session_uid`; digest token totals →
 `tokens`; MCP/tool overhead share → `infra_overhead_share`.
 **Read path for consumers:** `HELIX_STORE_DB` points at the digest SQLite file
 (default `session_memory/.store/mem.db`); `python -m session_memory.digest_lookup
 <uid> --json` or `kaizen-agentic metrics correlate <uid>` performs a read-only
 lookup. No ingestion code belongs in kaizen-agentic.
 ## 10. Success Metrics
 | Metric | Meaning | Target (directional, v1) |
 |--------|---------|--------------------------|
 | Sessions captured | Coverage of real work | ≥ 90% of sessions across the 3 flavors normalized |
 | Patterns cataloged | Knowledge made reusable | A growing, non-trivial catalog of reviewed solution patterns |
 | Cross-flavor patterns | Reuse leverage | ≥ 1 pattern proven to transfer across flavors |
 | Pattern effectiveness | Loop is closing | Applied patterns show measurable cost/reliability improvement vs. baseline |
 | Fleet trend | The helix turns | Median session cost ↓ and success rate ↑ over time |
 | Repeated-failure rate | Friction eliminated | Known problem patterns recur less after distribution |
 ## 11. Phasing / Roadmap
 - **Phase 0 — Foundations.** Define the Session/Event schema and Pattern Catalog
  format. One collector adapter (Claude) + batch import. Manual inspection only.
 - **Phase 1 — Detect.** Signal extractors + pattern clustering over captured sessions;
  candidate patterns surfaced with evidence. Add Codex + GrokBuild collectors.
 - **Phase 2 — Curate.** Review workflow + versioned Pattern Catalog, wired to hub
  decisions.
 - **Phase 3 — Distribute.** Distributor adapters for all three flavors; patterns ship
  as reviewable workplans/PRs (HITL).
 - **Phase 4 — Measure.** Baseline-vs-after effectiveness and fleet-level trend
  reporting; retire ineffective patterns. Loop is closed.
 ## 12. Open Questions
 - **OQ1** What is the canonical raw log format available from each of Claude, Codex,
  and GrokBuild today, and how lossy is normalization from each?
 - **OQ2** How are sessions reliably bounded and attributed to a repo/task across the
  three flavors?
 - **OQ3** Where does detection logic run — local batch jobs, hub-side, or a dedicated
  service? What volume do we actually expect?
 - ~~**OQ4** Pattern format: how do we keep one agnostic representation while giving each
  distributor enough to render high-quality native artifacts?~~ **Resolved (Phase 2,
  AGENTIC-WP-0004):** the `SolutionPattern` core is flavor-agnostic (problem,
  resolutions, scope, provenance) and carries per-flavor knowledge only in a separate
  `rendering_hints` sub-structure keyed by flavor — distributors read the hints, the
  core stays neutral. Catalogued as versioned files-first artifacts (FR-U3).
 - ~~**OQ5** What's the minimum trustworthy evidence bar before a pattern is allowed to be
  distributed to live agent environments?~~ **Resolved (Phase 2):** a two-tier
  evidence bar (`[curate.gate]`). A *promote* floor (frequency / distinct sessions /
  cost-impact) admits a candidate as `provisional`; a stricter *distribution* floor
  (higher frequency, optional cross-flavor requirement, cost-impact) is required to
  mark a pattern `approved` + `distribution_ready`. Defaults are conservative and
  config-tunable.
 - ~~**OQ6** How do we prevent pattern bloat — too many low-value instructions degrading
  agent context budgets (cf. the token-budget policy in global instructions)?~~
  **Resolved (Phase 2):** a bloat guard flags duplicate (same id) and near-duplicate
  (same signal-type+locus) candidates at review time, and the catalog dedups
  structurally on the source-candidate key so re-promotion never multiplies entries.
  Thin candidates stay `provisional` (not distributed) rather than padding live
  context.
 ## 13. Risks
 | Risk | Mitigation |
 |------|------------|
 | Capture overhead slows real coding sessions | Async, non-blocking collection (FR-C5); never in the agent's critical path. |
 | Patterns become noise / context bloat | Effectiveness gating (FR-M2) + retirement; measure before broad distribution. |
 | Over-fitting to one flavor | Agnostic core + explicit cross-flavor flagging (FR-D4, FR-A3). |
 | Bad pattern degrades agents | HITL approval before distribution (FR-X3); baseline measurement to catch regressions. |
 | Drift from State Hub conventions | Files-first per ADR-001; log via hub; no competing source of record. |
 ---
 *This PRD is a draft for discussion. Next step: a `proposed` workplan
 (`AGENTIC-WP-0002`) scoping Phase 0 — the Session/Event schema and the first
 (Claude) collector adapter.*
--- a/registry/README.md
+++ b/registry/README.md
@@ -0,0 +1,12 @@
 # Capability Registry
 Markdown-first capability index for federation and reuse planning.
 ## Authoring
 1. Copy a capability entry template (see reuse-surface `templates/capability-entry.template.md`).
 2. Add the row to `indexes/capabilities.yaml`.
 3. Run `reuse-surface validate` from a checkout with the CLI installed.
 4. Merge to `main` and verify publish with `reuse-surface establish --publish-check`.
 Federation contract: reuse-surface `docs/RegistryFederation.md`.
--- a/registry/capabilities/.gitkeep
+++ b/registry/capabilities/.gitkeep
--- a/registry/indexes/capabilities.yaml
+++ b/registry/indexes/capabilities.yaml
@@ -0,0 +1,4 @@
 version: 1
 updated: '2026-06-16'
 domain: helix_forge
 capabilities: []
--- a/session_memory/README.md
+++ b/session_memory/README.md
@@ -0,0 +1,260 @@
 # session_memory
 Capture + retention layer for Helix Forge — the **Capture** stage of the loop in
 [../docs/PRD-helix-forge.md](../docs/PRD-helix-forge.md), built to the
 [../docs/DESIGN-session-memory.md](../docs/DESIGN-session-memory.md) spec.
 It scans coding-agent session logs, normalizes them into one schema, distills a
 compact per-session digest, and ages out raw bulk under a **storage budget**
 (dropping sessions once analyzed and once space is needed) rather than a fixed
 time window.
 ## Layout
 ```
 session_memory/
  adapters/common.py   # shared Normalized bundle + helpers
  adapters/claude.py   # Tier0 -> Tier1 normalizers, one per flavor
  adapters/codex.py    #   (rollout {timestamp,type,payload}, flat call_id join)
  adapters/grok.py     #   (per-session dir: chat_history + events + updates)
  core/schema.py       # Session / SessionEvent / Cost
  core/store.py        # SQLite rows + blob-dir bodies (Tier1) + digests/patterns (Tier2)
  core/cursor.py       # incremental ingest cursors
  core/digest.py       # Tier1 -> Tier2 promotion + outcome heuristic
  core/retention.py    # budget-based eviction sweep
  ingest.py            # one sweep: discover -> normalize -> store -> digest -> evict
  detect/signals.py    # signal extractors over digests
  detect/cluster.py    # cluster signals -> candidate patterns + cross-flavor flag
  detect/__main__.py   # python -m session_memory.detect (ranked report)
  curate/schema.py     # SolutionPattern artifact + per-flavor rendering hints
  curate/catalog.py    # versioned, files-first Pattern Catalog (dedup on id)
  curate/gating.py     # promotion evidence bar + bloat guard
  curate/review.py     # discuss/approve/reject -> promote workflow
  curate/decisions.py  # hub decision audit trail (graceful local-queue fallback)
  curate/__main__.py   # python -m session_memory.curate (interactive / --auto-approve)
  catalog/             # the committed Pattern Catalog (source of truth)
  distribute/base.py   # Artifact + Distributor protocol + idempotent snippet markers
  distribute/claude.py # CLAUDE.md (or skill) renderer    } per-flavor edges
  distribute/codex.py  # AGENTS.md renderer                } (agnostic body,
  distribute/grok.py   # native instruction renderer       }  different targets)
  distribute/proposals.py  # scoping + proposed-not-applied output + active registry
  distribute/__main__.py   # python -m session_memory.distribute
  measure/metrics.py   # fleet metrics + persisted baseline snapshots
  measure/effect.py    # before/after per-pattern effectiveness
  measure/__main__.py  # python -m session_memory.measure
  retro/build.py       # windowed top-3-per-repo suggestions
  retro/publish.py     # hub coding_retro read model + local report
  retro/__main__.py    # python -m session_memory.retro
  digest_lookup.py     # python -m session_memory.digest_lookup (read one digest, no ingest)
  config.toml          # store paths, retention caps, sources, repo->domain map, curate gate
 ```
 The local store lives under `session_memory/.store/` (gitignored).
 ## Run a sweep
 ```bash
 # from the repo root
 python -m session_memory.ingest                 # ingest + analyze + evict
 python -m session_memory.ingest --dry-run       # discover + parse only, writes nothing
 python -m session_memory.ingest --config path/to/config.toml
 ```
 Output reports `discovered / ingested / skipped_unchanged / analyzed` and a
 retention line (`freed`, `final_usage`, and per-pass eviction counts). Sweeps are
 idempotent — re-running skips unchanged files via the cursor.
 ## Scheduling (cadence)
 Retention is budget-based; the `cadence` in `config.toml` only decides how often
 the sweep *runs*. Trigger it with the repo scheduler, e.g. daily:
 ```bash
 # Claude Code: schedule a daily routine that runs the sweep
 /schedule "daily session-memory sweep" -- python -m session_memory.ingest
 ```
 or a cron entry / `/loop` on a timer. Push-capture (agent Stop/SessionEnd hooks)
 can also enqueue a sweep; see design §7.
 ## Detect candidate patterns
 After ingesting, mine the digests for recurring problem/success patterns:
 ```bash
 python -m session_memory.detect                 # ranked report, cross-flavor first
 python -m session_memory.detect --json          # machine-readable candidates
 python -m session_memory.detect --min-frequency 3
 ```
 Candidates are persisted to a Tier 2 `patterns` table and are the input to the
 Curate phase (Phase 2). Patterns whose evidence spans more than one agent flavor
 are flagged `[CROSS-FLAVOR]` — the highest-value reuse targets.
 ## Curate candidates into the Pattern Catalog
 Review detect candidates into versioned **Solution Patterns** held in the
 files-first catalog (`session_memory/catalog/`). The flow is **detect → curate →
 (Phase 3) distribute**; `curate` refreshes candidates by running detect first.
 ```bash
 python -m session_memory.curate                 # interactive review (a/r/d per candidate)
 python -m session_memory.curate --auto-approve  # batch: promote all that clear the evidence bar
 python -m session_memory.curate --json          # machine-readable result
 ```
 - **Promotion** writes a `SolutionPattern` file (id = source candidate key, so
  re-promoting the same candidate dedups; content changes bump the semver and
  archive the prior version to `<id>.history.jsonl`).
 - The **evidence bar** (`[curate.gate]`) sets two floors: a promote floor and a
  stricter *distribution* floor. A thin-but-real candidate lands `provisional`;
  one clearing the distribution floor lands `approved` + `distribution_ready`.
 - A **bloat guard** flags duplicate / near-duplicate candidates so the catalog
  stays lean.
 - Re-review is **idempotent** — a remembered decision is skipped unless the
  candidate's evidence changed; a prior reject is not re-surfaced.
 - Each final promote/reject is recorded as a **hub decision**; if the hub is
  offline the decision is queued to `[curate].decision_queue` for later sync
  (the same after-the-fact pattern used in Phase 1).
 ### Curate knobs (`[curate]` / `[curate.gate]` in config.toml)
 | Key | Meaning |
 |-----|---------|
 | `catalog_dir` | committed Pattern Catalog dir (source of truth) |
 | `review_log` / `decision_queue` | remembered decisions + pending hub decisions (gitignored) |
 | `min_frequency` / `min_sessions` / `min_cost_impact` | floor to promote at all |
 | `dist_require_cross_flavor` | require cross-flavor evidence to be distribution-eligible |
 | `dist_min_frequency` / `dist_min_cost_impact` | stricter floor for `distribution_ready` |
 ## Distribute patterns as per-flavor proposals
 Render approved catalog patterns into per-flavor artifacts — **proposed, never
 auto-applied** (HITL). Completes the loop: **detect → curate → distribute**.
 ```bash
 python -m session_memory.distribute                 # proposals for all repos/flavors
 python -m session_memory.distribute --repo state-hub --flavor claude
 python -m session_memory.distribute --json
 ```
 - Only `approved` + `distribution_ready` patterns are rendered; each pattern's
  `Scope` (repos/domains/flavors) decides where it lands (FR-X2).
 - Each flavor renders the **same agnostic body** to its own target (Claude →
  `CLAUDE.md`/skill, Codex → `AGENTS.md`, Grok → native) via `rendering_hints`
  (FR-A3); blocks carry stable `BEGIN/END` markers so re-running updates in place.
 - Output goes to `session_memory/proposals/<repo>/<target>` (gitignored,
  regenerated) — a reviewable diff a human applies (FR-X3). The committed
  `distribute/active_patterns.json` records which pattern+version is proposed in
  which `(repo, flavor)` (FR-X4).
 ## Measure effectiveness (closing the loop)
 Track whether the fleet is getting cheaper / more reliable, and whether a
 distributed pattern actually helped.
 ```bash
 python -m session_memory.measure --label "baseline"      # snapshot + trend
 python -m session_memory.measure --since 2026-06-07      # before/after a change
 python -m session_memory.measure --no-save --json
 ```
 - A **snapshot** (infra-overhead share, error rate, schema-thrash, token
  percentiles, success rate) is appended to `measure/baselines.jsonl` to build a
  trend (FR-M3).
 - `--since DATE` splits sessions before/after a change and diffs the metrics, with
  an `improved` verdict per metric (FR-M1/FR-M2) — so ineffective patterns can be
  retired. Recorded pre-fix baseline (2026-06-07): 27 sessions, infra-overhead
  median 11.7 %, error rate 0.96, schema-thrash 8 sessions.
 ## Weekly retro (the input to the scheduled retrospection)
 A windowed roll-up: detect + measure over the last N days → the **top-3
 improvement suggestions per repo** (cross-flavor first; recommendations pulled
 from the Pattern Catalog) → published to the hub as the `coding_retro` read model.
 ```bash
 python -m session_memory.retro                      # last 7 days, local report
 python -m session_memory.retro --window-days 30 --json
 python -m session_memory.retro --publish            # also post coding_retro to the hub
 ```
 Writes `retro/last_retro.{json,md}` and (with `--publish`) posts an
 `event_type=coding_retro` progress event. This is consumed by activity-core's
 **Weekly Coding Retrospection** schedule (ACTIVITY-WP-0008, Saturday 19:00 Berlin),
 which emits one improvement task per relevant repo. Hub publish degrades
 gracefully when the hub is unreachable.
 ## Correlation with kaizen-agentic
 Helix Forge owns **fleet-level** session digests; **kaizen-agentic** owns
 **project-scoped** execution metrics (ADR-004). The two layers correlate by
 optional `helix_session_uid` on project records — **link-by-reference only**;
 kaizen-agentic does not ingest JSONL into this store.
 | Layer | Storage |
 |-------|---------|
 | Fleet (here) | `session_memory/.store/mem.db` → `digests` table |
 | Project (kaizen) | `.kaizen/metrics/<agent>/executions.jsonl` |
 - **Spec:** [DESIGN-session-memory.md §11](../docs/DESIGN-session-memory.md#11-project-metrics-correlation-kaizen-agentic)
 - **Contract (kaizen-agentic):** [Helix Forge Correlation Contract](https://gitea.coulomb.social/coulomb/kaizen-agentic/src/branch/main/docs/integrations/helix-forge-correlation.md)
 ### Session-close env export
 After ingest has written the digest, agents using both layers export `HELIX_*`
 vars for `kaizen-agentic metrics record` to merge (names match ADR-004):
 `HELIX_SESSION_UID`, `HELIX_REPO`, `HELIX_FLAVOR`, `HELIX_TOKENS`,
 `HELIX_INFRA_OVERHEAD_SHARE`, and optionally `HELIX_STORE_DB` (absolute path to
 `mem.db`). See DESIGN §11.1 for field sources.
 ### Read one digest (for `metrics correlate`)
 ```bash
 python -m session_memory.digest_lookup claude:abc-123 --json
 HELIX_STORE_DB=/abs/path/to/mem.db python -m session_memory.digest_lookup <uid>
 ```
 Defaults to `[store].db_path` in `config.toml`. Read-only — does not run ingest.
 ## Retention knobs (`[retention]` in config.toml)
 | Key | Meaning |
 |-----|---------|
 | `raw_soft_cap_bytes` | begin evicting **analyzed** sessions above this (oldest first) |
 | `raw_hard_cap_bytes` | absolute Tier 1 ceiling; overflow path may, as a last resort, evict un-analyzed sessions and report `data_loss` |
 | `raw_max_age_days` | backstop: analyzed raw older than this is evictable regardless of space |
 | `distilled_cap_bytes` | Tier 2 ceiling — **alert only**, never auto-dropped |
 **Invariant:** a session's raw bytes are never dropped before its Tier 2 digest
 exists, except the explicitly-reported hard-cap overflow path.
 ## Tests
 ```bash
 python -m pytest          # schema, adapters, store, digest, retention, ingest, detect, curate
 ```
 ## Status
 - **Phase 0** (AGENTIC-WP-0002): schema, store, digest, budget retention, Claude
  adapter, ingest sweep.
 - **Phase 1** (AGENTIC-WP-0003): Codex + Grok adapters, multi-file session merge,
  and the Detect pipeline (signals → clustering → cross-flavor candidate patterns).
 - **Phase 2** (AGENTIC-WP-0004): Curate — Solution Pattern schema, versioned
  files-first Pattern Catalog, discuss/approve/reject review with an evidence bar +
  bloat guard, and hub-decision audit trail.
 - **Detect hardening** (AGENTIC-WP-0005): session-quality filter + tool-mix /
  infra-overhead signals. **Error mining** (AGENTIC-WP-0006): recurring error
  fingerprints → root-cause patterns.
 - **Phase 3** (AGENTIC-WP-0007): Distribute — per-flavor distributor adapters
  render approved patterns into proposed (HITL) artifacts, scoped by repo/domain,
  with an active-pattern registry.
 - **Phase 4** (AGENTIC-WP-0009): Measure — fleet baseline/trend + before/after
  per-pattern effectiveness. The Capture → Detect → Curate → Distribute → Measure
  loop is closed.
 - **Weekly retro** (AGENTIC-WP-0010): windowed top-3-per-repo + hub `coding_retro`
  publish.
 - **Kaizen correlation** (AGENTIC-WP-0011): bidirectional doc links, session-close
  `HELIX_*` env convention, `digest_lookup` read path.
--- a/session_memory/init.py
+++ b/session_memory/init.py
@@ -0,0 +1,7 @@
 """Coding Session Memory — Helix Forge capture + retention layer.
 See docs/DESIGN-session-memory.md. Importable package name uses an underscore
 (``session_memory``) where the design doc writes ``session-memory/``.
 """
 __all__ = ["core", "adapters"]
--- a/session_memory/adapters/init.py
+++ b/session_memory/adapters/init.py
@@ -0,0 +1 @@
 """Per-flavor collector adapters (Tier 0 -> Tier 1 normalization)."""
--- a/session_memory/adapters/claude.py
+++ b/session_memory/adapters/claude.py
@@ -0,0 +1,162 @@
 """Claude Code collector adapter — Tier 0 -> Tier 1 (design §2.1, §4.3).
 Reads ``~/.claude/projects/<url-encoded-cwd>/<session-uuid>.jsonl`` (and
 ``agent-*.jsonl`` sidechains), discriminates on the record ``type``, reconstructs
 the turn DAG via ``uuid``/``parentUuid``, and emits normalized records.
 Returns a :class:`Normalized` bundle: the ``Session``, its ordered
 ``SessionEvent`` list, and a ``blobs`` map (``payload_ref -> full text body``)
 that the store persists out-of-line so Tier 1 rows stay light.
 """
 from __future__ import annotations
 import os
 from typing import Any, Optional
 from ..core.schema import Cost, Session, SessionEvent
 from .common import (  # noqa: F401  (Normalized re-exported for back-compat)
    Normalized,
    classify_tool,
    first_line as _first_line,
    iter_jsonl as _iter_records,
    now_iso as _now,
    resolve_repo as _resolve_repo,
    seconds_between as _seconds_between,
    stringify as _stringify,
 )
 FLAVOR = "claude"
 def _content_blocks(message: dict[str, Any]) -> list[dict[str, Any]]:
    content = message.get("content")
    if isinstance(content, str):
        return [{"type": "text", "text": content}]
    if isinstance(content, list):
        return [b for b in content if isinstance(b, dict)]
    return []
 def parse_session(path: str, repo_domain_map: Optional[dict[str, str]] = None) -> Optional[Normalized]:
    """Parse one Claude transcript file into a Normalized bundle.
    Returns None if the file has no usable session records.
    """
    repo_domain_map = repo_domain_map or {}
    records = list(_iter_records(path))
    if not records:
        return None
    session_id: Optional[str] = None
    cwd = git_branch = version = model = None
    timestamps: list[str] = []
    file_is_sidechain = os.path.basename(path).startswith("agent-")
    events: list[SessionEvent] = []
    blobs: dict[str, str] = {}
    uuid_to_seq: dict[str, int] = {}
    cost = Cost()
    seq = 0
    def add_event(uuid: Optional[str], parent_uuid: Optional[str], ts, kind, *,
                  role=None, tool=None, summary=None, body=None, tokens=0, sidechain=False):
        nonlocal seq
        s = seq
        seq += 1
        if uuid:
            uuid_to_seq[uuid] = s
        parent_seq = uuid_to_seq.get(parent_uuid) if parent_uuid else None
        payload_ref = None
        if body:
            payload_ref = f"blob://{session_id}/{s}"
            blobs[payload_ref] = body
        events.append(SessionEvent(
            session_uid=Session.make_uid(FLAVOR, session_id or "unknown"),
            seq=s, parent_seq=parent_seq, ts=ts, kind=kind, role=role, tool=tool,
            summary=(summary or "")[:300] or None, payload_ref=payload_ref,
            tokens=tokens, is_sidechain=sidechain or file_is_sidechain,
        ))
    for rec in records:
        rtype = rec.get("type")
        ts = rec.get("timestamp")
        if ts:
            timestamps.append(ts)
        session_id = session_id or rec.get("sessionId")
        cwd = cwd or rec.get("cwd")
        git_branch = git_branch or rec.get("gitBranch")
        version = version or rec.get("version")
        uuid = rec.get("uuid")
        parent = rec.get("parentUuid")
        sidechain = bool(rec.get("isSidechain"))
        if rtype == "user":
            msg = rec.get("message", {})
            for b in _content_blocks(msg):
                bt = b.get("type")
                if bt == "tool_result":
                    body = _stringify(b.get("content"))
                    add_event(uuid, parent, ts, "tool_result", role="tool",
                              summary="tool result", body=body, sidechain=sidechain)
                else:
                    text = b.get("text", "")
                    add_event(uuid, parent, ts, "user_msg", role="user",
                              summary=_first_line(text), body=text, sidechain=sidechain)
        elif rtype == "assistant":
            msg = rec.get("message", {})
            model = model or msg.get("model")
            usage = msg.get("usage") or {}
            cost.input_tokens += int(usage.get("input_tokens", 0) or 0)
            cost.output_tokens += int(usage.get("output_tokens", 0) or 0)
            cost.cache_tokens += int(
                (usage.get("cache_read_input_tokens", 0) or 0)
                + (usage.get("cache_creation_input_tokens", 0) or 0)
            )
            out_tokens = int(usage.get("output_tokens", 0) or 0)
            for b in _content_blocks(msg):
                bt = b.get("type")
                if bt == "thinking":
                    add_event(uuid, parent, ts, "thinking", role="assistant",
                              summary="thinking", body=b.get("thinking", ""), sidechain=sidechain)
                elif bt == "text":
                    text = b.get("text", "")
                    add_event(uuid, parent, ts, "assistant_msg", role="assistant",
                              summary=_first_line(text), body=text, tokens=out_tokens, sidechain=sidechain)
                elif bt == "tool_use":
                    name = b.get("name", "")
                    inp = b.get("input", {})
                    body = _stringify(inp)
                    cmd = inp.get("command", "") if isinstance(inp, dict) else ""
                    kind = classify_tool(name, _stringify(cmd))
                    add_event(uuid, parent, ts, kind, role="assistant", tool=name,
                              summary=f"{name}", body=body, sidechain=sidechain)
        elif rtype == "summary":
            add_event(uuid, parent, ts, "lifecycle", summary="summary",
                      body=_stringify(rec.get("summary")), sidechain=sidechain)
        # queue-operation / ai-title / last-prompt / attachment: skipped as events
    if session_id is None:
        return None
    cost.turns = sum(1 for e in events if e.kind == "user_msg")
    started = min(timestamps) if timestamps else None
    ended = max(timestamps) if timestamps else None
    cost.wall_clock_s = _seconds_between(started, ended)
    repo, domain = _resolve_repo(cwd, repo_domain_map)
    session = Session(
        session_uid=Session.make_uid(FLAVOR, session_id),
        flavor=FLAVOR,
        native_session_id=session_id,
        repo=repo, domain=domain, cwd=cwd, git_branch=git_branch,
        model=model, started_at=started, ended_at=ended,
        outcome="unknown",  # outcome inference happens in the digest step (T04)
        cost=cost,
        source_path=path,
        source_bytes=os.path.getsize(path) if os.path.exists(path) else 0,
        discovered_at=_now(),
    )
    return Normalized(session=session, events=events, blobs=blobs)
--- a/session_memory/adapters/codex.py
+++ b/session_memory/adapters/codex.py
@@ -0,0 +1,167 @@
 """OpenAI Codex CLI collector adapter — Tier 0 -> Tier 1 (design §2.2, §4.3).
 Reads ``$CODEX_HOME/sessions/YYYY/MM/DD/rollout-*.jsonl``. Each line is a
 ``RolloutLine`` wrapper ``{timestamp, type, payload}``; ``type`` discriminates
 ``session_meta`` / ``response_item`` / ``event_msg`` / ``turn_context`` /
 ``compacted``.
 Codex is **flat** — tool calls and outputs are joined only by ``call_id`` with no
 parent-ref DAG — so ``seq`` is assigned by temporal (line) order and
 ``parent_seq`` is set for ``function_call_output`` back to its ``function_call``.
 """
 from __future__ import annotations
 import os
 from typing import Any, Optional
 from ..core.schema import Cost, Session, SessionEvent
 from .common import (
    Normalized,
    classify_tool,
    first_line,
    iter_jsonl,
    now_iso,
    resolve_repo,
    seconds_between,
    stringify,
 )
 FLAVOR = "codex"
 def _message_text(payload: dict[str, Any]) -> str:
    content = payload.get("content")
    if isinstance(content, str):
        return content
    parts = []
    if isinstance(content, list):
        for b in content:
            if isinstance(b, dict):
                parts.append(b.get("text") or b.get("output_text") or "")
            elif isinstance(b, str):
                parts.append(b)
    return "\n".join(p for p in parts if p)
 def _extract_tokens(payload: dict[str, Any]) -> tuple[int, int, int]:
    """Best-effort (input, output, cache) from a token_count payload.
    Field shapes vary across Codex versions; probe known locations, else recurse.
    """
    for scope in (payload, payload.get("info") or {}, payload.get("usage") or {},
                  (payload.get("info") or {}).get("total_token_usage") or {}):
        if isinstance(scope, dict):
            i = scope.get("input_tokens") or scope.get("prompt_tokens")
            o = scope.get("output_tokens") or scope.get("completion_tokens")
            if i is not None or o is not None:
                cache = scope.get("cached_input_tokens") or scope.get("cache_read_input_tokens") or 0
                return int(i or 0), int(o or 0), int(cache or 0)
    return 0, 0, 0
 def parse_session(path: str, repo_domain_map: Optional[dict[str, str]] = None) -> Optional[Normalized]:
    repo_domain_map = repo_domain_map or {}
    records = list(iter_jsonl(path))
    if not records:
        return None
    session_id: Optional[str] = None
    cwd = model = cli_version = None
    timestamps: list[str] = []
    events: list[SessionEvent] = []
    blobs: dict[str, str] = {}
    call_seq: dict[str, int] = {}  # call_id -> seq of its function_call
    cost = Cost()
    seq = 0
    def add_event(ts, kind, *, role=None, tool=None, summary=None, body=None,
                  tokens=0, parent_seq=None) -> int:
        nonlocal seq
        s = seq
        seq += 1
        payload_ref = None
        if body:
            payload_ref = f"blob://{session_id}/{s}"
            blobs[payload_ref] = body
        events.append(SessionEvent(
            session_uid=Session.make_uid(FLAVOR, session_id or "unknown"),
            seq=s, parent_seq=parent_seq, ts=ts, kind=kind, role=role, tool=tool,
            summary=(summary or "")[:300] or None, payload_ref=payload_ref, tokens=tokens,
        ))
        return s
    for rec in records:
        rtype = rec.get("type")
        ts = rec.get("timestamp")
        if ts:
            timestamps.append(ts)
        payload = rec.get("payload") or {}
        if rtype == "session_meta":
            session_id = session_id or payload.get("id")
            cwd = cwd or payload.get("cwd")
            model = model or payload.get("model")
            cli_version = cli_version or payload.get("cli_version")
        elif rtype == "turn_context":
            model = model or payload.get("model")
        elif rtype == "response_item":
            ptype = payload.get("type")
            if ptype == "message":
                role = payload.get("role", "assistant")
                text = _message_text(payload)
                kind = "assistant_msg" if role == "assistant" else "user_msg"
                add_event(ts, kind, role=role, summary=first_line(text), body=text)
            elif ptype == "function_call":
                name = payload.get("name", "")
                args = stringify(payload.get("arguments"))
                kind = classify_tool(name, args)
                s = add_event(ts, kind, role="assistant", tool=name,
                              summary=name, body=args)
                call_id = payload.get("call_id")
                if call_id:
                    call_seq[call_id] = s
            elif ptype == "function_call_output":
                call_id = payload.get("call_id")
                parent = call_seq.get(call_id)
                body = stringify(payload.get("output"))
                add_event(ts, "tool_result", role="tool", tool=None,
                          summary="tool result", body=body, parent_seq=parent)
            elif ptype == "reasoning":
                body = _message_text(payload) or stringify(payload.get("summary"))
                add_event(ts, "thinking", role="assistant", summary="reasoning", body=body)
        elif rtype == "event_msg":
            ptype = payload.get("type")
            if ptype == "task_started":
                add_event(ts, "lifecycle", summary="task_started")
            elif ptype == "task_complete":
                add_event(ts, "completion", summary="task_complete")
            elif ptype == "token_count":
                i, o, c = _extract_tokens(payload)
                cost.input_tokens += i
                cost.output_tokens += o
                cost.cache_tokens += c
            # user_message / agent_message echoes are duplicated by response_item
            # messages on modern Codex; skipped to avoid double counting.
    if session_id is None:
        return None
    cost.turns = sum(1 for e in events if e.kind == "user_msg")
    started = min(timestamps) if timestamps else None
    ended = max(timestamps) if timestamps else None
    cost.wall_clock_s = seconds_between(started, ended)
    repo, domain = resolve_repo(cwd, repo_domain_map)
    session = Session(
        session_uid=Session.make_uid(FLAVOR, session_id),
        flavor=FLAVOR, native_session_id=session_id,
        repo=repo, domain=domain, cwd=cwd, model=model,
        started_at=started, ended_at=ended, outcome="unknown", cost=cost,
        source_path=path, source_bytes=os.path.getsize(path) if os.path.exists(path) else 0,
        discovered_at=now_iso(),
    )
    return Normalized(session=session, events=events, blobs=blobs)
--- a/session_memory/adapters/common.py
+++ b/session_memory/adapters/common.py
@@ -0,0 +1,100 @@
 """Shared adapter helpers (Tier 0 -> Tier 1).
 The ``Normalized`` bundle contract and small flavor-agnostic helpers used by every
 collector adapter. Per-flavor parsing lives in the individual adapter modules.
 """
 from __future__ import annotations
 import json
 import os
 from dataclasses import dataclass, field
 from datetime import datetime, timezone
 from typing import Any, Optional
 from ..core.schema import Session, SessionEvent
 # tool names that mutate files -> kind "edit" (union across flavors)
 EDIT_TOOLS = {
    "Edit", "Write", "NotebookEdit", "MultiEdit",  # Claude
    "apply_patch", "write_file", "edit_file",        # Codex / Grok variants
 }
 # substrings in a shell/tool command that indicate a test run -> kind "test_run"
 TEST_HINTS = (
    "pytest", "unittest", "npm test", "npm run test", "go test",
    "cargo test", "jest", "vitest", "make test", "tox",
 )
@dataclass
 class Normalized:
    session: Session
    events: list[SessionEvent]
    blobs: dict[str, str] = field(default_factory=dict)
 def resolve_repo(cwd: Optional[str], repo_domain_map: dict[str, str]) -> tuple[Optional[str], Optional[str]]:
    """cwd -> (repo, domain). repo is the cwd basename; domain via map."""
    if not cwd:
        return None, None
    repo = os.path.basename(cwd.rstrip("/")) or None
    domain = repo_domain_map.get(repo) if repo else None
    return repo, domain
 def is_test_command(text: str) -> bool:
    low = (text or "").lower()
    return any(h in low for h in TEST_HINTS)
 def classify_tool(name: str, command_text: str = "") -> str:
    """Map a tool invocation to an event kind: edit | test_run | tool_call."""
    if name in EDIT_TOOLS:
        return "edit"
    if is_test_command(command_text) or is_test_command(name):
        return "test_run"
    return "tool_call"
 def stringify(v: Any, limit: int = 20000) -> str:
    if v is None:
        return ""
    if isinstance(v, str):
        return v[:limit]
    try:
        return json.dumps(v, ensure_ascii=False)[:limit]
    except (TypeError, ValueError):
        return str(v)[:limit]
 def first_line(text: str) -> str:
    t = (text or "").strip()
    return t.splitlines()[0] if t else ""
 def seconds_between(start: Optional[str], end: Optional[str]) -> float:
    if not start or not end:
        return 0.0
    try:
        a = datetime.fromisoformat(start.replace("Z", "+00:00"))
        b = datetime.fromisoformat(end.replace("Z", "+00:00"))
        return max(0.0, (b - a).total_seconds())
    except ValueError:
        return 0.0
 def iter_jsonl(path: str):
    """Yield parsed JSON objects from a JSONL file, tolerating bad lines."""
    with open(path, "r", encoding="utf-8") as f:
        for line in f:
            line = line.strip()
            if not line:
                continue
            try:
                yield json.loads(line)
            except json.JSONDecodeError:
                continue
 def now_iso() -> str:
    return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
--- a/session_memory/adapters/grok.py
+++ b/session_memory/adapters/grok.py
@@ -0,0 +1,182 @@
 """Grok CLI collector adapter — Tier 0 -> Tier 1 (design §2.3, §4.3).
 A Grok session is a *directory* ``~/.grok/sessions/<enc-cwd>/<uuid>/`` containing
 ``summary.json`` (metadata), ``chat_history.jsonl`` (the canonical transcript),
 ``events.jsonl`` (explicit lifecycle + ``turn_number``), and ``updates.jsonl``
 (ACP ``session/update`` stream, which carries tool-call names/args).
 The ingest glob matches ``chat_history.jsonl``; this adapter derives its sibling
 files from the same directory. Conversation order is taken from
 ``chat_history.jsonl``; tool-call names are paired, in order, from
 ``updates.jsonl`` ``tool_call`` entries to classify edits/test runs.
 """
 from __future__ import annotations
 import json
 import os
 from typing import Any, Optional
 from ..core.schema import Cost, Session, SessionEvent
 from .common import (
    Normalized,
    classify_tool,
    first_line,
    iter_jsonl,
    now_iso,
    resolve_repo,
    seconds_between,
    stringify,
 )
 FLAVOR = "grok"
 def _text_content(content: Any) -> str:
    if isinstance(content, str):
        return content
    if isinstance(content, list):
        return "\n".join(
            (b.get("text") or "") for b in content if isinstance(b, dict)
        )
    return ""
 def _tool_calls_in_order(session_dir: str) -> list[dict[str, Any]]:
    """Ordered list of {title, rawInput} from updates.jsonl tool_call entries."""
    calls: list[dict[str, Any]] = []
    upd = os.path.join(session_dir, "updates.jsonl")
    if not os.path.exists(upd):
        return calls
    for rec in iter_jsonl(upd):
        u = (rec.get("params") or {}).get("update") or {}
        if u.get("sessionUpdate") == "tool_call":
            calls.append({"title": u.get("title") or "", "rawInput": u.get("rawInput") or {},
                          "id": u.get("toolCallId")})
    return calls
 def _session_meta(session_dir: str) -> dict[str, Any]:
    p = os.path.join(session_dir, "summary.json")
    if not os.path.exists(p):
        return {}
    try:
        with open(p, "r", encoding="utf-8") as f:
            return json.load(f)
    except (OSError, ValueError):
        return {}
 def _lifecycle(session_dir: str) -> tuple[list[dict[str, Any]], Optional[str]]:
    """events.jsonl records + the model id seen there."""
    evs, model = [], None
    p = os.path.join(session_dir, "events.jsonl")
    if os.path.exists(p):
        for rec in iter_jsonl(p):
            evs.append(rec)
            model = model or rec.get("model_id")
    return evs, model
 def parse_session(path: str, repo_domain_map: Optional[dict[str, str]] = None) -> Optional[Normalized]:
    repo_domain_map = repo_domain_map or {}
    # accept either the chat_history.jsonl path or the session dir
    session_dir = path if os.path.isdir(path) else os.path.dirname(path)
    chat = os.path.join(session_dir, "chat_history.jsonl")
    if not os.path.exists(chat):
        return None
    meta = _session_meta(session_dir)
    info = meta.get("info") or {}
    session_id = info.get("id") or os.path.basename(session_dir.rstrip("/"))
    cwd = info.get("cwd") or meta.get("git_root_dir")
    life_events, life_model = _lifecycle(session_dir)
    model = meta.get("current_model_id") or life_model
    pending_calls = _tool_calls_in_order(session_dir)
    call_idx = 0
    events: list[SessionEvent] = []
    blobs: dict[str, str] = {}
    seq = 0
    def add(kind, *, role=None, tool=None, summary=None, body=None, parent_seq=None) -> int:
        nonlocal seq
        s = seq
        seq += 1
        ref = None
        if body:
            ref = f"blob://{session_id}/{s}"
            blobs[ref] = body
        events.append(SessionEvent(
            session_uid=Session.make_uid(FLAVOR, session_id), seq=s, parent_seq=parent_seq,
            ts=None, kind=kind, role=role, tool=tool,
            summary=(summary or "")[:300] or None, payload_ref=ref,
        ))
        return s
    # explicit lifecycle first (turn_started/turn_ended carry no bodies)
    for le in life_events:
        t = le.get("type")
        if t in ("turn_started", "loop_started", "turn_ended", "phase_changed"):
            add("lifecycle", summary=t)
    for rec in iter_jsonl(chat):
        rtype = rec.get("type")
        content = rec.get("content")
        if rtype == "user":
            text = _text_content(content)
            if text.strip():
                add("user_msg", role="user", summary=first_line(text), body=text)
        elif rtype == "reasoning":
            text = _text_content(content)
            if text.strip():
                add("thinking", role="assistant", summary="reasoning", body=text)
        elif rtype == "assistant":
            text = _text_content(content)
            if text.strip():
                add("assistant_msg", role="assistant", summary=first_line(text), body=text)
        elif rtype == "tool_result":
            # pair with the next tool_call (in order) to recover name/args
            tool = None
            parent = None
            if call_idx < len(pending_calls):
                call = pending_calls[call_idx]
                call_idx += 1
                tool = call["title"]
                cmd = stringify(call["rawInput"])
                kind = classify_tool(tool, cmd)
                parent = add(kind, role="assistant", tool=tool, summary=tool, body=cmd)
            body = _text_content(content) if not isinstance(content, str) else content
            add("tool_result", role="tool", tool=tool, summary="tool result",
                body=stringify(body), parent_seq=parent)
    if not events:
        return None
    cost = Cost(turns=sum(1 for e in events if e.kind == "user_msg"))
    started = info.get("created_at") or meta.get("created_at")
    ended = meta.get("last_active_at") or info.get("updated_at") or meta.get("updated_at")
    cost.wall_clock_s = seconds_between(started, ended)
    repo, domain = resolve_repo(cwd, repo_domain_map)
    session = Session(
        session_uid=Session.make_uid(FLAVOR, session_id), flavor=FLAVOR,
        native_session_id=session_id, repo=repo, domain=domain, cwd=cwd,
        git_branch=meta.get("head_branch"), model=model,
        started_at=started, ended_at=ended, outcome="unknown", cost=cost,
        source_path=chat,
        source_bytes=_dir_bytes(session_dir),
        discovered_at=now_iso(),
    )
    return Normalized(session=session, events=events, blobs=blobs)
 def _dir_bytes(d: str) -> int:
    total = 0
    for root, _, files in os.walk(d):
        for f in files:
            try:
                total += os.path.getsize(os.path.join(root, f))
            except OSError:
                pass
    return total
--- a/session_memory/catalog/sp-problem-budget_overrun-tokens.history.jsonl
+++ b/session_memory/catalog/sp-problem-budget_overrun-tokens.history.jsonl
@@ -0,0 +1 @@
 {"created_at": "2026-06-07T09:13:20Z", "distribution_ready": true, "id": "sp-problem-budget_overrun-tokens", "name": "problem: budget overrun", "polarity": "problem", "problem": "problem: budget overrun", "provenance": {"detected_at": null, "evidence": {"cost_impact": 10.667, "cross_flavor": false, "flavors": ["claude"], "frequency": 3, "key": "problem:budget_overrun:tokens", "locus": "tokens", "polarity": "problem", "repos": ["artifact-store", "citation-evidence", "infospace-bench"], "score": 32.001, "sessions": ["claude:0ef1b45c-5c27-4e20-88b3-37daeaa24eca", "claude:6e0d3d68-872b-4d93-bb09-0691e091314b", "claude:8fabd5ce-6a20-4412-9a8b-0f0763394a78"], "signal_type": "budget_overrun", "title": "problem: budget overrun"}, "promoted_at": "2026-06-07T09:13:20Z", "source_key": "problem:budget_overrun:tokens"}, "rendering_hints": {"claude": {"note": "TODO: refine rendering", "target": "CLAUDE.md"}}, "resolutions": [{"detail": "", "steps": [], "summary": "TODO: capture the recommended resolution"}], "schema_version": 1, "scope": {"domains": [], "flavors": ["claude"], "repos": ["artifact-store", "citation-evidence", "infospace-bench"]}, "status": "superseded", "updated_at": "2026-06-07T09:13:20Z", "version": "1.0.0"}
--- a/session_memory/catalog/sp-problem-budget_overrun-tokens.json
+++ b/session_memory/catalog/sp-problem-budget_overrun-tokens.json
@@ -0,0 +1,77 @@
 {
  "created_at": "2026-06-07T09:13:20Z",
  "distribution_ready": true,
  "id": "sp-problem-budget_overrun-tokens",
  "name": "Budget overrun: token cost above peers",
  "polarity": "problem",
  "problem": "A session's token cost lands well above its peers (>p90). Usually driven by re-reading large files or tool outputs, carrying redundant context, or long exploratory loops without checkpoints.",
  "provenance": {
    "detected_at": null,
    "evidence": {
      "cost_impact": 10.667,
      "cross_flavor": false,
      "flavors": [
        "claude"
      ],
      "frequency": 3,
      "key": "problem:budget_overrun:tokens",
      "locus": "tokens",
      "polarity": "problem",
      "repos": [
        "artifact-store",
        "citation-evidence",
        "infospace-bench"
      ],
      "score": 32.001,
      "sessions": [
        "claude:0ef1b45c-5c27-4e20-88b3-37daeaa24eca",
        "claude:6e0d3d68-872b-4d93-bb09-0691e091314b",
        "claude:8fabd5ce-6a20-4412-9a8b-0f0763394a78"
      ],
      "signal_type": "budget_overrun",
      "title": "problem: budget overrun"
    },
    "promoted_at": "2026-06-07T09:13:20Z",
    "source_key": "problem:budget_overrun:tokens"
  },
  "rendering_hints": {
    "claude": {
      "target": "CLAUDE.md"
    }
  },
  "resolutions": [
    {
      "detail": "Use offset/limit; don't re-Read a file already in the transcript.",
      "steps": [
        "Locate with grep/glob first",
        "Read only the relevant span"
      ],
      "summary": "Read narrowly \u2014 target the region you need, not whole large files"
    },
    {
      "detail": "Summarize progress; avoid re-pulling outputs already shown.",
      "steps": [],
      "summary": "Checkpoint and prune context instead of re-fetching it"
    },
    {
      "detail": "grep/glob narrows scope far cheaper than reading whole trees.",
      "steps": [],
      "summary": "Prefer targeted search over broad reads to locate code"
    }
  ],
  "schema_version": 1,
  "scope": {
    "domains": [],
    "flavors": [
      "claude"
    ],
    "repos": [
      "artifact-store",
      "citation-evidence",
      "infospace-bench"
    ]
  },
  "status": "approved",
  "updated_at": "2026-06-07T14:21:06Z",
  "version": "1.0.1"
 }
--- a/session_memory/catalog/sp-problem-file_not_read-edit.history.jsonl
+++ b/session_memory/catalog/sp-problem-file_not_read-edit.history.jsonl
@@ -0,0 +1 @@
 {"covers": [], "created_at": "2026-06-07T13:26:25Z", "distribution_ready": true, "id": "sp-problem-file_not_read-edit", "name": "Read before you Edit", "polarity": "problem", "problem": "Agents call Edit/Write on a file they have not read in the current session, or after it changed under them. The edit tools reject this ('File has not been read yet' / 'File has been modified since read'), and the retry burns a turn. Top recurring error in the corpus (12/27 sessions, 8 repos).", "provenance": {"detected_at": null, "evidence": {"frequency": 32, "origin": "AGENTIC-WP-0006 error mining / ASSESSMENT-infra-friction.md", "polarity": "problem", "repos": 8, "sessions": 12}, "promoted_at": null, "source_key": "problem:file_not_read:edit"}, "rendering_hints": {"claude": {"target": "CLAUDE.md"}, "codex": {"target": "AGENTS.md"}, "grok": {"target": ".grok/instructions.md"}}, "resolutions": [{"detail": "Never blind-write a file you haven't read this session.", "steps": ["Read the target file", "Then Edit/Write"], "summary": "Read the file (or the region you'll touch) before Edit/Write"}, {"detail": "A stale read means the file changed under you; refresh, don't loop.", "steps": ["Re-Read the file", "Re-apply the Edit"], "summary": "On 'modified since read', re-Read then re-Edit"}], "schema_version": 1, "scope": {"domains": [], "flavors": [], "repos": []}, "status": "superseded", "updated_at": "2026-06-07T13:26:25Z", "version": "1.0.0"}
--- a/session_memory/catalog/sp-problem-file_not_read-edit.json
+++ b/session_memory/catalog/sp-problem-file_not_read-edit.json
@@ -0,0 +1,63 @@
 {
  "covers": [
    "file has not been read",
    "modified since read",
    "file_not_read"
  ],
  "created_at": "2026-06-07T13:26:25Z",
  "distribution_ready": true,
  "id": "sp-problem-file_not_read-edit",
  "name": "Read before you Edit",
  "polarity": "problem",
  "problem": "Agents call Edit/Write on a file they have not read in the current session, or after it changed under them. The edit tools reject this ('File has not been read yet' / 'File has been modified since read'), and the retry burns a turn. Top recurring error in the corpus (12/27 sessions, 8 repos).",
  "provenance": {
    "detected_at": null,
    "evidence": {
      "frequency": 32,
      "origin": "AGENTIC-WP-0006 error mining / ASSESSMENT-infra-friction.md",
      "polarity": "problem",
      "repos": 8,
      "sessions": 12
    },
    "promoted_at": null,
    "source_key": "problem:file_not_read:edit"
  },
  "rendering_hints": {
    "claude": {
      "target": "CLAUDE.md"
    },
    "codex": {
      "target": "AGENTS.md"
    },
    "grok": {
      "target": ".grok/instructions.md"
    }
  },
  "resolutions": [
    {
      "detail": "Never blind-write a file you haven't read this session.",
      "steps": [
        "Read the target file",
        "Then Edit/Write"
      ],
      "summary": "Read the file (or the region you'll touch) before Edit/Write"
    },
    {
      "detail": "A stale read means the file changed under you; refresh, don't loop.",
      "steps": [
        "Re-Read the file",
        "Re-apply the Edit"
      ],
      "summary": "On 'modified since read', re-Read then re-Edit"
    }
  ],
  "schema_version": 1,
  "scope": {
    "domains": [],
    "flavors": [],
    "repos": []
  },
  "status": "approved",
  "updated_at": "2026-06-07T19:06:45Z",
  "version": "1.0.1"
 }
--- a/session_memory/catalog/sp-problem-infra_overhead-infra_overhead.history.jsonl
+++ b/session_memory/catalog/sp-problem-infra_overhead-infra_overhead.history.jsonl
@@ -0,0 +1 @@
 {"created_at": "2026-06-07T09:13:20Z", "distribution_ready": false, "id": "sp-problem-infra_overhead-infra_overhead", "name": "problem: infra overhead", "polarity": "problem", "problem": "problem: infra overhead", "provenance": {"detected_at": null, "evidence": {"cost_impact": 0.801, "cross_flavor": false, "flavors": ["claude"], "frequency": 2, "key": "problem:infra_overhead:infra_overhead", "locus": "infra_overhead", "polarity": "problem", "repos": ["markitect-main", "vergabe-teilnahme"], "score": 1.602, "sessions": ["claude:135002f9-98d2-4d1b-b8fb-543b20388782", "claude:b4ae9631-a7eb-42a6-acb1-c65b660c4b74"], "signal_type": "infra_overhead", "title": "problem: infra overhead"}, "promoted_at": "2026-06-07T09:13:20Z", "source_key": "problem:infra_overhead:infra_overhead"}, "rendering_hints": {"claude": {"note": "TODO: refine rendering", "target": "CLAUDE.md"}}, "resolutions": [{"detail": "", "steps": [], "summary": "TODO: capture the recommended resolution"}], "schema_version": 1, "scope": {"domains": [], "flavors": ["claude"], "repos": ["markitect-main", "vergabe-teilnahme"]}, "status": "superseded", "updated_at": "2026-06-07T09:13:20Z", "version": "1.0.0"}
--- a/session_memory/catalog/sp-problem-infra_overhead-infra_overhead.json
+++ b/session_memory/catalog/sp-problem-infra_overhead-infra_overhead.json
@@ -0,0 +1,74 @@
 {
  "created_at": "2026-06-07T09:13:20Z",
  "distribution_ready": false,
  "id": "sp-problem-infra_overhead-infra_overhead",
  "name": "Infrastructure overhead: too much coordination plumbing",
  "polarity": "problem",
  "problem": "A large share of the session's tool calls are State Hub / task-management / schema-loading plumbing rather than touching the repo (corpus median 11.7%, up to 43% in the worst sessions; one session made 231 hub calls).",
  "provenance": {
    "detected_at": null,
    "evidence": {
      "cost_impact": 0.801,
      "cross_flavor": false,
      "flavors": [
        "claude"
      ],
      "frequency": 2,
      "key": "problem:infra_overhead:infra_overhead",
      "locus": "infra_overhead",
      "polarity": "problem",
      "repos": [
        "markitect-main",
        "vergabe-teilnahme"
      ],
      "score": 1.602,
      "sessions": [
        "claude:135002f9-98d2-4d1b-b8fb-543b20388782",
        "claude:b4ae9631-a7eb-42a6-acb1-c65b660c4b74"
      ],
      "signal_type": "infra_overhead",
      "title": "problem: infra overhead"
    },
    "promoted_at": "2026-06-07T09:13:20Z",
    "source_key": "problem:infra_overhead:infra_overhead"
  },
  "rendering_hints": {
    "claude": {
      "target": "CLAUDE.md"
    }
  },
  "resolutions": [
    {
      "detail": "Update several task statuses together; emit fewer, coarser progress events.",
      "steps": [
        "Do a chunk of work",
        "Then sync statuses in one pass"
      ],
      "summary": "Batch hub writes \u2014 sync at checkpoints, not per event"
    },
    {
      "detail": "One scoped summary at session start beats many broad reads.",
      "steps": [],
      "summary": "Orient once with get_domain_summary, don't re-query repeatedly"
    },
    {
      "detail": "See STATE-WP-0058 \u2014 stops the repeated ToolSearch for hub tools.",
      "steps": [],
      "summary": "Front-load hub tool knowledge via the State Hub skill"
    }
  ],
  "schema_version": 1,
  "scope": {
    "domains": [],
    "flavors": [
      "claude"
    ],
    "repos": [
      "markitect-main",
      "vergabe-teilnahme"
    ]
  },
  "status": "provisional",
  "updated_at": "2026-06-07T14:21:06Z",
  "version": "1.0.1"
 }
--- a/session_memory/catalog/sp-problem-schema_thrash-schema_load.history.jsonl
+++ b/session_memory/catalog/sp-problem-schema_thrash-schema_load.history.jsonl
@@ -0,0 +1 @@
 {"created_at": "2026-06-07T09:13:20Z", "distribution_ready": true, "id": "sp-problem-schema_thrash-schema_load", "name": "problem: schema thrash", "polarity": "problem", "problem": "problem: schema thrash", "provenance": {"detected_at": null, "evidence": {"cost_impact": 79.0, "cross_flavor": false, "flavors": ["claude"], "frequency": 8, "key": "problem:schema_thrash:schema_load", "locus": "schema_load", "polarity": "problem", "repos": ["activity-core", "citation-evidence", "flex-auth", "infospace-bench", "ops-bridge", "vergabe-teilnahme"], "score": 632.0, "sessions": ["claude:0ef1b45c-5c27-4e20-88b3-37daeaa24eca", "claude:30dbad62-c042-41f2-80c1-5953a1100e7f", "claude:4340b160-2fb6-47d0-897c-3cac0a8855d8", "claude:63fd4df2-5add-4748-af21-c1544825e006", "claude:8313f946-f008-4e98-9915-31950380e39e", "claude:8fabd5ce-6a20-4412-9a8b-0f0763394a78", "claude:b4ae9631-a7eb-42a6-acb1-c65b660c4b74", "claude:bbcf1c2b-14be-40e4-826b-4b2b49b9d212"], "signal_type": "schema_thrash", "title": "problem: schema thrash"}, "promoted_at": "2026-06-07T09:13:20Z", "source_key": "problem:schema_thrash:schema_load"}, "rendering_hints": {"claude": {"note": "TODO: refine rendering", "target": "CLAUDE.md"}}, "resolutions": [{"detail": "", "steps": [], "summary": "TODO: capture the recommended resolution"}], "schema_version": 1, "scope": {"domains": [], "flavors": ["claude"], "repos": ["activity-core", "citation-evidence", "flex-auth", "infospace-bench", "ops-bridge", "vergabe-teilnahme"]}, "status": "superseded", "updated_at": "2026-06-07T09:13:20Z", "version": "1.0.0"}
--- a/session_memory/catalog/sp-problem-schema_thrash-schema_load.json
+++ b/session_memory/catalog/sp-problem-schema_thrash-schema_load.json
@@ -0,0 +1,83 @@
 {
  "created_at": "2026-06-07T09:13:20Z",
  "distribution_ready": true,
  "id": "sp-problem-schema_thrash-schema_load",
  "name": "Schema thrash: repeated ToolSearch",
  "polarity": "problem",
  "problem": "ToolSearch fires repeatedly within a session (seen in 81% of sessions) because the State Hub MCP tools are deferred and their schemas get re-loaded each time they are needed \u2014 pure overhead with no work value.",
  "provenance": {
    "detected_at": null,
    "evidence": {
      "cost_impact": 79.0,
      "cross_flavor": false,
      "flavors": [
        "claude"
      ],
      "frequency": 8,
      "key": "problem:schema_thrash:schema_load",
      "locus": "schema_load",
      "polarity": "problem",
      "repos": [
        "activity-core",
        "citation-evidence",
        "flex-auth",
        "infospace-bench",
        "ops-bridge",
        "vergabe-teilnahme"
      ],
      "score": 632.0,
      "sessions": [
        "claude:0ef1b45c-5c27-4e20-88b3-37daeaa24eca",
        "claude:30dbad62-c042-41f2-80c1-5953a1100e7f",
        "claude:4340b160-2fb6-47d0-897c-3cac0a8855d8",
        "claude:63fd4df2-5add-4748-af21-c1544825e006",
        "claude:8313f946-f008-4e98-9915-31950380e39e",
        "claude:8fabd5ce-6a20-4412-9a8b-0f0763394a78",
        "claude:b4ae9631-a7eb-42a6-acb1-c65b660c4b74",
        "claude:bbcf1c2b-14be-40e4-826b-4b2b49b9d212"
      ],
      "signal_type": "schema_thrash",
      "title": "problem: schema thrash"
    },
    "promoted_at": "2026-06-07T09:13:20Z",
    "source_key": "problem:schema_thrash:schema_load"
  },
  "rendering_hints": {
    "claude": {
      "target": "CLAUDE.md"
    }
  },
  "resolutions": [
    {
      "detail": "Resolve them by name in one ToolSearch (select:...) rather than searching ad hoc.",
      "steps": [
        "List the hub tools the session needs",
        "Load them once at the start"
      ],
      "summary": "Load the tool schemas you'll need once, up front"
    },
    {
      "detail": "The skill carries the schemas so no per-use discovery is needed.",
      "steps": [],
      "summary": "Adopt the State Hub skill that front-loads common hub tool signatures"
    }
  ],
  "schema_version": 1,
  "scope": {
    "domains": [],
    "flavors": [
      "claude"
    ],
    "repos": [
      "activity-core",
      "citation-evidence",
      "flex-auth",
      "infospace-bench",
      "ops-bridge",
      "vergabe-teilnahme"
    ]
  },
  "status": "approved",
  "updated_at": "2026-06-07T14:21:06Z",
  "version": "1.0.1"
 }
--- a/session_memory/catalog/sp-problem-tool_thrash-tool-bash.history.jsonl
+++ b/session_memory/catalog/sp-problem-tool_thrash-tool-bash.history.jsonl
@@ -0,0 +1 @@
 {"created_at": "2026-06-07T09:13:20Z", "distribution_ready": true, "id": "sp-problem-tool_thrash-tool-bash", "name": "problem: tool thrash", "polarity": "problem", "problem": "problem: tool thrash", "provenance": {"detected_at": null, "evidence": {"cost_impact": 1990.0, "cross_flavor": false, "flavors": ["claude"], "frequency": 11, "key": "problem:tool_thrash:tool:Bash", "locus": "tool:Bash", "polarity": "problem", "repos": ["activity-core", "artifact-store", "citation-evidence", "ihp-railiance-probe", "infospace-bench", "railiance-apps", "state-hub", "vergabe-teilnahme"], "score": 21890.0, "sessions": ["claude:0ef1b45c-5c27-4e20-88b3-37daeaa24eca", "claude:2c0d14e1-d089-4076-bf35-b134737a261d", "claude:30dbad62-c042-41f2-80c1-5953a1100e7f", "claude:4307eff6-cd39-4189-be58-79a3acb69d6c", "claude:4340b160-2fb6-47d0-897c-3cac0a8855d8", "claude:6e0d3d68-872b-4d93-bb09-0691e091314b", "claude:8313f946-f008-4e98-9915-31950380e39e", "claude:8fabd5ce-6a20-4412-9a8b-0f0763394a78", "claude:a9483f07-c9dc-4f71-9fa0-831790ea965e", "claude:b1dfbcfa-91f9-4540-823a-26fcfaab7fc8", "claude:b4ae9631-a7eb-42a6-acb1-c65b660c4b74"], "signal_type": "tool_thrash", "title": "problem: tool thrash"}, "promoted_at": "2026-06-07T09:13:20Z", "source_key": "problem:tool_thrash:tool:Bash"}, "rendering_hints": {"claude": {"note": "TODO: refine rendering", "target": "CLAUDE.md"}}, "resolutions": [{"detail": "", "steps": [], "summary": "TODO: capture the recommended resolution"}], "schema_version": 1, "scope": {"domains": [], "flavors": ["claude"], "repos": ["activity-core", "artifact-store", "citation-evidence", "ihp-railiance-probe", "infospace-bench", "railiance-apps", "state-hub", "vergabe-teilnahme"]}, "status": "superseded", "updated_at": "2026-06-07T09:13:20Z", "version": "1.0.0"}
--- a/session_memory/catalog/sp-problem-tool_thrash-tool-bash.json
+++ b/session_memory/catalog/sp-problem-tool_thrash-tool-bash.json
@@ -0,0 +1,95 @@
 {
  "created_at": "2026-06-07T09:13:20Z",
  "distribution_ready": true,
  "id": "sp-problem-tool_thrash-tool-bash",
  "name": "Tool thrash: one tool hammered",
  "polarity": "problem",
  "problem": "A single tool (often Bash or Edit) is invoked far more than any other in a session \u2014 a sign of trial-and-error churn or missing higher-level tooling.",
  "provenance": {
    "detected_at": null,
    "evidence": {
      "cost_impact": 1990.0,
      "cross_flavor": false,
      "flavors": [
        "claude"
      ],
      "frequency": 11,
      "key": "problem:tool_thrash:tool:Bash",
      "locus": "tool:Bash",
      "polarity": "problem",
      "repos": [
        "activity-core",
        "artifact-store",
        "citation-evidence",
        "ihp-railiance-probe",
        "infospace-bench",
        "railiance-apps",
        "state-hub",
        "vergabe-teilnahme"
      ],
      "score": 21890.0,
      "sessions": [
        "claude:0ef1b45c-5c27-4e20-88b3-37daeaa24eca",
        "claude:2c0d14e1-d089-4076-bf35-b134737a261d",
        "claude:30dbad62-c042-41f2-80c1-5953a1100e7f",
        "claude:4307eff6-cd39-4189-be58-79a3acb69d6c",
        "claude:4340b160-2fb6-47d0-897c-3cac0a8855d8",
        "claude:6e0d3d68-872b-4d93-bb09-0691e091314b",
        "claude:8313f946-f008-4e98-9915-31950380e39e",
        "claude:8fabd5ce-6a20-4412-9a8b-0f0763394a78",
        "claude:a9483f07-c9dc-4f71-9fa0-831790ea965e",
        "claude:b1dfbcfa-91f9-4540-823a-26fcfaab7fc8",
        "claude:b4ae9631-a7eb-42a6-acb1-c65b660c4b74"
      ],
      "signal_type": "tool_thrash",
      "title": "problem: tool thrash"
    },
    "promoted_at": "2026-06-07T09:13:20Z",
    "source_key": "problem:tool_thrash:tool:Bash"
  },
  "rendering_hints": {
    "claude": {
      "target": "CLAUDE.md"
    }
  },
  "resolutions": [
    {
      "detail": "Compose a single command/script; run independent calls in parallel.",
      "steps": [
        "Group the steps",
        "Run them as one block"
      ],
      "summary": "Batch related shell work into one script, not many small Bash calls"
    },
    {
      "detail": "Read the region, then one substantive Edit beats many tiny ones.",
      "steps": [],
      "summary": "Make fewer, larger edits with full context"
    },
    {
      "detail": "If the same invocation recurs, wrap it once.",
      "steps": [],
      "summary": "Factor a repeated command pattern into a helper"
    }
  ],
  "schema_version": 1,
  "scope": {
    "domains": [],
    "flavors": [
      "claude"
    ],
    "repos": [
      "activity-core",
      "artifact-store",
      "citation-evidence",
      "ihp-railiance-probe",
      "infospace-bench",
      "railiance-apps",
      "state-hub",
      "vergabe-teilnahme"
    ]
  },
  "status": "approved",
  "updated_at": "2026-06-07T14:21:06Z",
  "version": "1.0.1"
 }
--- a/session_memory/catalog/sp-success-clean_pass-outcome.history.jsonl
+++ b/session_memory/catalog/sp-success-clean_pass-outcome.history.jsonl
@@ -0,0 +1 @@
 {"created_at": "2026-06-07T09:13:20Z", "distribution_ready": true, "id": "sp-success-clean_pass-outcome", "name": "cross-flavor success: clean pass", "polarity": "success", "problem": "cross-flavor success: clean pass", "provenance": {"detected_at": null, "evidence": {"cost_impact": 17.0, "cross_flavor": true, "flavors": ["claude", "grok"], "frequency": 17, "key": "success:clean_pass:outcome", "locus": "outcome", "polarity": "success", "repos": ["activity-core", "agentic-resources", "artifact-store", "can-you-assist", "citation-evidence", "infospace-bench", "issue-facade", "ops-bridge", "railiance-apps", "state-hub", "the-custodian", "vergabe-teilnahme"], "score": 433.5, "sessions": ["claude:0ef1b45c-5c27-4e20-88b3-37daeaa24eca", "claude:16bdbec4-b018-4902-9fb5-336f8f3d61c8", "claude:2c0d14e1-d089-4076-bf35-b134737a261d", "claude:30dbad62-c042-41f2-80c1-5953a1100e7f", "claude:4307eff6-cd39-4189-be58-79a3acb69d6c", "claude:4340b160-2fb6-47d0-897c-3cac0a8855d8", "claude:631de76e-fdee-43b5-b091-7b7675467ad1", "claude:63fd4df2-5add-4748-af21-c1544825e006", "claude:6e0d3d68-872b-4d93-bb09-0691e091314b", "claude:8313f946-f008-4e98-9915-31950380e39e", "claude:8fabd5ce-6a20-4412-9a8b-0f0763394a78", "claude:a9483f07-c9dc-4f71-9fa0-831790ea965e", "claude:b4ae9631-a7eb-42a6-acb1-c65b660c4b74", "claude:eb837dd1-5b8e-472e-b9e1-4537b10e03e6", "claude:ee9e84f2-bc35-4eb5-a7ad-aaec5f31d965", "claude:f1b25697-0e5f-45f0-81d1-af0f1762c438", "grok:019e6122-00c0-79f3-b4e5-9c70b77c015d"], "signal_type": "clean_pass", "title": "cross-flavor success: clean pass"}, "promoted_at": "2026-06-07T09:13:20Z", "source_key": "success:clean_pass:outcome"}, "rendering_hints": {"claude": {"note": "TODO: refine rendering", "target": "CLAUDE.md"}, "grok": {"note": "TODO: refine rendering", "target": "instructions"}}, "resolutions": [{"detail": "", "steps": [], "summary": "TODO: capture the recommended resolution"}], "schema_version": 1, "scope": {"domains": [], "flavors": ["claude", "grok"], "repos": ["activity-core", "agentic-resources", "artifact-store", "can-you-assist", "citation-evidence", "infospace-bench", "issue-facade", "ops-bridge", "railiance-apps", "state-hub", "the-custodian", "vergabe-teilnahme"]}, "status": "superseded", "updated_at": "2026-06-07T09:13:20Z", "version": "1.0.0"}
--- a/session_memory/catalog/sp-success-clean_pass-outcome.json
+++ b/session_memory/catalog/sp-success-clean_pass-outcome.json
@@ -0,0 +1,110 @@
 {
  "created_at": "2026-06-07T09:13:20Z",
  "distribution_ready": true,
  "id": "sp-success-clean_pass-outcome",
  "name": "Clean pass: tests green, no retries",
  "polarity": "success",
  "problem": "The target session shape: ends in success, runs the test suite, with no errors and no retries \u2014 resolves cheaply and reliably. Seen across many sessions and both Claude and Grok (the highest-value pattern to reinforce).",
  "provenance": {
    "detected_at": null,
    "evidence": {
      "cost_impact": 17.0,
      "cross_flavor": true,
      "flavors": [
        "claude",
        "grok"
      ],
      "frequency": 17,
      "key": "success:clean_pass:outcome",
      "locus": "outcome",
      "polarity": "success",
      "repos": [
        "activity-core",
        "agentic-resources",
        "artifact-store",
        "can-you-assist",
        "citation-evidence",
        "infospace-bench",
        "issue-facade",
        "ops-bridge",
        "railiance-apps",
        "state-hub",
        "the-custodian",
        "vergabe-teilnahme"
      ],
      "score": 433.5,
      "sessions": [
        "claude:0ef1b45c-5c27-4e20-88b3-37daeaa24eca",
        "claude:16bdbec4-b018-4902-9fb5-336f8f3d61c8",
        "claude:2c0d14e1-d089-4076-bf35-b134737a261d",
        "claude:30dbad62-c042-41f2-80c1-5953a1100e7f",
        "claude:4307eff6-cd39-4189-be58-79a3acb69d6c",
        "claude:4340b160-2fb6-47d0-897c-3cac0a8855d8",
        "claude:631de76e-fdee-43b5-b091-7b7675467ad1",
        "claude:63fd4df2-5add-4748-af21-c1544825e006",
        "claude:6e0d3d68-872b-4d93-bb09-0691e091314b",
        "claude:8313f946-f008-4e98-9915-31950380e39e",
        "claude:8fabd5ce-6a20-4412-9a8b-0f0763394a78",
        "claude:a9483f07-c9dc-4f71-9fa0-831790ea965e",
        "claude:b4ae9631-a7eb-42a6-acb1-c65b660c4b74",
        "claude:eb837dd1-5b8e-472e-b9e1-4537b10e03e6",
        "claude:ee9e84f2-bc35-4eb5-a7ad-aaec5f31d965",
        "claude:f1b25697-0e5f-45f0-81d1-af0f1762c438",
        "grok:019e6122-00c0-79f3-b4e5-9c70b77c015d"
      ],
      "signal_type": "clean_pass",
      "title": "cross-flavor success: clean pass"
    },
    "promoted_at": "2026-06-07T09:13:20Z",
    "source_key": "success:clean_pass:outcome"
  },
  "rendering_hints": {
    "claude": {
      "target": "CLAUDE.md"
    },
    "grok": {
      "target": "instructions"
    }
  },
  "resolutions": [
    {
      "detail": "A passing suite is the cheapest proof the change works.",
      "steps": [
        "Make the change",
        "Run the suite",
        "Only then report done"
      ],
      "summary": "Run the test suite before declaring done; let green gate completion"
    },
    {
      "detail": "Small verified steps beat large unverified ones that bounce.",
      "steps": [],
      "summary": "Work incrementally and verify as you go to avoid retries"
    }
  ],
  "schema_version": 1,
  "scope": {
    "domains": [],
    "flavors": [
      "claude",
      "grok"
    ],
    "repos": [
      "activity-core",
      "agentic-resources",
      "artifact-store",
      "can-you-assist",
      "citation-evidence",
      "infospace-bench",
      "issue-facade",
      "ops-bridge",
      "railiance-apps",
      "state-hub",
      "the-custodian",
      "vergabe-teilnahme"
    ]
  },
  "status": "approved",
  "updated_at": "2026-06-07T14:21:06Z",
  "version": "1.0.1"
 }
--- a/session_memory/config.toml
+++ b/session_memory/config.toml
@@ -0,0 +1,83 @@
 # Coding Session Memory — configuration (design §5.1, §8).
 # Paths support ~ expansion. Edit caps to taste; see docs/DESIGN-session-memory.md.
 [store]
 # Local store lives under the repo by default (gitignored).
 db_path   = "session_memory/.store/mem.db"
 blob_dir  = "session_memory/.store/blobs"
 cursor    = "session_memory/.store/cursors.json"
 [retention]
 raw_soft_cap_bytes  = 4294967296   # 4 GiB — begin evicting analyzed sessions above this
 raw_hard_cap_bytes  = 6442450944   # 6 GiB — absolute Tier 1 ceiling
 raw_max_age_days    = 45           # backstop: analyzed raw older than this is evictable
 distilled_cap_bytes = 1073741824   # 1 GiB — Tier 2 ceiling (alert, never auto-drop)
 cadence             = "daily"      # sweep trigger: daily | weekly | on-hook
 [sources.claude]
 enabled = true
 root    = "~/.claude/projects"
 # glob, relative to root; covers sessions and agent-* sidechains
 glob    = "*/*.jsonl"
 # Codex / Grok adapters added in Phase 1 (AGENTIC-WP-0003).
 [sources.codex]
 enabled = true
 root    = "~/.codex/sessions"
 glob    = "*/*/*/rollout-*.jsonl"
 [sources.grok]
 enabled = true
 root    = "~/.grok/sessions"
 glob    = "*/*/chat_history.jsonl"
 # Detect phase (AGENTIC-WP-0005): quality filter — drop non-coding/trivial sessions
 # before signals form, so health-checks don't mint false-positive patterns.
 [detect.quality]
 min_events      = 20   # below this many events, not a real coding session
 min_substantive = 3    # require >= this many substantive (edit/read/shell) tool calls
 min_prompt_len  = 25   # first prompt shorter than this is treated as trivial
 # Curate phase (AGENTIC-WP-0004): catalog location + promotion evidence bar.
 # Measure phase (AGENTIC-WP-0009): persisted baseline/trend of fleet metrics.
 [measure]
 baselines = "session_memory/measure/baselines.jsonl"  # timestamped metric snapshots (committed)
 # Weekly retro (AGENTIC-WP-0010): windowed top-3-per-repo report, published to the
 # hub as the coding_retro read model that activity-core's weekly schedule consumes.
 [retro]
 window_days = 7
 report_json = "session_memory/retro/last_retro.json"  # latest report (committed)
 report_md   = "session_memory/retro/last_retro.md"    # human-readable mirror
 hub_url     = "http://127.0.0.1:8000"                 # for --publish (best-effort)
 # Distribute phase (AGENTIC-WP-0007): where per-flavor proposals + the active
 # registry are written. Proposals are HITL — reviewed, never auto-applied.
 [distribute]
 proposals_dir   = "session_memory/proposals"                  # reviewable proposals (gitignored, regenerated)
 active_registry = "session_memory/distribute/active_patterns.json"  # what's proposed/active where (committed)
 [curate]
 catalog_dir    = "session_memory/catalog"               # files-first Pattern Catalog (committed)
 review_log     = "session_memory/.store/reviews.jsonl"  # remembered decisions (gitignored)
 decision_queue = "session_memory/.store/decisions.queue.jsonl"  # hub decisions pending sync
 state_hub_workstream_id = "b3703684-f60e-42f3-b03e-dabe3e8ce3f4"  # AGENTIC-WP-0004
 # Evidence bar (OQ5): floors to promote at all, and stricter floors to be
 # distribution-eligible (status=approved, distribution_ready=true).
 [curate.gate]
 min_frequency             = 2      # >= this many supporting signals to promote
 min_sessions              = 2      # >= this many distinct sessions
 min_cost_impact           = 0.0
 dist_require_cross_flavor = false  # require cross-flavor evidence to distribute
 dist_min_frequency        = 3
 dist_min_cost_impact      = 0.0
 # cwd basename -> domain slug. Used to tag sessions with their Custodian domain.
 [repo_domain_map]
 agentic-resources = "helix_forge"
 the-custodian     = "custodian"
 state-hub         = "custodian"
 ops-bridge        = "custodian"
 net-kingdom       = "netkingdom"
 can-you-assist    = "coulomb_social"
--- a/session_memory/core/init.py
+++ b/session_memory/core/init.py
@@ -0,0 +1 @@
 """Flavor-agnostic core: schema, store, cursor, digest, retention."""
--- a/session_memory/core/cursor.py
+++ b/session_memory/core/cursor.py
@@ -0,0 +1,49 @@
 """Per-source ingest cursors (design §6; T06).
 Tracks ``(path -> size, mtime)`` so sweeps re-ingest only changed/grown files.
 Persisted as a small JSON sidecar. Ingest itself is idempotent on
 ``(session_uid, seq)`` in the store, so the cursor is an optimization, not a
 correctness requirement — a lost cursor just means a full (still-idempotent)
 re-scan.
 """
 from __future__ import annotations
 import json
 import os
 from typing import Optional
 class Cursors:
    def __init__(self, path: str):
        self.path = path
        self._data: dict[str, dict] = {}
        if os.path.exists(path):
            try:
                with open(path, "r", encoding="utf-8") as f:
                    self._data = json.load(f)
            except (OSError, ValueError):
                self._data = {}
    def is_changed(self, file_path: str) -> bool:
        """True if the file is new or has changed size/mtime since last seen."""
        try:
            stat = os.stat(file_path)
        except OSError:
            return False
        prev = self._data.get(file_path)
        return prev is None or prev.get("size") != stat.st_size or prev.get("mtime") != stat.st_mtime
    def mark(self, file_path: str) -> None:
        try:
            stat = os.stat(file_path)
        except OSError:
            return
        self._data[file_path] = {"size": stat.st_size, "mtime": stat.st_mtime}
    def save(self) -> None:
        os.makedirs(os.path.dirname(self.path) or ".", exist_ok=True)
        tmp = self.path + ".tmp"
        with open(tmp, "w", encoding="utf-8") as f:
            json.dump(self._data, f)
        os.replace(tmp, self.path)
--- a/session_memory/core/digest.py
+++ b/session_memory/core/digest.py
@@ -0,0 +1,286 @@
 """Session digest — Tier 1 -> Tier 2 promotion (design §3, §4; T04).
 Compresses a session's events into a small, durable digest: outcome heuristic,
 cost totals, tool histogram, and counts of error/retry/test/edit/human markers,
 plus a few key snippets. Writing the digest sets ``analyzed_at``, which is what
 makes a session evictable under budget-based retention (design §5).
 Signal extraction beyond this digest is intentionally out of scope here — it
 belongs to the Detect phase (PRD §6.2).
 """
 from __future__ import annotations
 import collections
 import json
 import re
 from typing import Any
 from .schema import Session, SessionEvent
 # Substrings in tool_result bodies / summaries that suggest a failure.
 _FAIL_HINTS = ("error", "failed", "exception", "traceback", "fatal", "non-zero")
 # Substrings suggesting a clean test pass.
 _PASS_HINTS = ("passed", "0 failed", "ok", "success")
 # A line that is numbered source content from a Read result (`cat -n` style),
 # e.g. "229\t    raise InfospaceError(" — code text, never a runtime error.
 _NUMBERED_LINE_RE = re.compile(r"^\s*\d+\t")
 # Top-level keys that mark a JSON tool-result as an actual error (vs. success).
 _JSON_ERROR_KEYS = ("error", "errors", "detail")
 # Normalization patterns so the same error collapses to one fingerprint
 # regardless of paths / ids / counts (WP-0006 T01).
 _UUID_RE = re.compile(r"\b[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}\b", re.I)
 _HEXADDR_RE = re.compile(r"\b0x[0-9a-f]+\b", re.I)
 _PATH_RE = re.compile(r"(?:/[\w.\-]+)+/?|[A-Za-z]:\\[\w.\\\-]+")
 _NUM_RE = re.compile(r"\b\d+\b")
 _WS_RE = re.compile(r"\s+")
 _ERR_SAMPLE_MAX = 200
 _ERR_FP_MAX = 160
 def infer_outcome(events: list[SessionEvent], blobs: dict[str, str] | None = None) -> str:
    """Heuristic outcome label across flavors (design OQ2).
    - ``abandoned`` if the session has no assistant output at all.
    - ``fail`` if the last substantive signal is an error / failing test.
    - ``success`` if it ends on assistant output or a passing test.
    - ``unknown`` otherwise.
    """
    blobs = blobs or {}
    assistant = [e for e in events if e.kind == "assistant_msg"]
    if not assistant:
        return "abandoned"
    # Look at error and test signals; weight the latest ones.
    last_fail = _last_index(events, lambda e: e.kind == "error")
    last_test = _last_index(events, lambda e: e.kind == "test_run")
    last_completion = _last_index(events, lambda e: e.kind in ("completion", "assistant_msg"))
    test_passed = None
    if last_test is not None:
        # inspect the nearest following tool_result body for pass/fail hints
        body = _nearby_result_body(events, last_test, blobs)
        if body:
            low = body.lower()
            if any(h in low for h in _FAIL_HINTS):
                test_passed = False
            elif any(h in low for h in _PASS_HINTS):
                test_passed = True
    if test_passed is False and (last_test or 0) >= (last_completion or 0):
        return "fail"
    if last_fail is not None and last_completion is not None and last_fail > last_completion:
        return "fail"
    if test_passed is True:
        return "success"
    if last_completion is not None:
        return "success"
    return "unknown"
 def build_digest(session: Session, events: list[SessionEvent],
                 blobs: dict[str, str] | None = None) -> dict[str, Any]:
    """Produce the compact Tier 2 digest dict for a session."""
    blobs = blobs or {}
    kind_counts = collections.Counter(e.kind for e in events)
    tool_hist = collections.Counter(e.tool for e in events if e.tool)
    retries = kind_counts.get("retry", 0)
    outcome = infer_outcome(events, blobs)
    return {
        "session_uid": session.session_uid,
        "flavor": session.flavor,
        "repo": session.repo,
        "domain": session.domain,
        "model": session.model,
        "started_at": session.started_at,
        "ended_at": session.ended_at,
        "outcome": outcome,
        "cost": {
            "input_tokens": session.cost.input_tokens,
            "output_tokens": session.cost.output_tokens,
            "cache_tokens": session.cost.cache_tokens,
            "wall_clock_s": session.cost.wall_clock_s,
            "turns": session.cost.turns,
            "retries": retries,
        },
        "event_count": len(events),
        "kind_counts": dict(kind_counts),
        "tool_histogram": dict(tool_hist),
        "markers": {
            "errors": kind_counts.get("error", 0),
            "retries": retries,
            "test_runs": kind_counts.get("test_run", 0),
            "edits": kind_counts.get("edit", 0),
            "human_interventions": kind_counts.get("human_intervention", 0),
        },
        "first_prompt": _first_prompt(events, blobs),
        "last_assistant": _last_assistant(events, blobs),
        "error_snippets": _error_snippets(events, blobs),
        "schema_version": session.schema_version,
    }
 def analyze(store, session_uid: str) -> dict[str, Any]:
    """Read a session from the store, write its digest, return the digest."""
    session = store.get_session(session_uid)
    if session is None:
        raise KeyError(session_uid)
    events = store.get_events(session_uid)
    blobs = {e.payload_ref: _read_blob(store, e.payload_ref)
             for e in events if e.payload_ref}
    digest = build_digest(session, events, blobs)
    store.write_digest(session_uid, digest)
    return digest
 # ---- helpers ---------------------------------------------------------------
 def _last_index(events, pred):
    idx = None
    for i, e in enumerate(events):
        if pred(e):
            idx = i
    return idx
 def _nearby_result_body(events, idx, blobs):
    for e in events[idx + 1: idx + 4]:
        if e.kind == "tool_result" and e.payload_ref in blobs:
            return blobs[e.payload_ref]
    return None
 def _first_prompt(events, blobs):
    for e in events:
        if e.kind == "user_msg":
            return (blobs.get(e.payload_ref) or e.summary or "")[:280]
    return None
 def _last_assistant(events, blobs):
    for e in reversed(events):
        if e.kind == "assistant_msg":
            return (blobs.get(e.payload_ref) or e.summary or "")[:280]
    return None
 def _error_line(text: str) -> str:
    """Pick the most error-like line from a body.
    Prefers the *last* line matching a fail hint — in a Python traceback the
    actual exception is the final line, while the bare ``Traceback (most recent
    call last):`` header is just noise and is skipped.
    """
    lines = [ln.strip() for ln in text.splitlines() if ln.strip()]
    matches = [ln for ln in lines
               if any(h in ln.lower() for h in _FAIL_HINTS)
               and not ln.lower().startswith("traceback")]
    if matches:
        return matches[-1]
    # fall back to any fail-hint line (e.g. only the traceback header), else first
    any_hint = [ln for ln in lines if any(h in ln.lower() for h in _FAIL_HINTS)]
    return any_hint[-1] if any_hint else (lines[0] if lines else "")
 def _error_fingerprint(text: str) -> str:
    """Stable, content-addressable key for an error, paths/ids/numbers removed."""
    s = _error_line(text).lower()
    s = _UUID_RE.sub("<uuid>", s)
    s = _HEXADDR_RE.sub("<addr>", s)
    s = _PATH_RE.sub("<path>", s)
    s = _NUM_RE.sub("<n>", s)
    return _WS_RE.sub(" ", s).strip()[:_ERR_FP_MAX]
 def _error_body(event: SessionEvent, blobs: dict) -> str:
    """Best available text for a failed event."""
    if event.payload_ref and event.payload_ref in blobs:
        return blobs[event.payload_ref]
    return event.summary or ""
 def _looks_like_file_read(body: str) -> bool:
    """True if the body is mostly numbered source lines (a Read result), not an error."""
    lines = [ln for ln in body.splitlines() if ln.strip()]
    if not lines:
        return False
    numbered = sum(1 for ln in lines if _NUMBERED_LINE_RE.match(ln))
    return numbered >= max(3, len(lines) // 2)
 def _json_verdict(body: str):
    """Classify a JSON tool-result body: 'error', 'success', or None (not JSON).
    Hub MCP successes look like ``{"result": "..."}`` and mention 'error' deep
    inside summaries but are not failures ('success'). A payload with a top-level
    error key (``{"detail": ...}`` / ``{"error": ...}``) is 'error'. Non-JSON text
    returns None so the plain fail-hint heuristic still applies.
    """
    s = body.strip()
    if not s or s[0] not in "{[":
        return None
    try:
        obj = json.loads(s)
    except (ValueError, TypeError):
        return None
    if isinstance(obj, dict) and any(k in obj for k in _JSON_ERROR_KEYS):
        return "error"
    return "success"
 def _is_failed(event: SessionEvent, blobs: dict) -> bool:
    if event.kind == "error":
        return True
    if event.kind == "tool_result":
        body = _error_body(event, blobs)
        if not body.strip():
            return False
        if _looks_like_file_read(body):
            return False
        verdict = _json_verdict(body)
        if verdict is not None:
            return verdict == "error"
        return any(h in body.lower() for h in _FAIL_HINTS)
    return False
 def _error_snippets(events: list[SessionEvent], blobs: dict) -> list[dict]:
    """Collapse a session's failures into deduped, normalized error fingerprints.
    Durable in Tier 2 (the raw blobs may be evicted): each entry is
    ``{fingerprint, sample, count, tool}`` with same-fingerprint occurrences
    counted. Ordered by frequency (then first appearance) for stable output.
    """
    agg: dict[str, dict] = {}
    order: list[str] = []
    for e in events:
        if not _is_failed(e, blobs):
            continue
        body = _error_body(e, blobs)
        if not body.strip():
            continue
        fp = _error_fingerprint(body)
        if not fp:
            continue
        if fp not in agg:
            agg[fp] = {"fingerprint": fp, "sample": _error_line(body)[:_ERR_SAMPLE_MAX],
                       "count": 0, "tool": e.tool}
            order.append(fp)
        agg[fp]["count"] += 1
    snippets = [agg[fp] for fp in order]
    snippets.sort(key=lambda s: (-s["count"], order.index(s["fingerprint"])))
    return snippets
 def _read_blob(store, ref):
    row = store.db.execute("SELECT path FROM blobs WHERE ref=?", (ref,)).fetchone()
    if not row:
        return ""
    try:
        with open(row["path"], "r", encoding="utf-8") as f:
            return f.read()
    except OSError:
        return ""
--- a/session_memory/core/retention.py
+++ b/session_memory/core/retention.py
@@ -0,0 +1,144 @@
 """Budget-based retention sweep (design §5; T05).
 Eviction is tied to the two conditions the design names — a session is dropped
 from Tier 1 once it has been *analyzed* (its digest is in Tier 2) **and** space is
 needed, with a max-age backstop. The invariant: raw bytes are never dropped
 before the Tier 2 digest exists, except the explicitly-reported hard-cap overflow
 path.
 Order of passes per sweep:
  1. backstop  — evict analyzed sessions older than ``raw_max_age_days``
  2. budget    — while over ``raw_soft_cap_bytes``, evict oldest-analyzed first
  3. overflow  — if still over ``raw_hard_cap_bytes`` and only un-analyzed bulk
                 remains: analyze-now, retry budget; last resort evict oldest
                 un-analyzed and emit a reported ``data_loss`` event.
 """
 from __future__ import annotations
 from dataclasses import dataclass, field
 from datetime import datetime, timezone
 from typing import Callable, Optional
 from .schema import Session
@dataclass
 class RetentionConfig:
    raw_soft_cap_bytes: int = 4 * 1024**3   # 4 GiB
    raw_hard_cap_bytes: int = 6 * 1024**3   # 6 GiB
    raw_max_age_days: int = 45
    distilled_cap_bytes: int = 1 * 1024**3  # 1 GiB (alert only, never auto-drop)
@dataclass
 class EvictionReport:
    backstop_evicted: list[str] = field(default_factory=list)
    budget_evicted: list[str] = field(default_factory=list)
    overflow_analyzed: list[str] = field(default_factory=list)
    overflow_data_loss: list[str] = field(default_factory=list)
    bytes_freed: int = 0
    final_usage_bytes: int = 0
    over_hard_cap: bool = False
    tier2_over_cap: bool = False
    warnings: list[str] = field(default_factory=list)
    @property
    def lost_data(self) -> bool:
        return bool(self.overflow_data_loss)
 def _parse_ts(ts: Optional[str]) -> Optional[datetime]:
    if not ts:
        return None
    try:
        return datetime.fromisoformat(ts.replace("Z", "+00:00"))
    except ValueError:
        return None
 def _age_days(s: Session, now: datetime) -> Optional[float]:
    ref = _parse_ts(s.ended_at) or _parse_ts(s.started_at) or _parse_ts(s.ingested_at)
    if ref is None:
        return None
    if ref.tzinfo is None:
        ref = ref.replace(tzinfo=timezone.utc)
    return (now - ref).total_seconds() / 86400.0
 def _sort_key(s: Session) -> str:
    # oldest-analyzed-first; fall back through timestamps
    return s.analyzed_at or s.ended_at or s.ingested_at or ""
 def sweep(store, config: RetentionConfig, *,
          analyze_fn: Optional[Callable[[object, str], object]] = None,
          now: Optional[datetime] = None) -> EvictionReport:
    """Run one retention sweep against ``store``. Returns an EvictionReport.
    ``analyze_fn(store, session_uid)`` is used by the overflow path to make
    un-analyzed sessions evictable; pass ``digest.analyze``.
    """
    now = now or datetime.now(timezone.utc)
    report = EvictionReport()
    def live_sessions() -> list[Session]:
        return [s for s in store.list_sessions() if s.evicted_at is None]
    # 1. backstop pass — analyzed + older than max age
    for s in sorted(live_sessions(), key=_sort_key):
        age = _age_days(s, now)
        if s.is_evictable and age is not None and age > config.raw_max_age_days:
            report.bytes_freed += store.evict_raw(s.session_uid)
            report.backstop_evicted.append(s.session_uid)
    # 2. budget pass — evict oldest analyzed while over soft cap
    while store.tier1_usage_bytes() > config.raw_soft_cap_bytes:
        candidates = [s for s in live_sessions() if s.is_evictable]
        if not candidates:
            break  # will not destroy un-analyzed data for space
        victim = min(candidates, key=_sort_key)
        report.bytes_freed += store.evict_raw(victim.session_uid)
        report.budget_evicted.append(victim.session_uid)
    # 3. overflow path — only if still over HARD cap with un-analyzed bulk left
    if store.tier1_usage_bytes() > config.raw_hard_cap_bytes:
        # 3a. try to analyze now so those sessions become evictable
        if analyze_fn is not None:
            for s in sorted(live_sessions(), key=_sort_key):
                if not s.is_evictable:
                    try:
                        analyze_fn(store, s.session_uid)
                        report.overflow_analyzed.append(s.session_uid)
                    except Exception as e:  # analysis may fail; keep going
                        report.warnings.append(f"analyze failed for {s.session_uid}: {e}")
            # retry budget pass on the freshly-analyzed sessions
            while store.tier1_usage_bytes() > config.raw_soft_cap_bytes:
                candidates = [s for s in live_sessions() if s.is_evictable]
                if not candidates:
                    break
                victim = min(candidates, key=_sort_key)
                report.bytes_freed += store.evict_raw(victim.session_uid)
                report.budget_evicted.append(victim.session_uid)
        # 3b. last resort — evict oldest un-analyzed, REPORTED as data loss
        while store.tier1_usage_bytes() > config.raw_hard_cap_bytes:
            remaining = [s for s in live_sessions() if not s.is_evictable]
            if not remaining:
                break
            victim = min(remaining, key=_sort_key)
            report.bytes_freed += store.evict_raw(victim.session_uid)
            report.overflow_data_loss.append(victim.session_uid)
            report.warnings.append(
                f"data_loss: evicted un-analyzed {victim.session_uid} to stay under hard cap"
            )
    usage = store.tier1_usage_bytes()
    report.final_usage_bytes = usage
    report.over_hard_cap = usage > config.raw_hard_cap_bytes
    report.tier2_over_cap = store.tier2_usage_bytes() > config.distilled_cap_bytes
    if report.tier2_over_cap:
        report.warnings.append(
            "tier2 distilled store over cap — flag for curation review (do not auto-drop)"
        )
    return report
--- a/session_memory/core/schema.py
+++ b/session_memory/core/schema.py
@@ -0,0 +1,156 @@
 """Normalized session schema (Tier 1) — design doc §4.
 Two record kinds, ``Session`` and ``SessionEvent``, plus the small enums every
 adapter targets. Field names here are the stable contract; per-flavor quirks are
 absorbed inside each adapter (see design §4.3 native -> kind mapping).
 """
 from __future__ import annotations
 import json
 from dataclasses import asdict, dataclass, field, fields
 from typing import Any, Optional
 SCHEMA_VERSION = 2  # v2: digest carries error_snippets (WP-0006 T01)
 # Supported agent flavors. ``session_uid`` is always "<flavor>:<native id>".
 FLAVORS = ("claude", "codex", "grok")
 # SessionEvent.kind universe (design §4.2 / §4.3).
 KINDS = (
    "user_msg",
    "assistant_msg",
    "thinking",
    "tool_call",
    "tool_result",
    "error",
    "test_run",
    "edit",
    "retry",
    "human_intervention",
    "decision",
    "lifecycle",
    "completion",
 )
 # Session.outcome universe.
 OUTCOMES = ("success", "fail", "abandoned", "unknown")
@dataclass
 class Cost:
    """Token + effort accounting for a session."""
    input_tokens: int = 0
    output_tokens: int = 0
    cache_tokens: int = 0
    wall_clock_s: float = 0.0
    turns: int = 0
    retries: int = 0
@dataclass
 class Session:
    """One bounded run of a coding agent against a repo (design §4.1)."""
    session_uid: str  # "<flavor>:<native id>" — globally unique
    flavor: str
    native_session_id: str
    repo: Optional[str] = None
    domain: Optional[str] = None
    cwd: Optional[str] = None
    git_branch: Optional[str] = None
    model: Optional[str] = None
    started_at: Optional[str] = None  # ISO-8601 UTC
    ended_at: Optional[str] = None
    outcome: str = "unknown"
    cost: Cost = field(default_factory=Cost)
    task_ref: Optional[str] = None
    source_path: Optional[str] = None
    source_bytes: int = 0
    schema_version: int = SCHEMA_VERSION
    # watermarks (design §3.1): discovered -> ingested -> analyzed -> evicted
    discovered_at: Optional[str] = None
    ingested_at: Optional[str] = None
    analyzed_at: Optional[str] = None
    evicted_at: Optional[str] = None
    def __post_init__(self) -> None:
        if self.flavor not in FLAVORS:
            raise ValueError(f"unknown flavor {self.flavor!r}; expected one of {FLAVORS}")
        if self.outcome not in OUTCOMES:
            raise ValueError(f"unknown outcome {self.outcome!r}; expected one of {OUTCOMES}")
        expected_prefix = f"{self.flavor}:"
        if not self.session_uid.startswith(expected_prefix):
            raise ValueError(
                f"session_uid {self.session_uid!r} must start with {expected_prefix!r}"
            )
    @property
    def is_evictable(self) -> bool:
        """A session may be evicted from Tier 1 only once analyzed (design §3.1)."""
        return self.analyzed_at is not None and self.evicted_at is None
    @staticmethod
    def make_uid(flavor: str, native_session_id: str) -> str:
        return f"{flavor}:{native_session_id}"
    def to_dict(self) -> dict[str, Any]:
        d = asdict(self)
        return d
    def to_json(self) -> str:
        return json.dumps(self.to_dict(), sort_keys=True)
    @classmethod
    def from_dict(cls, d: dict[str, Any]) -> "Session":
        d = dict(d)
        cost = d.pop("cost", None)
        obj = cls(**{k: v for k, v in d.items() if k in _SESSION_FIELDS})
        if cost is not None:
            obj.cost = Cost(**{k: v for k, v in cost.items() if k in _COST_FIELDS})
        return obj
    @classmethod
    def from_json(cls, s: str) -> "Session":
        return cls.from_dict(json.loads(s))
@dataclass
 class SessionEvent:
    """One atomic record within a session (design §4.2)."""
    session_uid: str
    seq: int  # monotonic within session
    ts: Optional[str] = None
    kind: str = "lifecycle"
    parent_seq: Optional[int] = None  # turn DAG (Claude); None for flat flavors
    role: Optional[str] = None  # user|assistant|system|tool
    tool: Optional[str] = None  # when kind in {tool_call, tool_result}
    summary: Optional[str] = None  # short, human-readable
    payload_ref: Optional[str] = None  # pointer to full body in Tier 1 blob store
    tokens: int = 0
    is_sidechain: bool = False
    def __post_init__(self) -> None:
        if self.kind not in KINDS:
            raise ValueError(f"unknown kind {self.kind!r}; expected one of {KINDS}")
    def to_dict(self) -> dict[str, Any]:
        return asdict(self)
    def to_json(self) -> str:
        return json.dumps(self.to_dict(), sort_keys=True)
    @classmethod
    def from_dict(cls, d: dict[str, Any]) -> "SessionEvent":
        return cls(**{k: v for k, v in d.items() if k in _EVENT_FIELDS})
    @classmethod
    def from_json(cls, s: str) -> "SessionEvent":
        return cls.from_dict(json.loads(s))
 _SESSION_FIELDS = {f.name for f in fields(Session)}
 _COST_FIELDS = {f.name for f in fields(Cost)}
 _EVENT_FIELDS = {f.name for f in fields(SessionEvent)}
--- a/session_memory/core/store.py
+++ b/session_memory/core/store.py
@@ -0,0 +1,315 @@
 """Two-tier store (design §3, §8).
 Tier 1 (bulky, evictable): ``Session`` + ``SessionEvent`` rows in SQLite, with
 event bodies written out-of-line as files under a blob dir (referenced by
 ``payload_ref``). Tier 2 (compact, durable): per-session ``digest`` rows.
 Writes are idempotent on ``(session_uid, seq)`` for events and on
 ``session_uid`` for sessions/digests, so sweeps are safely re-runnable. Eviction
 (:meth:`evict_raw`) deletes Tier 1 rows + blobs but keeps the session row and its
 Tier 2 digest — the invariant that makes budget-based retention non-lossy.
 """
 from __future__ import annotations
 import hashlib
 import json
 import os
 import re
 import sqlite3
 from datetime import datetime, timezone
 from typing import Any, Optional
 from .schema import Cost, Session, SessionEvent
 _SAFE = re.compile(r"[^A-Za-z0-9._-]+")
 def _now() -> str:
    return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
 def _fingerprint(ev: SessionEvent, body: Optional[str]) -> str:
    """Stable content fingerprint, independent of seq/payload_ref, for dedup."""
    h = hashlib.sha1()
    parts = [ev.ts or "", ev.kind, ev.role or "", ev.tool or "", ev.summary or "",
             ev.role or "", str(ev.is_sidechain)]
    h.update("\x1f".join(parts).encode("utf-8"))
    if body is not None:
        h.update(b"\x1e")
        h.update(body.encode("utf-8"))
    return h.hexdigest()
 class Store:
    def __init__(self, db_path: str, blob_dir: str):
        self.db_path = db_path
        self.blob_dir = blob_dir
        os.makedirs(os.path.dirname(db_path) or ".", exist_ok=True)
        os.makedirs(blob_dir, exist_ok=True)
        self.db = sqlite3.connect(db_path)
        self.db.row_factory = sqlite3.Row
        self.db.execute("PRAGMA journal_mode=WAL")
        self._init_schema()
    def close(self) -> None:
        self.db.close()
    def __enter__(self) -> "Store":
        return self
    def __exit__(self, *exc) -> None:
        self.close()
    def _init_schema(self) -> None:
        self.db.executescript(
            """
            CREATE TABLE IF NOT EXISTS sessions (
                session_uid TEXT PRIMARY KEY,
                json        TEXT NOT NULL,
                analyzed_at TEXT,
                evicted_at  TEXT
            );
            CREATE TABLE IF NOT EXISTS events (
                session_uid TEXT NOT NULL,
                seq         INTEGER NOT NULL,
                json        TEXT NOT NULL,
                PRIMARY KEY (session_uid, seq)
            );
            CREATE TABLE IF NOT EXISTS blobs (
                ref         TEXT PRIMARY KEY,
                session_uid TEXT NOT NULL,
                path        TEXT NOT NULL,
                nbytes      INTEGER NOT NULL
            );
            CREATE TABLE IF NOT EXISTS digests (
                session_uid TEXT PRIMARY KEY,
                json        TEXT NOT NULL,
                nbytes      INTEGER NOT NULL
            );
            CREATE INDEX IF NOT EXISTS ix_events_uid ON events(session_uid);
            CREATE INDEX IF NOT EXISTS ix_blobs_uid  ON blobs(session_uid);
            """
        )
        self.db.commit()
    # ---- Tier 1 writes -----------------------------------------------------
    def upsert_session(self, s: Session) -> None:
        self.db.execute(
            "INSERT INTO sessions(session_uid, json, analyzed_at, evicted_at) "
            "VALUES(?,?,?,?) ON CONFLICT(session_uid) DO UPDATE SET "
            "json=excluded.json, analyzed_at=excluded.analyzed_at, evicted_at=excluded.evicted_at",
            (s.session_uid, s.to_json(), s.analyzed_at, s.evicted_at),
        )
        self.db.commit()
    def upsert_events(self, events: list[SessionEvent]) -> int:
        rows = [(e.session_uid, e.seq, e.to_json()) for e in events]
        self.db.executemany(
            "INSERT INTO events(session_uid, seq, json) VALUES(?,?,?) "
            "ON CONFLICT(session_uid, seq) DO UPDATE SET json=excluded.json",
            rows,
        )
        self.db.commit()
        return len(rows)
    def write_blobs(self, session_uid: str, blobs: dict[str, str]) -> int:
        """Write event bodies as files; record path + size. Returns bytes written."""
        total = 0
        sub = os.path.join(self.blob_dir, _SAFE.sub("_", session_uid))
        os.makedirs(sub, exist_ok=True)
        for ref, body in blobs.items():
            data = body.encode("utf-8")
            fname = _SAFE.sub("_", ref) + ".txt"
            path = os.path.join(sub, fname)
            with open(path, "w", encoding="utf-8") as f:
                f.write(body)
            self.db.execute(
                "INSERT INTO blobs(ref, session_uid, path, nbytes) VALUES(?,?,?,?) "
                "ON CONFLICT(ref) DO UPDATE SET path=excluded.path, nbytes=excluded.nbytes",
                (ref, session_uid, path, len(data)),
            )
            total += len(data)
        self.db.commit()
        return total
    def ingest(self, bundle) -> int:
        """Persist a Normalized bundle, merging into any existing session.
        Multiple files can map to one ``session_uid`` (Claude resume/sidechains;
        Grok multi-file dirs). Events are de-duplicated by content fingerprint and
        genuinely-new events are appended with offset ``seq`` (design OQ6 / T03).
        Returns the number of new events written. Idempotent: re-ingesting the
        same bundle adds nothing.
        """
        s = bundle.session
        existing = self.get_session(s.session_uid)
        if existing is None:
            if s.ingested_at is None:
                s.ingested_at = _now()
            self.upsert_session(s)
        # known fingerprints + current max seq for this session
        seen = self._event_fingerprints(s.session_uid)
        next_seq = self._max_seq(s.session_uid) + 1
        new_events: list[SessionEvent] = []
        new_blobs: dict[str, str] = {}
        old_to_new: dict[int, int] = {}
        for ev in bundle.events:
            body = bundle.blobs.get(ev.payload_ref) if ev.payload_ref else None
            fp = _fingerprint(ev, body)
            if fp in seen:
                continue  # already stored (prior file or prior sweep)
            new_seq = next_seq
            next_seq += 1
            old_to_new[ev.seq] = new_seq
            # remap parent within this bundle; cross-file parents become None
            parent = old_to_new.get(ev.parent_seq) if ev.parent_seq is not None else None
            ref = None
            if body is not None:
                ref = f"blob://{s.session_uid}/{new_seq}"
                new_blobs[ref] = body
            merged = SessionEvent(
                session_uid=s.session_uid, seq=new_seq, parent_seq=parent, ts=ev.ts,
                kind=ev.kind, role=ev.role, tool=ev.tool, summary=ev.summary,
                payload_ref=ref, tokens=ev.tokens, is_sidechain=ev.is_sidechain,
            )
            new_events.append(merged)
            seen.add(fp)
        if new_events:
            self.upsert_events(new_events)
            self.write_blobs(s.session_uid, new_blobs)
        return len(new_events)
    def _max_seq(self, session_uid: str) -> int:
        row = self.db.execute(
            "SELECT COALESCE(MAX(seq), -1) m FROM events WHERE session_uid=?", (session_uid,)
        ).fetchone()
        return int(row["m"])
    def _event_fingerprints(self, session_uid: str) -> set[str]:
        fps: set[str] = set()
        for e in self.get_events(session_uid):
            body = None
            if e.payload_ref:
                r = self.db.execute("SELECT path FROM blobs WHERE ref=?", (e.payload_ref,)).fetchone()
                if r:
                    try:
                        with open(r["path"], "r", encoding="utf-8") as f:
                            body = f.read()
                    except OSError:
                        body = None
            fps.add(_fingerprint(e, body))
        return fps
    # ---- Tier 2 (digest) ---------------------------------------------------
    def write_digest(self, session_uid: str, digest: dict[str, Any], analyzed_at: Optional[str] = None) -> None:
        payload = json.dumps(digest, sort_keys=True)
        self.db.execute(
            "INSERT INTO digests(session_uid, json, nbytes) VALUES(?,?,?) "
            "ON CONFLICT(session_uid) DO UPDATE SET json=excluded.json, nbytes=excluded.nbytes",
            (session_uid, payload, len(payload.encode("utf-8"))),
        )
        self.db.execute(
            "UPDATE sessions SET analyzed_at=? WHERE session_uid=?",
            (analyzed_at or _now(), session_uid),
        )
        self.db.commit()
    def get_digest(self, session_uid: str) -> Optional[dict[str, Any]]:
        row = self.db.execute("SELECT json FROM digests WHERE session_uid=?", (session_uid,)).fetchone()
        return json.loads(row["json"]) if row else None
    def list_digests(self) -> list[dict[str, Any]]:
        return [json.loads(r["json"]) for r in self.db.execute("SELECT json FROM digests")]
    def save_patterns(self, patterns: list[dict[str, Any]]) -> None:
        """Persist candidate patterns to a Tier 2 table (replace prior run)."""
        self.db.execute(
            "CREATE TABLE IF NOT EXISTS patterns ("
            "key TEXT PRIMARY KEY, json TEXT NOT NULL, detected_at TEXT NOT NULL)"
        )
        self.db.execute("DELETE FROM patterns")
        self.db.executemany(
            "INSERT INTO patterns(key, json, detected_at) VALUES(?,?,?)",
            [(p["key"], json.dumps(p, sort_keys=True), _now()) for p in patterns],
        )
        self.db.commit()
    # ---- reads -------------------------------------------------------------
    def get_session(self, session_uid: str) -> Optional[Session]:
        row = self.db.execute(
            "SELECT json, analyzed_at, evicted_at FROM sessions WHERE session_uid=?", (session_uid,)
        ).fetchone()
        return self._row_to_session(row) if row else None
    def list_sessions(self) -> list[Session]:
        rows = self.db.execute("SELECT json, analyzed_at, evicted_at FROM sessions")
        return [self._row_to_session(r) for r in rows]
    @staticmethod
    def _row_to_session(row) -> Session:
        """Rebuild a Session, treating the watermark columns as authoritative."""
        s = Session.from_json(row["json"])
        s.analyzed_at = row["analyzed_at"]
        s.evicted_at = row["evicted_at"]
        return s
    def get_events(self, session_uid: str) -> list[SessionEvent]:
        rows = self.db.execute(
            "SELECT json FROM events WHERE session_uid=? ORDER BY seq", (session_uid,)
        ).fetchall()
        return [SessionEvent.from_json(r["json"]) for r in rows]
    def count_events(self, session_uid: str) -> int:
        return self.db.execute(
            "SELECT COUNT(*) c FROM events WHERE session_uid=?", (session_uid,)
        ).fetchone()["c"]
    # ---- usage accounting (drives retention) -------------------------------
    def tier1_usage_bytes(self) -> int:
        """Bytes held in Tier 1: event-row JSON + blob bytes for non-evicted sessions."""
        row = self.db.execute(
            "SELECT COALESCE(SUM(LENGTH(json)),0) b FROM events e "
            "WHERE NOT EXISTS (SELECT 1 FROM sessions s "
            "WHERE s.session_uid=e.session_uid AND s.evicted_at IS NOT NULL)"
        ).fetchone()
        blob = self.db.execute("SELECT COALESCE(SUM(nbytes),0) b FROM blobs").fetchone()
        return int(row["b"]) + int(blob["b"])
    def session_tier1_bytes(self, session_uid: str) -> int:
        ev = self.db.execute(
            "SELECT COALESCE(SUM(LENGTH(json)),0) b FROM events WHERE session_uid=?", (session_uid,)
        ).fetchone()["b"]
        bl = self.db.execute(
            "SELECT COALESCE(SUM(nbytes),0) b FROM blobs WHERE session_uid=?", (session_uid,)
        ).fetchone()["b"]
        return int(ev) + int(bl)
    def tier2_usage_bytes(self) -> int:
        return int(self.db.execute("SELECT COALESCE(SUM(nbytes),0) b FROM digests").fetchone()["b"])
    # ---- eviction ----------------------------------------------------------
    def evict_raw(self, session_uid: str) -> int:
        """Drop Tier 1 raw (events + blob files) for a session; keep digest + row.
        Sets ``evicted_at``. Returns bytes freed. Safe to call on an
        already-evicted session (no-op-ish).
        """
        freed = self.session_tier1_bytes(session_uid)
        for r in self.db.execute("SELECT path FROM blobs WHERE session_uid=?", (session_uid,)).fetchall():
            try:
                os.remove(r["path"])
            except FileNotFoundError:
                pass
        self.db.execute("DELETE FROM blobs WHERE session_uid=?", (session_uid,))
        self.db.execute("DELETE FROM events WHERE session_uid=?", (session_uid,))
        self.db.execute("UPDATE sessions SET evicted_at=? WHERE session_uid=?", (_now(), session_uid))
        self.db.commit()
        return freed
--- a/session_memory/curate/init.py
+++ b/session_memory/curate/init.py
@@ -0,0 +1,9 @@
 """Curate phase (PRD §6.3) — review candidate patterns into versioned Solution
 Patterns held in an in-repo Pattern Catalog.
 Layout mirrors ``detect/``:
    schema.py    Solution Pattern artifact + per-flavor rendering hints (T01)
    catalog.py   versioned, files-first catalog store (T02)
    review.py    discuss/approve/reject -> promote workflow (T03)
    __main__.py  `python -m session_memory.curate` entrypoint (T06)
 """
--- a/session_memory/curate/main.py
+++ b/session_memory/curate/main.py
@@ -0,0 +1,130 @@
 """Curate entrypoint (T06): review detect candidates into the Pattern Catalog.
    python -m session_memory.curate [--config PATH] [--auto-approve] [--json]
                                    [--workstream-id ID]
 Refreshes candidate patterns (runs the detect pipeline), then drives them through
 the review workflow — **interactive** by default, or **batch** with
 ``--auto-approve`` (promote everything clearing the evidence bar, reject the rest)
 for kaizen-agent runs. Candidates are presented cross-flavor first (detect's
 ranking). Emits a catalog diff summary and, with ``--json``, a machine-readable
 result. Approvals land in the files-first catalog; each final decision is logged
 as a hub decision (queued if the hub is down).
 """
 from __future__ import annotations
 import argparse
 import json
 import os
 from ..detect.__main__ import run_detect
 from ..ingest import _expand, load_config
 from .catalog import Catalog
 from .decisions import DecisionRecorder
 from .gating import bloat_warnings, evaluate, gate_config
 from .review import APPROVE, DISCUSS, REJECT, ReviewLog, review
 def _curate_paths(config: dict):
    c = config.get("curate", {})
    catalog_dir = _expand(c.get("catalog_dir", "session_memory/catalog"))
    review_log = _expand(c.get("review_log", "session_memory/.store/reviews.jsonl"))
    queue = _expand(c.get("decision_queue", "session_memory/.store/decisions.queue.jsonl"))
    ws_id = c.get("state_hub_workstream_id")
    return catalog_dir, review_log, queue, ws_id
 def _render_candidate(cand: dict, gate, existing) -> str:
    g = evaluate(cand, gate)
    flag = " [CROSS-FLAVOR]" if cand.get("cross_flavor") else ""
    lines = [
        f"\n{cand['title']}{flag}",
        f"  key={cand['key']}  score={cand.get('score')} freq={cand['frequency']} "
        f"impact={cand.get('cost_impact')}",
        f"  flavors={','.join(cand.get('flavors', []))}  "
        f"repos={','.join(cand.get('repos', [])) or '-'}  sessions={len(cand.get('sessions', []))}",
        f"  gate: promotable={g.promotable} distribution_ready={g.distribution_ready}"
        + (f"  ({'; '.join(g.reasons)})" if g.reasons else ""),
    ]
    for w in bloat_warnings(cand, existing):
        lines.append(f"  bloat: {w}")
    return "\n".join(lines)
 def _interactive_decider(gate, catalog):
    def decide(cand):
        print(_render_candidate(cand, gate, catalog.list()))
        while True:
            choice = input("  [a]pprove / [r]eject / [d]iscuss ? ").strip().lower()
            if choice in ("a", "approve"):
                return (APPROVE, input("  rationale: ").strip() or "approved")
            if choice in ("r", "reject"):
                return (REJECT, input("  rationale: ").strip() or "rejected")
            if choice in ("d", "discuss"):
                return (DISCUSS, "deferred for discussion")
    return decide
 def _auto_decider(gate):
    """Batch policy: approve candidates clearing the promote floor, reject the rest."""
    def decide(cand):
        g = evaluate(cand, gate)
        if g.promotable:
            return (APPROVE, "auto-approved: clears evidence bar")
        return (REJECT, "auto-rejected: " + "; ".join(g.reasons))
    return decide
 def _summary(result, n_candidates: int) -> str:
    added = [k for k, a in result.approved if a in ("added", "versioned", "updated")]
    lines = [
        f"# Curate summary  ({n_candidates} candidates reviewed)",
        f"  approved : {len(result.approved)}  ({', '.join(f'{k}:{a}' for k, a in result.approved) or '-'})",
        f"  rejected : {len(result.rejected)}  ({', '.join(result.rejected) or '-'})",
        f"  deferred : {len(result.deferred)}  ({', '.join(result.deferred) or '-'})",
        f"  skipped  : {len(result.skipped)}  (already decided)",
        f"  catalog writes: {len(added)}",
    ]
    return "\n".join(lines)
 def main(argv=None) -> int:
    here = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
    ap = argparse.ArgumentParser(description="Curate detect candidates into the Pattern Catalog.")
    ap.add_argument("--config", default=os.path.join(here, "config.toml"))
    ap.add_argument("--auto-approve", action="store_true",
                    help="batch mode: promote everything clearing the evidence bar")
    ap.add_argument("--min-frequency", type=int, default=2)
    ap.add_argument("--workstream-id", default=None, help="hub workstream for decisions")
    ap.add_argument("--json", action="store_true", help="emit machine-readable JSON")
    args = ap.parse_args(argv)
    config = load_config(args.config)
    candidates = run_detect(config, min_frequency=args.min_frequency)
    catalog_dir, review_log_path, queue_path, ws_id = _curate_paths(config)
    gate = gate_config(config)
    catalog = Catalog(catalog_dir)
    log = ReviewLog(review_log_path)
    recorder = DecisionRecorder(queue_path, workstream_id=args.workstream_id or ws_id)
    decide = _auto_decider(gate) if args.auto_approve else _interactive_decider(gate, catalog)
    result = review(candidates, decide, catalog, log, gate=gate, recorder=recorder)
    if args.json:
        print(json.dumps({
            "approved": result.approved, "rejected": result.rejected,
            "deferred": result.deferred, "skipped": result.skipped,
            "decisions_queued": len(recorder.pending()),
        }, indent=2))
    else:
        print(_summary(result, len(candidates)))
        if recorder.pending():
            print(f"  decisions queued (hub offline): {len(recorder.pending())} "
                  f"-> {queue_path}")
    return 0
 if __name__ == "__main__":
    raise SystemExit(main())
--- a/session_memory/curate/catalog.py
+++ b/session_memory/curate/catalog.py
@@ -0,0 +1,148 @@
 """Versioned Pattern Catalog — files-first source of truth (FR-U3; T02).
 The catalog is a directory of one JSON file per Solution Pattern
 (``<catalog_dir>/<pattern-id>.json``). Files originate the work; the State Hub
 indexes them (ADR-001 / PRD §9). Identity is the pattern ``id`` (derived from the
 source candidate key), so re-promoting the same detect candidate maps to the same
 file — dedup is structural, not heuristic.
 :meth:`Catalog.upsert` is the one write path and is **idempotent**:
 * new id                       -> written as-is                  (``added``)
 * same id, identical content   -> no write, no version bump      (``unchanged``)
 * same id, only status/flags   -> updated in place, no bump      (``updated``)
 * same id, content changed     -> version bumped, prior snapshot
                                  appended to ``<id>.history.jsonl`` (``versioned``)
 History is append-only alongside the current file, so the catalog dir stays one
 clean current file per pattern while every superseded version is recoverable.
 """
 from __future__ import annotations
 import json
 import os
 from datetime import datetime, timezone
 from typing import Optional
 from .schema import SolutionPattern
 # Content fields that define a pattern's substance. Version, timestamps, status,
 # and distribution_ready are metadata — changes to them never bump the version.
 _CONTENT_KEYS = ("name", "polarity", "problem", "resolutions", "scope",
                 "provenance", "rendering_hints", "covers")
 ADDED = "added"
 UNCHANGED = "unchanged"
 UPDATED = "updated"
 VERSIONED = "versioned"
 def _now() -> str:
    return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
 def _content(p: SolutionPattern) -> str:
    d = p.to_dict()
    return json.dumps({k: d[k] for k in _CONTENT_KEYS}, sort_keys=True)
 class Catalog:
    """File-backed catalog of versioned :class:`SolutionPattern` artifacts."""
    def __init__(self, catalog_dir: str) -> None:
        self.dir = catalog_dir
        os.makedirs(self.dir, exist_ok=True)
    # --- paths --------------------------------------------------------------
    def _path(self, pattern_id: str) -> str:
        return os.path.join(self.dir, f"{pattern_id}.json")
    def _history_path(self, pattern_id: str) -> str:
        return os.path.join(self.dir, f"{pattern_id}.history.jsonl")
    # --- reads --------------------------------------------------------------
    def load(self, pattern_id: str) -> Optional[SolutionPattern]:
        path = self._path(pattern_id)
        if not os.path.exists(path):
            return None
        with open(path, encoding="utf-8") as fh:
            return SolutionPattern.from_json(fh.read())
    def list(self) -> list[SolutionPattern]:
        out: list[SolutionPattern] = []
        for name in sorted(os.listdir(self.dir)):
            if name.endswith(".json") and not name.endswith(".history.jsonl"):
                with open(os.path.join(self.dir, name), encoding="utf-8") as fh:
                    out.append(SolutionPattern.from_json(fh.read()))
        return out
    def history(self, pattern_id: str) -> list[dict]:
        path = self._history_path(pattern_id)
        if not os.path.exists(path):
            return []
        with open(path, encoding="utf-8") as fh:
            return [json.loads(line) for line in fh if line.strip()]
    def find_for(self, signal_key: str, locus: str = "") -> Optional[SolutionPattern]:
        """Best catalog pattern for a detect signal: exact id first, then ``covers``.
        Lets a signal that doesn't share a pattern's exact key (e.g. a
        ``recurring_error`` fingerprint) inherit the curated recommendation when a
        pattern declares it covers that text.
        """
        exact = self.load(SolutionPattern.make_id(signal_key))
        if exact is not None:
            return exact
        hay = f"{signal_key} {locus}".lower()
        for p in self.list():  # sorted by id -> deterministic
            if any(c.lower() in hay for c in p.covers):
                return p
        return None
    # --- the single write path ---------------------------------------------
    def upsert(self, pattern: SolutionPattern) -> str:
        """Insert or version-update a pattern. Returns the action taken."""
        existing = self.load(pattern.id)
        now = _now()
        if existing is None:
            pattern.created_at = pattern.created_at or now
            pattern.updated_at = now
            self._write(pattern)
            return ADDED
        if _content(existing) == _content(pattern):
            # substance unchanged — only persist a metadata (status/flag) change
            if (existing.status == pattern.status
                    and existing.distribution_ready == pattern.distribution_ready):
                return UNCHANGED
            existing.status = pattern.status
            existing.distribution_ready = pattern.distribution_ready
            existing.updated_at = now
            self._write(existing)
            return UPDATED
        # substance changed: archive the old version, bump, write the new one
        self._append_history(existing)
        pattern.version = SolutionPattern.bump_version(existing.version)
        pattern.created_at = existing.created_at or now
        pattern.updated_at = now
        self._write(pattern)
        return VERSIONED
    # --- internals ----------------------------------------------------------
    def _write(self, pattern: SolutionPattern) -> None:
        with open(self._path(pattern.id), "w", encoding="utf-8") as fh:
            fh.write(pattern.to_json())
            fh.write("\n")
    def _append_history(self, superseded: SolutionPattern) -> None:
        superseded.status = "superseded"
        with open(self._history_path(superseded.id), "a", encoding="utf-8") as fh:
            fh.write(json.dumps(superseded.to_dict(), sort_keys=True))
            fh.write("\n")
--- a/session_memory/curate/decisions.py
+++ b/session_memory/curate/decisions.py
@@ -0,0 +1,114 @@
 """State Hub decision integration (FR-U4; T05).
 Every final promote/reject is recorded as an auditable decision so the rationale,
 the source candidate key, and an evidence snapshot are traceable. The catalog
 file remains the durable artifact (ADR-001); the decision is the audit trail.
 The recorder is **graceful under a hub outage** — exactly the condition hit during
 Phase 1, where statuses were synced after the fact. A pluggable ``sink`` does the
 actual write (HTTP to the hub, or the MCP ``record_decision`` tool driven by the
 operator). If the sink is absent or raises, the decision is appended to a local
 queue (``decisions.queue.jsonl``) and can be replayed later with :meth:`flush`.
 """
 from __future__ import annotations
 import json
 import os
 from dataclasses import dataclass, field
 from datetime import datetime, timezone
 from typing import Callable, Optional
 # A sink takes a hub-shaped decision payload and persists it (may raise on failure).
 Sink = Callable[[dict], None]
 def _now() -> str:
    return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
 def build_decision(candidate: dict, action: str, rationale: str,
                   *, workstream_id: Optional[str] = None,
                   decided_by: str = "curator") -> dict:
    """Shape a curate decision as a State Hub ``record_decision`` payload."""
    key = candidate["key"]
    verb = "Promote" if action == "approve" else "Reject"
    return {
        "title": f"{verb} pattern candidate {key}",
        "decision_type": "made",
        "workstream_id": workstream_id,
        "rationale": rationale,
        "decided_by": decided_by,
        "description": json.dumps({
            "action": action,
            "source_key": key,
            "evidence": candidate,
        }, sort_keys=True),
        "recorded_at": _now(),
    }
@dataclass
 class DecisionRecorder:
    """Records decisions through ``sink`` with a durable local-queue fallback."""
    queue_path: str
    sink: Optional[Sink] = None
    workstream_id: Optional[str] = None
    decided_by: str = "curator"
    _queued: int = field(default=0, init=False)
    def record(self, candidate: dict, action: str, rationale: str) -> bool:
        """Record one decision. Returns True if the sink accepted it, else queued."""
        payload = build_decision(candidate, action, rationale,
                                 workstream_id=self.workstream_id, decided_by=self.decided_by)
        if self.sink is not None:
            try:
                self.sink(payload)
                return True
            except Exception:  # hub down / transient — fall through to the queue
                pass
        self._append(payload)
        return False
    def pending(self) -> list[dict]:
        if not os.path.exists(self.queue_path):
            return []
        with open(self.queue_path, encoding="utf-8") as fh:
            return [json.loads(line) for line in fh if line.strip()]
    def flush(self, sink: Optional[Sink] = None) -> int:
        """Replay queued decisions through ``sink``. Returns count synced.
        Stops at the first failure so ordering is preserved; the unsynced tail is
        rewritten back to the queue.
        """
        sink = sink or self.sink
        if sink is None:
            return 0
        items = self.pending()
        synced = 0
        for i, payload in enumerate(items):
            try:
                sink(payload)
                synced += 1
            except Exception:
                self._rewrite(items[i:])
                return synced
        self._rewrite([])
        return synced
    # --- internals ----------------------------------------------------------
    def _append(self, payload: dict) -> None:
        os.makedirs(os.path.dirname(self.queue_path) or ".", exist_ok=True)
        with open(self.queue_path, "a", encoding="utf-8") as fh:
            fh.write(json.dumps(payload, sort_keys=True))
            fh.write("\n")
        self._queued += 1
    def _rewrite(self, items: list[dict]) -> None:
        with open(self.queue_path, "w", encoding="utf-8") as fh:
            for payload in items:
                fh.write(json.dumps(payload, sort_keys=True))
                fh.write("\n")
--- a/session_memory/curate/gating.py
+++ b/session_memory/curate/gating.py
@@ -0,0 +1,117 @@
 """Promotion evidence-bar + bloat guard (design OQ5/OQ6; T04).
 Two gates protect the catalog:
 * **Evidence bar (OQ5)** — a candidate must clear configurable floors
  (frequency, distinct supporting sessions) before it may be promoted at all.
  A separate, stricter bar decides whether the promoted pattern is
  *distribution-eligible* (``status="approved"``, ``distribution_ready=True``)
  vs. merely ``provisional`` — the minimum trustworthy evidence before a pattern
  is allowed near live agent environments.
 * **Bloat guard (OQ6)** — flags candidates that would add little: a duplicate of
  an already-cataloged pattern, or a near-duplicate sharing the same
  signal-type+locus. Keeps the catalog lean so agent context budgets aren't
  degraded by low-value instructions.
 Knobs live under ``[curate]`` in ``config.toml``; :func:`gate_config` reads them
 with safe defaults so the module also works config-free (tests).
 """
 from __future__ import annotations
 from dataclasses import dataclass, field
 from typing import Optional
 from .schema import SolutionPattern
@dataclass
 class GateConfig:
    # promotion floor (OQ5)
    min_frequency: int = 2
    min_sessions: int = 2
    min_cost_impact: float = 0.0
    # distribution-eligibility floor (stricter; OQ5)
    dist_require_cross_flavor: bool = False
    dist_min_frequency: int = 3
    dist_min_cost_impact: float = 0.0
 def gate_config(config: Optional[dict] = None) -> GateConfig:
    c = (config or {}).get("curate", {}) if config else {}
    g = c.get("gate", {}) if isinstance(c, dict) else {}
    return GateConfig(
        min_frequency=g.get("min_frequency", 2),
        min_sessions=g.get("min_sessions", 2),
        min_cost_impact=g.get("min_cost_impact", 0.0),
        dist_require_cross_flavor=g.get("dist_require_cross_flavor", False),
        dist_min_frequency=g.get("dist_min_frequency", 3),
        dist_min_cost_impact=g.get("dist_min_cost_impact", 0.0),
    )
@dataclass
 class GateResult:
    promotable: bool
    distribution_ready: bool
    status: str  # "approved" if distribution-ready else "provisional"
    reasons: list = field(default_factory=list)
 def _n_sessions(candidate: dict) -> int:
    return len(candidate.get("sessions", []) or [])
 def evaluate(candidate: dict, config: Optional[GateConfig] = None) -> GateResult:
    """Decide whether a candidate may be promoted, and at what trust level."""
    cfg = config or GateConfig()
    reasons: list[str] = []
    freq = candidate.get("frequency", 0)
    sessions = _n_sessions(candidate)
    impact = candidate.get("cost_impact", 0.0)
    promotable = True
    if freq < cfg.min_frequency:
        promotable = False
        reasons.append(f"frequency {freq} < min {cfg.min_frequency}")
    if sessions < cfg.min_sessions:
        promotable = False
        reasons.append(f"sessions {sessions} < min {cfg.min_sessions}")
    if impact < cfg.min_cost_impact:
        promotable = False
        reasons.append(f"cost_impact {impact} < min {cfg.min_cost_impact}")
    dist = promotable
    if cfg.dist_require_cross_flavor and not candidate.get("cross_flavor", False):
        dist = False
        reasons.append("not cross-flavor (required for distribution)")
    if freq < cfg.dist_min_frequency:
        dist = False
        reasons.append(f"frequency {freq} < distribution min {cfg.dist_min_frequency}")
    if impact < cfg.dist_min_cost_impact:
        dist = False
        reasons.append(f"cost_impact {impact} < distribution min {cfg.dist_min_cost_impact}")
    return GateResult(
        promotable=promotable,
        distribution_ready=bool(dist),
        status="approved" if dist else "provisional",
        reasons=reasons,
    )
 def bloat_warnings(candidate: dict, existing: list[SolutionPattern]) -> list[str]:
    """Flag low-value adds against what is already catalogued (OQ6)."""
    warnings: list[str] = []
    cand_id = SolutionPattern.make_id(candidate["key"])
    _, sig_type, locus = (candidate["key"].split(":", 2) + ["", ""])[:3]
    for p in existing:
        if p.id == cand_id:
            warnings.append(f"duplicate of catalogued pattern {p.id}")
            continue
        p_parts = (p.provenance.source_key.split(":", 2) + ["", ""])[:3]
        if (p_parts[1], p_parts[2]) == (sig_type, locus):
            warnings.append(f"near-duplicate of {p.id} (same {sig_type}/{locus})")
    return warnings
--- a/session_memory/curate/review.py
+++ b/session_memory/curate/review.py
@@ -0,0 +1,158 @@
 """Curation review workflow (FR-U1/FR-U2; T03).
 Drives Phase 1 detect candidates through a **discuss / approve / reject** review
 and, on approve, promotes the candidate into a :class:`SolutionPattern` written to
 the :class:`Catalog`. The actual decision is supplied by a ``decide`` callback so
 this engine stays UI-free — the ``__main__`` entrypoint (T06) plugs in interactive
 or batch (auto-approve) logic.
 Re-review is **idempotent** via a :class:`ReviewLog`: a candidate already decided
 is skipped unless its *evidence fingerprint* changed (new sessions/frequency), so
 a prior **reject** is remembered and not re-surfaced, and a prior **approve** is
 updated in place rather than duplicated (catalog dedup does the rest).
 """
 from __future__ import annotations
 import hashlib
 import json
 import os
 from dataclasses import dataclass, field
 from datetime import datetime, timezone
 from typing import Callable, Optional
 from .catalog import Catalog
 from .decisions import DecisionRecorder
 from .gating import GateConfig, evaluate
 from .schema import Provenance, Resolution, Scope, SolutionPattern
 APPROVE = "approve"
 REJECT = "reject"
 DISCUSS = "discuss"  # defer — no final decision recorded
 # Default per-flavor rendering-hint stubs a reviewer can later refine (OQ4).
 _DEFAULT_TARGET = {"claude": "CLAUDE.md", "codex": "AGENTS.md", "grok": "instructions"}
 # A decision callback: (candidate dict) -> (action, rationale)
 Decider = Callable[[dict], tuple]
 def _now() -> str:
    return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
 def evidence_fingerprint(candidate: dict) -> str:
    """Stable hash of the evidence that would justify (re)reviewing a candidate."""
    keys = ("frequency", "cost_impact", "flavors", "repos", "sessions", "cross_flavor")
    payload = {k: candidate.get(k) for k in keys}
    return hashlib.sha1(json.dumps(payload, sort_keys=True).encode("utf-8")).hexdigest()
 def candidate_to_pattern(candidate: dict, *, status: str = "provisional",
                         distribution_ready: bool = False) -> SolutionPattern:
    """Build a Solution Pattern from a detect candidate.
    ``status``/``distribution_ready`` come from the evidence gate (T04); they
    default to a provisional, non-distribution-ready pattern when ungated.
    """
    src = candidate["key"]
    flavors = list(candidate.get("flavors", []))
    hints = {f: {"target": _DEFAULT_TARGET.get(f, ""), "note": "TODO: refine rendering"}
             for f in flavors}
    return SolutionPattern(
        id=SolutionPattern.make_id(src),
        name=candidate.get("title") or src,
        version="1.0.0",
        polarity=candidate.get("polarity", "problem"),
        problem=candidate.get("title") or src,
        resolutions=[Resolution(summary="TODO: capture the recommended resolution")],
        scope=Scope(flavors=flavors, repos=list(candidate.get("repos", []))),
        provenance=Provenance(source_key=src, evidence=dict(candidate), promoted_at=_now()),
        rendering_hints=hints,
        status=status,
        distribution_ready=distribution_ready,
    )
@dataclass
 class ReviewLog:
    """Append-only record of final decisions, keyed by candidate source key."""
    path: str
    _by_key: dict = field(default_factory=dict)
    def __post_init__(self) -> None:
        if os.path.exists(self.path):
            with open(self.path, encoding="utf-8") as fh:
                for line in fh:
                    if line.strip():
                        rec = json.loads(line)
                        self._by_key[rec["source_key"]] = rec  # last write wins
    def prior(self, source_key: str) -> Optional[dict]:
        return self._by_key.get(source_key)
    def already_decided(self, candidate: dict) -> bool:
        rec = self._by_key.get(candidate["key"])
        return bool(rec) and rec["fingerprint"] == evidence_fingerprint(candidate)
    def record(self, candidate: dict, action: str, rationale: str) -> None:
        rec = {
            "source_key": candidate["key"],
            "action": action,
            "rationale": rationale,
            "fingerprint": evidence_fingerprint(candidate),
            "ts": _now(),
        }
        self._by_key[candidate["key"]] = rec
        os.makedirs(os.path.dirname(self.path) or ".", exist_ok=True)
        with open(self.path, "a", encoding="utf-8") as fh:
            fh.write(json.dumps(rec, sort_keys=True))
            fh.write("\n")
@dataclass
 class ReviewResult:
    approved: list = field(default_factory=list)   # (source_key, catalog_action)
    rejected: list = field(default_factory=list)   # source_key
    deferred: list = field(default_factory=list)   # source_key (discuss)
    skipped: list = field(default_factory=list)    # source_key (already decided)
 def review(candidates: list[dict], decide: Decider, catalog: Catalog,
           log: ReviewLog, gate: Optional[GateConfig] = None,
           recorder: Optional[DecisionRecorder] = None) -> ReviewResult:
    """Run each candidate through ``decide``; promote approvals into ``catalog``.
    When a ``gate`` (T04 evidence bar) is supplied, the promoted pattern's
    ``status``/``distribution_ready`` are set from the gate evaluation, so an
    approved-but-thin candidate lands as ``provisional`` rather than
    distribution-ready. When a ``recorder`` (T05) is supplied, each final
    promote/reject is logged as an auditable hub decision (queued if the hub is
    down).
    """
    result = ReviewResult()
    for cand in candidates:
        key = cand["key"]
        if log.already_decided(cand):
            result.skipped.append(key)
            continue
        action, rationale = decide(cand)
        if action == DISCUSS:
            result.deferred.append(key)
            continue  # not a final decision — leave for a later pass
        if action == APPROVE:
            g = evaluate(cand, gate) if gate is not None else None
            pattern = (candidate_to_pattern(cand, status=g.status,
                                            distribution_ready=g.distribution_ready)
                       if g is not None else candidate_to_pattern(cand))
            cat_action = catalog.upsert(pattern)
            result.approved.append((key, cat_action))
        elif action == REJECT:
            result.rejected.append(key)
        else:
            raise ValueError(f"unknown review action {action!r}")
        log.record(cand, action, rationale)
        if recorder is not None:
            recorder.record(cand, action, rationale)
    return result
--- a/session_memory/curate/schema.py
+++ b/session_memory/curate/schema.py
@@ -0,0 +1,160 @@
 """Solution Pattern schema (PRD §6.3 FR-U2; design OQ4) — T01.
 A **Solution Pattern** is the curated, reviewed artifact a candidate pattern is
 promoted into: a named, versioned record pairing a problem (or success) with one
 or more recommended resolutions, written **flavor-agnostically**. Everything a
 distributor needs to render a native artifact lives in a *separate*
 ``rendering_hints`` sub-structure, keyed by flavor — so the core stays neutral
 (FR-A1/FR-A2) while Phase 3 distributors still get enough to render well (OQ4).
 The artifact is the durable unit of the Pattern Catalog (T02): files originate,
 the State Hub indexes (ADR-001). Serialization is deterministic (sorted keys) so
 catalog files diff cleanly and re-saving an unchanged pattern is a no-op.
 """
 from __future__ import annotations
 import json
 import re
 from dataclasses import asdict, dataclass, field, fields
 from typing import Any, Optional
 from ..core.schema import FLAVORS
 SCHEMA_VERSION = 1
 # Lifecycle of a catalogued pattern.
 #   provisional — promoted but below the distribution evidence bar (OQ5)
 #   approved    — meets the bar; distribution-eligible (Phase 3)
 #   rejected    — reviewed and declined; remembered so it is not re-surfaced
 #   superseded  — replaced by a newer version of the same pattern id
 STATUSES = ("provisional", "approved", "rejected", "superseded")
 POLARITIES = ("problem", "success")
@dataclass
 class Resolution:
    """One recommended resolution for the pattern's problem (FR-U2)."""
    summary: str
    detail: str = ""
    steps: list[str] = field(default_factory=list)
@dataclass
 class Scope:
    """Where the pattern applies (FR-X2 input). Empty list == unrestricted."""
    repos: list[str] = field(default_factory=list)
    domains: list[str] = field(default_factory=list)
    flavors: list[str] = field(default_factory=list)
    def __post_init__(self) -> None:
        bad = [f for f in self.flavors if f not in FLAVORS]
        if bad:
            raise ValueError(f"unknown flavor(s) in scope {bad!r}; expected {FLAVORS}")
@dataclass
 class Provenance:
    """Trace back to the detect candidate this pattern was promoted from."""
    source_key: str  # the detect Pattern.key — stable cluster identity
    evidence: dict[str, Any] = field(default_factory=dict)  # snapshot of the candidate
    detected_at: Optional[str] = None
    promoted_at: Optional[str] = None
@dataclass
 class SolutionPattern:
    """A curated, versioned solution pattern (PRD §5 / §6.3)."""
    id: str  # stable, derived from provenance.source_key
    name: str
    version: str  # semantic, e.g. "1.0.0"
    polarity: str  # problem | success
    problem: str  # human-readable description of the recurring situation
    resolutions: list[Resolution] = field(default_factory=list)
    scope: Scope = field(default_factory=Scope)
    provenance: Provenance = field(default_factory=lambda: Provenance(source_key=""))
    # per-flavor rendering hints, kept OUT of the agnostic core (OQ4):
    #   {"claude": {...}, "codex": {...}, "grok": {...}}
    rendering_hints: dict[str, dict[str, Any]] = field(default_factory=dict)
    # other signal keys/loci this pattern's recommendation also applies to —
    # lowercase substrings matched against a candidate signal's key+locus, so a
    # detect signal that doesn't share this pattern's exact key (e.g. a
    # recurring_error fingerprint) can still inherit the curated resolution.
    covers: list[str] = field(default_factory=list)
    status: str = "provisional"
    distribution_ready: bool = False
    created_at: Optional[str] = None
    updated_at: Optional[str] = None
    schema_version: int = SCHEMA_VERSION
    def __post_init__(self) -> None:
        if self.polarity not in POLARITIES:
            raise ValueError(f"unknown polarity {self.polarity!r}; expected {POLARITIES}")
        if self.status not in STATUSES:
            raise ValueError(f"unknown status {self.status!r}; expected {STATUSES}")
        bad = [f for f in self.rendering_hints if f not in FLAVORS]
        if bad:
            raise ValueError(f"unknown flavor(s) in rendering_hints {bad!r}; expected {FLAVORS}")
    # --- identity / versioning helpers -------------------------------------
    @staticmethod
    def make_id(source_key: str) -> str:
        """Stable catalog id from a detect candidate key (``polarity:type:locus``).
        Identity is the source key, so re-promoting the same candidate maps to the
        same pattern (dedup in T02), independent of wording or version.
        """
        slug = re.sub(r"[^a-z0-9_]+", "-", source_key.lower()).strip("-")
        return f"sp-{slug}"
    @staticmethod
    def bump_version(version: str, level: str = "patch") -> str:
        """Increment a ``major.minor.patch`` version string."""
        parts = (version.split(".") + ["0", "0", "0"])[:3]
        major, minor, patch = (int(p) for p in parts)
        if level == "major":
            major, minor, patch = major + 1, 0, 0
        elif level == "minor":
            minor, patch = minor + 1, 0
        else:
            patch += 1
        return f"{major}.{minor}.{patch}"
    # --- serialization ------------------------------------------------------
    def to_dict(self) -> dict[str, Any]:
        return asdict(self)
    def to_json(self) -> str:
        return json.dumps(self.to_dict(), sort_keys=True, indent=2)
    @classmethod
    def from_dict(cls, d: dict[str, Any]) -> "SolutionPattern":
        d = dict(d)
        resolutions = [Resolution(**{k: v for k, v in r.items() if k in _RESOLUTION_FIELDS})
                       for r in d.pop("resolutions", [])]
        scope = d.pop("scope", None)
        prov = d.pop("provenance", None)
        obj = cls(**{k: v for k, v in d.items() if k in _PATTERN_FIELDS})
        obj.resolutions = resolutions
        if scope is not None:
            obj.scope = Scope(**{k: v for k, v in scope.items() if k in _SCOPE_FIELDS})
        if prov is not None:
            obj.provenance = Provenance(**{k: v for k, v in prov.items() if k in _PROV_FIELDS})
        return obj
    @classmethod
    def from_json(cls, s: str) -> "SolutionPattern":
        return cls.from_dict(json.loads(s))
 _PATTERN_FIELDS = {f.name for f in fields(SolutionPattern)}
 _RESOLUTION_FIELDS = {f.name for f in fields(Resolution)}
 _SCOPE_FIELDS = {f.name for f in fields(Scope)}
 _PROV_FIELDS = {f.name for f in fields(Provenance)}
--- a/session_memory/detect/init.py
+++ b/session_memory/detect/init.py
@@ -0,0 +1 @@
 """Detect: extract signals from sessions, cluster into candidate patterns."""
--- a/session_memory/detect/main.py
+++ b/session_memory/detect/main.py
@@ -0,0 +1,72 @@
 """Detect entrypoint (T07): digests -> signals -> clusters -> report.
    python -m session_memory.detect [--config PATH] [--json] [--min-frequency N]
 Reads Tier 2 digests from the store, extracts signals, clusters them into
 candidate patterns, persists the candidates, and prints a ranked report
 (cross-flavor first) — the input to the Curate phase (Phase 2).
 """
 from __future__ import annotations
 import argparse
 import json
 import os
 from ..core.store import Store
 from ..ingest import _expand, load_config
 from .cluster import cluster
 from .quality import filter_real, quality_config
 from .signals import extract_signals
 def run_detect(config: dict, *, min_frequency: int = 2) -> list[dict]:
    store_cfg = config.get("store", {})
    store = Store(_expand(store_cfg["db_path"]), _expand(store_cfg["blob_dir"]))
    digests = filter_real(store.list_digests(), quality_config(config))
    signals = extract_signals(digests)
    patterns = [p.to_dict() for p in cluster(signals, min_frequency=min_frequency)]
    store.save_patterns(patterns)
    store.close()
    return patterns
 def _format_report(patterns: list[dict], n_digests: int) -> str:
    lines = [f"# Candidate Patterns  ({len(patterns)} from {n_digests} sessions)", ""]
    if not patterns:
        lines.append("No recurring patterns above the frequency threshold yet.")
        return "\n".join(lines)
    for i, p in enumerate(patterns, 1):
        flag = " [CROSS-FLAVOR]" if p["cross_flavor"] else ""
        lines.append(f"{i}. {p['title']}{flag}")
        lines.append(f"   score={p['score']} freq={p['frequency']} "
                     f"impact={p['cost_impact']} flavors={','.join(p['flavors'])}")
        lines.append(f"   repos={','.join(p['repos']) or '-'}  "
                     f"sessions={len(p['sessions'])}")
        lines.append("")
    return "\n".join(lines)
 def main(argv=None) -> int:
    here = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
    ap = argparse.ArgumentParser(description="Detect candidate patterns from session digests.")
    ap.add_argument("--config", default=os.path.join(here, "config.toml"))
    ap.add_argument("--min-frequency", type=int, default=2)
    ap.add_argument("--json", action="store_true", help="emit machine-readable JSON")
    args = ap.parse_args(argv)
    config = load_config(args.config)
    store_cfg = config.get("store", {})
    all_digests = Store(_expand(store_cfg["db_path"]), _expand(store_cfg["blob_dir"])).list_digests()
    n = len(filter_real(all_digests, quality_config(config)))
    patterns = run_detect(config, min_frequency=args.min_frequency)
    if args.json:
        print(json.dumps(patterns, indent=2))
    else:
        print(_format_report(patterns, n))
    return 0
 if __name__ == "__main__":
    raise SystemExit(main())
--- a/session_memory/detect/cluster.py
+++ b/session_memory/detect/cluster.py
@@ -0,0 +1,78 @@
 """Pattern clusterer + evidence (PRD §5, §6.2; T05/T06).
 Groups recurring :class:`Signal`s into candidate ``Pattern`` records. Clustering
 is deterministic and keyed on ``(polarity, signal-type, locus)`` — enough to
 surface "the same thing keeps happening" without embeddings (a later option).
 Each candidate carries evidence (FR-D3): supporting sessions, frequency, affected
 repos, affected **flavors**, and an estimated cost-impact score. Candidates whose
 evidence spans more than one flavor are flagged ``cross_flavor`` (FR-D4) — the
 highest-value reuse targets.
 """
 from __future__ import annotations
 import collections
 from dataclasses import asdict, dataclass, field
 from typing import Any
 from .signals import PROBLEM, Signal
@dataclass
 class Pattern:
    key: str                       # stable cluster key
    polarity: str                  # problem | success
    signal_type: str
    locus: str
    frequency: int                 # number of supporting signals
    sessions: list[str] = field(default_factory=list)
    repos: list[str] = field(default_factory=list)
    flavors: list[str] = field(default_factory=list)
    cross_flavor: bool = False
    cost_impact: float = 0.0       # frequency-weighted magnitude
    score: float = 0.0             # ranking score (impact x frequency)
    title: str = ""
    def to_dict(self) -> dict[str, Any]:
        return asdict(self)
 def _key(s: Signal) -> str:
    return f"{s.polarity}:{s.type}:{s.locus}"
 def _title(polarity: str, signal_type: str, n_flavors: int) -> str:
    scope = "cross-flavor " if n_flavors > 1 else ""
    verb = "problem" if polarity == PROBLEM else "success"
    return f"{scope}{verb}: {signal_type.replace('_', ' ')}"
 def cluster(signals: list[Signal], *, min_frequency: int = 2) -> list[Pattern]:
    """Group signals into candidate patterns; keep clusters >= min_frequency."""
    groups: dict[str, list[Signal]] = collections.defaultdict(list)
    for s in signals:
        groups[_key(s)].append(s)
    patterns: list[Pattern] = []
    for key, members in groups.items():
        if len(members) < min_frequency:
            continue
        sessions = sorted({m.session_uid for m in members})
        repos = sorted({m.repo for m in members if m.repo})
        flavors = sorted({m.flavor for m in members})
        cost_impact = sum(m.magnitude for m in members)
        first = members[0]
        p = Pattern(
            key=key, polarity=first.polarity, signal_type=first.type, locus=first.locus,
            frequency=len(members), sessions=sessions, repos=repos, flavors=flavors,
            cross_flavor=len(flavors) > 1, cost_impact=round(cost_impact, 3),
            title=_title(first.polarity, first.type, len(flavors)),
        )
        # rank: impact x frequency, with a boost for cross-flavor reuse value
        p.score = round(p.cost_impact * p.frequency * (1.5 if p.cross_flavor else 1.0), 3)
        patterns.append(p)
    # cross-flavor first, then by score
    patterns.sort(key=lambda p: (not p.cross_flavor, -p.score))
    return patterns
--- a/session_memory/detect/quality.py
+++ b/session_memory/detect/quality.py
@@ -0,0 +1,75 @@
 """Session-quality filter (T01).
 The capture layer ingests *every* session it finds — including API health-checks,
 smoke-tests, and interrupted runs (e.g. ``llm-connect`` firing "Say hello in one
 word", or a transcript that is just ``[Request interrupted by user]``). These are
 not real coding work, but the outcome heuristic labels the short ones ``abandoned``
 and the clusterer then mints false-positive "problem" patterns from them.
 :func:`is_real_coding_session` gates those out so Detect signals/clusters form only
 over genuine coding sessions. It is intentionally conservative — a session counts
 as real if it shows substantive activity, and is dropped only on clear trivial
 markers. Thresholds come from ``[detect.quality]`` in ``config.toml``.
 """
 from __future__ import annotations
 from dataclasses import dataclass
 from typing import Optional
 # Prompt prefixes/markers that indicate a non-coding or interrupted session.
 _TRIVIAL_PROMPTS = (
    "say hello", "hello", "[request interrupted", "return only this json",
    "ping", "ok", "<system-reminder>",
 )
 # Tool buckets that count as "substantive" coding activity.
 _SUBSTANTIVE_TOOLS = (
    "Edit", "Write", "Read", "Bash", "search_replace", "write", "read_file",
    "run_terminal_command", "grep", "Grep", "glob", "Glob", "NotebookEdit",
 )
@dataclass
 class QualityConfig:
    min_events: int = 20          # below this, not a real coding session
    min_substantive: int = 3      # >= this many substantive tool calls required
    min_prompt_len: int = 25      # first prompt shorter than this is suspect
 def quality_config(config: Optional[dict] = None) -> QualityConfig:
    d = (config or {}).get("detect", {}).get("quality", {}) if config else {}
    return QualityConfig(
        min_events=d.get("min_events", 20),
        min_substantive=d.get("min_substantive", 3),
        min_prompt_len=d.get("min_prompt_len", 25),
    )
 def _substantive_calls(digest: dict) -> int:
    hist = digest.get("tool_histogram") or {}
    return sum(n for t, n in hist.items() if t in _SUBSTANTIVE_TOOLS)
 def is_real_coding_session(digest: dict, config: Optional[QualityConfig] = None) -> bool:
    cfg = config or QualityConfig()
    if not digest.get("repo"):
        return False
    if digest.get("event_count", 0) < cfg.min_events:
        return False
    if _substantive_calls(digest) < cfg.min_substantive:
        return False
    prompt = (digest.get("first_prompt") or "").strip().lower()
    if len(prompt) < cfg.min_prompt_len:
        return False
    if any(prompt.startswith(p) for p in _TRIVIAL_PROMPTS):
        return False
    return True
 def filter_real(digests: list[dict], config: Optional[QualityConfig] = None) -> list[dict]:
    cfg = config or QualityConfig()
    return [d for d in digests if is_real_coding_session(d, cfg)]
--- a/session_memory/detect/signals.py
+++ b/session_memory/detect/signals.py
@@ -0,0 +1,205 @@
 """Signal extractors (PRD §6.2; T04).
 Pure functions over a session digest (Tier 2) — the compact, durable view. Each
 extractor emits zero or more :class:`Signal`s. A signal records its source
 session, a *locus* (what it's about), a *polarity* (problem vs. success), and a
 *magnitude*. Signals are the atoms the clusterer groups into candidate patterns.
 No new capture happens here; everything is derived from digests already written
 by the Capture layer, so detection is cheap and re-runnable.
 """
 from __future__ import annotations
 from dataclasses import dataclass, field
 from typing import Any, Callable, Optional
 # polarity
 PROBLEM = "problem"
 SUCCESS = "success"
@dataclass
 class Signal:
    session_uid: str
    flavor: str
    repo: Optional[str]
    type: str               # e.g. "budget_overrun", "clean_pass"
    polarity: str           # PROBLEM | SUCCESS
    locus: str              # normalized subject key (tool, marker, ...)
    magnitude: float = 1.0  # strength / cost weight
    detail: dict[str, Any] = field(default_factory=dict)
 # --- individual extractors --------------------------------------------------
 # Each takes (digest, ctx) and returns a list[Signal]. ctx carries corpus-level
 # stats (e.g. cost percentiles) so extractors can compare a session to its peers.
 def _base(digest, type_, polarity, locus, magnitude=1.0, **detail) -> Signal:
    return Signal(
        session_uid=digest["session_uid"], flavor=digest["flavor"],
        repo=digest.get("repo"), type=type_, polarity=polarity, locus=locus,
        magnitude=magnitude, detail=detail,
    )
 def sig_retry_storm(digest, ctx) -> list[Signal]:
    retries = digest.get("markers", {}).get("retries", 0)
    if retries >= ctx.get("retry_storm_threshold", 3):
        return [_base(digest, "retry_storm", PROBLEM, "retries", float(retries), retries=retries)]
    return []
 def sig_repeated_errors(digest, ctx) -> list[Signal]:
    errors = digest.get("markers", {}).get("errors", 0)
    if errors >= ctx.get("error_threshold", 3):
        return [_base(digest, "repeated_errors", PROBLEM, "errors", float(errors), errors=errors)]
    return []
 def sig_budget_overrun(digest, ctx) -> list[Signal]:
    total = digest.get("cost", {}).get("input_tokens", 0) + digest.get("cost", {}).get("output_tokens", 0)
    p90 = ctx.get("tokens_p90", 0)
    if p90 and total > p90:
        return [_base(digest, "budget_overrun", PROBLEM, "tokens",
                      float(total) / max(p90, 1), tokens=total, p90=p90)]
    return []
 def sig_abandoned(digest, ctx) -> list[Signal]:
    if digest.get("outcome") == "abandoned":
        return [_base(digest, "abandoned", PROBLEM, "outcome", 1.0)]
    return []
 def sig_clean_pass(digest, ctx) -> list[Signal]:
    """Success: ended success, ran tests, no errors, modest cost."""
    m = digest.get("markers", {})
    if (digest.get("outcome") == "success" and m.get("test_runs", 0) >= 1
            and m.get("errors", 0) == 0 and m.get("retries", 0) == 0):
        return [_base(digest, "clean_pass", SUCCESS, "outcome", 1.0,
                      test_runs=m.get("test_runs"))]
    return []
 def sig_error_then_recovery(digest, ctx) -> list[Signal]:
    """Success despite hitting errors — a recovery worth learning from."""
    m = digest.get("markers", {})
    if digest.get("outcome") == "success" and m.get("errors", 0) >= 1:
        return [_base(digest, "error_then_recovery", SUCCESS, "errors",
                      float(m.get("errors", 1)), errors=m.get("errors"))]
    return []
 # --- tool-mix / infrastructure-overhead signals (WP-0005 T02) ----------------
 # These read the captured ``tool_histogram`` — friction that the outcome+marker
 # signals above are blind to (sessions still "succeed", just expensively).
 def tool_bucket(tool: str) -> str:
    """Group a tool name into a coarse activity bucket (flavor-agnostic)."""
    if tool.startswith("mcp__state-hub"):
        return "statehub_mcp"
    if tool in ("TaskUpdate", "TaskCreate", "TaskGet", "TaskList", "TaskOutput",
                "TaskStop", "todo_write", "update_task_status"):
        return "task_mgmt"
    if tool == "ToolSearch":
        return "schema_load"
    if tool in ("Bash", "run_terminal_command"):
        return "shell"
    if tool in ("Edit", "Write", "search_replace", "write", "NotebookEdit"):
        return "edit"
    if tool in ("Read", "read_file", "grep", "Grep", "glob", "Glob"):
        return "read"
    return "other"
 def _bucketed(digest) -> tuple[dict, int]:
    buckets: dict[str, int] = {}
    for tool, n in (digest.get("tool_histogram") or {}).items():
        buckets[tool_bucket(tool)] = buckets.get(tool_bucket(tool), 0) + n
    return buckets, sum(buckets.values())
 def sig_infra_overhead(digest, ctx) -> list[Signal]:
    """Problem: a large share of tool calls is hub/task/schema plumbing, not work."""
    buckets, total = _bucketed(digest)
    if total < ctx.get("infra_min_calls", 20):
        return []
    overhead = buckets.get("statehub_mcp", 0) + buckets.get("task_mgmt", 0) + buckets.get("schema_load", 0)
    share = overhead / total
    if share >= ctx.get("infra_overhead_threshold", 0.30):
        return [_base(digest, "infra_overhead", PROBLEM, "infra_overhead", round(share, 3),
                      overhead_calls=overhead, total_calls=total,
                      statehub=buckets.get("statehub_mcp", 0),
                      task_mgmt=buckets.get("task_mgmt", 0),
                      schema_load=buckets.get("schema_load", 0))]
    return []
 def sig_schema_thrash(digest, ctx) -> list[Signal]:
    """Problem: repeated ToolSearch — deferred-tool schemas reloaded over and over."""
    buckets, _ = _bucketed(digest)
    n = buckets.get("schema_load", 0)
    if n >= ctx.get("schema_thrash_threshold", 5):
        return [_base(digest, "schema_thrash", PROBLEM, "schema_load", float(n), tool_searches=n)]
    return []
 def sig_tool_thrash(digest, ctx) -> list[Signal]:
    """Problem: a single tool is hammered far more than any other — likely churn."""
    hist = digest.get("tool_histogram") or {}
    if not hist:
        return []
    tool, n = max(hist.items(), key=lambda kv: kv[1])
    if n >= ctx.get("tool_thrash_threshold", 80):
        return [_base(digest, "tool_thrash", PROBLEM, f"tool:{tool}", float(n), tool=tool, calls=n)]
    return []
 def sig_recurring_error(digest, ctx) -> list[Signal]:
    """Problem: a normalized error fingerprint (WP-0006) — one signal per distinct
    error in the session, so the same error across sessions/repos/flavors clusters
    into a candidate root-cause pattern (locus = fingerprint, magnitude = in-session
    occurrences). This is the content-level 'why', not just a coarse error count.
    """
    out: list[Signal] = []
    for snip in digest.get("error_snippets", []) or []:
        fp = snip.get("fingerprint")
        if not fp:
            continue
        out.append(_base(digest, "recurring_error", PROBLEM, fp, float(snip.get("count", 1)),
                         sample=snip.get("sample", ""), tool=snip.get("tool"),
                         occurrences=snip.get("count", 1)))
    return out
 EXTRACTORS: list[Callable] = [
    sig_retry_storm, sig_repeated_errors, sig_budget_overrun, sig_abandoned,
    sig_clean_pass, sig_error_then_recovery,
    sig_infra_overhead, sig_schema_thrash, sig_tool_thrash,
    sig_recurring_error,
 ]
 def build_context(digests: list[dict]) -> dict[str, Any]:
    """Corpus-level stats so extractors can compare a session to its peers."""
    totals = sorted(
        d.get("cost", {}).get("input_tokens", 0) + d.get("cost", {}).get("output_tokens", 0)
        for d in digests
    )
    p90 = totals[int(0.9 * (len(totals) - 1))] if totals else 0
    return {
        "tokens_p90": p90, "retry_storm_threshold": 3, "error_threshold": 3,
        # tool-mix / infra-overhead thresholds (WP-0005 T02)
        "infra_min_calls": 20, "infra_overhead_threshold": 0.30,
        "schema_thrash_threshold": 5, "tool_thrash_threshold": 80,
    }
 def extract_signals(digests: list[dict], ctx: Optional[dict] = None) -> list[Signal]:
    ctx = ctx or build_context(digests)
    out: list[Signal] = []
    for d in digests:
        for ex in EXTRACTORS:
            out.extend(ex(d, ctx))
    return out
--- a/session_memory/digest_lookup.py
+++ b/session_memory/digest_lookup.py
@@ -0,0 +1,76 @@
 """Read a single session digest from the local store (AGENTIC-WP-0011 T03).
 Thin read path for ``kaizen-agentic metrics correlate`` and other consumers.
 Does not run ingest.
 Usage:
    python -m session_memory.digest_lookup <session_uid> [--json]
    HELIX_STORE_DB=/abs/path/to/mem.db python -m session_memory.digest_lookup <uid>
 """
 from __future__ import annotations
 import argparse
 import json
 import os
 import sys
 from .core.store import Store
 from .ingest import _expand, load_config
 def resolve_store_paths(*, config_path: str | None = None) -> tuple[str, str]:
    """Resolve db + blob paths from HELIX_STORE_DB or config.toml [store]."""
    env_db = os.environ.get("HELIX_STORE_DB")
    if env_db:
        db_path = _expand(env_db)
        blob_dir = os.path.join(os.path.dirname(db_path), "blobs")
        return db_path, blob_dir
    here = os.path.dirname(os.path.abspath(__file__))
    cfg_path = config_path or os.path.join(here, "config.toml")
    store_cfg = load_config(cfg_path).get("store", {})
    return _expand(store_cfg.get("db_path", "session_memory/.store/mem.db")), _expand(
        store_cfg.get("blob_dir", "session_memory/.store/blobs")
    )
 def lookup_digest(session_uid: str, *, config_path: str | None = None) -> dict | None:
    db_path, blob_dir = resolve_store_paths(config_path=config_path)
    store = Store(db_path, blob_dir)
    try:
        return store.get_digest(session_uid)
    finally:
        store.close()
 def main(argv: list[str] | None = None) -> int:
    here = os.path.dirname(os.path.abspath(__file__))
    ap = argparse.ArgumentParser(
        description="Read one session digest from the Helix Forge store (no ingest)."
    )
    ap.add_argument("session_uid", help="Normalized session uid, e.g. claude:abc-123")
    ap.add_argument("--config", default=os.path.join(here, "config.toml"),
                    help="config.toml when HELIX_STORE_DB is unset")
    ap.add_argument("--json", action="store_true", help="print digest JSON to stdout")
    args = ap.parse_args(argv)
    digest = lookup_digest(args.session_uid, config_path=args.config)
    if digest is None:
        print(f"digest not found: {args.session_uid}", file=sys.stderr)
        return 1
    if args.json:
        print(json.dumps(digest, indent=2, sort_keys=True))
    else:
        cost = digest.get("cost") or {}
        tokens = cost.get("input_tokens", 0) + cost.get("output_tokens", 0)
        print(f"session_uid: {digest.get('session_uid')}")
        print(f"repo: {digest.get('repo')}  flavor: {digest.get('flavor')}")
        print(f"outcome: {digest.get('outcome')}  tokens: {tokens}")
        print(f"started_at: {digest.get('started_at')}  ended_at: {digest.get('ended_at')}")
    return 0
 if __name__ == "__main__":
    raise SystemExit(main())
--- a/session_memory/distribute/init.py
+++ b/session_memory/distribute/init.py
@@ -0,0 +1,9 @@
 """Distribute phase (PRD §6.4) — render approved Solution Patterns into per-flavor
 artifacts. Mirror of the collector design: agnostic core, thin distributor edges.
    base.py      Artifact + Distributor protocol + idempotent snippet markers (T01)
    claude.py    CLAUDE.md snippet distributor (T02)
    codex.py     AGENTS.md snippet distributor (T03)
    grok.py      native instruction distributor (T03)
    __main__.py  `python -m session_memory.distribute` (T05)
 """
--- a/session_memory/distribute/main.py
+++ b/session_memory/distribute/main.py
@@ -0,0 +1,89 @@
 """Distribute entrypoint (T05): catalog -> per-flavor proposals (HITL).
    python -m session_memory.distribute [--config PATH] [--repo R] [--flavor F] [--json]
 Reads approved / distribution-ready Solution Patterns from the Pattern Catalog and
 renders them into per-flavor **proposals** (never auto-applied) scoped by
 repo/domain, recording what is proposed where in the active-pattern registry.
 Targets are the repo->domain map in ``config.toml`` crossed with the known
 distributor flavors; each pattern's own ``Scope`` filters where it actually lands.
 """
 from __future__ import annotations
 import argparse
 import json
 import os
 from ..curate.catalog import Catalog
 from ..ingest import _expand, load_config
 from .proposals import ActiveRegistry, Target, propose
 from .registry import all_flavors
 def build_targets(config: dict, repo_filter=None, flavor_filter=None) -> list[Target]:
    repo_map = config.get("repo_domain_map", {})
    flavors = [flavor_filter] if flavor_filter else all_flavors()
    targets = []
    for repo, domain in repo_map.items():
        if repo_filter and repo != repo_filter:
            continue
        for flavor in flavors:
            targets.append(Target(repo=repo, domain=domain, flavor=flavor))
    return targets
 def run_distribute(config: dict, *, repo_filter=None, flavor_filter=None):
    cur = config.get("curate", {})
    dist = config.get("distribute", {})
    catalog = Catalog(_expand(cur.get("catalog_dir", "session_memory/catalog")))
    patterns = catalog.list()
    targets = build_targets(config, repo_filter, flavor_filter)
    registry = ActiveRegistry(_expand(dist.get("active_registry",
                                               "session_memory/distribute/active_patterns.json")))
    out_dir = _expand(dist.get("proposals_dir", "session_memory/proposals"))
    return propose(patterns, targets, out_dir, registry)
 def _summary(res) -> str:
    by_repo = {}
    for repo, flavor, pid, _ in res.proposals:
        by_repo.setdefault(repo, []).append(f"{pid}[{flavor}]")
    lines = [f"# Distribute proposals  ({len(res.proposals)} renders, "
             f"{len(res.files_written)} files)"]
    for repo in sorted(by_repo):
        lines.append(f"  {repo}: {', '.join(sorted(by_repo[repo]))}")
    if res.skipped_not_distributable:
        lines.append(f"  skipped (not distribution-ready): "
                     f"{len(set(res.skipped_not_distributable))} pattern(s)")
    if not res.proposals:
        lines.append("  (no approved/distribution-ready patterns matched any target)")
    return "\n".join(lines)
 def main(argv=None) -> int:
    here = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
    ap = argparse.ArgumentParser(description="Distribute approved patterns as per-flavor proposals.")
    ap.add_argument("--config", default=os.path.join(here, "config.toml"))
    ap.add_argument("--repo", default=None, help="limit to one target repo")
    ap.add_argument("--flavor", default=None, help="limit to one flavor")
    ap.add_argument("--json", action="store_true")
    args = ap.parse_args(argv)
    config = load_config(args.config)
    res = run_distribute(config, repo_filter=args.repo, flavor_filter=args.flavor)
    if args.json:
        print(json.dumps({
            "proposals": [{"repo": r, "flavor": f, "pattern_id": p, "path": path}
                          for r, f, p, path in res.proposals],
            "files_written": res.files_written,
            "skipped": sorted(set(res.skipped_not_distributable)),
        }, indent=2))
    else:
        print(_summary(res))
    return 0
 if __name__ == "__main__":
    raise SystemExit(main())
--- a/session_memory/distribute/active_patterns.json
+++ b/session_memory/distribute/active_patterns.json
@@ -0,0 +1,242 @@
 [
  {
    "flavor": "claude",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "agentic-resources",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "codex",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "agentic-resources",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "agentic-resources",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "can-you-assist",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "codex",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "can-you-assist",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "can-you-assist",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "net-kingdom",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "codex",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "net-kingdom",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "net-kingdom",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "ops-bridge",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "codex",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "ops-bridge",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "ops-bridge",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "state-hub",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "codex",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "state-hub",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "state-hub",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "the-custodian",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "codex",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "the-custodian",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "the-custodian",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-problem-schema_thrash-schema_load",
    "repo": "ops-bridge",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-problem-tool_thrash-tool-bash",
    "repo": "state-hub",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "agentic-resources",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "agentic-resources",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "can-you-assist",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "can-you-assist",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "ops-bridge",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "ops-bridge",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "state-hub",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "state-hub",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "the-custodian",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "the-custodian",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  }
 ]
--- a/session_memory/distribute/base.py
+++ b/session_memory/distribute/base.py
@@ -0,0 +1,115 @@
 """Distributor base — Artifact, the Distributor protocol, and idempotent markers
 (PRD §6.4 FR-X1; T01).
 A **distributor** turns one agnostic :class:`SolutionPattern` into a per-flavor
 :class:`Artifact` (a target path + a snippet of content). Everything flavor-neutral
 lives here; each flavor adapter (T02/T03) only supplies its target filename and may
 override the rendered body using the pattern's ``rendering_hints``.
 Snippets carry stable ``BEGIN/END`` markers keyed on the pattern id, so
 re-distributing a pattern **updates its block in place** instead of duplicating it
 — the property that lets Distribute run repeatedly (HITL) without drift.
 """
 from __future__ import annotations
 import re
 from dataclasses import dataclass
 from typing import Any, Optional, Protocol, runtime_checkable
 from ..curate.schema import SolutionPattern
@dataclass
 class Artifact:
    """A proposed per-flavor rendering of a pattern (FR-X1/FR-X3 — proposed, not applied)."""
    flavor: str
    target_path: str        # repo-relative file the snippet belongs in (e.g. "CLAUDE.md")
    pattern_id: str
    content: str            # the marker-wrapped snippet block
@runtime_checkable
 class Distributor(Protocol):
    flavor: str
    target_path: str
    def render(self, pattern: SolutionPattern) -> Artifact: ...
 # --- idempotent snippet markers ---------------------------------------------
 _MARK = "helix-forge pattern"
 def begin_marker(pattern_id: str) -> str:
    return f"<!-- BEGIN {_MARK}:{pattern_id} -->"
 def end_marker(pattern_id: str) -> str:
    return f"<!-- END {_MARK}:{pattern_id} -->"
 def wrap_block(pattern_id: str, body: str, version: str = "") -> str:
    """Wrap a rendered body in stable BEGIN/END markers."""
    ver = f" v{version}" if version else ""
    return f"{begin_marker(pattern_id)}{ver}\n{body.strip()}\n{end_marker(pattern_id)}"
 def upsert_block(doc_text: str, pattern_id: str, block: str) -> str:
    """Insert or replace a pattern's marked block within a document (idempotent)."""
    pat = re.compile(
        re.escape(begin_marker(pattern_id)) + r".*?" + re.escape(end_marker(pattern_id)),
        re.DOTALL,
    )
    if pat.search(doc_text):
        return pat.sub(block, doc_text)
    sep = "" if doc_text.endswith("\n\n") or not doc_text else "\n\n"
    return f"{doc_text}{sep}{block}\n"
 # --- agnostic body rendering ------------------------------------------------
 def render_markdown_body(pattern: SolutionPattern) -> str:
    """Default flavor-neutral snippet body from the agnostic pattern fields."""
    label = "Avoid" if pattern.polarity == "problem" else "Prefer"
    lines = [f"### {pattern.name}", "", pattern.problem.strip(), ""]
    if pattern.resolutions:
        lines.append(f"**{label}:**")
        for r in pattern.resolutions:
            detail = f" — {r.detail}" if r.detail else ""
            lines.append(f"- {r.summary}{detail}")
            for step in r.steps:
                lines.append(f"  - {step}")
    return "\n".join(lines).strip()
 def hint(pattern: SolutionPattern, flavor: str, key: str, default: Any = None) -> Any:
    """Read a per-flavor rendering hint, falling back to ``default``."""
    return (pattern.rendering_hints.get(flavor) or {}).get(key, default)
 class BaseDistributor:
    """Shared distributor: renders the agnostic body, honouring a ``body`` hint
    override and a ``target`` hint, then wraps it in idempotent markers."""
    flavor: str = ""
    target_path: str = ""
    def __init__(self, flavor: Optional[str] = None, target_path: Optional[str] = None) -> None:
        if flavor is not None:
            self.flavor = flavor
        if target_path is not None:
            self.target_path = target_path
    def body(self, pattern: SolutionPattern) -> str:
        return hint(pattern, self.flavor, "body") or render_markdown_body(pattern)
    def target(self, pattern: SolutionPattern) -> str:
        return hint(pattern, self.flavor, "target") or self.target_path
    def render(self, pattern: SolutionPattern) -> Artifact:
        block = wrap_block(pattern.id, self.body(pattern), pattern.version)
        return Artifact(flavor=self.flavor, target_path=self.target(pattern),
                        pattern_id=pattern.id, content=block)
--- a/session_memory/distribute/claude.py
+++ b/session_memory/distribute/claude.py
@@ -0,0 +1,42 @@
 """Claude distributor (PRD §6.4 FR-X1; T02).
 Renders an approved Solution Pattern into a ``CLAUDE.md`` snippet block. Most logic
 is inherited from :class:`BaseDistributor`; the Claude-specific touch is an
 optional **skill** rendering mode (``rendering_hints["claude"]["as"] == "skill"``)
 that emits a skill-style stub instead of a plain instruction snippet — Claude's
 native distribution targets are CLAUDE.md snippets, skills, or hooks.
 """
 from __future__ import annotations
 from ..curate.schema import SolutionPattern
 from .base import BaseDistributor, hint, render_markdown_body
 class ClaudeDistributor(BaseDistributor):
    flavor = "claude"
    target_path = "CLAUDE.md"
    def body(self, pattern: SolutionPattern) -> str:
        override = hint(pattern, self.flavor, "body")
        if override:
            return override
        if hint(pattern, self.flavor, "as") == "skill":
            return self._skill_stub(pattern)
        return render_markdown_body(pattern)
    @staticmethod
    def _skill_stub(pattern: SolutionPattern) -> str:
        trigger = "avoid" if pattern.polarity == "problem" else "apply"
        lines = [
            f"## Skill: {pattern.name}",
            "",
            f"**When:** situations where you would {trigger} — {pattern.problem.strip()}",
            "",
            "**Steps:**",
        ]
        for r in pattern.resolutions:
            lines.append(f"- {r.summary}" + (f" — {r.detail}" if r.detail else ""))
            for step in r.steps:
                lines.append(f"  - {step}")
        return "\n".join(lines).strip()
--- a/session_memory/distribute/codex.py
+++ b/session_memory/distribute/codex.py
@@ -0,0 +1,15 @@
 """Codex distributor (PRD §6.4 FR-X1; T03).
 Renders an approved Solution Pattern into an ``AGENTS.md`` snippet — Codex's native
 repo-convention surface. Identical agnostic body to the other flavors (FR-A3: one
 pattern, expressible everywhere); only the target file differs.
 """
 from __future__ import annotations
 from .base import BaseDistributor
 class CodexDistributor(BaseDistributor):
    flavor = "codex"
    target_path = "AGENTS.md"
--- a/session_memory/distribute/grok.py
+++ b/session_memory/distribute/grok.py
@@ -0,0 +1,15 @@
 """Grok distributor (PRD §6.4 FR-X1; T03).
 Renders an approved Solution Pattern into Grok's native instruction format. Defaults
 to a ``.grok/instructions.md`` snippet; the same agnostic body as the other flavors
 (FR-A3), overridable via ``rendering_hints["grok"]``.
 """
 from __future__ import annotations
 from .base import BaseDistributor
 class GrokDistributor(BaseDistributor):
    flavor = "grok"
    target_path = ".grok/instructions.md"
--- a/session_memory/distribute/proposals.py
+++ b/session_memory/distribute/proposals.py
@@ -0,0 +1,136 @@
 """Scoping, proposed-not-applied output, and the active-pattern registry
 (PRD §6.4 FR-X2/FR-X3/FR-X4; T04).
 * **Scope (FR-X2):** a pattern lands in a target environment only if the target's
  repo/domain/flavor are within the pattern's :class:`Scope` (an empty scope list
  means "unrestricted on that axis").
 * **Proposed, not applied (FR-X3):** rendered artifacts are written under a
  ``proposals/`` tree mirroring the target path — a reviewable diff a human applies,
  never auto-written into the live file. Re-running upserts each pattern's block in
  place (idempotent), so proposals don't accumulate duplicates.
 * **Active-pattern registry (FR-X4):** a JSON record of which pattern (and version)
  is proposed/active in which (repo, flavor) environment.
 """
 from __future__ import annotations
 import json
 import os
 from dataclasses import dataclass
 from datetime import datetime, timezone
 from ..curate.schema import SolutionPattern
 from .base import upsert_block
 from .registry import get_distributor
 def _now() -> str:
    return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
@dataclass(frozen=True)
 class Target:
    """An environment a pattern could be distributed to."""
    repo: str
    domain: str = ""
    flavor: str = "claude"
 def applies(pattern: SolutionPattern, target: Target) -> bool:
    """True if ``target`` is within the pattern's scope (empty axis == any)."""
    sc = pattern.scope
    if sc.repos and target.repo not in sc.repos:
        return False
    if sc.domains and target.domain and target.domain not in sc.domains:
        return False
    if sc.flavors and target.flavor not in sc.flavors:
        return False
    return True
 def is_distributable(pattern: SolutionPattern) -> bool:
    return pattern.status == "approved" and pattern.distribution_ready
 class ActiveRegistry:
    """JSON record of patterns proposed/active per (repo, flavor) — FR-X4."""
    def __init__(self, path: str) -> None:
        self.path = path
        self._entries: dict[str, dict] = {}
        if os.path.exists(path):
            with open(path, encoding="utf-8") as fh:
                for e in json.load(fh):
                    self._entries[self._key(e["pattern_id"], e["repo"], e["flavor"])] = e
    @staticmethod
    def _key(pid: str, repo: str, flavor: str) -> str:
        return f"{pid}|{repo}|{flavor}"
    def record(self, pid: str, repo: str, flavor: str, version: str,
               status: str = "proposed") -> None:
        self._entries[self._key(pid, repo, flavor)] = {
            "pattern_id": pid, "repo": repo, "flavor": flavor,
            "version": version, "status": status, "updated_at": _now(),
        }
    def entries(self) -> list[dict]:
        return [self._entries[k] for k in sorted(self._entries)]
    def save(self) -> None:
        os.makedirs(os.path.dirname(self.path) or ".", exist_ok=True)
        with open(self.path, "w", encoding="utf-8") as fh:
            json.dump(self.entries(), fh, indent=2, sort_keys=True)
            fh.write("\n")
@dataclass
 class ProposalResult:
    proposals: list = None        # (repo, flavor, pattern_id, proposal_path)
    files_written: list = None    # absolute proposal paths
    skipped_not_distributable: list = None  # pattern ids
    def __post_init__(self):
        self.proposals = self.proposals or []
        self.files_written = self.files_written or []
        self.skipped_not_distributable = self.skipped_not_distributable or []
 def propose(patterns: list[SolutionPattern], targets: list[Target], out_dir: str,
            registry: ActiveRegistry) -> ProposalResult:
    """Render in-scope, distributable patterns into per-target proposal files."""
    result = ProposalResult()
    pending: dict[str, str] = {}  # proposal path -> accumulated content
    for p in patterns:
        if not is_distributable(p):
            result.skipped_not_distributable.append(p.id)
            continue
        for t in targets:
            dist = get_distributor(t.flavor)
            if dist is None or not applies(p, t):
                continue
            art = dist.render(p)
            path = os.path.join(out_dir, t.repo, art.target_path)
            if path not in pending:
                pending[path] = _read(path)
            pending[path] = upsert_block(pending[path], p.id, art.content)
            registry.record(p.id, t.repo, t.flavor, p.version)
            result.proposals.append((t.repo, t.flavor, p.id, path))
    for path, content in pending.items():
        os.makedirs(os.path.dirname(path), exist_ok=True)
        with open(path, "w", encoding="utf-8") as fh:
            fh.write(content if content.endswith("\n") else content + "\n")
        result.files_written.append(path)
    registry.save()
    return result
 def _read(path: str) -> str:
    if os.path.exists(path):
        with open(path, encoding="utf-8") as fh:
            return fh.read()
    return ""
--- a/session_memory/distribute/registry.py
+++ b/session_memory/distribute/registry.py
@@ -0,0 +1,26 @@
 """Distributor registry (T03) — flavor -> distributor, the one place that knows
 about all flavor edges. Adding a flavor = one entry here + one adapter module.
 """
 from __future__ import annotations
 from typing import Optional
 from .base import BaseDistributor
 from .claude import ClaudeDistributor
 from .codex import CodexDistributor
 from .grok import GrokDistributor
 _REGISTRY: dict[str, BaseDistributor] = {
    "claude": ClaudeDistributor(),
    "codex": CodexDistributor(),
    "grok": GrokDistributor(),
 }
 def get_distributor(flavor: str) -> Optional[BaseDistributor]:
    return _REGISTRY.get(flavor)
 def all_flavors() -> list[str]:
    return list(_REGISTRY)
--- a/session_memory/ingest.py
+++ b/session_memory/ingest.py
@@ -0,0 +1,134 @@
 """Session-memory sweep entrypoint (design §7; T06).
 One sweep: discover (per enabled source) -> normalize (adapter) -> store ->
 digest -> retention-evict. Idempotent and re-runnable; intended to be triggered
 on the configured cadence (``/schedule`` daily/weekly) or by an agent hook.
 Usage:
    python -m session_memory.ingest [--config PATH] [--once] [--dry-run]
 """
 from __future__ import annotations
 import argparse
 import glob
 import os
 import sys
 import tomllib
 from dataclasses import dataclass, field
 from typing import Any
 from .adapters import claude as claude_adapter
 from .adapters import codex as codex_adapter
 from .adapters import grok as grok_adapter
 from .core import digest as digest_mod
 from .core.cursor import Cursors
 from .core.retention import RetentionConfig, sweep as retention_sweep
 from .core.store import Store
 # adapter dispatch by source name
 _ADAPTERS = {
    "claude": claude_adapter.parse_session,
    "codex": codex_adapter.parse_session,
    "grok": grok_adapter.parse_session,
 }
@dataclass
 class SweepResult:
    discovered: int = 0
    ingested: int = 0
    skipped_unchanged: int = 0
    analyzed: int = 0
    warnings: list[str] = field(default_factory=list)
    retention: Any = None
 def _expand(p: str) -> str:
    return os.path.expanduser(p)
 def load_config(path: str) -> dict[str, Any]:
    with open(path, "rb") as f:
        return tomllib.load(f)
 def run_sweep(config: dict[str, Any], *, dry_run: bool = False) -> SweepResult:
    store_cfg = config.get("store", {})
    ret_cfg = config.get("retention", {})
    repo_map = config.get("repo_domain_map", {})
    res = SweepResult()
    # In dry-run we only discover + parse: no store is created or written.
    store = None if dry_run else Store(_expand(store_cfg["db_path"]), _expand(store_cfg["blob_dir"]))
    cursors = Cursors(_expand(store_cfg["cursor"]))
    for name, src in config.get("sources", {}).items():
        if not src.get("enabled"):
            continue
        parse = _ADAPTERS.get(name)
        if parse is None:
            res.warnings.append(f"no adapter for source {name!r} (Phase 1)")
            continue
        root = _expand(src["root"])
        for fp in sorted(glob.glob(os.path.join(root, src["glob"]))):
            res.discovered += 1
            if not cursors.is_changed(fp):
                res.skipped_unchanged += 1
                continue
            try:
                bundle = parse(fp, repo_map)
            except Exception as e:  # one bad file must not abort the sweep
                res.warnings.append(f"parse failed {fp}: {e}")
                continue
            if bundle is None:
                cursors.mark(fp)
                continue
            if not dry_run:
                store.ingest(bundle)
                digest_mod.analyze(store, bundle.session.session_uid)
                res.analyzed += 1
            res.ingested += 1
            cursors.mark(fp)
    if not dry_run and store is not None:
        cursors.save()
        rc = RetentionConfig(
            raw_soft_cap_bytes=int(ret_cfg.get("raw_soft_cap_bytes", RetentionConfig.raw_soft_cap_bytes)),
            raw_hard_cap_bytes=int(ret_cfg.get("raw_hard_cap_bytes", RetentionConfig.raw_hard_cap_bytes)),
            raw_max_age_days=int(ret_cfg.get("raw_max_age_days", RetentionConfig.raw_max_age_days)),
            distilled_cap_bytes=int(ret_cfg.get("distilled_cap_bytes", RetentionConfig.distilled_cap_bytes)),
        )
        res.retention = retention_sweep(store, rc, analyze_fn=digest_mod.analyze)
        res.warnings.extend(res.retention.warnings)
    if store is not None:
        store.close()
    return res
 def main(argv: list[str] | None = None) -> int:
    here = os.path.dirname(os.path.abspath(__file__))
    ap = argparse.ArgumentParser(description="Run one coding-session-memory sweep.")
    ap.add_argument("--config", default=os.path.join(here, "config.toml"))
    ap.add_argument("--dry-run", action="store_true", help="discover + parse, but do not write or evict")
    ap.add_argument("--once", action="store_true", help="(default) run a single sweep")
    args = ap.parse_args(argv)
    config = load_config(args.config)
    res = run_sweep(config, dry_run=args.dry_run)
    print(f"discovered={res.discovered} ingested={res.ingested} "
          f"skipped_unchanged={res.skipped_unchanged} analyzed={res.analyzed}")
    if res.retention is not None:
        r = res.retention
        print(f"retention: freed={r.bytes_freed}B final_usage={r.final_usage_bytes}B "
              f"backstop={len(r.backstop_evicted)} budget={len(r.budget_evicted)} "
              f"overflow_analyzed={len(r.overflow_analyzed)} data_loss={len(r.overflow_data_loss)}")
    for w in res.warnings:
        print(f"  WARN: {w}", file=sys.stderr)
    return 0
 if __name__ == "__main__":
    raise SystemExit(main())
--- a/session_memory/measure/init.py
+++ b/session_memory/measure/init.py
@@ -0,0 +1,9 @@
 """Measure phase (PRD §6.5) — the loop-closer.
    metrics.py    fleet metrics + persisted baseline snapshots (T01)
    effect.py     before/after per-pattern effectiveness (T02)
    __main__.py   python -m session_memory.measure (T03)
 Computation over existing digests (reusing WP-0005 tool buckets + WP-0006 error
 mining); no new capture.
 """
--- a/session_memory/measure/main.py
+++ b/session_memory/measure/main.py
@@ -0,0 +1,101 @@
 """Measure entrypoint (T03): fleet trend + per-pattern effectiveness.
    python -m session_memory.measure [--config PATH] [--label L] [--since DATE]
                                     [--no-save] [--json]
 Computes current fleet metrics over the real (quality-filtered) sessions, appends
 them to the baseline trend, and reports whether the fleet is getting cheaper /
 more reliable over time (FR-M3). With ``--since DATE`` it also reports before/after
 effectiveness around a change (FR-M1/FR-M2).
 """
 from __future__ import annotations
 import argparse
 import json
 import os
 from ..core.store import Store
 from ..detect.quality import filter_real, quality_config
 from ..ingest import _expand, load_config
 from .effect import effectiveness
 from .metrics import load_baselines, save_baseline, snapshot
 _TREND_KEYS = ("infra_overhead_share_median", "error_rate", "schema_thrash_sessions",
               "tokens_p50", "success_rate")
 def real_digests(config: dict) -> list[dict]:
    s = config.get("store", {})
    store = Store(_expand(s["db_path"]), _expand(s["blob_dir"]))
    out = filter_real(store.list_digests(), quality_config(config))
    store.close()
    return out
 def _fmt_trend(baselines: list[dict]) -> str:
    if not baselines:
        return "  (no prior snapshots)"
    lines = []
    recent = baselines[-5:]
    for b in recent:
        when = (b.get("captured_at") or "")[:10]
        lbl = f" {b['label']}" if b.get("label") else ""
        lines.append(f"  {when}{lbl}: overhead_med={b.get('infra_overhead_share_median')} "
                     f"err_rate={b.get('error_rate')} schema_thrash={b.get('schema_thrash_sessions')} "
                     f"tok_p50={b.get('tokens_p50')} success={b.get('success_rate')} "
                     f"(n={b.get('n_sessions')})")
    return "\n".join(lines)
 def _report(current: dict, baselines: list[dict], eff: dict | None) -> str:
    lines = [f"# Fleet metrics  (n={current.get('n_sessions')} real sessions)"]
    for k in _TREND_KEYS:
        lines.append(f"  {k} = {current.get(k)}")
    lines.append("\n## Trend (recent snapshots)")
    lines.append(_fmt_trend(baselines))
    if eff is not None:
        lines.append(f"\n## Effectiveness since {eff['applied_at']} "
                     f"(before={eff['n_before']}, after={eff['n_after']})")
        if eff["insufficient_data"]:
            lines.append("  insufficient data on one side of the date")
        else:
            for k in _TREND_KEYS:
                d = eff["deltas"].get(k, {})
                mark = {True: "improved", False: "worse", None: "—"}[d.get("improved")]
                lines.append(f"  {k}: {d.get('before')} -> {d.get('after')} "
                             f"({d.get('change'):+}) {mark}")
    return "\n".join(lines)
 def main(argv=None) -> int:
    here = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
    ap = argparse.ArgumentParser(description="Measure fleet metrics + per-pattern effectiveness.")
    ap.add_argument("--config", default=os.path.join(here, "config.toml"))
    ap.add_argument("--label", default="")
    ap.add_argument("--since", default=None, help="ISO date for before/after effectiveness")
    ap.add_argument("--no-save", action="store_true", help="don't append to the baseline trend")
    ap.add_argument("--json", action="store_true")
    args = ap.parse_args(argv)
    config = load_config(args.config)
    digests = real_digests(config)
    current = snapshot(digests, label=args.label)
    path = _expand(config.get("measure", {}).get("baselines", "session_memory/measure/baselines.jsonl"))
    prior = load_baselines(path)
    if not args.no_save:
        save_baseline(current, path)
    eff = effectiveness(digests, args.since, label=args.label) if args.since else None
    if args.json:
        print(json.dumps({"current": current, "trend": prior + [current], "effectiveness": eff},
                         indent=2))
    else:
        print(_report(current, prior + [current], eff))
    return 0
 if __name__ == "__main__":
    raise SystemExit(main())
--- a/session_memory/measure/baselines.jsonl
+++ b/session_memory/measure/baselines.jsonl
@@ -0,0 +1 @@
 {"captured_at": "2026-06-07T13:30:14Z", "error_rate": 0.963, "infra_overhead_share_median": 0.117, "infra_overhead_share_p90": 0.261, "label": "phase4-baseline (pre-fixes)", "n_sessions": 27, "recurring_error_occurrences": 505, "schema_thrash_sessions": 8, "success_rate": 1.0, "tokens_p50": 250725, "tokens_p90": 1423966}
--- a/session_memory/measure/effect.py
+++ b/session_memory/measure/effect.py
@@ -0,0 +1,60 @@
 """Before/after per-pattern effectiveness (PRD §6.5 FR-M1/FR-M2; T02).
 Given a change/pattern with an ``applied_at`` date, split sessions into *before*
 and *after* by their start time, aggregate each side, and diff the headline
 metrics — so we can say whether a distributed pattern (e.g. the Read-before-Edit
 reflex, or the State Hub skill) actually moved the numbers, and retire it if not.
 """
 from __future__ import annotations
 from .metrics import aggregate
 # Metrics where a *lower* value after the change means improvement.
 _LOWER_IS_BETTER = {
    "infra_overhead_share_median", "infra_overhead_share_p90", "error_rate",
    "recurring_error_occurrences", "schema_thrash_sessions", "tokens_p50", "tokens_p90",
 }
 # Metrics where a *higher* value is improvement.
 _HIGHER_IS_BETTER = {"success_rate"}
 def split_by_date(digests: list[dict], applied_at: str) -> tuple[list[dict], list[dict]]:
    """Partition digests into (before, after) by ``started_at`` vs ``applied_at``."""
    before, after = [], []
    for d in digests:
        ts = d.get("started_at") or ""
        (after if ts and ts >= applied_at else before).append(d)
    return before, after
 def _delta(metric: str, before: float, after: float) -> dict:
    change = round(after - before, 3)
    if metric in _LOWER_IS_BETTER:
        improved = change < 0
    elif metric in _HIGHER_IS_BETTER:
        improved = change > 0
    else:
        improved = None
    return {"before": before, "after": after, "change": change, "improved": improved}
 def effectiveness(digests: list[dict], applied_at: str, *, label: str = "") -> dict:
    """Compare fleet metrics after ``applied_at`` against the prior period."""
    before, after = split_by_date(digests, applied_at)
    b_agg, a_agg = aggregate(before), aggregate(after)
    metrics = (_LOWER_IS_BETTER | _HIGHER_IS_BETTER)
    deltas = {}
    if before and after:
        for m in metrics:
            deltas[m] = _delta(m, b_agg.get(m, 0.0), a_agg.get(m, 0.0))
    return {
        "label": label,
        "applied_at": applied_at,
        "n_before": len(before),
        "n_after": len(after),
        "before": b_agg,
        "after": a_agg,
        "deltas": deltas,
        "insufficient_data": not (before and after),
    }
--- a/session_memory/measure/metrics.py
+++ b/session_memory/measure/metrics.py
@@ -0,0 +1,102 @@
 """Fleet metrics + persisted baselines (PRD §6.5 FR-M3; T01).
 Computes the headline health metrics of the captured corpus — the same quantities
 the friction assessment reported — so they can be tracked over time and compared
 before/after a change. Reuses :func:`detect.signals.tool_bucket` (WP-0005) and the
 digest ``error_snippets`` (WP-0006); no new capture.
 A **baseline** is a timestamped metrics snapshot appended to a JSONL file, so
 successive runs build a trend the entrypoint (T03) can chart.
 """
 from __future__ import annotations
 import collections
 import json
 import os
 from datetime import datetime, timezone
 from ..detect.signals import tool_bucket
 def _now() -> str:
    return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
 def _pct(values: list[float], q: float) -> float:
    if not values:
        return 0.0
    s = sorted(values)
    return round(s[int(q * (len(s) - 1))], 3)
 def _median(values: list[float]) -> float:
    return _pct(values, 0.5)
 def _buckets(digest: dict) -> collections.Counter:
    b: collections.Counter = collections.Counter()
    for tool, n in (digest.get("tool_histogram") or {}).items():
        b[tool_bucket(tool)] += n
    return b
 def session_metrics(digest: dict) -> dict:
    """Per-session metrics used to build fleet aggregates."""
    b = _buckets(digest)
    total = sum(b.values()) or 1
    overhead = b["statehub_mcp"] + b["task_mgmt"] + b["schema_load"]
    cost = digest.get("cost", {})
    tokens = cost.get("input_tokens", 0) + cost.get("output_tokens", 0)
    return {
        "infra_overhead_share": overhead / total,
        "tool_calls": total,
        "schema_load": b["schema_load"],
        "error_occurrences": sum(s.get("count", 1) for s in (digest.get("error_snippets") or [])),
        "has_error": bool(digest.get("error_snippets")),
        "tokens": tokens,
        "success": digest.get("outcome") == "success",
    }
 def aggregate(digests: list[dict], *, schema_thrash_threshold: int = 5) -> dict:
    """Fleet-level metrics over a set of (already quality-filtered) digests."""
    per = [session_metrics(d) for d in digests]
    n = len(per)
    if n == 0:
        return {"n_sessions": 0}
    shares = [m["infra_overhead_share"] for m in per]
    tokens = [m["tokens"] for m in per]
    return {
        "n_sessions": n,
        "infra_overhead_share_median": _median(shares),
        "infra_overhead_share_p90": _pct(shares, 0.9),
        "error_rate": round(sum(m["has_error"] for m in per) / n, 3),
        "recurring_error_occurrences": sum(m["error_occurrences"] for m in per),
        "schema_thrash_sessions": sum(1 for m in per if m["schema_load"] >= schema_thrash_threshold),
        "tokens_p50": _pct(tokens, 0.5),
        "tokens_p90": _pct(tokens, 0.9),
        "success_rate": round(sum(m["success"] for m in per) / n, 3),
    }
 def snapshot(digests: list[dict], *, label: str = "") -> dict:
    m = aggregate(digests)
    m["captured_at"] = _now()
    m["label"] = label
    return m
 def save_baseline(metrics: dict, path: str) -> None:
    """Append a metrics snapshot to the baseline JSONL trend file."""
    os.makedirs(os.path.dirname(path) or ".", exist_ok=True)
    with open(path, "a", encoding="utf-8") as fh:
        fh.write(json.dumps(metrics, sort_keys=True))
        fh.write("\n")
 def load_baselines(path: str) -> list[dict]:
    if not os.path.exists(path):
        return []
    with open(path, encoding="utf-8") as fh:
        return [json.loads(line) for line in fh if line.strip()]
--- a/session_memory/retro/init.py
+++ b/session_memory/retro/init.py
@@ -0,0 +1,9 @@
 """Weekly retro (AGENTIC-WP-0010) — the analysis half of the coding retrospection.
    build.py     windowed detect + measure -> ranked top-3 suggestions per repo (T01)
    publish.py   publish the retro to the hub read model + local report (T02)
    __main__.py  python -m session_memory.retro (T03)
 Consumed by activity-core's weekly-coding-retro schedule (ACTIVITY-WP-0008) via
 the ``event_type=coding_retro`` read model.
 """
--- a/session_memory/retro/main.py
+++ b/session_memory/retro/main.py
@@ -0,0 +1,68 @@
 """Weekly retro entrypoint (AGENTIC-WP-0010 T03).
    python -m session_memory.retro [--window-days 7] [--since D] [--until D]
                                   [--publish] [--json]
 Builds the windowed top-3-per-repo retro over the captured sessions, writes a local
 JSON + markdown report, and (with ``--publish``) posts it to the hub as the
 ``coding_retro`` read model that activity-core's weekly schedule consumes.
 """
 from __future__ import annotations
 import argparse
 import json
 import os
 from ..core.store import Store
 from ..curate.catalog import Catalog
 from ..ingest import _expand, load_config
 from .build import weekly_retro
 from .publish import publish_to_hub, render_markdown, write_local
 def run_retro(config: dict, *, window_days=None, since=None, until=None):
    s = config.get("store", {})
    store = Store(_expand(s["db_path"]), _expand(s["blob_dir"]))
    digests = store.list_digests()
    store.close()
    cur = config.get("curate", {})
    catalog = Catalog(_expand(cur.get("catalog_dir", "session_memory/catalog")))
    rcfg = config.get("retro", {})
    return weekly_retro(digests, catalog, since=since, until=until,
                        window_days=window_days or rcfg.get("window_days", 7))
 def main(argv=None) -> int:
    here = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
    ap = argparse.ArgumentParser(description="Build (and optionally publish) the weekly coding retro.")
    ap.add_argument("--config", default=os.path.join(here, "config.toml"))
    ap.add_argument("--window-days", type=int, default=None)
    ap.add_argument("--since", default=None)
    ap.add_argument("--until", default=None)
    ap.add_argument("--publish", action="store_true", help="post to the hub coding_retro read model")
    ap.add_argument("--json", action="store_true")
    args = ap.parse_args(argv)
    config = load_config(args.config)
    report = run_retro(config, window_days=args.window_days, since=args.since, until=args.until)
    rcfg = config.get("retro", {})
    write_local(report, _expand(rcfg.get("report_json", "session_memory/retro/last_retro.json")),
                _expand(rcfg.get("report_md", "session_memory/retro/last_retro.md")))
    published = None
    if args.publish:
        published = publish_to_hub(report, base_url=rcfg.get("hub_url", "http://127.0.0.1:8000"))
    if args.json:
        print(json.dumps({"report": report, "published": published}, indent=2))
    else:
        print(render_markdown(report))
        if args.publish:
            print(f"\npublished to hub: {published}")
    return 0
 if __name__ == "__main__":
    raise SystemExit(main())
--- a/session_memory/retro/build.py
+++ b/session_memory/retro/build.py
@@ -0,0 +1,99 @@
 """Windowed weekly retro report (AGENTIC-WP-0010 T01).
 Runs the existing detect pipeline over a date window, ranks the recurring problem
 patterns into **per-repo improvement suggestions** (top 3, cross-flavor first),
 attaches a recommendation from the Pattern Catalog where one exists, and bundles a
 fleet measure snapshot for context. Pure function over digests — the entrypoint
 (T03) handles store/publish.
 """
 from __future__ import annotations
 import collections
 from dataclasses import asdict, dataclass
 from datetime import datetime, timedelta, timezone
 from typing import Optional
 from ..detect.cluster import cluster
 from ..detect.quality import QualityConfig, filter_real
 from ..detect.signals import extract_signals
 from ..measure.metrics import aggregate
 # score at/above which a suggestion is "high" priority even when single-flavor
 _HIGH_SCORE = 100.0
 def _parse(ts: str) -> datetime:
    return datetime.fromisoformat(ts.replace("Z", "+00:00"))
 def _iso(dt: datetime) -> str:
    return dt.astimezone(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
 def _now() -> datetime:
    return datetime.now(timezone.utc)
@dataclass
 class Suggestion:
    repo: str
    title: str
    recommendation: str
    priority: str          # high | medium
    score: float
    signal_type: str
    cross_flavor: bool
    pattern_key: str
 def _recommendation(pattern_key: str, locus: str, catalog) -> Optional[str]:
    if catalog is None:
        return None
    sp = catalog.find_for(pattern_key, locus)
    if sp and sp.resolutions:
        return sp.resolutions[0].summary
    return None
 def weekly_retro(digests: list[dict], catalog=None, *, since: Optional[str] = None,
                 until: Optional[str] = None, window_days: int = 7,
                 max_per_repo: int = 3, min_frequency: int = 2,
                 quality: Optional[QualityConfig] = None) -> dict:
    """Build the ranked weekly retro report over a date window."""
    until_dt = _parse(until) if until else _now()
    since_dt = _parse(since) if since else until_dt - timedelta(days=window_days)
    windowed = [d for d in digests
                if d.get("started_at") and since_dt <= _parse(d["started_at"]) < until_dt]
    real = filter_real(windowed, quality or QualityConfig())
    patterns = cluster(extract_signals(real), min_frequency=min_frequency)
    by_repo: dict[str, list[Suggestion]] = collections.defaultdict(list)
    for p in patterns:
        if p.polarity != "problem":
            continue  # improvements come from problems
        rec = (_recommendation(p.key, p.locus, catalog)
               or f"Investigate {p.signal_type.replace('_', ' ')} on {p.locus}")
        priority = "high" if (p.cross_flavor or p.score >= _HIGH_SCORE) else "medium"
        for repo in (p.repos or ["(unknown)"]):
            by_repo[repo].append(Suggestion(
                repo=repo, title=p.title, recommendation=rec, priority=priority,
                score=p.score, signal_type=p.signal_type, cross_flavor=p.cross_flavor,
                pattern_key=p.key))
    suggestions: list[Suggestion] = []
    for repo in sorted(by_repo):
        items = sorted(by_repo[repo], key=lambda s: -s.score)
        suggestions.extend(items[:max_per_repo])
    # cross-flavor first, then by score (global ordering for the report)
    suggestions.sort(key=lambda s: (not s.cross_flavor, -s.score))
    return {
        "window": {"since": _iso(since_dt), "until": _iso(until_dt), "days": window_days},
        "generated_at": _iso(_now()),
        "n_sessions": len(real),
        "suggestions": [asdict(s) for s in suggestions],
        "measure": aggregate(real),
    }
--- a/session_memory/retro/last_retro.json
+++ b/session_memory/retro/last_retro.json
@@ -0,0 +1,322 @@
 {
  "generated_at": "2026-06-07T19:30:56Z",
  "measure": {
    "error_rate": 0.957,
    "infra_overhead_share_median": 0.167,
    "infra_overhead_share_p90": 0.23,
    "n_sessions": 23,
    "recurring_error_occurrences": 463,
    "schema_thrash_sessions": 7,
    "success_rate": 1.0,
    "tokens_p50": 250725,
    "tokens_p90": 901422
  },
  "n_sessions": 23,
  "suggestions": [
    {
      "cross_flavor": true,
      "pattern_key": "problem:recurring_error:make: *** [makefile:<n>: fix-consistency] error <n>",
      "priority": "high",
      "recommendation": "Investigate recurring error on make: *** [makefile:<n>: fix-consistency] error <n>",
      "repo": "net-kingdom",
      "score": 54.0,
      "signal_type": "recurring_error",
      "title": "cross-flavor problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:tool_thrash:tool:Bash",
      "priority": "high",
      "recommendation": "Batch related shell work into one script, not many small Bash calls",
      "repo": "activity-core",
      "score": 13128.0,
      "signal_type": "tool_thrash",
      "title": "problem: tool thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:tool_thrash:tool:Bash",
      "priority": "high",
      "recommendation": "Batch related shell work into one script, not many small Bash calls",
      "repo": "artifact-store",
      "score": 13128.0,
      "signal_type": "tool_thrash",
      "title": "problem: tool thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:tool_thrash:tool:Bash",
      "priority": "high",
      "recommendation": "Batch related shell work into one script, not many small Bash calls",
      "repo": "citation-evidence",
      "score": 13128.0,
      "signal_type": "tool_thrash",
      "title": "problem: tool thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:tool_thrash:tool:Bash",
      "priority": "high",
      "recommendation": "Batch related shell work into one script, not many small Bash calls",
      "repo": "infospace-bench",
      "score": 13128.0,
      "signal_type": "tool_thrash",
      "title": "problem: tool thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:tool_thrash:tool:Bash",
      "priority": "high",
      "recommendation": "Batch related shell work into one script, not many small Bash calls",
      "repo": "railiance-apps",
      "score": 13128.0,
      "signal_type": "tool_thrash",
      "title": "problem: tool thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:tool_thrash:tool:Bash",
      "priority": "high",
      "recommendation": "Batch related shell work into one script, not many small Bash calls",
      "repo": "state-hub",
      "score": 13128.0,
      "signal_type": "tool_thrash",
      "title": "problem: tool thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:schema_thrash:schema_load",
      "priority": "high",
      "recommendation": "Load the tool schemas you'll need once, up front",
      "repo": "activity-core",
      "score": 441.0,
      "signal_type": "schema_thrash",
      "title": "problem: schema thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:schema_thrash:schema_load",
      "priority": "high",
      "recommendation": "Load the tool schemas you'll need once, up front",
      "repo": "citation-evidence",
      "score": 441.0,
      "signal_type": "schema_thrash",
      "title": "problem: schema thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:schema_thrash:schema_load",
      "priority": "high",
      "recommendation": "Load the tool schemas you'll need once, up front",
      "repo": "flex-auth",
      "score": 441.0,
      "signal_type": "schema_thrash",
      "title": "problem: schema thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:schema_thrash:schema_load",
      "priority": "high",
      "recommendation": "Load the tool schemas you'll need once, up front",
      "repo": "infospace-bench",
      "score": 441.0,
      "signal_type": "schema_thrash",
      "title": "problem: schema thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:schema_thrash:schema_load",
      "priority": "high",
      "recommendation": "Load the tool schemas you'll need once, up front",
      "repo": "ops-bridge",
      "score": 441.0,
      "signal_type": "schema_thrash",
      "title": "problem: schema thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has not been read yet. read it first before writing to it.<<path>>",
      "priority": "high",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "activity-core",
      "score": 290.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has not been read yet. read it first before writing to it.<<path>>",
      "priority": "high",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "citation-evidence",
      "score": 290.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has not been read yet. read it first before writing to it.<<path>>",
      "priority": "high",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "infospace-bench",
      "score": 290.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has not been read yet. read it first before writing to it.<<path>>",
      "priority": "high",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "issue-facade",
      "score": 290.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has not been read yet. read it first before writing to it.<<path>>",
      "priority": "high",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "railiance-apps",
      "score": 290.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has not been read yet. read it first before writing to it.<<path>>",
      "priority": "high",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "state-hub",
      "score": 290.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has not been read yet. read it first before writing to it.<<path>>",
      "priority": "high",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "the-custodian",
      "score": 290.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has not been read yet. read it first before writing to it.<<path>>",
      "priority": "high",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "vergabe-teilnahme",
      "score": 290.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has been modified since read, either by the user or by a linter. read it again before attempting to write it.<<path>>",
      "priority": "medium",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "artifact-store",
      "score": 78.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has been modified since read, either by the user or by a linter. read it again before attempting to write it.<<path>>",
      "priority": "medium",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "issue-facade",
      "score": 78.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has been modified since read, either by the user or by a linter. read it again before attempting to write it.<<path>>",
      "priority": "medium",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "railiance-apps",
      "score": 78.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has been modified since read, either by the user or by a linter. read it again before attempting to write it.<<path>>",
      "priority": "medium",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "state-hub",
      "score": 78.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:budget_overrun:tokens",
      "priority": "medium",
      "recommendation": "Read narrowly \u2014 target the region you need, not whole large files",
      "repo": "artifact-store",
      "score": 50.55,
      "signal_type": "budget_overrun",
      "title": "problem: budget overrun"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:{",
      "priority": "medium",
      "recommendation": "Investigate recurring error on {",
      "repo": "vergabe-teilnahme",
      "score": 12.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:found <n> errors (<n> fixed, <n> remaining).",
      "priority": "medium",
      "recommendation": "Investigate recurring error on found <n> errors (<n> fixed, <n> remaining).",
      "repo": "ops-bridge",
      "score": 10.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:(note: edit also tried swapping \\uxxxx escapes and their characters; neither form matched, so the mismatch is likely elsewhere in old_string. re-read the file a",
      "priority": "medium",
      "recommendation": "Investigate recurring error on (note: edit also tried swapping \\uxxxx escapes and their characters; neither form matched, so the mismatch is likely elsewhere in old_string. re-read the file a",
      "repo": "net-kingdom",
      "score": 6.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:found <n> error (<n> fixed, <n> remaining).",
      "priority": "medium",
      "recommendation": "Investigate recurring error on found <n> error (<n> fixed, <n> remaining).",
      "repo": "ops-bridge",
      "score": 6.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<n> failed, <n> passed in <n>.00s",
      "priority": "medium",
      "recommendation": "Investigate recurring error on <n> failed, <n> passed in <n>.00s",
      "repo": "agentic-resources",
      "score": 4.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    }
  ],
  "window": {
    "days": 30,
    "since": "2026-05-08T19:30:56Z",
    "until": "2026-06-07T19:30:56Z"
  }
 }
--- a/session_memory/retro/last_retro.md
+++ b/session_memory/retro/last_retro.md
@@ -0,0 +1,39 @@
 # Weekly Coding Retro  (2026-05-08 → 2026-06-07)
 _23 real sessions · generated 2026-06-07T19:30:56Z_
 ## Top improvement suggestions (cross-flavor first, ≤3 per repo)
 - **net-kingdom** (high, score=54.0) [CROSS-FLAVOR]: cross-flavor problem: recurring error — Investigate recurring error on make: *** [makefile:<n>: fix-consistency] error <n>
 - **activity-core** (high, score=13128.0): problem: tool thrash — Batch related shell work into one script, not many small Bash calls
 - **artifact-store** (high, score=13128.0): problem: tool thrash — Batch related shell work into one script, not many small Bash calls
 - **citation-evidence** (high, score=13128.0): problem: tool thrash — Batch related shell work into one script, not many small Bash calls
 - **infospace-bench** (high, score=13128.0): problem: tool thrash — Batch related shell work into one script, not many small Bash calls
 - **railiance-apps** (high, score=13128.0): problem: tool thrash — Batch related shell work into one script, not many small Bash calls
 - **state-hub** (high, score=13128.0): problem: tool thrash — Batch related shell work into one script, not many small Bash calls
 - **activity-core** (high, score=441.0): problem: schema thrash — Load the tool schemas you'll need once, up front
 - **citation-evidence** (high, score=441.0): problem: schema thrash — Load the tool schemas you'll need once, up front
 - **flex-auth** (high, score=441.0): problem: schema thrash — Load the tool schemas you'll need once, up front
 - **infospace-bench** (high, score=441.0): problem: schema thrash — Load the tool schemas you'll need once, up front
 - **ops-bridge** (high, score=441.0): problem: schema thrash — Load the tool schemas you'll need once, up front
 - **activity-core** (high, score=290.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **citation-evidence** (high, score=290.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **infospace-bench** (high, score=290.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **issue-facade** (high, score=290.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **railiance-apps** (high, score=290.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **state-hub** (high, score=290.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **the-custodian** (high, score=290.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **vergabe-teilnahme** (high, score=290.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **artifact-store** (medium, score=78.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **issue-facade** (medium, score=78.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **railiance-apps** (medium, score=78.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **state-hub** (medium, score=78.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **artifact-store** (medium, score=50.55): problem: budget overrun — Read narrowly — target the region you need, not whole large files
 - **vergabe-teilnahme** (medium, score=12.0): problem: recurring error — Investigate recurring error on {
 - **ops-bridge** (medium, score=10.0): problem: recurring error — Investigate recurring error on found <n> errors (<n> fixed, <n> remaining).
 - **net-kingdom** (medium, score=6.0): problem: recurring error — Investigate recurring error on (note: edit also tried swapping \uxxxx escapes and their characters; neither form matched, so the mismatch is likely elsewhere in old_string. re-read the file a
 - **ops-bridge** (medium, score=6.0): problem: recurring error — Investigate recurring error on found <n> error (<n> fixed, <n> remaining).
 - **agentic-resources** (medium, score=4.0): problem: recurring error — Investigate recurring error on <n> failed, <n> passed in <n>.00s
 ## Fleet snapshot
 - infra-overhead median: 0.167
 - error rate: 0.957  ·  schema-thrash: 7
 - success rate: 1.0  ·  tokens p50: 250725
--- a/session_memory/retro/publish.py
+++ b/session_memory/retro/publish.py
@@ -0,0 +1,78 @@
 """Publish the weekly retro (AGENTIC-WP-0010 T02).
 The retro is published to the State Hub as a **read model** — a progress event of
 ``event_type=coding_retro`` whose ``detail`` carries the structured report. This is
 exactly how ``daily-triage-report`` surfaces, and it is what activity-core's
 ``coding_retro`` resolver (ACTIVITY-WP-0008) reads. A local JSON + markdown report
 is always written; the hub publish is best-effort and **degrades gracefully** when
 the hub is unreachable.
 """
 from __future__ import annotations
 import json
 import os
 import urllib.request
 from typing import Callable, Optional
 DEFAULT_HUB = "http://127.0.0.1:8000"
 def render_markdown(report: dict) -> str:
    w = report.get("window", {})
    lines = [
        f"# Weekly Coding Retro  ({w.get('since', '')[:10]} → {w.get('until', '')[:10]})",
        f"_{report.get('n_sessions', 0)} real sessions · generated {report.get('generated_at', '')}_",
        "",
        "## Top improvement suggestions (cross-flavor first, ≤3 per repo)",
    ]
    if not report.get("suggestions"):
        lines.append("- (no recurring problems above threshold this week)")
    for s in report.get("suggestions", []):
        flag = " [CROSS-FLAVOR]" if s.get("cross_flavor") else ""
        lines.append(f"- **{s['repo']}** ({s['priority']}, score={s['score']}){flag}: "
                     f"{s['title']} — {s['recommendation']}")
    m = report.get("measure", {})
    lines += ["", "## Fleet snapshot",
              f"- infra-overhead median: {m.get('infra_overhead_share_median')}",
              f"- error rate: {m.get('error_rate')}  ·  schema-thrash: {m.get('schema_thrash_sessions')}",
              f"- success rate: {m.get('success_rate')}  ·  tokens p50: {m.get('tokens_p50')}"]
    return "\n".join(lines)
 def write_local(report: dict, json_path: str, md_path: Optional[str] = None) -> None:
    os.makedirs(os.path.dirname(json_path) or ".", exist_ok=True)
    with open(json_path, "w", encoding="utf-8") as fh:
        json.dump(report, fh, indent=2, sort_keys=True)
        fh.write("\n")
    if md_path:
        with open(md_path, "w", encoding="utf-8") as fh:
            fh.write(render_markdown(report))
            fh.write("\n")
 def _http_post(url: str, payload: dict) -> None:
    req = urllib.request.Request(url, data=json.dumps(payload).encode(),
                                 headers={"Content-Type": "application/json"}, method="POST")
    with urllib.request.urlopen(req, timeout=10) as r:
        r.read()
 def publish_to_hub(report: dict, *, base_url: str = DEFAULT_HUB,
                   poster: Optional[Callable[[str, dict], None]] = None) -> bool:
    """POST the retro as an event_type=coding_retro progress event. Best-effort."""
    poster = poster or _http_post
    n = report.get("n_sessions", 0)
    k = len(report.get("suggestions", []))
    payload = {
        "event_type": "coding_retro",
        "author": "helix-forge",
        "summary": f"Weekly coding retro: {k} ranked suggestions across "
                   f"{report.get('window', {}).get('days', 7)} days ({n} sessions).",
        "detail": report,
    }
    try:
        poster(f"{base_url.rstrip('/')}/progress/", payload)
        return True
    except Exception:
        return False
--- a/tests/test_catalog_covers.py
+++ b/tests/test_catalog_covers.py
@@ -0,0 +1,62 @@
 """find_for / covers tests (AGENTIC-WP-0010 follow-up)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.catalog import Catalog  # noqa: E402
 from session_memory.curate.schema import (  # noqa: E402
    Provenance,
    Resolution,
    SolutionPattern,
 )
 def _pattern(pid, src, covers=None, name="P"):
    return SolutionPattern(
        id=pid, name=name, version="1.0.0", polarity="problem", problem="p",
        resolutions=[Resolution(summary="do x")],
        provenance=Provenance(source_key=src), covers=covers or [])
 def test_covers_round_trips(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(_pattern("sp-a", "problem:file_not_read:edit",
                        covers=["file has not been read"]))
    assert cat.load("sp-a").covers == ["file has not been read"]
 def test_find_for_exact_key(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(_pattern(SolutionPattern.make_id("problem:retry_storm:retries"),
                        "problem:retry_storm:retries"))
    got = cat.find_for("problem:retry_storm:retries")
    assert got is not None and got.id == "sp-problem-retry_storm-retries"
 def test_find_for_covers_match(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(_pattern("sp-rbe", "problem:file_not_read:edit",
                        covers=["file has not been read", "modified since read"]))
    # a recurring_error signal with a different key but matching fingerprint locus
    got = cat.find_for(
        "problem:recurring_error:<tool_use_error>file has not been read yet...",
        locus="<tool_use_error>file has not been read yet. read it first...")
    assert got is not None and got.id == "sp-rbe"
 def test_find_for_no_match_returns_none(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(_pattern("sp-rbe", "problem:file_not_read:edit",
                        covers=["file has not been read"]))
    assert cat.find_for("problem:recurring_error:some unrelated error") is None
 def test_covers_change_versions(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(_pattern("sp-a", "problem:x:y"))
    p = cat.load("sp-a")
    p.covers = ["new coverage"]
    assert cat.upsert(p) == "versioned"  # covers is substantive content
    assert cat.load("sp-a").version == "1.0.1"
--- a/tests/test_claude_adapter.py
+++ b/tests/test_claude_adapter.py
@@ -0,0 +1,99 @@
 """Claude adapter tests (T02): synthetic fixture + a real on-disk session."""
 import glob
 import json
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.adapters.claude import parse_session  # noqa: E402
 REPO_MAP = {"agentic-resources": "helix_forge"}
 def _write_jsonl(path, records):
    with open(path, "w", encoding="utf-8") as f:
        for r in records:
            f.write(json.dumps(r) + "\n")
 def test_synthetic_session(tmp_path):
    p = tmp_path / "11111111-2222-3333-4444-555555555555.jsonl"
    _write_jsonl(p, [
        {"type": "user", "uuid": "u1", "parentUuid": None,
         "timestamp": "2026-06-06T10:00:00Z", "sessionId": "sess-1",
         "cwd": "/home/worsch/agentic-resources", "gitBranch": "main",
         "version": "1.0", "message": {"role": "user", "content": "fix the bug"}},
        {"type": "assistant", "uuid": "a1", "parentUuid": "u1",
         "timestamp": "2026-06-06T10:00:05Z", "sessionId": "sess-1",
         "message": {"role": "assistant", "model": "claude-opus-4-8",
                     "usage": {"input_tokens": 100, "output_tokens": 20,
                               "cache_read_input_tokens": 10},
                     "content": [
                         {"type": "thinking", "thinking": "let me look"},
                         {"type": "text", "text": "I'll edit the file."},
                         {"type": "tool_use", "name": "Edit",
                          "input": {"file_path": "x.py", "old_string": "a", "new_string": "b"}},
                         {"type": "tool_use", "name": "Bash",
                          "input": {"command": "pytest -q"}},
                     ]}},
        {"type": "user", "uuid": "u2", "parentUuid": "a1",
         "timestamp": "2026-06-06T10:00:10Z", "sessionId": "sess-1",
         "message": {"role": "user",
                     "content": [{"type": "tool_result", "content": "6 passed"}]}},
    ])
    norm = parse_session(str(p), REPO_MAP)
    assert norm is not None
    s = norm.session
    assert s.session_uid == "claude:sess-1"
    assert s.repo == "agentic-resources" and s.domain == "helix_forge"
    assert s.model == "claude-opus-4-8"
    assert s.cost.input_tokens == 100 and s.cost.output_tokens == 20
    assert s.cost.cache_tokens == 10
    assert s.cost.turns == 1
    assert s.cost.wall_clock_s == 10.0
    kinds = [e.kind for e in norm.events]
    assert kinds == ["user_msg", "thinking", "assistant_msg", "edit", "test_run", "tool_result"]
    # turn DAG: assistant events link back to the first user msg (seq 0)
    edit_ev = next(e for e in norm.events if e.kind == "edit")
    assert edit_ev.parent_seq == 0
    assert edit_ev.tool == "Edit"
    # bodies captured as blobs, referenced by payload_ref
    assert edit_ev.payload_ref in norm.blobs
    assert "x.py" in norm.blobs[edit_ev.payload_ref]
 def test_sidechain_filename_marks_events(tmp_path):
    p = tmp_path / "agent-deadbeef.jsonl"
    _write_jsonl(p, [
        {"type": "assistant", "uuid": "a1", "sessionId": "side-1",
         "timestamp": "2026-06-06T10:00:00Z",
         "message": {"role": "assistant", "content": [{"type": "text", "text": "hi"}]}},
    ])
    norm = parse_session(str(p), REPO_MAP)
    assert norm.events[0].is_sidechain is True
 def test_real_local_session_if_available():
    """Smoke-parse a real Claude transcript on this workstation, if present."""
    base = os.path.expanduser("~/.claude/projects/-home-worsch-agentic-resources")
    files = sorted(glob.glob(os.path.join(base, "*.jsonl")))
    if not files:
        return  # environment without local sessions; synthetic tests cover logic
    parsed = 0
    for fp in files:
        norm = parse_session(fp, REPO_MAP)
        if norm is None:
            continue
        parsed += 1
        assert norm.session.session_uid.startswith("claude:")
        # seq is monotonic and unique
        seqs = [e.seq for e in norm.events]
        assert seqs == sorted(seqs)
        assert len(seqs) == len(set(seqs))
    assert parsed >= 1
--- a/tests/test_cluster.py
+++ b/tests/test_cluster.py
@@ -0,0 +1,54 @@
 """Clusterer + evidence + cross-flavor tests (T05/T06)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.detect.cluster import cluster  # noqa: E402
 from session_memory.detect.signals import PROBLEM, SUCCESS, Signal  # noqa: E402
 def _sig(uid, flavor, repo, type_, polarity, locus, mag=1.0):
    return Signal(session_uid=uid, flavor=flavor, repo=repo, type=type_,
                  polarity=polarity, locus=locus, magnitude=mag)
 def test_min_frequency_filters_singletons():
    sigs = [_sig("claude:a", "claude", "r1", "retry_storm", PROBLEM, "retries")]
    assert cluster(sigs, min_frequency=2) == []
 def test_clusters_recurring_signal_with_evidence():
    sigs = [
        _sig("claude:a", "claude", "r1", "retry_storm", PROBLEM, "retries", 5),
        _sig("claude:b", "claude", "r2", "retry_storm", PROBLEM, "retries", 3),
    ]
    pats = cluster(sigs, min_frequency=2)
    assert len(pats) == 1
    p = pats[0]
    assert p.frequency == 2
    assert p.sessions == ["claude:a", "claude:b"]
    assert sorted(p.repos) == ["r1", "r2"]
    assert p.flavors == ["claude"]
    assert p.cross_flavor is False
    assert p.cost_impact == 8.0
 def test_cross_flavor_flagged_and_ranked_first():
    sigs = [
        # cross-flavor problem (claude + codex)
        _sig("claude:a", "claude", "r1", "repeated_errors", PROBLEM, "errors", 3),
        _sig("codex:b", "codex", "r2", "repeated_errors", PROBLEM, "errors", 3),
        # single-flavor success cluster with higher raw impact
        _sig("grok:c", "grok", "r3", "clean_pass", SUCCESS, "outcome", 5),
        _sig("grok:d", "grok", "r4", "clean_pass", SUCCESS, "outcome", 5),
    ]
    pats = cluster(sigs, min_frequency=2)
    assert len(pats) == 2
    xf = next(p for p in pats if p.signal_type == "repeated_errors")
    assert xf.cross_flavor is True
    assert sorted(xf.flavors) == ["claude", "codex"]
    # cross-flavor pattern is ranked first even if another has higher raw impact
    assert pats[0].cross_flavor is True
    assert "cross-flavor" in pats[0].title
--- a/tests/test_codex_adapter.py
+++ b/tests/test_codex_adapter.py
@@ -0,0 +1,86 @@
 """Codex adapter tests (T01): synthetic rollout fixture."""
 import json
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.adapters.codex import parse_session  # noqa: E402
 REPO_MAP = {"agentic-resources": "helix_forge"}
 def _rollout(path, lines):
    with open(path, "w", encoding="utf-8") as f:
        for ln in lines:
            f.write(json.dumps(ln) + "\n")
 def test_codex_rollout_parse(tmp_path):
    p = tmp_path / "rollout-2026-06-06-abc.jsonl"
    _rollout(p, [
        {"timestamp": "2026-06-06T10:00:00Z", "type": "session_meta",
         "payload": {"id": "cdx-1", "cwd": "/home/worsch/agentic-resources",
                     "model_provider": "openai", "cli_version": "0.44.0", "model": "gpt-5-codex"}},
        {"timestamp": "2026-06-06T10:00:01Z", "type": "turn_context",
         "payload": {"model": "gpt-5-codex", "approval_policy": "on-request"}},
        {"timestamp": "2026-06-06T10:00:02Z", "type": "event_msg",
         "payload": {"type": "task_started"}},
        {"timestamp": "2026-06-06T10:00:03Z", "type": "response_item",
         "payload": {"type": "message", "role": "user",
                     "content": [{"type": "input_text", "text": "fix the bug"}]}},
        {"timestamp": "2026-06-06T10:00:04Z", "type": "response_item",
         "payload": {"type": "reasoning", "summary": "think about it"}},
        {"timestamp": "2026-06-06T10:00:05Z", "type": "response_item",
         "payload": {"type": "function_call", "name": "apply_patch",
                     "arguments": "{\"path\":\"x.py\"}", "call_id": "call_1"}},
        {"timestamp": "2026-06-06T10:00:06Z", "type": "response_item",
         "payload": {"type": "function_call", "name": "shell",
                     "arguments": "{\"command\":\"pytest -q\"}", "call_id": "call_2"}},
        {"timestamp": "2026-06-06T10:00:07Z", "type": "response_item",
         "payload": {"type": "function_call_output", "call_id": "call_2", "output": "2 passed"}},
        {"timestamp": "2026-06-06T10:00:08Z", "type": "response_item",
         "payload": {"type": "message", "role": "assistant",
                     "content": [{"type": "output_text", "text": "done"}]}},
        {"timestamp": "2026-06-06T10:00:09Z", "type": "event_msg",
         "payload": {"type": "token_count",
                     "info": {"total_token_usage": {"input_tokens": 200, "output_tokens": 30,
                                                    "cached_input_tokens": 15}}}},
        {"timestamp": "2026-06-06T10:00:10Z", "type": "event_msg",
         "payload": {"type": "task_complete"}},
    ])
    norm = parse_session(str(p), REPO_MAP)
    assert norm is not None
    s = norm.session
    assert s.session_uid == "codex:cdx-1"
    assert s.flavor == "codex"
    assert s.repo == "agentic-resources" and s.domain == "helix_forge"
    assert s.model == "gpt-5-codex"
    assert s.cost.input_tokens == 200 and s.cost.output_tokens == 30 and s.cost.cache_tokens == 15
    assert s.cost.turns == 1
    assert s.cost.wall_clock_s == 10.0
    kinds = [e.kind for e in norm.events]
    assert kinds == ["lifecycle", "user_msg", "thinking", "edit", "test_run",
                     "tool_result", "assistant_msg", "completion"]
    # flat linkage: function_call_output links to its function_call by call_id
    out = next(e for e in norm.events if e.kind == "tool_result")
    test_call = next(e for e in norm.events if e.kind == "test_run")
    assert out.parent_seq == test_call.seq
    # apply_patch classified as edit; pytest as test_run
    edit = next(e for e in norm.events if e.kind == "edit")
    assert edit.tool == "apply_patch"
 def test_codex_empty_or_no_meta_returns_none(tmp_path):
    p = tmp_path / "rollout-empty.jsonl"
    p.write_text("")
    assert parse_session(str(p), REPO_MAP) is None
    p2 = tmp_path / "rollout-nometa.jsonl"
    _rollout(p2, [{"timestamp": "t", "type": "event_msg", "payload": {"type": "task_started"}}])
    assert parse_session(str(p2), REPO_MAP) is None  # no session_meta -> no id
--- a/tests/test_curate_catalog.py
+++ b/tests/test_curate_catalog.py
@@ -0,0 +1,86 @@
 """Versioned Pattern Catalog tests (T02): round-trip, dedup, idempotent upsert."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.catalog import (  # noqa: E402
    ADDED,
    UNCHANGED,
    UPDATED,
    VERSIONED,
    Catalog,
 )
 from session_memory.curate.schema import (  # noqa: E402
    Provenance,
    Resolution,
    Scope,
    SolutionPattern,
 )
 def _pattern(src="success:clean_pass:outcome", problem="ran tests, clean finish"):
    return SolutionPattern(
        id=SolutionPattern.make_id(src),
        name="Run tests before declaring success",
        version="1.0.0",
        polarity="success",
        problem=problem,
        resolutions=[Resolution(summary="run the suite")],
        scope=Scope(flavors=["claude", "grok"]),
        provenance=Provenance(source_key=src, evidence={"frequency": 18}),
    )
 def test_add_then_load_round_trips(tmp_path):
    cat = Catalog(str(tmp_path))
    assert cat.upsert(_pattern()) == ADDED
    loaded = cat.load(SolutionPattern.make_id("success:clean_pass:outcome"))
    assert loaded is not None
    assert loaded.problem == "ran tests, clean finish"
    assert loaded.created_at and loaded.updated_at
    assert [p.id for p in cat.list()] == [loaded.id]
 def test_resave_identical_is_noop(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(_pattern())
    assert cat.upsert(_pattern()) == UNCHANGED
    # version not bumped, no history written
    assert cat.load(_pattern().id).version == "1.0.0"
    assert cat.history(_pattern().id) == []
 def test_dedup_on_source_key(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(_pattern())
    cat.upsert(_pattern())  # same source key -> same id -> one file
    assert len(cat.list()) == 1
 def test_content_change_bumps_version_and_archives(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(_pattern())
    assert cat.upsert(_pattern(problem="now with more nuance")) == VERSIONED
    current = cat.load(_pattern().id)
    assert current.version == "1.0.1"
    assert current.problem == "now with more nuance"
    hist = cat.history(_pattern().id)
    assert len(hist) == 1
    assert hist[0]["version"] == "1.0.0"
    assert hist[0]["status"] == "superseded"
 def test_status_only_change_updates_without_bump(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(_pattern())
    p = _pattern()
    p.status = "approved"
    p.distribution_ready = True
    assert cat.upsert(p) == UPDATED
    current = cat.load(p.id)
    assert current.status == "approved"
    assert current.distribution_ready is True
    assert current.version == "1.0.0"  # metadata change, no bump
    assert cat.history(p.id) == []
--- a/tests/test_curate_decisions.py
+++ b/tests/test_curate_decisions.py
@@ -0,0 +1,70 @@
 """Hub decision integration tests (T05): payload shape + graceful queue/flush."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.catalog import Catalog  # noqa: E402
 from session_memory.curate.decisions import DecisionRecorder, build_decision  # noqa: E402
 from session_memory.curate.review import APPROVE, REJECT, ReviewLog, review  # noqa: E402
 def _candidate(key="success:clean_pass:outcome"):
    return {"key": key, "frequency": 18, "sessions": ["a", "b"],
            "cost_impact": 9.0, "cross_flavor": True, "flavors": ["claude", "grok"]}
 def test_build_decision_payload_shape():
    d = build_decision(_candidate(), "approve", "looks solid", workstream_id="ws-1")
    assert d["decision_type"] == "made"
    assert d["workstream_id"] == "ws-1"
    assert "Promote" in d["title"]
    assert d["rationale"] == "looks solid"
    assert "success:clean_pass:outcome" in d["description"]
 def test_sink_accepts_decision(tmp_path):
    captured = []
    rec = DecisionRecorder(str(tmp_path / "q.jsonl"), sink=captured.append)
    assert rec.record(_candidate(), "approve", "ok") is True
    assert rec.pending() == []
    assert len(captured) == 1
 def test_queues_when_sink_down(tmp_path):
    def boom(_):
        raise RuntimeError("hub down")
    rec = DecisionRecorder(str(tmp_path / "q.jsonl"), sink=boom)
    assert rec.record(_candidate(), "reject", "noise") is False
    assert len(rec.pending()) == 1
 def test_no_sink_defaults_to_queue(tmp_path):
    rec = DecisionRecorder(str(tmp_path / "q.jsonl"))
    rec.record(_candidate(), "approve", "ok")
    assert len(rec.pending()) == 1
 def test_flush_replays_queue(tmp_path):
    rec = DecisionRecorder(str(tmp_path / "q.jsonl"))  # offline -> queue
    rec.record(_candidate("problem:abandoned:outcome"), "reject", "x")
    rec.record(_candidate("success:clean_pass:outcome"), "approve", "y")
    captured = []
    assert rec.flush(sink=captured.append) == 2
    assert rec.pending() == []
    assert len(captured) == 2
 def test_review_records_each_final_decision(tmp_path):
    cat = Catalog(str(tmp_path / "catalog"))
    log = ReviewLog(str(tmp_path / "reviews.jsonl"))
    captured = []
    rec = DecisionRecorder(str(tmp_path / "q.jsonl"), sink=captured.append, workstream_id="ws")
    cands = [_candidate("success:clean_pass:outcome"), _candidate("problem:abandoned:outcome")]
    review(cands, lambda c: (APPROVE if "success" in c["key"] else REJECT, "r"), cat, log,
           recorder=rec)
    assert len(captured) == 2
    actions = sorted("Promote" in d["title"] for d in captured)
    assert actions == [False, True]
--- a/tests/test_curate_entrypoint.py
+++ b/tests/test_curate_entrypoint.py
@@ -0,0 +1,84 @@
 """Curate entrypoint tests (T06): batch auto-approve end-to-end via the store."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.core.store import Store  # noqa: E402
 from session_memory.curate.__main__ import main  # noqa: E402
 from session_memory.curate.catalog import Catalog  # noqa: E402
 def _digest(uid, flavor, repo, **markers):
    return {
        "session_uid": uid, "flavor": flavor, "repo": repo, "outcome": "fail",
        "cost": {"input_tokens": 10, "output_tokens": 1},
        "markers": {"errors": markers.get("errors", 0), "retries": markers.get("retries", 0),
                    "test_runs": 0, "edits": 0, "human_interventions": 0},
        # real coding session per the quality filter (WP-0005 T01)
        "event_count": 40, "first_prompt": "Fix the failing build and retry the suite",
        "tool_histogram": {"Bash": 20, "Edit": 12, "Read": 8},
    }
 def _write_config(tmp_path) -> str:
    store = tmp_path / ".store"
    catalog = tmp_path / "catalog"
    cfg = f"""
 [store]
 db_path = "{store / 'm.db'}"
 blob_dir = "{store / 'blobs'}"
 cursor = "{store / 'c.json'}"
 [curate]
 catalog_dir = "{catalog}"
 review_log = "{store / 'reviews.jsonl'}"
 decision_queue = "{store / 'decisions.queue.jsonl'}"
 [curate.gate]
 min_frequency = 2
 min_sessions = 2
 """
    path = tmp_path / "config.toml"
    path.write_text(cfg)
    return str(path), str(store), str(catalog)
 def test_auto_approve_promotes_cross_flavor(tmp_path, capsys):
    cfg_path, store_dir, catalog_dir = _write_config(tmp_path)
    st = Store(os.path.join(store_dir, "m.db"), os.path.join(store_dir, "blobs"))
    st.write_digest("claude:a", _digest("claude:a", "claude", "r1", retries=5))
    st.write_digest("codex:b", _digest("codex:b", "codex", "r2", retries=4))
    st.close()
    rc = main(["--config", cfg_path, "--auto-approve"])
    assert rc == 0
    cat = Catalog(catalog_dir)
    patterns = cat.list()
    assert len(patterns) == 1
    assert patterns[0].polarity == "problem"
    # clears the promote floor (freq>=2) but below the default distribution
    # floor (freq>=3) -> promoted as provisional, not distribution-ready
    assert patterns[0].status == "provisional"
    assert patterns[0].distribution_ready is False
    out = capsys.readouterr().out
    assert "Curate summary" in out
    # hub offline in tests -> decision queued
    assert "decisions queued" in out
 def test_rerun_is_idempotent(tmp_path):
    cfg_path, store_dir, catalog_dir = _write_config(tmp_path)
    st = Store(os.path.join(store_dir, "m.db"), os.path.join(store_dir, "blobs"))
    st.write_digest("claude:a", _digest("claude:a", "claude", "r1", retries=5))
    st.write_digest("codex:b", _digest("codex:b", "codex", "r2", retries=4))
    st.close()
    main(["--config", cfg_path, "--auto-approve"])
    main(["--config", cfg_path, "--auto-approve"])  # second pass: already decided
    cat = Catalog(catalog_dir)
    assert len(cat.list()) == 1
    assert cat.load(cat.list()[0].id).version == "1.0.0"  # no spurious bump
--- a/tests/test_curate_gating.py
+++ b/tests/test_curate_gating.py
@@ -0,0 +1,76 @@
 """Evidence-bar + bloat-guard tests (T04)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.catalog import Catalog  # noqa: E402
 from session_memory.curate.gating import (  # noqa: E402
    GateConfig,
    bloat_warnings,
    evaluate,
    gate_config,
 )
 from session_memory.curate.review import candidate_to_pattern  # noqa: E402
 def _candidate(key="success:clean_pass:outcome", freq=5, sessions=5, impact=10.0,
               cross=True, flavors=("claude", "grok")):
    return {
        "key": key,
        "frequency": freq,
        "sessions": [f"s{i}" for i in range(sessions)],
        "cost_impact": impact,
        "cross_flavor": cross,
        "flavors": list(flavors),
    }
 def test_clears_bar_and_distribution_ready():
    r = evaluate(_candidate(), GateConfig(dist_min_frequency=3))
    assert r.promotable and r.distribution_ready
    assert r.status == "approved"
 def test_thin_candidate_promotable_but_provisional():
    # meets promote floor (freq>=2) but below distribution floor (freq<3)
    r = evaluate(_candidate(freq=2, sessions=2), GateConfig(dist_min_frequency=3))
    assert r.promotable
    assert not r.distribution_ready
    assert r.status == "provisional"
 def test_below_promote_floor_not_promotable():
    r = evaluate(_candidate(freq=1, sessions=1))
    assert not r.promotable
    assert any("frequency" in reason for reason in r.reasons)
 def test_cross_flavor_required_for_distribution():
    r = evaluate(_candidate(cross=False), GateConfig(dist_require_cross_flavor=True))
    assert r.promotable
    assert not r.distribution_ready
    assert any("cross-flavor" in reason for reason in r.reasons)
 def test_gate_config_reads_toml_dict():
    cfg = gate_config({"curate": {"gate": {"min_frequency": 9, "dist_require_cross_flavor": True}}})
    assert cfg.min_frequency == 9
    assert cfg.dist_require_cross_flavor is True
    # defaults preserved for unspecified keys
    assert cfg.dist_min_frequency == 3
 def test_bloat_flags_duplicate_and_near_duplicate(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(candidate_to_pattern(_candidate(key="success:clean_pass:outcome")))
    existing = cat.list()
    # exact same key -> duplicate
    dup = bloat_warnings(_candidate(key="success:clean_pass:outcome"), existing)
    assert any("duplicate" in w for w in dup)
    # different polarity, same signal_type+locus -> near-duplicate
    near = bloat_warnings(_candidate(key="problem:clean_pass:outcome"), existing)
    assert any("near-duplicate" in w for w in near)
    # unrelated -> no warnings
    assert bloat_warnings(_candidate(key="problem:retry_storm:retries"), existing) == []
--- a/tests/test_curate_review.py
+++ b/tests/test_curate_review.py
@@ -0,0 +1,93 @@
 """Review workflow tests (T03): promote/reject/discuss + idempotent re-review."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.catalog import Catalog  # noqa: E402
 from session_memory.curate.review import (  # noqa: E402
    APPROVE,
    DISCUSS,
    REJECT,
    ReviewLog,
    candidate_to_pattern,
    review,
 )
 from session_memory.curate.schema import SolutionPattern  # noqa: E402
 def _candidate(key="success:clean_pass:outcome", freq=18, flavors=("claude", "grok")):
    return {
        "key": key,
        "polarity": key.split(":")[0],
        "signal_type": key.split(":")[1],
        "locus": key.split(":")[2],
        "title": "cross-flavor success: clean pass",
        "frequency": freq,
        "flavors": list(flavors),
        "repos": ["agentic-resources"],
        "sessions": [f"s{i}" for i in range(freq)],
        "cross_flavor": len(flavors) > 1,
        "cost_impact": 12.5,
    }
 def _decider(action, rationale="because"):
    return lambda cand: (action, rationale)
 def test_approve_promotes_to_catalog(tmp_path):
    cat = Catalog(str(tmp_path / "catalog"))
    log = ReviewLog(str(tmp_path / "reviews.jsonl"))
    res = review([_candidate()], _decider(APPROVE), cat, log)
    assert len(res.approved) == 1
    p = cat.load(SolutionPattern.make_id("success:clean_pass:outcome"))
    assert p is not None
    assert p.scope.flavors == ["claude", "grok"]
    assert set(p.rendering_hints) == {"claude", "grok"}
    assert p.provenance.evidence["frequency"] == 18
 def test_reject_records_no_catalog_write(tmp_path):
    cat = Catalog(str(tmp_path / "catalog"))
    log = ReviewLog(str(tmp_path / "reviews.jsonl"))
    res = review([_candidate()], _decider(REJECT), cat, log)
    assert res.rejected == ["success:clean_pass:outcome"]
    assert cat.list() == []
 def test_discuss_defers_and_is_not_final(tmp_path):
    cat = Catalog(str(tmp_path / "catalog"))
    log = ReviewLog(str(tmp_path / "reviews.jsonl"))
    res = review([_candidate()], _decider(DISCUSS), cat, log)
    assert res.deferred == ["success:clean_pass:outcome"]
    # not recorded as final -> a later pass re-surfaces it
    res2 = review([_candidate()], _decider(APPROVE), cat, log)
    assert len(res2.approved) == 1
 def test_prior_reject_remembered_same_evidence(tmp_path):
    cat = Catalog(str(tmp_path / "catalog"))
    log_path = str(tmp_path / "reviews.jsonl")
    review([_candidate()], _decider(REJECT), cat, ReviewLog(log_path))
    # fresh log instance (reloads from disk) + same evidence -> skipped
    res = review([_candidate()], _decider(APPROVE), cat, ReviewLog(log_path))
    assert res.skipped == ["success:clean_pass:outcome"]
    assert cat.list() == []
 def test_changed_evidence_resurfaces(tmp_path):
    cat = Catalog(str(tmp_path / "catalog"))
    log_path = str(tmp_path / "reviews.jsonl")
    review([_candidate(freq=18)], _decider(REJECT), cat, ReviewLog(log_path))
    # more evidence now -> not skipped, gets re-reviewed
    res = review([_candidate(freq=40)], _decider(APPROVE), cat, ReviewLog(log_path))
    assert len(res.approved) == 1
 def test_candidate_to_pattern_defaults():
    p = candidate_to_pattern(_candidate(flavors=("claude",)))
    assert p.status == "provisional"
    assert p.rendering_hints["claude"]["target"] == "CLAUDE.md"
    assert p.polarity == "success"
--- a/tests/test_curate_schema.py
+++ b/tests/test_curate_schema.py
@@ -0,0 +1,80 @@
 """Round-trip + validation tests for the Solution Pattern schema (T01)."""
 import os
 import sys
 import pytest
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.schema import (  # noqa: E402
    Provenance,
    Resolution,
    Scope,
    SolutionPattern,
 )
 def _sample() -> SolutionPattern:
    src = "success:clean_pass:outcome"
    return SolutionPattern(
        id=SolutionPattern.make_id(src),
        name="Run tests before declaring success",
        version="1.0.0",
        polarity="success",
        problem="Sessions that run tests and finish with no retries resolve cheaply.",
        resolutions=[Resolution(summary="Always run the suite", steps=["edit", "test", "commit"])],
        scope=Scope(flavors=["claude", "grok"]),
        provenance=Provenance(source_key=src, evidence={"frequency": 18, "cross_flavor": True}),
        rendering_hints={"claude": {"target": "CLAUDE.md"}, "codex": {"target": "AGENTS.md"}},
        status="approved",
        distribution_ready=True,
    )
 def test_round_trip_is_lossless():
    p = _sample()
    again = SolutionPattern.from_json(p.to_json())
    assert again.to_dict() == p.to_dict()
    assert again.resolutions[0].steps == ["edit", "test", "commit"]
    assert again.scope.flavors == ["claude", "grok"]
    assert again.provenance.evidence["cross_flavor"] is True
 def test_serialization_is_deterministic():
    p = _sample()
    assert p.to_json() == p.to_json()
    assert SolutionPattern.from_json(p.to_json()).to_json() == p.to_json()
 def test_make_id_is_stable_and_slugged():
    assert SolutionPattern.make_id("success:clean_pass:outcome") == "sp-success-clean_pass-outcome"
    # same source key -> same id regardless of later wording
    assert SolutionPattern.make_id("problem:abandoned:outcome") == SolutionPattern.make_id(
        "problem:abandoned:outcome"
    )
 def test_bump_version():
    assert SolutionPattern.bump_version("1.0.0") == "1.0.1"
    assert SolutionPattern.bump_version("1.2.3", "minor") == "1.3.0"
    assert SolutionPattern.bump_version("1.2.3", "major") == "2.0.0"
 def test_rejects_unknown_polarity():
    with pytest.raises(ValueError):
        SolutionPattern(id="x", name="n", version="1.0.0", polarity="meh", problem="p")
 def test_rejects_unknown_status():
    with pytest.raises(ValueError):
        SolutionPattern(id="x", name="n", version="1.0.0", polarity="problem",
                        problem="p", status="bogus")
 def test_rejects_unknown_flavor_in_hints_and_scope():
    with pytest.raises(ValueError):
        SolutionPattern(id="x", name="n", version="1.0.0", polarity="problem",
                        problem="p", rendering_hints={"gpt": {}})
    with pytest.raises(ValueError):
        Scope(flavors=["gpt"])
--- a/tests/test_detect_entrypoint.py
+++ b/tests/test_detect_entrypoint.py
@@ -0,0 +1,47 @@
 """Detect entrypoint tests (T07): end-to-end digests -> patterns, persisted."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.core.store import Store  # noqa: E402
 from session_memory.detect.__main__ import run_detect  # noqa: E402
 def _digest(uid, flavor, repo, **markers):
    return {
        "session_uid": uid, "flavor": flavor, "repo": repo, "outcome": "fail",
        "cost": {"input_tokens": 10, "output_tokens": 1},
        "markers": {"errors": markers.get("errors", 0), "retries": markers.get("retries", 0),
                    "test_runs": 0, "edits": 0, "human_interventions": 0},
        # fields the quality filter (WP-0005 T01) checks — real coding session
        "event_count": 40, "first_prompt": "Fix the failing build and retry the suite",
        "tool_histogram": {"Bash": 20, "Edit": 12, "Read": 8},
    }
 def _config(tmp_path):
    return {"store": {"db_path": str(tmp_path / ".store/m.db"),
                      "blob_dir": str(tmp_path / ".store/blobs"),
                      "cursor": str(tmp_path / ".store/c.json")}}
 def test_run_detect_persists_cross_flavor_pattern(tmp_path):
    cfg = _config(tmp_path)
    st = Store(cfg["store"]["db_path"], cfg["store"]["blob_dir"])
    # same problem (retry_storm) across two flavors -> cross-flavor candidate
    st.write_digest("claude:a", _digest("claude:a", "claude", "r1", retries=5))
    st.write_digest("codex:b", _digest("codex:b", "codex", "r2", retries=4))
    st.close()
    patterns = run_detect(cfg, min_frequency=2)
    assert len(patterns) == 1
    assert patterns[0]["cross_flavor"] is True
    assert patterns[0]["signal_type"] == "retry_storm"
    # persisted to the Tier 2 patterns table
    st2 = Store(cfg["store"]["db_path"], cfg["store"]["blob_dir"])
    rows = st2.db.execute("SELECT key FROM patterns").fetchall()
    assert len(rows) == 1
    st2.close()
--- a/tests/test_detect_infra_signals.py
+++ b/tests/test_detect_infra_signals.py
@@ -0,0 +1,80 @@
 """Infra-overhead + thrash signal tests (WP-0005 T02)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.detect.signals import (  # noqa: E402
    build_context,
    extract_signals,
    sig_infra_overhead,
    sig_schema_thrash,
    sig_tool_thrash,
    tool_bucket,
 )
 def _digest(uid="claude:a", repo="r1", tools=None):
    return {"session_uid": uid, "flavor": "claude", "repo": repo, "outcome": "success",
            "cost": {"input_tokens": 1, "output_tokens": 1},
            "markers": {"errors": 0, "retries": 0, "test_runs": 0},
            "tool_histogram": tools or {}}
 CTX = {"infra_min_calls": 20, "infra_overhead_threshold": 0.30,
       "schema_thrash_threshold": 5, "tool_thrash_threshold": 80}
 def test_tool_bucket_mapping():
    assert tool_bucket("mcp__state-hub__update_task_status") == "statehub_mcp"
    assert tool_bucket("ToolSearch") == "schema_load"
    assert tool_bucket("TaskUpdate") == "task_mgmt"
    assert tool_bucket("Bash") == "shell"
    assert tool_bucket("Edit") == "edit"
 def test_infra_overhead_fires_above_share():
    # 18 statehub of 30 total = 60% overhead
    d = _digest(tools={"mcp__state-hub__create_task": 18, "Bash": 8, "Edit": 4})
    sig = sig_infra_overhead(d, CTX)
    assert sig and sig[0].type == "infra_overhead"
    assert sig[0].magnitude >= 0.30
    assert sig[0].detail["statehub"] == 18
 def test_infra_overhead_quiet_when_mostly_work():
    d = _digest(tools={"mcp__state-hub__create_task": 3, "Bash": 40, "Edit": 30})
    assert sig_infra_overhead(d, CTX) == []
 def test_infra_overhead_ignores_tiny_sessions():
    d = _digest(tools={"mcp__state-hub__create_task": 5})  # below infra_min_calls
    assert sig_infra_overhead(d, CTX) == []
 def test_schema_thrash_fires():
    d = _digest(tools={"ToolSearch": 9, "Bash": 5})
    sig = sig_schema_thrash(d, CTX)
    assert sig and sig[0].type == "schema_thrash"
    assert sig[0].detail["tool_searches"] == 9
 def test_tool_thrash_fires_on_dominant_tool():
    d = _digest(tools={"Bash": 120, "Edit": 5})
    sig = sig_tool_thrash(d, CTX)
    assert sig and sig[0].locus == "tool:Bash"
 def test_extract_signals_includes_infra():
    d = _digest(tools={"mcp__state-hub__create_task": 18, "Bash": 8, "Edit": 4,
                       "ToolSearch": 6})
    types = {s.type for s in extract_signals([d])}
    assert "infra_overhead" in types
    assert "schema_thrash" in types
 def test_build_context_has_infra_defaults():
    ctx = build_context([])
    assert ctx["infra_overhead_threshold"] == 0.30
    assert ctx["schema_thrash_threshold"] == 5
--- a/tests/test_detect_quality.py
+++ b/tests/test_detect_quality.py
@@ -0,0 +1,61 @@
 """Session-quality filter tests (T01)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.detect.quality import (  # noqa: E402
    QualityConfig,
    filter_real,
    is_real_coding_session,
    quality_config,
 )
 def _digest(repo="agentic-resources", events=60, prompt="Implement the curate entrypoint",
            tools=None):
    return {
        "session_uid": "claude:x", "flavor": "claude", "repo": repo,
        "event_count": events, "first_prompt": prompt,
        "tool_histogram": tools if tools is not None else {"Bash": 20, "Edit": 15, "Read": 8},
    }
 def test_real_session_passes():
    assert is_real_coding_session(_digest()) is True
 def test_healthcheck_prompt_dropped():
    assert is_real_coding_session(_digest(events=3, prompt="Say hello in one word.",
                                          tools={})) is False
 def test_interrupted_dropped():
    assert is_real_coding_session(_digest(events=1, prompt="[Request interrupted by user]",
                                          tools={})) is False
 def test_too_short_dropped():
    assert is_real_coding_session(_digest(events=5)) is False
 def test_no_repo_dropped():
    assert is_real_coding_session(_digest(repo=None)) is False
 def test_no_substantive_tools_dropped():
    # plenty of events but only plumbing calls -> not real coding
    assert is_real_coding_session(
        _digest(tools={"mcp__state-hub__update_task_status": 40})) is False
 def test_filter_real_keeps_only_real():
    digs = [_digest(), _digest(events=3, prompt="hello", tools={}), _digest(repo=None)]
    assert len(filter_real(digs)) == 1
 def test_quality_config_from_toml():
    cfg = quality_config({"detect": {"quality": {"min_events": 50}}})
    assert cfg.min_events == 50
    assert cfg.min_substantive == 3  # default preserved
--- a/tests/test_detect_recurring_error.py
+++ b/tests/test_detect_recurring_error.py
@@ -0,0 +1,59 @@
 """Recurring-error signal + clustering (WP-0006 T02)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.detect.cluster import cluster  # noqa: E402
 from session_memory.detect.signals import (  # noqa: E402
    extract_signals,
    sig_recurring_error,
 )
 def _digest(uid, repo, flavor="claude", snippets=None):
    return {
        "session_uid": uid, "flavor": flavor, "repo": repo, "outcome": "success",
        "cost": {"input_tokens": 1, "output_tokens": 1},
        "markers": {"errors": 0, "retries": 0, "test_runs": 0},
        "tool_histogram": {}, "error_snippets": snippets or [],
    }
 _FP = "modulenotfounderror: no module named 'foo' at <path>:<n>"
 def test_signal_per_distinct_fingerprint():
    d = _digest("claude:a", "r1", snippets=[
        {"fingerprint": _FP, "sample": "ModuleNotFoundError ...", "count": 3, "tool": "Bash"},
        {"fingerprint": "keyerror: <str>", "sample": "KeyError", "count": 1, "tool": None},
    ])
    sigs = sig_recurring_error(d, {})
    assert len(sigs) == 2
    top = [s for s in sigs if s.locus == _FP][0]
    assert top.type == "recurring_error"
    assert top.magnitude == 3.0
    assert top.detail["sample"].startswith("ModuleNotFound")
 def test_clusters_across_sessions_and_flavors():
    # same fingerprint in a claude and a grok session -> cross-flavor candidate
    digs = [
        _digest("claude:a", "r1", "claude",
                [{"fingerprint": _FP, "sample": "ModuleNotFoundError", "count": 2, "tool": "Bash"}]),
        _digest("grok:b", "r2", "grok",
                [{"fingerprint": _FP, "sample": "ModuleNotFoundError", "count": 1, "tool": None}]),
    ]
    signals = extract_signals(digs)
    pats = cluster([s for s in signals if s.type == "recurring_error"], min_frequency=2)
    assert len(pats) == 1
    p = pats[0]
    assert p.signal_type == "recurring_error"
    assert p.cross_flavor is True
    assert sorted(p.flavors) == ["claude", "grok"]
    assert p.frequency == 2
 def test_no_snippets_no_signal():
    assert sig_recurring_error(_digest("claude:a", "r1"), {}) == []
--- a/tests/test_digest.py
+++ b/tests/test_digest.py
@@ -0,0 +1,82 @@
 """Digest tests (T04): outcome heuristic + Tier 2 promotion."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.adapters.claude import Normalized  # noqa: E402
 from session_memory.core.digest import analyze, build_digest, infer_outcome  # noqa: E402
 from session_memory.core.schema import Cost, Session, SessionEvent  # noqa: E402
 from session_memory.core.store import Store  # noqa: E402
 def _ev(uid, seq, kind, **kw):
    return SessionEvent(session_uid=uid, seq=seq, kind=kind, **kw)
 def test_infer_outcome_abandoned():
    uid = "claude:s"
    assert infer_outcome([_ev(uid, 0, "user_msg")]) == "abandoned"
 def test_infer_outcome_success_on_passing_test():
    uid = "claude:s"
    events = [
        _ev(uid, 0, "user_msg"),
        _ev(uid, 1, "assistant_msg"),
        _ev(uid, 2, "test_run", tool="Bash"),
        _ev(uid, 3, "tool_result", payload_ref="b3"),
    ]
    assert infer_outcome(events, {"b3": "6 passed in 0.4s"}) == "success"
 def test_infer_outcome_fail_on_failing_test():
    uid = "claude:s"
    events = [
        _ev(uid, 0, "user_msg"),
        _ev(uid, 1, "assistant_msg"),
        _ev(uid, 2, "test_run", tool="Bash"),
        _ev(uid, 3, "tool_result", payload_ref="b3"),
    ]
    assert infer_outcome(events, {"b3": "1 failed, traceback ..."}) == "fail"
 def test_build_digest_histograms_and_markers():
    uid = "claude:s"
    s = Session(session_uid=uid, flavor="claude", native_session_id="s",
                repo="agentic-resources", cost=Cost(input_tokens=100, output_tokens=40, turns=2))
    events = [
        _ev(uid, 0, "user_msg"),
        _ev(uid, 1, "edit", tool="Edit"),
        _ev(uid, 2, "edit", tool="Write"),
        _ev(uid, 3, "test_run", tool="Bash"),
        _ev(uid, 4, "error"),
        _ev(uid, 5, "assistant_msg"),
    ]
    d = build_digest(s, events)
    assert d["tool_histogram"] == {"Edit": 1, "Write": 1, "Bash": 1}
    assert d["markers"]["edits"] == 2
    assert d["markers"]["errors"] == 1
    assert d["markers"]["test_runs"] == 1
    assert d["event_count"] == 6
    assert d["cost"]["input_tokens"] == 100
 def test_analyze_writes_digest_and_sets_analyzed(tmp_path):
    st = Store(str(tmp_path / "m.db"), str(tmp_path / "blobs"))
    uid = Session.make_uid("claude", "s1")
    s = Session(session_uid=uid, flavor="claude", native_session_id="s1")
    events = [
        SessionEvent(session_uid=uid, seq=0, kind="user_msg", payload_ref="b0"),
        SessionEvent(session_uid=uid, seq=1, kind="assistant_msg", payload_ref="b1"),
    ]
    blobs = {"b0": "please help", "b1": "done"}
    st.ingest(Normalized(session=s, events=events, blobs=blobs))
    assert st.get_session(uid).is_evictable is False
    d = analyze(st, uid)
    assert d["outcome"] == "success"
    assert d["first_prompt"] == "please help"
    assert st.get_session(uid).analyzed_at is not None
    assert st.get_session(uid).is_evictable is True  # now promoted -> evictable
--- a/tests/test_digest_errors.py
+++ b/tests/test_digest_errors.py
@@ -0,0 +1,101 @@
 """Error-body mining into the digest (WP-0006 T01)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.core.digest import (  # noqa: E402
    _error_fingerprint,
    _error_snippets,
    build_digest,
 )
 from session_memory.core.schema import SCHEMA_VERSION, Session, SessionEvent  # noqa: E402
 def _ev(seq, kind, **kw):
    return SessionEvent(session_uid="claude:s", seq=seq, kind=kind, **kw)
 def test_fingerprint_normalizes_paths_numbers_ids():
    a = _error_fingerprint("ModuleNotFoundError: No module named 'foo' at /home/x/a.py:42")
    b = _error_fingerprint("ModuleNotFoundError: No module named 'foo' at /srv/y/b.py:9991")
    assert a == b  # paths + line numbers stripped -> same fingerprint
    assert "<path>" in a and "<n>" in a
 def test_fingerprint_uuid_and_addr():
    fp = _error_fingerprint("connection 0xDEADBEEF to 1972d1d9-fc35-4912-8126-1fe64cc51425 failed")
    assert "<addr>" in fp and "<uuid>" in fp
 def test_snippets_dedup_and_count():
    blobs = {"b1": "Traceback...\nValueError: bad thing at /p/x.py:10",
             "b2": "Traceback...\nValueError: bad thing at /q/y.py:99",
             "b3": "KeyError: 'id'"}
    events = [
        _ev(0, "error", payload_ref="b1"),
        _ev(1, "error", payload_ref="b2"),       # same fingerprint as b1
        _ev(2, "error", payload_ref="b3"),
    ]
    snips = _error_snippets(events, blobs)
    assert len(snips) == 2
    top = snips[0]
    assert top["count"] == 2  # the ValueError collapsed
    assert "ValueError" in top["sample"]
 def test_failed_tool_result_mined():
    blobs = {"b1": "npm ERR! something failed with non-zero exit"}
    events = [_ev(0, "tool_result", tool="Bash", payload_ref="b1")]
    snips = _error_snippets(events, blobs)
    assert len(snips) == 1
    assert snips[0]["tool"] == "Bash"
 def test_clean_tool_result_not_mined():
    blobs = {"b1": "6 passed in 0.4s"}
    events = [_ev(0, "tool_result", tool="Bash", payload_ref="b1")]
    assert _error_snippets(events, blobs) == []
 def test_success_json_not_mined():
    # a hub MCP success payload mentioning 'error' deep inside is NOT a failure
    blobs = {"b1": '{"result": "{\\"domain\\": \\"custodian\\", \\"note\\": \\"no errors\\"}"}'}
    events = [_ev(0, "tool_result", tool="mcp__state-hub__get_domain_summary", payload_ref="b1")]
    assert _error_snippets(events, blobs) == []
 def test_error_json_still_mined():
    blobs = {"b1": '{"detail": "Invalid request parameters"}'}
    events = [_ev(0, "tool_result", tool="Bash", payload_ref="b1")]
    snips = _error_snippets(events, blobs)
    assert len(snips) == 1
 def test_plain_mcp_error_still_mined():
    blobs = {"b1": "MCP error -32602: Invalid request parameters"}
    events = [_ev(0, "tool_result", tool="Bash", payload_ref="b1")]
    assert len(_error_snippets(events, blobs)) == 1
 def test_file_read_snapshot_not_mined():
    # a Read result of source code containing 'raise ...Error' is not a runtime error
    blobs = {"b1": "227\t    def f():\n228\t        x = 1\n229\t        raise InfospaceError()\n"}
    events = [_ev(0, "tool_result", tool="Read", payload_ref="b1")]
    assert _error_snippets(events, blobs) == []
 def test_build_digest_includes_error_snippets_and_v2():
    s = Session(session_uid="claude:s", flavor="claude", native_session_id="s", repo="r")
    events = [_ev(0, "user_msg"), _ev(1, "error", payload_ref="b1"), _ev(2, "assistant_msg")]
    d = build_digest(s, events, {"b1": "RuntimeError: kaboom at /a/b.py:3"})
    assert d["schema_version"] == SCHEMA_VERSION == 2
    assert d["error_snippets"][0]["count"] == 1
    assert "RuntimeError" in d["error_snippets"][0]["sample"]
 def test_no_errors_empty_list():
    s = Session(session_uid="claude:s", flavor="claude", native_session_id="s", repo="r")
    d = build_digest(s, [_ev(0, "user_msg"), _ev(1, "assistant_msg")])
    assert d["error_snippets"] == []
--- a/tests/test_digest_lookup.py
+++ b/tests/test_digest_lookup.py
@@ -0,0 +1,78 @@
 """digest_lookup entrypoint tests (AGENTIC-WP-0011 T03)."""
 import json
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.core.store import Store  # noqa: E402
 from session_memory.digest_lookup import lookup_digest, main, resolve_store_paths  # noqa: E402
 def _write_config(tmp_path) -> str:
    store = tmp_path / ".store"
    toml = tmp_path / "config.toml"
    toml.write_text(
        f'[store]\ndb_path = "{store / "m.db"}"\nblob_dir = "{store / "blobs"}"\n'
        f'cursor = "{store / "c.json"}"\n')
    return str(toml), str(store)
 def _seed(store_dir, uid="claude:test-uid"):
    st = Store(os.path.join(store_dir, "m.db"), os.path.join(store_dir, "blobs"))
    st.write_digest(uid, {
        "session_uid": uid,
        "flavor": "claude",
        "repo": "agentic-resources",
        "outcome": "success",
        "started_at": "2026-06-19T10:00:00Z",
        "ended_at": "2026-06-19T11:00:00Z",
        "cost": {"input_tokens": 100, "output_tokens": 25},
        "tool_histogram": {"Bash": 10, "Edit": 5},
    })
    st.close()
    return uid
 def test_resolve_store_paths_from_config(tmp_path):
    cfg_path, store_dir = _write_config(tmp_path)
    db, blob = resolve_store_paths(config_path=cfg_path)
    assert db.endswith("m.db")
    assert blob.endswith("blobs")
    assert store_dir in db
 def test_resolve_store_paths_from_env(tmp_path, monkeypatch):
    db = tmp_path / "custom" / "mem.db"
    db.parent.mkdir(parents=True)
    monkeypatch.setenv("HELIX_STORE_DB", str(db))
    resolved_db, blob = resolve_store_paths()
    assert resolved_db == str(db)
    assert blob == str(tmp_path / "custom" / "blobs")
 def test_lookup_digest_found_and_missing(tmp_path):
    cfg_path, store_dir = _write_config(tmp_path)
    uid = _seed(store_dir)
    found = lookup_digest(uid, config_path=cfg_path)
    assert found is not None and found["outcome"] == "success"
    assert lookup_digest("claude:missing", config_path=cfg_path) is None
 def test_main_json_success(tmp_path, capsys):
    cfg_path, store_dir = _write_config(tmp_path)
    uid = _seed(store_dir)
    rc = main(["--config", cfg_path, uid, "--json"])
    assert rc == 0
    data = json.loads(capsys.readouterr().out)
    assert data["session_uid"] == uid
    assert data["repo"] == "agentic-resources"
 def test_main_not_found(tmp_path, capsys):
    cfg_path, store_dir = _write_config(tmp_path)
    _seed(store_dir)
    rc = main(["--config", cfg_path, "claude:missing"])
    assert rc == 1
    assert "not found" in capsys.readouterr().err.lower()
--- a/tests/test_distribute_base.py
+++ b/tests/test_distribute_base.py
@@ -0,0 +1,88 @@
 """Distributor base tests (WP-0007 T01): markers, idempotent upsert, rendering."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.schema import Resolution, SolutionPattern  # noqa: E402
 from session_memory.distribute.base import (  # noqa: E402
    Artifact,
    BaseDistributor,
    Distributor,
    render_markdown_body,
    upsert_block,
    wrap_block,
 )
 def _pattern(pid="sp-x", polarity="problem"):
    return SolutionPattern(
        id=pid, name="Read before edit", version="1.2.0", polarity=polarity,
        problem="Agents edit files they have not read.",
        resolutions=[Resolution(summary="Read the file first", detail="then Edit",
                                steps=["Read", "Edit"])],
        rendering_hints={"claude": {"target": "CLAUDE.md"}},
    )
 def test_render_markdown_body_has_problem_and_resolution():
    body = render_markdown_body(_pattern())
    assert "### Read before edit" in body
    assert "Agents edit files" in body
    assert "**Avoid:**" in body  # problem polarity
    assert "- Read the file first — then Edit" in body
    assert "  - Read" in body
 def test_success_polarity_label():
    assert "**Prefer:**" in render_markdown_body(_pattern(polarity="success"))
 def test_wrap_block_has_markers_and_version():
    block = wrap_block("sp-x", "hello", "1.2.0")
    assert block.startswith("<!-- BEGIN helix-forge pattern:sp-x --> v1.2.0")
    assert block.rstrip().endswith("<!-- END helix-forge pattern:sp-x -->")
 def test_upsert_inserts_then_replaces_in_place():
    doc = "# Title\n\nsome text\n"
    b1 = wrap_block("sp-x", "first", "1")
    once = upsert_block(doc, "sp-x", b1)
    assert "first" in once and once.count("BEGIN helix-forge pattern:sp-x") == 1
    # re-distributing the same id replaces, does not duplicate
    b2 = wrap_block("sp-x", "second", "2")
    twice = upsert_block(once, "sp-x", b2)
    assert "second" in twice and "first" not in twice
    assert twice.count("BEGIN helix-forge pattern:sp-x") == 1
 def test_upsert_keeps_other_patterns():
    doc = upsert_block("", "sp-a", wrap_block("sp-a", "A"))
    doc = upsert_block(doc, "sp-b", wrap_block("sp-b", "B"))
    assert "sp-a" in doc and "sp-b" in doc
 def test_base_distributor_renders_artifact():
    d = BaseDistributor(flavor="claude", target_path="CLAUDE.md")
    art = d.render(_pattern())
    assert isinstance(art, Artifact)
    assert isinstance(d, Distributor)  # satisfies the protocol
    assert art.flavor == "claude"
    assert art.target_path == "CLAUDE.md"
    assert "BEGIN helix-forge pattern:sp-x" in art.content
    assert "Read before edit" in art.content
 def test_body_hint_overrides_default():
    p = _pattern()
    p.rendering_hints["claude"]["body"] = "custom claude body"
    d = BaseDistributor(flavor="claude", target_path="CLAUDE.md")
    assert "custom claude body" in d.render(p).content
 def test_target_hint_overrides_default():
    p = _pattern()
    p.rendering_hints["claude"]["target"] = "docs/CLAUDE.md"
    d = BaseDistributor(flavor="claude", target_path="CLAUDE.md")
    assert d.render(p).target_path == "docs/CLAUDE.md"
--- a/tests/test_distribute_claude.py
+++ b/tests/test_distribute_claude.py
@@ -0,0 +1,40 @@
 """Claude distributor tests (WP-0007 T02)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.schema import Resolution, SolutionPattern  # noqa: E402
 from session_memory.distribute.claude import ClaudeDistributor  # noqa: E402
 def _pattern(hints=None):
    return SolutionPattern(
        id="sp-read-before-edit", name="Read before edit", version="1.0.0",
        polarity="problem", problem="Agents edit files they have not read.",
        resolutions=[Resolution(summary="Read the file first", steps=["Read", "Edit"])],
        rendering_hints=hints or {"claude": {}},
    )
 def test_default_targets_claude_md():
    art = ClaudeDistributor().render(_pattern())
    assert art.flavor == "claude"
    assert art.target_path == "CLAUDE.md"
    assert "BEGIN helix-forge pattern:sp-read-before-edit" in art.content
    assert "### Read before edit" in art.content
 def test_skill_mode_emits_skill_stub():
    art = ClaudeDistributor().render(_pattern({"claude": {"as": "skill"}}))
    assert "## Skill: Read before edit" in art.content
    assert "**When:**" in art.content
    assert "  - Read" in art.content
 def test_idempotent_marker_present_for_reupsert():
    art = ClaudeDistributor().render(_pattern())
    # same id in both renders -> caller can upsert in place
    art2 = ClaudeDistributor().render(_pattern())
    assert art.pattern_id == art2.pattern_id == "sp-read-before-edit"
--- a/tests/test_distribute_codex_grok.py
+++ b/tests/test_distribute_codex_grok.py
@@ -0,0 +1,49 @@
 """Codex + Grok distributor + registry tests (WP-0007 T03)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.schema import Resolution, SolutionPattern  # noqa: E402
 from session_memory.distribute.codex import CodexDistributor  # noqa: E402
 from session_memory.distribute.grok import GrokDistributor  # noqa: E402
 from session_memory.distribute.registry import all_flavors, get_distributor  # noqa: E402
 def _pattern():
    return SolutionPattern(
        id="sp-x", name="Read before edit", version="1.0.0", polarity="problem",
        problem="Agents edit files they have not read.",
        resolutions=[Resolution(summary="Read the file first")],
    )
 def test_codex_targets_agents_md():
    art = CodexDistributor().render(_pattern())
    assert art.flavor == "codex" and art.target_path == "AGENTS.md"
    assert "Read before edit" in art.content
 def test_grok_targets_native_instructions():
    art = GrokDistributor().render(_pattern())
    assert art.flavor == "grok" and art.target_path == ".grok/instructions.md"
 def test_same_pattern_expressible_for_all_flavors():
    # FR-A3: one pattern, rendered for every flavor (same body, different targets)
    p = _pattern()
    bodies = {}
    for f in all_flavors():
        art = get_distributor(f).render(p)
        # strip markers -> compare agnostic body
        inner = art.content.split("\n", 1)[1].rsplit("\n", 1)[0]
        bodies[f] = inner
    targets = {get_distributor(f).render(p).target_path for f in all_flavors()}
    assert len(targets) == 3                 # distinct per-flavor targets
    assert len(set(bodies.values())) == 1    # identical agnostic body
 def test_registry_unknown_flavor():
    assert get_distributor("gpt") is None
    assert set(all_flavors()) == {"claude", "codex", "grok"}
--- a/Show More
+++ b/Show More
		`@@ -0,0 +1 @@`
							`"""Per-flavor collector adapters (Tier 0 -> Tier 1 normalization)."""`
		`@@ -0,0 +1 @@`
							{"created_at": "2026-06-07T09:13:20Z", "distribution_ready": true, "id": "sp-problem-budget_overrun-tokens", "name": "problem: budget overrun", "polarity": "problem", "problem": "problem: budget overrun", "provenance": {"detected_at": null, "evidence": {"cost_impact": 10.667, "cross_flavor": false, "flavors": ["claude"], "frequency": 3, "key": "problem:budget_overrun:tokens", "locus": "tokens", "polarity": "problem", "repos": ["artifact-store", "citation-evidence", "infospace-bench"], "score": 32.001, "sessions": ["claude:0ef1b45c-5c27-4e20-88b3-37daeaa24eca", "claude:6e0d3d68-872b-4d93-bb09-0691e091314b", "claude:8fabd5ce-6a20-4412-9a8b-0f0763394a78"], "signal_type": "budget_overrun", "title": "problem: budget overrun"}, "promoted_at": "2026-06-07T09:13:20Z", "source_key": "problem:budget_overrun:tokens"}, "rendering_hints": {"claude": {"note": "TODO: refine rendering", "target": "CLAUDE.md"}}, "resolutions": [{"detail": "", "steps": [], "summary": "TODO: capture the recommended resolution"}], "schema_version": 1, "scope": {"domains": [], "flavors": ["claude"], "repos": ["artifact-store", "citation-evidence", "infospace-bench"]}, "status": "superseded", "updated_at": "2026-06-07T09:13:20Z", "version": "1.0.0"}
		`@@ -0,0 +1 @@`
							{"covers": [], "created_at": "2026-06-07T13:26:25Z", "distribution_ready": true, "id": "sp-problem-file_not_read-edit", "name": "Read before you Edit", "polarity": "problem", "problem": "Agents call Edit/Write on a file they have not read in the current session, or after it changed under them. The edit tools reject this ('File has not been read yet' / 'File has been modified since read'), and the retry burns a turn. Top recurring error in the corpus (12/27 sessions, 8 repos).", "provenance": {"detected_at": null, "evidence": {"frequency": 32, "origin": "AGENTIC-WP-0006 error mining / ASSESSMENT-infra-friction.md", "polarity": "problem", "repos": 8, "sessions": 12}, "promoted_at": null, "source_key": "problem:file_not_read:edit"}, "rendering_hints": {"claude": {"target": "CLAUDE.md"}, "codex": {"target": "AGENTS.md"}, "grok": {"target": ".grok/instructions.md"}}, "resolutions": [{"detail": "Never blind-write a file you haven't read this session.", "steps": ["Read the target file", "Then Edit/Write"], "summary": "Read the file (or the region you'll touch) before Edit/Write"}, {"detail": "A stale read means the file changed under you; refresh, don't loop.", "steps": ["Re-Read the file", "Re-apply the Edit"], "summary": "On 'modified since read', re-Read then re-Edit"}], "schema_version": 1, "scope": {"domains": [], "flavors": [], "repos": []}, "status": "superseded", "updated_at": "2026-06-07T13:26:25Z", "version": "1.0.0"}
		`@@ -0,0 +1 @@`
							`"""Flavor-agnostic core: schema, store, cursor, digest, retention."""`
		`@@ -0,0 +1 @@`
							`"""Detect: extract signals from sessions, cluster into candidate patterns."""`
		`@@ -0,0 +1 @@`
							`{"captured_at": "2026-06-07T13:30:14Z", "error_rate": 0.963, "infra_overhead_share_median": 0.117, "infra_overhead_share_p90": 0.261, "label": "phase4-baseline (pre-fixes)", "n_sessions": 27, "recurring_error_occurrences": 505, "schema_thrash_sessions": 8, "success_rate": 1.0, "tokens_p50": 250725, "tokens_p90": 1423966}`