established rules

Add .repo-classification.yaml (CUST-WP-0050 T11 agent first-pass)
chore(consistency): sync task status from DB [auto]
2026-06-22 23:06:36 +02:00 · 2026-06-22 17:47:34 +02:00 · 2026-06-21 16:09:45 +02:00 · 2026-06-21 16:09:34 +02:00 · 2026-06-19 20:37:50 +02:00 · 2026-06-19 20:27:00 +02:00
114 changed files with 8016 additions and 121 deletions
--- a/.claude/rules/agents.md
+++ b/.claude/rules/agents.md
@@ -0,0 +1,20 @@
 ## Kaizen Agents
 Specialized agent personas available on demand via the state-hub MCP.
 **Discover:** `list_kaizen_agents()` — returns all agents with name, description, category
 **Load:** `get_kaizen_agent("tdd-workflow")` — returns full instructions; read and follow them
 Common agents:
 | Agent | Category | When to use |
 |-------|----------|-------------|
 | `tdd-workflow` | testing | Step-by-step TDD8 workflow for any feature |
 | `code-refactoring` | quality | Code quality analysis and safe refactoring |
 | `test-maintenance` | testing | Diagnose and fix failing tests |
 | `requirements-engineering` | process | Prevent interface/mock mismatches upfront |
 | `keepaTodofile` | process | Maintain TODO.md during work |
 | `project-management` | process | Track status, determine next steps |
 | `datamodel-optimization` | quality | Optimize dataclasses and data structures |
 All 17 agents: call `list_kaizen_agents()` for the full list.
--- a/.claude/rules/architecture.md
+++ b/.claude/rules/architecture.md
@@ -0,0 +1,8 @@
 ## Architecture
 <!-- TODO: Describe the key design decisions and component structure.
     Key modules, data flows, external integrations, state machines, etc. -->
 ## Quick Reference
 `~/state-hub/mcp_server/TOOLS.md` — MCP tool reference
--- a/.claude/rules/credential-routing.md
+++ b/.claude/rules/credential-routing.md
@@ -0,0 +1,50 @@
 # Credential and access routing
 **Audience:** Codex, Claude Code, Grok, and custodian agents that call **llm-connect**
 for inference. Run this check **before** requesting secrets, API keys, SSH access,
 login tokens, or database passwords — in any repo, not only `ops-warden`.
 ops-warden **issues SSH certificates only** (`warden sign`, `cert_command`). Every
 other credential need belongs to another subsystem. **Do not** message
 `ops-warden` on State Hub expecting a secret value; the reply is a pointer, not a key.
 ### Lookup (do this first)
 ```bash
 warden route find "<describe your need>" --json
 warden route show <catalog-id> --json
 ```
 Requires the `warden` CLI from `~/ops-warden` (`uv tool install .` or `uv run warden`).
 | Agent runtime | How to orient |
 | --- | --- |
 | **Codex / Grok** (shell, HTTP State Hub) | `warden route` commands above; inbox `to_agent=agentic-resources` is for coordination, not secret vending |
 | **Claude Code** (MCP when available) | `get_domain_summary("custodian")` for workstreams; **still** use `warden route` for credential ownership |
 | **llm-connect** (inference service) | Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by `warden route` |
 ### Quick routing table
 | I need… | Owner | ops-warden executes? |
 | --- | --- | --- |
 | SSH cert (`adm`/`agt`/`atm`) | ops-warden | **Yes** — `warden sign` |
 | API key, DB password, provider token | OpenBao (`railiance-platform`) | No — route only |
 | Login / OIDC / MFA | key-cape / Keycloak | No — route only |
 | Authorization decision | flex-auth | No — route only |
 | activity-core → issue-core emission | activity-core + issue-core | No — `warden route show activity-core-issue-sink` |
 | SSH tunnel | ops-bridge (+ `cert_command` from warden) | No — route only |
 ### Anti-patterns (do not do these)
 - `POST /messages/` to `ops-warden` asking for `ISSUE_CORE_API_KEY`, `OPENROUTER_API_KEY`, etc.
 - Inventing `warden secret`, `warden login`, `warden bao`, `warden tunnel` — they do not exist
 - Pasting secrets into Git, State Hub, workplans, logs, or chat
 ### Other capabilities (reuse-surface)
 Non-credential capabilities are usually discovered through **reuse-surface** federation
 (`reuse-surface` registry / `capability.*` indexes). Credential routing is inlined in
 every repo's agent instructions because it is high-frequency, high-risk, and easy to
 get wrong.
 **Canon:** `~/ops-warden/wiki/CredentialRouting.md` · catalog `~/ops-warden/registry/routing/catalog.yaml`
--- a/.claude/rules/first-session.md
+++ b/.claude/rules/first-session.md
@@ -0,0 +1,38 @@
 ## First Session Protocol
 Triggered when `get_domain_summary("infotech")` shows **no workstreams**.
 The project is registered but work has not yet been structured.
 **Step 1 — Read, don't write**
 - `~/the-custodian/canon/projects/infotech/project_charter_v0.1.md` — purpose, scope
 - `~/the-custodian/canon/projects/infotech/roadmap_v0.1.md` — planned phases
 - Scan repo root: README, directory structure, existing code or docs
 **Step 2 — Survey in-progress work**
 Look for TODOs, open branches, half-finished files. Note done vs. started but incomplete.
 **Step 3 — Propose workstreams to Bernd**
 Propose 1–3 workstreams — each a coherent strand, weeks to months, anchored to a
 roadmap phase. **Wait for approval before creating.**
 **Step 4 — Create workplan file first, then DB record (ADR-001)**
 ```
 workplans/AGENTIC-WP-NNNN-<slug>.md   ← write this first
 ```
 Then register in the hub:
 ```
 create_workstream(topic_id="f39fa2a3-c491-414c-a91b-b4c5fcc6139c", title="...", owner="...", description="...")
 create_task(workstream_id="<id>", title="...", priority="high|medium|low")
 ```
 **Step 5 — Record the setup**
 ```
 add_progress_event(
    summary="First session: structured infotech into N workstreams, M tasks",
    event_type="milestone",
    topic_id="f39fa2a3-c491-414c-a91b-b4c5fcc6139c",
    detail={"workstreams": [...], "tasks_created": M}
 )
 ```
 <!-- Delete or archive this file once past first session -->
--- a/.claude/rules/repo-boundary.md
+++ b/.claude/rules/repo-boundary.md
@@ -0,0 +1,8 @@
 ## Repo boundary
 This repo owns **agentic-resources** only. It does not own:
 <!-- TODO: List what belongs in adjacent repos, e.g.:
 - SSH key management → railiance-infra/
 - State hub code     → state-hub/
 -->
--- a/.claude/rules/repo-identity.md
+++ b/.claude/rules/repo-identity.md
@@ -0,0 +1,5 @@
 **Purpose:** Iterating towards optimal agentic performance.
 **Domain:** infotech
 **Repo slug:** agentic-resources
 **Topic ID:** f39fa2a3-c491-414c-a91b-b4c5fcc6139c
--- a/.claude/rules/session-protocol.md
+++ b/.claude/rules/session-protocol.md
@@ -0,0 +1,85 @@
 ## Session Protocol
 Dev Hub (State Hub API): http://127.0.0.1:8000
 MCP server name in `~/.claude.json`: `dev-hub`
 **Step 1 — Orient**
 Read the offline-safe brief first — it works without a live hub connection:
 ```bash
 cat .custodian-brief.md
 ```
 Then call the MCP tool for richer cross-domain context when MCP tools are exposed:
 ```
 get_domain_summary("infotech")
 ```
 If MCP tools are unavailable in the current agent session, use the REST API:
 ```bash
 curl -s "http://127.0.0.1:8000/state/summary" | python3 -m json.tool
 ```
 If the hub is offline: `cd ~/state-hub && make api`
 **Step 2 — Check inbox**
 With MCP tools:
 ```
 get_messages(to_agent="agentic-resources", unread_only=True)
 ```
 Mark read with `mark_message_read(message_id)`. Reply or act on coordination
 requests before proceeding.
 Without MCP tools:
 ```bash
 curl -s "http://127.0.0.1:8000/messages/?to_agent=agentic-resources&unread_only=true" \
  | python3 -m json.tool
 curl -s -X PATCH "http://127.0.0.1:8000/messages/<id>/read" \
  -H "Content-Type: application/json" -d '{}'
 ```
 **Step 3 — Scan workplans**
 ```bash
 ls workplans/
 ```
 For each file with `status: ready`, `active`, or `blocked`, note pending
 `wait`/`todo`/`progress` tasks.
 **Step 4 — Present brief**
 1. **Active workstreams** for `infotech` — title, task counts, blocking decisions
 2. **Pending tasks** from `workplans/` + any `[repo:agentic-resources]` hub tasks
 3. **Goal guidance** — if `goal_guidance` in summary:
   - `needs_workplan`: surface as top action — *"Repo goal '{title}' has no workplan yet"*
   - `alignment_warnings`: flag if active work is not aligned with current goal
 4. **Suggested next action** — highest-priority open item
 5. **SBOM status** — flag if `last_sbom_at` is unset for this repo
 If no workstreams: follow First Session Protocol (`first-session.md`).
 **During work:** `record_decision()` · `add_progress_event()` · `resolve_decision()`
 > State Hub is a *read model*. Bootstrap tools (`create_workstream`, `create_task`)
 > are First Session Protocol only. Work structure belongs in repo files (ADR-001).
 **Session close:**
 With MCP tools:
 ```
 add_progress_event(summary="...", topic_id="f39fa2a3-c491-414c-a91b-b4c5fcc6139c", workstream_id="<uuid>")
 ```
 Without MCP tools:
 ```bash
 curl -s -X POST http://127.0.0.1:8000/progress/ \
  -H "Content-Type: application/json" \
  -d '{"topic_id":"f39fa2a3-c491-414c-a91b-b4c5fcc6139c","workstream_id":"<uuid>","event_type":"note","summary":"what changed","author":"codex"}'
 ```
 If workplan files were modified, ensure the local copy is up to date first:
 ```bash
 git -C <repo_path> pull --ff-only
 cd ~/state-hub && make fix-consistency REPO=agentic-resources
 ```
 For repos where implementation runs on a remote machine (e.g. CoulombCore),
 use the combined target which pulls before fixing:
 ```bash
 cd ~/state-hub && make fix-consistency-remote REPO=agentic-resources
 ```
 **C-15** (DB task ahead of file) is normal in multi-machine workflows — writeback
 will sync the file to match DB.  **C-16** (repo behind remote) blocks all writes
 until you pull — intentional to prevent clobbering remote progress.
--- a/.claude/rules/stack-and-commands.md
+++ b/.claude/rules/stack-and-commands.md
@@ -0,0 +1,19 @@
 ## Stack
 <!-- TODO: Fill in language, frameworks, and key dependencies -->
 - **Language:**
 - **Key deps:**
 ## Dev Commands
 ```bash
 # TODO: Fill in the standard commands for this repo
 # Install dependencies
 # Run tests
 # Lint / type check
 # Build / package (if applicable)
 ```
--- a/.claude/rules/workplan-convention.md
+++ b/.claude/rules/workplan-convention.md
@@ -0,0 +1,40 @@
 ## Workplan Convention (ADR-001)
 File location: `workplans/AGENTIC-WP-NNNN-<slug>.md`
 ID prefix: `AGENTIC-WP-`
 Work items originate as files in this repo **before** being registered in the hub.
 Canonical workplan/workstream frontmatter statuses are:
 `proposed`, `ready`, `active`, `blocked`, `backlog`, `finished`, `archived`.
 Use `proposed` for a newly drafted plan, `ready` after review against current
 repo state, and `finished` when implementation is complete. `stalled` and
 `needs_review` are derived health labels, not stored statuses.
 Closed workplans may be moved to `workplans/archived/` with a completion-date
 prefix: `YYMMDD-AGENTIC-WP-NNNN-<slug>.md`. The frontmatter id remains
 unchanged; the prefix is only for quick visual reference.
 Small opportunistic tasks discovered during another session use **Ad Hoc Tasks**:
 `workplans/ADHOC-YYYY-MM-DD.md`, workstream slug `adhoc-YYYY-MM-DD`, and task ids
 `ADHOC-YYYY-MM-DD-T01`, `T02`, etc. Use adhocs only for low-risk work completed
 directly. Promote anything requiring analysis, design, approval, dependencies, or
 multiple planned phases into a normal workplan.
 Ecosystem todos from other agents arrive as `[repo:agentic-resources]` hub tasks —
 visible at session start. Pick one up by creating the workplan file, then registering
 the workstream.
 Task blocks use this shape:
 ```task
 id: AGENTIC-WP-NNNN-T01
 status: wait | todo | progress | done | cancel
 priority: high | medium | low
 state_hub_task_id: "<uuid>"         # written by fix-consistency — do not edit
 ```
 Status progression is `todo` → `progress` → `done`; use `wait` for waiting or
 blocked work and `cancel` for stopped work.
 <!-- Ralph Loop rules and HEUREKA sequence: ~/.claude/CLAUDE.md — do not duplicate here -->
--- a/.custodian-brief.md
+++ b/.custodian-brief.md
@@ -2,18 +2,12 @@
 # Custodian Brief — agentic-resources
 **Domain:** helix_forge  
-**Last synced:** 2026-06-05 22:10 UTC  
+**Last synced:** 2026-06-21 14:09 UTC  
 **State Hub:** http://127.0.0.1:8000 *(adjust if running on a remote machine)*
 ## Active Workstreams
-### Bootstrap State Hub integration
+*(none — repo may need first-session setup)*
 Progress: 0/3 done  |  workstream_id: `bb9a43a3-a54f-434b-97c2-e1c7142b52f5`
 **Open tasks:**
 - · Review Generated Integration Files  `3ad7b7a9`
 - · Verify Local Developer Workflow  `db248d57`
 - · Seed First Real Workplan  `9cbb7aa5`
 ---
 ## MCP Orientation (when available)
--- a/.gitignore
+++ b/.gitignore
@@ -177,6 +177,8 @@ cython_debug/
 # session-memory local store
 session_memory/.store/
 # generated per-flavor distribution proposals (HITL, regenerated each run)
 session_memory/proposals/
 __pycache__/
 *.pyc
 .pytest_cache/
--- a/.repo-classification.yaml
+++ b/.repo-classification.yaml
@@ -0,0 +1,18 @@
 repo_classification:
  standard: Repo Classification Standard
  version: '1.0'
  classified_at: '2026-06-22'
  classified_by: agent
  category: project
  domain: infotech
  secondary_domains: []
  capability_tags:
  - automation
  - orchestration
  business_stake:
  - technology
  - product
  - operations
  business_mechanics:
  - coordination
  - operation
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -4,7 +4,7 @@
 **Purpose:** Iterating towards optimal agentic performance.
-**Domain:** helix_forge
+**Domain:** infotech
 **Repo slug:** agentic-resources
 **Topic ID:** `f39fa2a3-c491-414c-a91b-b4c5fcc6139c`
 **Workplan prefix:** `AGENTIC-WP-`
@@ -101,6 +101,63 @@ curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
 ---
 ## Credential and access routing
 **Audience:** Codex, Claude Code, Grok, and custodian agents that call **llm-connect**
 for inference. Run this check **before** requesting secrets, API keys, SSH access,
 login tokens, or database passwords — in any repo, not only `ops-warden`.
 ops-warden **issues SSH certificates only** (`warden sign`, `cert_command`). Every
 other credential need belongs to another subsystem. **Do not** message
 `ops-warden` on State Hub expecting a secret value; the reply is a pointer, not a key.
 ### Lookup (do this first)
 ```bash
 warden route find "<describe your need>" --json
 warden route show <catalog-id> --json
 ```
 Requires the `warden` CLI from `~/ops-warden` (`uv tool install .` or `uv run warden`).
 | Agent runtime | How to orient |
 | --- | --- |
 | **Codex / Grok** (shell, HTTP State Hub) | `warden route` commands above; inbox `to_agent=agentic-resources` is for coordination, not secret vending |
 | **Claude Code** (MCP when available) | `get_domain_summary("custodian")` for workstreams; **still** use `warden route` for credential ownership |
 | **llm-connect** (inference service) | Never put secret retrieval in prompts; route custody to OpenBao/operator paths surfaced by `warden route` |
 ### Quick routing table
 | I need… | Owner | ops-warden executes? |
 | --- | --- | --- |
 | SSH cert (`adm`/`agt`/`atm`) | ops-warden | **Yes** — `warden sign` |
 | API key, DB password, provider token | OpenBao (`railiance-platform`) | No — route only |
 | Login / OIDC / MFA | key-cape / Keycloak | No — route only |
 | Authorization decision | flex-auth | No — route only |
 | activity-core → issue-core emission | activity-core + issue-core | No — `warden route show activity-core-issue-sink` |
 | SSH tunnel | ops-bridge (+ `cert_command` from warden) | No — route only |
 ### Anti-patterns (do not do these)
 - `POST /messages/` to `ops-warden` asking for `ISSUE_CORE_API_KEY`, `OPENROUTER_API_KEY`, etc.
 - Inventing `warden secret`, `warden login`, `warden bao`, `warden tunnel` — they do not exist
 - Pasting secrets into Git, State Hub, workplans, logs, or chat
 ### Other capabilities (reuse-surface)
 Non-credential capabilities are usually discovered through **reuse-surface** federation
 (`reuse-surface` registry / `capability.*` indexes). Credential routing is inlined in
 every repo's agent instructions because it is high-frequency, high-risk, and easy to
 get wrong.
 **Canon:** `~/ops-warden/wiki/CredentialRouting.md` · catalog `~/ops-warden/registry/routing/catalog.yaml`
 <!-- REPO-AGENTS-EXTENSIONS -->
 <!-- Append repo-specific agent instructions below this marker.
     The state-hub template sync preserves content after this line. -->
 ---
 ## Workplan Convention (ADR-001)
 Work items originate as files in this repo — not in the hub. The hub is a
@@ -124,7 +181,7 @@ anything needing analysis, design, approval, dependencies, or multiple phases.
 id: AGENTIC-WP-NNNN
 type: workplan
 title: "..."
-domain: helix_forge
+domain: infotech
 repo: agentic-resources
 status: proposed | ready | active | blocked | backlog | finished | archived
 owner: codex
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -0,0 +1,12 @@
 # agentic-resources — Claude Code Instructions
@SCOPE.md
@.claude/rules/repo-identity.md
@.claude/rules/session-protocol.md
@.claude/rules/first-session.md
@.claude/rules/workplan-convention.md
@.claude/rules/stack-and-commands.md
@.claude/rules/architecture.md
@.claude/rules/repo-boundary.md
@.claude/rules/credential-routing.md
@.claude/rules/agents.md
--- a/docs/ASSESSMENT-infra-friction.md
+++ b/docs/ASSESSMENT-infra-friction.md
@@ -0,0 +1,144 @@
 # Infrastructure Friction Assessment
 *Generated 2026-06-07 from captured coding-session data (Helix Forge session
 memory), after the Detect-hardening pass ([AGENTIC-WP-0005]). First data-driven
 assessment of where our agentic coding sessions spend effort on plumbing rather
 than work.*
 ## Method & data quality
 - **Corpus:** 72 sessions captured across Claude + Grok. A session-quality filter
  ([detect/quality.py]) drops health-checks, smoke-tests, and interrupted runs
  (mostly `llm-connect` *"Say hello in one word"*). **27 are real coding sessions.**
 - **Caveat:** the 41 % that were filtered out had been mislabeled `abandoned` by
  the outcome heuristic and produced a *false-positive* "cross-flavor abandoned"
  pattern in the first catalog — now purged. Treat any pre-hardening finding with
  suspicion.
 - **Key framing:** all 27 real sessions ended in `success`. So the friction here
  is **cost/efficiency, not failure** — sessions get there, but pay an avoidable
  tax to do it.
 ## The headline number
 Across the 27 real sessions, tool-call activity breaks down as:
 | Bucket | Share |
 |--------|------:|
 | shell (Bash / run_terminal) | 38.2 % |
 | edit | 30.2 % |
 | read | 12.9 % |
 | **State Hub MCP** | **10.3 %** |
 | **task-management plumbing** | **5.8 %** |
 | **schema-loading (`ToolSearch`)** | **1.5 %** |
 | other | 1.1 % |
 **~17.6 % of all tool calls in real coding sessions are coordination plumbing
 (hub + task + schema-loading), not touching the repo.** Per-session infra-overhead
 share: median **11.7 %**, p90 **26.1 %**, max **43.3 %** — it concentrates badly.
 ## Ranked friction
 ### 1. State Hub call volume — *highest cost, addressable*
 State Hub MCP is 10.3 % of all tool calls and dominates the worst sessions:
 | Repo (one session) | total calls | State Hub calls | overhead share |
 |--------------------|------:|------:|------:|
 | vergabe-teilnahme | 570 | **231** | 43 % |
 | activity-core | 488 | 98 | 23 % |
 | flex-auth | 236 | 35 (+27 task) | 29 % |
 | net-kingdom | 129 | 25 | 22 % |
 Root cause: many **fine-grained** calls — per-task status updates, per-event
 progress writes, repeated `get_domain_summary`. 231 hub calls in a single session
 is coordination overhead, not work.
 ### 2. Schema-loading thrash (`ToolSearch`) — *low cost, near-zero-effort fix*
 **106 `ToolSearch` calls across 22 of 27 sessions (81 %).** The State Hub MCP
 tools are *deferred*, so nearly every session re-discovers and re-loads the same
 tool schemas before it can call them. This is pure overhead with no work value —
 and it is **exactly the CLI/MCP-interface friction hypothesized.**
 ### 3. Task-management plumbing — 5.8 %
 `TaskUpdate` / `TaskCreate` / `todo_write` / `update_task_status`. Overlaps with
 (1); much of it is redundant status churn within a session.
 ### 4. Tool thrash — *session-shape, watch only*
 11 sessions hammer a single tool 80–230× (usually Bash or Edit). Less an infra
 problem than a sign of missing higher-level tooling; low priority.
 ### 5. Budget overrun — 3 sessions
 Token cost well above peers. Secondary; revisit once (1)–(2) are addressed.
 ## Recommendations
 **The CLI/MCP-interface hypothesis is validated as a top-2 friction, not a minor
 issue.** Two high-ROI moves:
 - **A. A State Hub skill (highest ROI).** A skill (or a pre-loaded tool manifest)
  that (i) **front-loads the common hub tool schemas** so agents stop
  `ToolSearch`-ing for them — eliminates finding #2 almost entirely (81 % of
  sessions) — and (ii) **teaches batched writes** (sync N task statuses in one
  call, fewer progress events) to attack finding #1. Low effort, broad reach.
 - **B. Coarser hub operations.** Add bulk endpoints / a single "sync workplan
  statuses" op so a session doesn't make 200+ individual hub calls. This is the
  structural fix behind the skill's guidance.
 - **C. Measure the effect (Phase 4).** After A/B land, compare infra-overhead
  share on subsequent sessions against this baseline (median 11.7 %, p90 26.1 %).
  This is precisely what the Measure phase is for — the loop closes here.
 ## Content-level root causes (error-body mining)
 *Added 2026-06-07 from [AGENTIC-WP-0006] — `build_digest` now mines normalized
 error fingerprints into the durable digest, and `sig_recurring_error` clusters
 them. This is the "why" the tool-mix view above could not see.*
 **26 of 27 real sessions hit at least one error.** Top recurring error
 fingerprints across the corpus (by # sessions affected):
 | # sessions | occ | flavors | top sample |
 |-----------:|----:|---------|------------|
 | **12** | 32 | claude | `<tool_use_error>File has not been read yet. Read it first before writing to it.` |
 | **6** | 13 | claude | `<tool_use_error>File has been modified since read …` |
 | **4** | 9 | **claude + grok** | `make: *** [Makefile:227: fix-consistency] Error 1` |
 | 3 | 21 | claude | `MCP error -32602: Invalid request parameters` |
 | 3 | 6 | claude | `Error calling tool 'update_task_status': 'title'` |
 | 2 | 6 | claude | `make: *** [Makefile:21: test] Error 1` |
 Reading:
 - **#1 — Edit/Write-before-Read (12/27 sessions, 8 repos).** The single most
  common error is agents trying to edit a file they haven't read into context.
  This is a *workflow* friction, highly addressable: a Read-before-Edit reflex in
  the agent instructions / a skill, or a harness affordance. (Observed live: the
  author hit this exact error twice while writing this workplan.)
 - **#2 — stale-read conflicts (6 sessions):** "File has been modified since read"
  — same family, a re-read-before-edit discipline fixes both.
 - **#3 — cross-flavor `make fix-consistency` failures (claude + grok, 3 repos):**
  the consistency tooling itself fails across flavors — a shared infra issue worth
  a look on the state-hub side (cf. [STATE-WP-0058]).
 - **State Hub MCP instability** (`-32602`, `update_task_status 'title'`) shows up
  in 3 sessions each — corroborates the plumbing-overhead story and the live MCP
  flakiness seen during this work (REST fallback used).
 **Fingerprint noise — mostly handled.** `_is_failed` now excludes successful hub
 JSON responses (top-level no-error payloads) and file-read snapshots (numbered
 `cat -n` source lines), which cut distinct fingerprints **444 → 269 (~40 %)**
 without touching the top entries. Residual low-value items remain in the long tail
 (bare structural lines like `{`, linter "N errors" summaries); the *top*
 fingerprints are real. Note several entries (`MCP error -32602`,
 `update_task_status 'title'`) reflect the State Hub MCP instability hit live during
 this work — genuine, if self-referential, friction.
 ## What this assessment still can't see
 - ~~**Why** a session was expensive at the content level.~~ **Now addressed**
  (error-body mining, above), modulo the fingerprint-noise caveat.
 - Repeated *failed approaches* (as opposed to surfaced errors) — e.g. an agent
  silently retrying a wrong strategy without an error — are still invisible.
 - Grok/Codex are thin in the corpus (4 Grok, 0 Codex sessions), so cross-flavor
  friction claims are Claude-weighted for now.
 [AGENTIC-WP-0005]: ../workplans/AGENTIC-WP-0005-detect-hardening.md
 [AGENTIC-WP-0006]: ../workplans/AGENTIC-WP-0006-error-body-mining.md
 [STATE-WP-0058]: handed off to the state-hub repo worker
 [detect/quality.py]: ../session_memory/detect/quality.py
--- a/docs/DESIGN-session-memory.md
+++ b/docs/DESIGN-session-memory.md
@@ -370,8 +370,89 @@ hub indexes).
 ---
-*Next step: [AGENTIC-WP-0002] implements Phase 0 — the schema, the Claude
+## 11. Project metrics correlation (kaizen-agentic)
-collector, the Tier1/Tier2 store, and the budget-based eviction sweep.*
+
 Helix Forge owns **fleet-level** session capture and digests (this repo). The
 **kaizen-agentic** framework owns **project-scoped** agent execution metrics
 (ADR-004: `.kaizen/metrics/<agent>/executions.jsonl`). The two layers correlate
 by optional `helix_session_uid` on project records — link-by-reference, no
 duplicate ingestion in either repo.
 | Layer | Owner | Storage |
 |-------|-------|---------|
 | Fleet | agentic-resources (Helix Forge) | digest store (`digests` table) |
 | Project | kaizen-agentic | `.kaizen/metrics/<agent>/executions.jsonl` |
 **Cross-repo contract:** [Helix Forge Correlation Contract](https://gitea.coulomb.social/coulomb/kaizen-agentic/src/branch/main/docs/integrations/helix-forge-correlation.md)
 (kaizen-agentic). Field mapping from `Session.session_uid` → `helix_session_uid`,
 `digest.cost` → `tokens`, `tool_histogram` MCP share → `infra_overhead_share`.
 **Read path:** `kaizen-agentic metrics correlate <uid>` looks up a digest via
 `HELIX_STORE_DB` (this repo's session store). No write path from kaizen-agentic
 into Helix Forge.
 **Related kaizen-agentic docs:** [ADR-004 project metrics convention](https://gitea.coulomb.social/coulomb/kaizen-agentic/src/branch/main/docs/adr/ADR-004-project-metrics-convention.md),
 [wiki/EcosystemIntegration.md](https://gitea.coulomb.social/coulomb/kaizen-agentic/src/branch/main/wiki/EcosystemIntegration.md).
 ### 11.1 Session-close env export (dual-layer agents)
 Agents that run **both** Helix Forge capture and kaizen `metrics record` should
 export the following **after** the ingest sweep has written the session digest
 (`python -m session_memory.ingest` or an equivalent Stop/SessionEnd hook). Names
 match kaizen-agentic ADR-004 — do not invent parallel aliases.
 | Variable | Source in Helix Forge | Purpose |
 |----------|----------------------|---------|
 | `HELIX_SESSION_UID` | `Session.session_uid` | Primary correlation key → `helix_session_uid` |
 | `HELIX_REPO` | `digest.repo` | Project/repo scoping |
 | `HELIX_FLAVOR` | `digest.flavor` | Agent runtime (`claude` / `codex` / `grok`) |
 | `HELIX_TOKENS` | `digest.cost.input_tokens + digest.cost.output_tokens` | Token rollup → `tokens` |
 | `HELIX_INFRA_OVERHEAD_SHARE` | infra bucket share over `tool_histogram` (see `measure.metrics.session_metrics`) | MCP/plumbing overhead → `infra_overhead_share` |
 Example (after digest exists):
 ```bash
 export HELIX_SESSION_UID="claude:abc-123"
 export HELIX_REPO="agentic-resources"
 export HELIX_FLAVOR="claude"
 export HELIX_TOKENS=125000
 export HELIX_INFRA_OVERHEAD_SHARE=0.117
 # optional — lets kaizen correlate without guessing the store location:
 export HELIX_STORE_DB="$(pwd)/session_memory/.store/mem.db"
 kaizen-agentic metrics record   # merges HELIX_* when present
 ```
 ### 11.2 Digest store location and read API
 - **`HELIX_STORE_DB`** — absolute path to the SQLite file holding Tier 2 digests.
  Defaults to `config.toml` `[store].db_path` (`session_memory/.store/mem.db` relative
  to the repo root). Export as an absolute path when setting the variable on session
  close so `metrics correlate` works across hosts and working directories.
 - **Thin CLI** — `python -m session_memory.digest_lookup <session_uid> [--json]`
  prints one digest without running ingest. Exit `0` on hit, `1` when missing.
 - **Programmatic** — `Store.get_digest(session_uid)` returns the JSON blob written
  by `build_digest` / `analyze`.
 **Stable digest JSON shape** (fields consumers may rely on):
 | Field | Type | Notes |
 |-------|------|-------|
 | `session_uid` | string | Normalized uid (`<flavor>:<native-id>`) |
 | `flavor`, `repo`, `domain` | string | Session attribution |
 | `model` | string | Model id when known |
 | `started_at`, `ended_at` | string | ISO timestamps |
 | `outcome` | string | `success` / `fail` / `abandoned` / `unknown` |
 | `cost` | object | `input_tokens`, `output_tokens`, `cache_tokens`, `wall_clock_s`, `turns`, `retries` |
 | `tool_histogram` | object | Tool name → call count |
 | `event_count`, `kind_counts`, `markers` | object/int | Compact activity summary |
 | `first_prompt`, `last_assistant` | string | Short text snippets |
 | `error_snippets` | array | `{fingerprint, sample, count, tool}` entries |
 | `schema_version` | int | Digest schema version |
 ---
 *Implemented:* Phases 0–4, weekly retro ([AGENTIC-WP-0002]–[AGENTIC-WP-0010]);
 kaizen correlation follow-up ([AGENTIC-WP-0011]).
 ## Sources
--- a/docs/PRD-helix-forge.md
+++ b/docs/PRD-helix-forge.md
@@ -5,7 +5,7 @@
 **Status:** Draft v0.1
 **Author:** Claude (drafted with Bernd Worsch)
 **Created:** 2026-06-06
-**Updated:** 2026-06-06
+**Updated:** 2026-06-19
 ---
@@ -223,6 +223,32 @@ record:
 - The hub remains a **read model**; Helix Forge writes its durable artifacts as files
  and lets the hub index them.
 ### 9.1 Downstream: kaizen-agentic project metrics correlation
 Helix Forge is a **fleet-level** producer of normalized session digests. The
 **kaizen-agentic** framework is a **project-scoped** consumer of optional
 correlation fields on its execution metrics (ADR-004). The two layers link
 **by reference** — kaizen-agentic does not re-implement JSONL ingestion or write
 into the Helix Forge store.
 | Layer | Owner | What it stores |
 |-------|-------|----------------|
 | Fleet | agentic-resources (`session_memory`) | Per-session digests in the local SQLite store |
 | Project | kaizen-agentic | `.kaizen/metrics/<agent>/executions.jsonl` |
 **Canonical spec in this repo:** [DESIGN-session-memory.md §11](DESIGN-session-memory.md#11-project-metrics-correlation-kaizen-agentic)
 (session-close env export, digest read path, stable JSON shape).
 **Authoritative cross-repo contract (kaizen-agentic):**
 [Helix Forge Correlation Contract](https://gitea.coulomb.social/coulomb/kaizen-agentic/src/branch/main/docs/integrations/helix-forge-correlation.md).
 Field mapping: `Session.session_uid` → `helix_session_uid`; digest token totals →
 `tokens`; MCP/tool overhead share → `infra_overhead_share`.
 **Read path for consumers:** `HELIX_STORE_DB` points at the digest SQLite file
 (default `session_memory/.store/mem.db`); `python -m session_memory.digest_lookup
 <uid> --json` or `kaizen-agentic metrics correlate <uid>` performs a read-only
 lookup. No ingestion code belongs in kaizen-agentic.
 ## 10. Success Metrics
 | Metric | Meaning | Target (directional, v1) |
@@ -255,12 +281,26 @@ record:
  three flavors?
 - **OQ3** Where does detection logic run — local batch jobs, hub-side, or a dedicated
  service? What volume do we actually expect?
- **OQ4** Pattern format: how do we keep one agnostic representation while giving each
+- ~~**OQ4** Pattern format: how do we keep one agnostic representation while giving each
-  distributor enough to render high-quality native artifacts?
+  distributor enough to render high-quality native artifacts?~~ **Resolved (Phase 2,
- **OQ5** What's the minimum trustworthy evidence bar before a pattern is allowed to be
+  AGENTIC-WP-0004):** the `SolutionPattern` core is flavor-agnostic (problem,
-  distributed to live agent environments?
+  resolutions, scope, provenance) and carries per-flavor knowledge only in a separate
- **OQ6** How do we prevent pattern bloat — too many low-value instructions degrading
+  `rendering_hints` sub-structure keyed by flavor — distributors read the hints, the
-  agent context budgets (cf. the token-budget policy in global instructions)?
+  core stays neutral. Catalogued as versioned files-first artifacts (FR-U3).
 - ~~**OQ5** What's the minimum trustworthy evidence bar before a pattern is allowed to be
  distributed to live agent environments?~~ **Resolved (Phase 2):** a two-tier
  evidence bar (`[curate.gate]`). A *promote* floor (frequency / distinct sessions /
  cost-impact) admits a candidate as `provisional`; a stricter *distribution* floor
  (higher frequency, optional cross-flavor requirement, cost-impact) is required to
  mark a pattern `approved` + `distribution_ready`. Defaults are conservative and
  config-tunable.
 - ~~**OQ6** How do we prevent pattern bloat — too many low-value instructions degrading
  agent context budgets (cf. the token-budget policy in global instructions)?~~
  **Resolved (Phase 2):** a bloat guard flags duplicate (same id) and near-duplicate
  (same signal-type+locus) candidates at review time, and the catalog dedups
  structurally on the source-candidate key so re-promotion never multiplies entries.
  Thin candidates stay `provisional` (not distributed) rather than padding live
  context.
 ## 13. Risks
--- a/registry/README.md
+++ b/registry/README.md
@@ -0,0 +1,12 @@
 # Capability Registry
 Markdown-first capability index for federation and reuse planning.
 ## Authoring
 1. Copy a capability entry template (see reuse-surface `templates/capability-entry.template.md`).
 2. Add the row to `indexes/capabilities.yaml`.
 3. Run `reuse-surface validate` from a checkout with the CLI installed.
 4. Merge to `main` and verify publish with `reuse-surface establish --publish-check`.
 Federation contract: reuse-surface `docs/RegistryFederation.md`.
--- a/registry/capabilities/.gitkeep
+++ b/registry/capabilities/.gitkeep
--- a/registry/indexes/capabilities.yaml
+++ b/registry/indexes/capabilities.yaml
@@ -0,0 +1,4 @@
 version: 1
 updated: '2026-06-16'
 domain: helix_forge
 capabilities: []
--- a/session_memory/README.md
+++ b/session_memory/README.md
@@ -13,14 +13,40 @@ time window.
 ```
 session_memory/
-  adapters/claude.py   # Tier0 -> Tier1 normalizer (Codex/Grok land in Phase 1)
+  adapters/common.py   # shared Normalized bundle + helpers
  adapters/claude.py   # Tier0 -> Tier1 normalizers, one per flavor
  adapters/codex.py    #   (rollout {timestamp,type,payload}, flat call_id join)
  adapters/grok.py     #   (per-session dir: chat_history + events + updates)
  core/schema.py       # Session / SessionEvent / Cost
-  core/store.py        # SQLite rows + blob-dir bodies (Tier1) + digests (Tier2)
+  core/store.py        # SQLite rows + blob-dir bodies (Tier1) + digests/patterns (Tier2)
  core/cursor.py       # incremental ingest cursors
  core/digest.py       # Tier1 -> Tier2 promotion + outcome heuristic
  core/retention.py    # budget-based eviction sweep
  ingest.py            # one sweep: discover -> normalize -> store -> digest -> evict
-  config.toml          # store paths, retention caps, sources, repo->domain map
+  detect/signals.py    # signal extractors over digests
  detect/cluster.py    # cluster signals -> candidate patterns + cross-flavor flag
  detect/__main__.py   # python -m session_memory.detect (ranked report)
  curate/schema.py     # SolutionPattern artifact + per-flavor rendering hints
  curate/catalog.py    # versioned, files-first Pattern Catalog (dedup on id)
  curate/gating.py     # promotion evidence bar + bloat guard
  curate/review.py     # discuss/approve/reject -> promote workflow
  curate/decisions.py  # hub decision audit trail (graceful local-queue fallback)
  curate/__main__.py   # python -m session_memory.curate (interactive / --auto-approve)
  catalog/             # the committed Pattern Catalog (source of truth)
  distribute/base.py   # Artifact + Distributor protocol + idempotent snippet markers
  distribute/claude.py # CLAUDE.md (or skill) renderer    } per-flavor edges
  distribute/codex.py  # AGENTS.md renderer                } (agnostic body,
  distribute/grok.py   # native instruction renderer       }  different targets)
  distribute/proposals.py  # scoping + proposed-not-applied output + active registry
  distribute/__main__.py   # python -m session_memory.distribute
  measure/metrics.py   # fleet metrics + persisted baseline snapshots
  measure/effect.py    # before/after per-pattern effectiveness
  measure/__main__.py  # python -m session_memory.measure
  retro/build.py       # windowed top-3-per-repo suggestions
  retro/publish.py     # hub coding_retro read model + local report
  retro/__main__.py    # python -m session_memory.retro
  digest_lookup.py     # python -m session_memory.digest_lookup (read one digest, no ingest)
  config.toml          # store paths, retention caps, sources, repo->domain map, curate gate
 ```
 The local store lives under `session_memory/.store/` (gitignored).
@@ -51,6 +77,147 @@ the sweep *runs*. Trigger it with the repo scheduler, e.g. daily:
 or a cron entry / `/loop` on a timer. Push-capture (agent Stop/SessionEnd hooks)
 can also enqueue a sweep; see design §7.
 ## Detect candidate patterns
 After ingesting, mine the digests for recurring problem/success patterns:
 ```bash
 python -m session_memory.detect                 # ranked report, cross-flavor first
 python -m session_memory.detect --json          # machine-readable candidates
 python -m session_memory.detect --min-frequency 3
 ```
 Candidates are persisted to a Tier 2 `patterns` table and are the input to the
 Curate phase (Phase 2). Patterns whose evidence spans more than one agent flavor
 are flagged `[CROSS-FLAVOR]` — the highest-value reuse targets.
 ## Curate candidates into the Pattern Catalog
 Review detect candidates into versioned **Solution Patterns** held in the
 files-first catalog (`session_memory/catalog/`). The flow is **detect → curate →
 (Phase 3) distribute**; `curate` refreshes candidates by running detect first.
 ```bash
 python -m session_memory.curate                 # interactive review (a/r/d per candidate)
 python -m session_memory.curate --auto-approve  # batch: promote all that clear the evidence bar
 python -m session_memory.curate --json          # machine-readable result
 ```
 - **Promotion** writes a `SolutionPattern` file (id = source candidate key, so
  re-promoting the same candidate dedups; content changes bump the semver and
  archive the prior version to `<id>.history.jsonl`).
 - The **evidence bar** (`[curate.gate]`) sets two floors: a promote floor and a
  stricter *distribution* floor. A thin-but-real candidate lands `provisional`;
  one clearing the distribution floor lands `approved` + `distribution_ready`.
 - A **bloat guard** flags duplicate / near-duplicate candidates so the catalog
  stays lean.
 - Re-review is **idempotent** — a remembered decision is skipped unless the
  candidate's evidence changed; a prior reject is not re-surfaced.
 - Each final promote/reject is recorded as a **hub decision**; if the hub is
  offline the decision is queued to `[curate].decision_queue` for later sync
  (the same after-the-fact pattern used in Phase 1).
 ### Curate knobs (`[curate]` / `[curate.gate]` in config.toml)
 | Key | Meaning |
 |-----|---------|
 | `catalog_dir` | committed Pattern Catalog dir (source of truth) |
 | `review_log` / `decision_queue` | remembered decisions + pending hub decisions (gitignored) |
 | `min_frequency` / `min_sessions` / `min_cost_impact` | floor to promote at all |
 | `dist_require_cross_flavor` | require cross-flavor evidence to be distribution-eligible |
 | `dist_min_frequency` / `dist_min_cost_impact` | stricter floor for `distribution_ready` |
 ## Distribute patterns as per-flavor proposals
 Render approved catalog patterns into per-flavor artifacts — **proposed, never
 auto-applied** (HITL). Completes the loop: **detect → curate → distribute**.
 ```bash
 python -m session_memory.distribute                 # proposals for all repos/flavors
 python -m session_memory.distribute --repo state-hub --flavor claude
 python -m session_memory.distribute --json
 ```
 - Only `approved` + `distribution_ready` patterns are rendered; each pattern's
  `Scope` (repos/domains/flavors) decides where it lands (FR-X2).
 - Each flavor renders the **same agnostic body** to its own target (Claude →
  `CLAUDE.md`/skill, Codex → `AGENTS.md`, Grok → native) via `rendering_hints`
  (FR-A3); blocks carry stable `BEGIN/END` markers so re-running updates in place.
 - Output goes to `session_memory/proposals/<repo>/<target>` (gitignored,
  regenerated) — a reviewable diff a human applies (FR-X3). The committed
  `distribute/active_patterns.json` records which pattern+version is proposed in
  which `(repo, flavor)` (FR-X4).
 ## Measure effectiveness (closing the loop)
 Track whether the fleet is getting cheaper / more reliable, and whether a
 distributed pattern actually helped.
 ```bash
 python -m session_memory.measure --label "baseline"      # snapshot + trend
 python -m session_memory.measure --since 2026-06-07      # before/after a change
 python -m session_memory.measure --no-save --json
 ```
 - A **snapshot** (infra-overhead share, error rate, schema-thrash, token
  percentiles, success rate) is appended to `measure/baselines.jsonl` to build a
  trend (FR-M3).
 - `--since DATE` splits sessions before/after a change and diffs the metrics, with
  an `improved` verdict per metric (FR-M1/FR-M2) — so ineffective patterns can be
  retired. Recorded pre-fix baseline (2026-06-07): 27 sessions, infra-overhead
  median 11.7 %, error rate 0.96, schema-thrash 8 sessions.
 ## Weekly retro (the input to the scheduled retrospection)
 A windowed roll-up: detect + measure over the last N days → the **top-3
 improvement suggestions per repo** (cross-flavor first; recommendations pulled
 from the Pattern Catalog) → published to the hub as the `coding_retro` read model.
 ```bash
 python -m session_memory.retro                      # last 7 days, local report
 python -m session_memory.retro --window-days 30 --json
 python -m session_memory.retro --publish            # also post coding_retro to the hub
 ```
 Writes `retro/last_retro.{json,md}` and (with `--publish`) posts an
 `event_type=coding_retro` progress event. This is consumed by activity-core's
 **Weekly Coding Retrospection** schedule (ACTIVITY-WP-0008, Saturday 19:00 Berlin),
 which emits one improvement task per relevant repo. Hub publish degrades
 gracefully when the hub is unreachable.
 ## Correlation with kaizen-agentic
 Helix Forge owns **fleet-level** session digests; **kaizen-agentic** owns
 **project-scoped** execution metrics (ADR-004). The two layers correlate by
 optional `helix_session_uid` on project records — **link-by-reference only**;
 kaizen-agentic does not ingest JSONL into this store.
 | Layer | Storage |
 |-------|---------|
 | Fleet (here) | `session_memory/.store/mem.db` → `digests` table |
 | Project (kaizen) | `.kaizen/metrics/<agent>/executions.jsonl` |
 - **Spec:** [DESIGN-session-memory.md §11](../docs/DESIGN-session-memory.md#11-project-metrics-correlation-kaizen-agentic)
 - **Contract (kaizen-agentic):** [Helix Forge Correlation Contract](https://gitea.coulomb.social/coulomb/kaizen-agentic/src/branch/main/docs/integrations/helix-forge-correlation.md)
 ### Session-close env export
 After ingest has written the digest, agents using both layers export `HELIX_*`
 vars for `kaizen-agentic metrics record` to merge (names match ADR-004):
 `HELIX_SESSION_UID`, `HELIX_REPO`, `HELIX_FLAVOR`, `HELIX_TOKENS`,
 `HELIX_INFRA_OVERHEAD_SHARE`, and optionally `HELIX_STORE_DB` (absolute path to
 `mem.db`). See DESIGN §11.1 for field sources.
 ### Read one digest (for `metrics correlate`)
 ```bash
 python -m session_memory.digest_lookup claude:abc-123 --json
 HELIX_STORE_DB=/abs/path/to/mem.db python -m session_memory.digest_lookup <uid>
 ```
 Defaults to `[store].db_path` in `config.toml`. Read-only — does not run ingest.
 ## Retention knobs (`[retention]` in config.toml)
 | Key | Meaning |
@@ -66,10 +233,28 @@ exists, except the explicitly-reported hard-cap overflow path.
 ## Tests
 ```bash
-python -m pytest          # 26 tests: schema, adapter, store, digest, retention, ingest
+python -m pytest          # schema, adapters, store, digest, retention, ingest, detect, curate
 ```
 ## Status
-Phase 0 (AGENTIC-WP-0002): Claude adapter only, end to end. Codex and Grok
+- **Phase 0** (AGENTIC-WP-0002): schema, store, digest, budget retention, Claude
-adapters are designed (schemas confirmed in the design doc) and land in Phase 1.
+  adapter, ingest sweep.
 - **Phase 1** (AGENTIC-WP-0003): Codex + Grok adapters, multi-file session merge,
  and the Detect pipeline (signals → clustering → cross-flavor candidate patterns).
 - **Phase 2** (AGENTIC-WP-0004): Curate — Solution Pattern schema, versioned
  files-first Pattern Catalog, discuss/approve/reject review with an evidence bar +
  bloat guard, and hub-decision audit trail.
 - **Detect hardening** (AGENTIC-WP-0005): session-quality filter + tool-mix /
  infra-overhead signals. **Error mining** (AGENTIC-WP-0006): recurring error
  fingerprints → root-cause patterns.
 - **Phase 3** (AGENTIC-WP-0007): Distribute — per-flavor distributor adapters
  render approved patterns into proposed (HITL) artifacts, scoped by repo/domain,
  with an active-pattern registry.
 - **Phase 4** (AGENTIC-WP-0009): Measure — fleet baseline/trend + before/after
  per-pattern effectiveness. The Capture → Detect → Curate → Distribute → Measure
  loop is closed.
 - **Weekly retro** (AGENTIC-WP-0010): windowed top-3-per-repo + hub `coding_retro`
  publish.
 - **Kaizen correlation** (AGENTIC-WP-0011): bidirectional doc links, session-close
  `HELIX_*` env convention, `digest_lookup` read path.
--- a/session_memory/adapters/claude.py
+++ b/session_memory/adapters/claude.py
@@ -11,54 +11,23 @@ that the store persists out-of-line so Tier 1 rows stay light.
 from __future__ import annotations
 import json
 import os
-from dataclasses import dataclass, field
+from typing import Any, Optional
 from datetime import datetime, timezone
 from typing import Any, Iterable, Optional
 from ..core.schema import Cost, Session, SessionEvent
 from .common import (  # noqa: F401  (Normalized re-exported for back-compat)
    Normalized,
    classify_tool,
    first_line as _first_line,
    iter_jsonl as _iter_records,
    now_iso as _now,
    resolve_repo as _resolve_repo,
    seconds_between as _seconds_between,
    stringify as _stringify,
 )
 FLAVOR = "claude"
 # tool_use names that mutate files -> kind "edit"
 _EDIT_TOOLS = {"Edit", "Write", "NotebookEdit", "MultiEdit"}
 # crude test-runner detection inside Bash commands -> kind "test_run"
 _TEST_HINTS = ("pytest", "unittest", "npm test", "npm run test", "go test", "cargo test", "jest", "vitest")
@dataclass
 class Normalized:
    session: Session
    events: list[SessionEvent]
    blobs: dict[str, str] = field(default_factory=dict)
 def _iter_records(path: str) -> Iterable[dict[str, Any]]:
    with open(path, "r", encoding="utf-8") as f:
        for line in f:
            line = line.strip()
            if not line:
                continue
            try:
                yield json.loads(line)
            except json.JSONDecodeError:
                continue  # tolerate partial/corrupt trailing lines
 def _resolve_repo(cwd: Optional[str], repo_domain_map: dict[str, str]) -> tuple[Optional[str], Optional[str]]:
    """cwd -> (repo, domain). repo is the cwd basename; domain via map."""
    if not cwd:
        return None, None
    repo = os.path.basename(cwd.rstrip("/")) or None
    domain = repo_domain_map.get(repo) if repo else None
    return repo, domain
 def _is_test_command(text: str) -> bool:
    low = text.lower()
    return any(h in low for h in _TEST_HINTS)
 def _content_blocks(message: dict[str, Any]) -> list[dict[str, Any]]:
    content = message.get("content")
@@ -159,11 +128,8 @@ def parse_session(path: str, repo_domain_map: Optional[dict[str, str]] = None) -
                    name = b.get("name", "")
                    inp = b.get("input", {})
                    body = _stringify(inp)
-                    kind = "tool_call"
+                    cmd = inp.get("command", "") if isinstance(inp, dict) else ""
-                    if name in _EDIT_TOOLS:
+                    kind = classify_tool(name, _stringify(cmd))
                        kind = "edit"
                    elif name == "Bash" and _is_test_command(_stringify(inp.get("command", ""))):
                        kind = "test_run"
                    add_event(uuid, parent, ts, kind, role="assistant", tool=name,
                              summary=f"{name}", body=body, sidechain=sidechain)
@@ -194,35 +160,3 @@ def parse_session(path: str, repo_domain_map: Optional[dict[str, str]] = None) -
        discovered_at=_now(),
    )
    return Normalized(session=session, events=events, blobs=blobs)
 # ---- helpers ---------------------------------------------------------------
 def _stringify(v: Any) -> str:
    if v is None:
        return ""
    if isinstance(v, str):
        return v
    try:
        return json.dumps(v, ensure_ascii=False)[:20000]
    except (TypeError, ValueError):
        return str(v)[:20000]
 def _first_line(text: str) -> str:
    return (text or "").strip().splitlines()[0] if (text or "").strip() else ""
 def _seconds_between(start: Optional[str], end: Optional[str]) -> float:
    if not start or not end:
        return 0.0
    try:
        a = datetime.fromisoformat(start.replace("Z", "+00:00"))
        b = datetime.fromisoformat(end.replace("Z", "+00:00"))
        return max(0.0, (b - a).total_seconds())
    except ValueError:
        return 0.0
 def _now() -> str:
    return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
--- a/session_memory/adapters/codex.py
+++ b/session_memory/adapters/codex.py
@@ -0,0 +1,167 @@
 """OpenAI Codex CLI collector adapter — Tier 0 -> Tier 1 (design §2.2, §4.3).
 Reads ``$CODEX_HOME/sessions/YYYY/MM/DD/rollout-*.jsonl``. Each line is a
 ``RolloutLine`` wrapper ``{timestamp, type, payload}``; ``type`` discriminates
 ``session_meta`` / ``response_item`` / ``event_msg`` / ``turn_context`` /
 ``compacted``.
 Codex is **flat** — tool calls and outputs are joined only by ``call_id`` with no
 parent-ref DAG — so ``seq`` is assigned by temporal (line) order and
 ``parent_seq`` is set for ``function_call_output`` back to its ``function_call``.
 """
 from __future__ import annotations
 import os
 from typing import Any, Optional
 from ..core.schema import Cost, Session, SessionEvent
 from .common import (
    Normalized,
    classify_tool,
    first_line,
    iter_jsonl,
    now_iso,
    resolve_repo,
    seconds_between,
    stringify,
 )
 FLAVOR = "codex"
 def _message_text(payload: dict[str, Any]) -> str:
    content = payload.get("content")
    if isinstance(content, str):
        return content
    parts = []
    if isinstance(content, list):
        for b in content:
            if isinstance(b, dict):
                parts.append(b.get("text") or b.get("output_text") or "")
            elif isinstance(b, str):
                parts.append(b)
    return "\n".join(p for p in parts if p)
 def _extract_tokens(payload: dict[str, Any]) -> tuple[int, int, int]:
    """Best-effort (input, output, cache) from a token_count payload.
    Field shapes vary across Codex versions; probe known locations, else recurse.
    """
    for scope in (payload, payload.get("info") or {}, payload.get("usage") or {},
                  (payload.get("info") or {}).get("total_token_usage") or {}):
        if isinstance(scope, dict):
            i = scope.get("input_tokens") or scope.get("prompt_tokens")
            o = scope.get("output_tokens") or scope.get("completion_tokens")
            if i is not None or o is not None:
                cache = scope.get("cached_input_tokens") or scope.get("cache_read_input_tokens") or 0
                return int(i or 0), int(o or 0), int(cache or 0)
    return 0, 0, 0
 def parse_session(path: str, repo_domain_map: Optional[dict[str, str]] = None) -> Optional[Normalized]:
    repo_domain_map = repo_domain_map or {}
    records = list(iter_jsonl(path))
    if not records:
        return None
    session_id: Optional[str] = None
    cwd = model = cli_version = None
    timestamps: list[str] = []
    events: list[SessionEvent] = []
    blobs: dict[str, str] = {}
    call_seq: dict[str, int] = {}  # call_id -> seq of its function_call
    cost = Cost()
    seq = 0
    def add_event(ts, kind, *, role=None, tool=None, summary=None, body=None,
                  tokens=0, parent_seq=None) -> int:
        nonlocal seq
        s = seq
        seq += 1
        payload_ref = None
        if body:
            payload_ref = f"blob://{session_id}/{s}"
            blobs[payload_ref] = body
        events.append(SessionEvent(
            session_uid=Session.make_uid(FLAVOR, session_id or "unknown"),
            seq=s, parent_seq=parent_seq, ts=ts, kind=kind, role=role, tool=tool,
            summary=(summary or "")[:300] or None, payload_ref=payload_ref, tokens=tokens,
        ))
        return s
    for rec in records:
        rtype = rec.get("type")
        ts = rec.get("timestamp")
        if ts:
            timestamps.append(ts)
        payload = rec.get("payload") or {}
        if rtype == "session_meta":
            session_id = session_id or payload.get("id")
            cwd = cwd or payload.get("cwd")
            model = model or payload.get("model")
            cli_version = cli_version or payload.get("cli_version")
        elif rtype == "turn_context":
            model = model or payload.get("model")
        elif rtype == "response_item":
            ptype = payload.get("type")
            if ptype == "message":
                role = payload.get("role", "assistant")
                text = _message_text(payload)
                kind = "assistant_msg" if role == "assistant" else "user_msg"
                add_event(ts, kind, role=role, summary=first_line(text), body=text)
            elif ptype == "function_call":
                name = payload.get("name", "")
                args = stringify(payload.get("arguments"))
                kind = classify_tool(name, args)
                s = add_event(ts, kind, role="assistant", tool=name,
                              summary=name, body=args)
                call_id = payload.get("call_id")
                if call_id:
                    call_seq[call_id] = s
            elif ptype == "function_call_output":
                call_id = payload.get("call_id")
                parent = call_seq.get(call_id)
                body = stringify(payload.get("output"))
                add_event(ts, "tool_result", role="tool", tool=None,
                          summary="tool result", body=body, parent_seq=parent)
            elif ptype == "reasoning":
                body = _message_text(payload) or stringify(payload.get("summary"))
                add_event(ts, "thinking", role="assistant", summary="reasoning", body=body)
        elif rtype == "event_msg":
            ptype = payload.get("type")
            if ptype == "task_started":
                add_event(ts, "lifecycle", summary="task_started")
            elif ptype == "task_complete":
                add_event(ts, "completion", summary="task_complete")
            elif ptype == "token_count":
                i, o, c = _extract_tokens(payload)
                cost.input_tokens += i
                cost.output_tokens += o
                cost.cache_tokens += c
            # user_message / agent_message echoes are duplicated by response_item
            # messages on modern Codex; skipped to avoid double counting.
    if session_id is None:
        return None
    cost.turns = sum(1 for e in events if e.kind == "user_msg")
    started = min(timestamps) if timestamps else None
    ended = max(timestamps) if timestamps else None
    cost.wall_clock_s = seconds_between(started, ended)
    repo, domain = resolve_repo(cwd, repo_domain_map)
    session = Session(
        session_uid=Session.make_uid(FLAVOR, session_id),
        flavor=FLAVOR, native_session_id=session_id,
        repo=repo, domain=domain, cwd=cwd, model=model,
        started_at=started, ended_at=ended, outcome="unknown", cost=cost,
        source_path=path, source_bytes=os.path.getsize(path) if os.path.exists(path) else 0,
        discovered_at=now_iso(),
    )
    return Normalized(session=session, events=events, blobs=blobs)
--- a/session_memory/adapters/common.py
+++ b/session_memory/adapters/common.py
@@ -0,0 +1,100 @@
 """Shared adapter helpers (Tier 0 -> Tier 1).
 The ``Normalized`` bundle contract and small flavor-agnostic helpers used by every
 collector adapter. Per-flavor parsing lives in the individual adapter modules.
 """
 from __future__ import annotations
 import json
 import os
 from dataclasses import dataclass, field
 from datetime import datetime, timezone
 from typing import Any, Optional
 from ..core.schema import Session, SessionEvent
 # tool names that mutate files -> kind "edit" (union across flavors)
 EDIT_TOOLS = {
    "Edit", "Write", "NotebookEdit", "MultiEdit",  # Claude
    "apply_patch", "write_file", "edit_file",        # Codex / Grok variants
 }
 # substrings in a shell/tool command that indicate a test run -> kind "test_run"
 TEST_HINTS = (
    "pytest", "unittest", "npm test", "npm run test", "go test",
    "cargo test", "jest", "vitest", "make test", "tox",
 )
@dataclass
 class Normalized:
    session: Session
    events: list[SessionEvent]
    blobs: dict[str, str] = field(default_factory=dict)
 def resolve_repo(cwd: Optional[str], repo_domain_map: dict[str, str]) -> tuple[Optional[str], Optional[str]]:
    """cwd -> (repo, domain). repo is the cwd basename; domain via map."""
    if not cwd:
        return None, None
    repo = os.path.basename(cwd.rstrip("/")) or None
    domain = repo_domain_map.get(repo) if repo else None
    return repo, domain
 def is_test_command(text: str) -> bool:
    low = (text or "").lower()
    return any(h in low for h in TEST_HINTS)
 def classify_tool(name: str, command_text: str = "") -> str:
    """Map a tool invocation to an event kind: edit | test_run | tool_call."""
    if name in EDIT_TOOLS:
        return "edit"
    if is_test_command(command_text) or is_test_command(name):
        return "test_run"
    return "tool_call"
 def stringify(v: Any, limit: int = 20000) -> str:
    if v is None:
        return ""
    if isinstance(v, str):
        return v[:limit]
    try:
        return json.dumps(v, ensure_ascii=False)[:limit]
    except (TypeError, ValueError):
        return str(v)[:limit]
 def first_line(text: str) -> str:
    t = (text or "").strip()
    return t.splitlines()[0] if t else ""
 def seconds_between(start: Optional[str], end: Optional[str]) -> float:
    if not start or not end:
        return 0.0
    try:
        a = datetime.fromisoformat(start.replace("Z", "+00:00"))
        b = datetime.fromisoformat(end.replace("Z", "+00:00"))
        return max(0.0, (b - a).total_seconds())
    except ValueError:
        return 0.0
 def iter_jsonl(path: str):
    """Yield parsed JSON objects from a JSONL file, tolerating bad lines."""
    with open(path, "r", encoding="utf-8") as f:
        for line in f:
            line = line.strip()
            if not line:
                continue
            try:
                yield json.loads(line)
            except json.JSONDecodeError:
                continue
 def now_iso() -> str:
    return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
--- a/session_memory/adapters/grok.py
+++ b/session_memory/adapters/grok.py
@@ -0,0 +1,182 @@
 """Grok CLI collector adapter — Tier 0 -> Tier 1 (design §2.3, §4.3).
 A Grok session is a *directory* ``~/.grok/sessions/<enc-cwd>/<uuid>/`` containing
 ``summary.json`` (metadata), ``chat_history.jsonl`` (the canonical transcript),
 ``events.jsonl`` (explicit lifecycle + ``turn_number``), and ``updates.jsonl``
 (ACP ``session/update`` stream, which carries tool-call names/args).
 The ingest glob matches ``chat_history.jsonl``; this adapter derives its sibling
 files from the same directory. Conversation order is taken from
 ``chat_history.jsonl``; tool-call names are paired, in order, from
 ``updates.jsonl`` ``tool_call`` entries to classify edits/test runs.
 """
 from __future__ import annotations
 import json
 import os
 from typing import Any, Optional
 from ..core.schema import Cost, Session, SessionEvent
 from .common import (
    Normalized,
    classify_tool,
    first_line,
    iter_jsonl,
    now_iso,
    resolve_repo,
    seconds_between,
    stringify,
 )
 FLAVOR = "grok"
 def _text_content(content: Any) -> str:
    if isinstance(content, str):
        return content
    if isinstance(content, list):
        return "\n".join(
            (b.get("text") or "") for b in content if isinstance(b, dict)
        )
    return ""
 def _tool_calls_in_order(session_dir: str) -> list[dict[str, Any]]:
    """Ordered list of {title, rawInput} from updates.jsonl tool_call entries."""
    calls: list[dict[str, Any]] = []
    upd = os.path.join(session_dir, "updates.jsonl")
    if not os.path.exists(upd):
        return calls
    for rec in iter_jsonl(upd):
        u = (rec.get("params") or {}).get("update") or {}
        if u.get("sessionUpdate") == "tool_call":
            calls.append({"title": u.get("title") or "", "rawInput": u.get("rawInput") or {},
                          "id": u.get("toolCallId")})
    return calls
 def _session_meta(session_dir: str) -> dict[str, Any]:
    p = os.path.join(session_dir, "summary.json")
    if not os.path.exists(p):
        return {}
    try:
        with open(p, "r", encoding="utf-8") as f:
            return json.load(f)
    except (OSError, ValueError):
        return {}
 def _lifecycle(session_dir: str) -> tuple[list[dict[str, Any]], Optional[str]]:
    """events.jsonl records + the model id seen there."""
    evs, model = [], None
    p = os.path.join(session_dir, "events.jsonl")
    if os.path.exists(p):
        for rec in iter_jsonl(p):
            evs.append(rec)
            model = model or rec.get("model_id")
    return evs, model
 def parse_session(path: str, repo_domain_map: Optional[dict[str, str]] = None) -> Optional[Normalized]:
    repo_domain_map = repo_domain_map or {}
    # accept either the chat_history.jsonl path or the session dir
    session_dir = path if os.path.isdir(path) else os.path.dirname(path)
    chat = os.path.join(session_dir, "chat_history.jsonl")
    if not os.path.exists(chat):
        return None
    meta = _session_meta(session_dir)
    info = meta.get("info") or {}
    session_id = info.get("id") or os.path.basename(session_dir.rstrip("/"))
    cwd = info.get("cwd") or meta.get("git_root_dir")
    life_events, life_model = _lifecycle(session_dir)
    model = meta.get("current_model_id") or life_model
    pending_calls = _tool_calls_in_order(session_dir)
    call_idx = 0
    events: list[SessionEvent] = []
    blobs: dict[str, str] = {}
    seq = 0
    def add(kind, *, role=None, tool=None, summary=None, body=None, parent_seq=None) -> int:
        nonlocal seq
        s = seq
        seq += 1
        ref = None
        if body:
            ref = f"blob://{session_id}/{s}"
            blobs[ref] = body
        events.append(SessionEvent(
            session_uid=Session.make_uid(FLAVOR, session_id), seq=s, parent_seq=parent_seq,
            ts=None, kind=kind, role=role, tool=tool,
            summary=(summary or "")[:300] or None, payload_ref=ref,
        ))
        return s
    # explicit lifecycle first (turn_started/turn_ended carry no bodies)
    for le in life_events:
        t = le.get("type")
        if t in ("turn_started", "loop_started", "turn_ended", "phase_changed"):
            add("lifecycle", summary=t)
    for rec in iter_jsonl(chat):
        rtype = rec.get("type")
        content = rec.get("content")
        if rtype == "user":
            text = _text_content(content)
            if text.strip():
                add("user_msg", role="user", summary=first_line(text), body=text)
        elif rtype == "reasoning":
            text = _text_content(content)
            if text.strip():
                add("thinking", role="assistant", summary="reasoning", body=text)
        elif rtype == "assistant":
            text = _text_content(content)
            if text.strip():
                add("assistant_msg", role="assistant", summary=first_line(text), body=text)
        elif rtype == "tool_result":
            # pair with the next tool_call (in order) to recover name/args
            tool = None
            parent = None
            if call_idx < len(pending_calls):
                call = pending_calls[call_idx]
                call_idx += 1
                tool = call["title"]
                cmd = stringify(call["rawInput"])
                kind = classify_tool(tool, cmd)
                parent = add(kind, role="assistant", tool=tool, summary=tool, body=cmd)
            body = _text_content(content) if not isinstance(content, str) else content
            add("tool_result", role="tool", tool=tool, summary="tool result",
                body=stringify(body), parent_seq=parent)
    if not events:
        return None
    cost = Cost(turns=sum(1 for e in events if e.kind == "user_msg"))
    started = info.get("created_at") or meta.get("created_at")
    ended = meta.get("last_active_at") or info.get("updated_at") or meta.get("updated_at")
    cost.wall_clock_s = seconds_between(started, ended)
    repo, domain = resolve_repo(cwd, repo_domain_map)
    session = Session(
        session_uid=Session.make_uid(FLAVOR, session_id), flavor=FLAVOR,
        native_session_id=session_id, repo=repo, domain=domain, cwd=cwd,
        git_branch=meta.get("head_branch"), model=model,
        started_at=started, ended_at=ended, outcome="unknown", cost=cost,
        source_path=chat,
        source_bytes=_dir_bytes(session_dir),
        discovered_at=now_iso(),
    )
    return Normalized(session=session, events=events, blobs=blobs)
 def _dir_bytes(d: str) -> int:
    total = 0
    for root, _, files in os.walk(d):
        for f in files:
            try:
                total += os.path.getsize(os.path.join(root, f))
            except OSError:
                pass
    return total
--- a/session_memory/catalog/sp-problem-budget_overrun-tokens.history.jsonl
+++ b/session_memory/catalog/sp-problem-budget_overrun-tokens.history.jsonl
@@ -0,0 +1 @@
 {"created_at": "2026-06-07T09:13:20Z", "distribution_ready": true, "id": "sp-problem-budget_overrun-tokens", "name": "problem: budget overrun", "polarity": "problem", "problem": "problem: budget overrun", "provenance": {"detected_at": null, "evidence": {"cost_impact": 10.667, "cross_flavor": false, "flavors": ["claude"], "frequency": 3, "key": "problem:budget_overrun:tokens", "locus": "tokens", "polarity": "problem", "repos": ["artifact-store", "citation-evidence", "infospace-bench"], "score": 32.001, "sessions": ["claude:0ef1b45c-5c27-4e20-88b3-37daeaa24eca", "claude:6e0d3d68-872b-4d93-bb09-0691e091314b", "claude:8fabd5ce-6a20-4412-9a8b-0f0763394a78"], "signal_type": "budget_overrun", "title": "problem: budget overrun"}, "promoted_at": "2026-06-07T09:13:20Z", "source_key": "problem:budget_overrun:tokens"}, "rendering_hints": {"claude": {"note": "TODO: refine rendering", "target": "CLAUDE.md"}}, "resolutions": [{"detail": "", "steps": [], "summary": "TODO: capture the recommended resolution"}], "schema_version": 1, "scope": {"domains": [], "flavors": ["claude"], "repos": ["artifact-store", "citation-evidence", "infospace-bench"]}, "status": "superseded", "updated_at": "2026-06-07T09:13:20Z", "version": "1.0.0"}
--- a/session_memory/catalog/sp-problem-budget_overrun-tokens.json
+++ b/session_memory/catalog/sp-problem-budget_overrun-tokens.json
@@ -0,0 +1,77 @@
 {
  "created_at": "2026-06-07T09:13:20Z",
  "distribution_ready": true,
  "id": "sp-problem-budget_overrun-tokens",
  "name": "Budget overrun: token cost above peers",
  "polarity": "problem",
  "problem": "A session's token cost lands well above its peers (>p90). Usually driven by re-reading large files or tool outputs, carrying redundant context, or long exploratory loops without checkpoints.",
  "provenance": {
    "detected_at": null,
    "evidence": {
      "cost_impact": 10.667,
      "cross_flavor": false,
      "flavors": [
        "claude"
      ],
      "frequency": 3,
      "key": "problem:budget_overrun:tokens",
      "locus": "tokens",
      "polarity": "problem",
      "repos": [
        "artifact-store",
        "citation-evidence",
        "infospace-bench"
      ],
      "score": 32.001,
      "sessions": [
        "claude:0ef1b45c-5c27-4e20-88b3-37daeaa24eca",
        "claude:6e0d3d68-872b-4d93-bb09-0691e091314b",
        "claude:8fabd5ce-6a20-4412-9a8b-0f0763394a78"
      ],
      "signal_type": "budget_overrun",
      "title": "problem: budget overrun"
    },
    "promoted_at": "2026-06-07T09:13:20Z",
    "source_key": "problem:budget_overrun:tokens"
  },
  "rendering_hints": {
    "claude": {
      "target": "CLAUDE.md"
    }
  },
  "resolutions": [
    {
      "detail": "Use offset/limit; don't re-Read a file already in the transcript.",
      "steps": [
        "Locate with grep/glob first",
        "Read only the relevant span"
      ],
      "summary": "Read narrowly \u2014 target the region you need, not whole large files"
    },
    {
      "detail": "Summarize progress; avoid re-pulling outputs already shown.",
      "steps": [],
      "summary": "Checkpoint and prune context instead of re-fetching it"
    },
    {
      "detail": "grep/glob narrows scope far cheaper than reading whole trees.",
      "steps": [],
      "summary": "Prefer targeted search over broad reads to locate code"
    }
  ],
  "schema_version": 1,
  "scope": {
    "domains": [],
    "flavors": [
      "claude"
    ],
    "repos": [
      "artifact-store",
      "citation-evidence",
      "infospace-bench"
    ]
  },
  "status": "approved",
  "updated_at": "2026-06-07T14:21:06Z",
  "version": "1.0.1"
 }
--- a/session_memory/catalog/sp-problem-file_not_read-edit.history.jsonl
+++ b/session_memory/catalog/sp-problem-file_not_read-edit.history.jsonl
@@ -0,0 +1 @@
 {"covers": [], "created_at": "2026-06-07T13:26:25Z", "distribution_ready": true, "id": "sp-problem-file_not_read-edit", "name": "Read before you Edit", "polarity": "problem", "problem": "Agents call Edit/Write on a file they have not read in the current session, or after it changed under them. The edit tools reject this ('File has not been read yet' / 'File has been modified since read'), and the retry burns a turn. Top recurring error in the corpus (12/27 sessions, 8 repos).", "provenance": {"detected_at": null, "evidence": {"frequency": 32, "origin": "AGENTIC-WP-0006 error mining / ASSESSMENT-infra-friction.md", "polarity": "problem", "repos": 8, "sessions": 12}, "promoted_at": null, "source_key": "problem:file_not_read:edit"}, "rendering_hints": {"claude": {"target": "CLAUDE.md"}, "codex": {"target": "AGENTS.md"}, "grok": {"target": ".grok/instructions.md"}}, "resolutions": [{"detail": "Never blind-write a file you haven't read this session.", "steps": ["Read the target file", "Then Edit/Write"], "summary": "Read the file (or the region you'll touch) before Edit/Write"}, {"detail": "A stale read means the file changed under you; refresh, don't loop.", "steps": ["Re-Read the file", "Re-apply the Edit"], "summary": "On 'modified since read', re-Read then re-Edit"}], "schema_version": 1, "scope": {"domains": [], "flavors": [], "repos": []}, "status": "superseded", "updated_at": "2026-06-07T13:26:25Z", "version": "1.0.0"}
--- a/session_memory/catalog/sp-problem-file_not_read-edit.json
+++ b/session_memory/catalog/sp-problem-file_not_read-edit.json
@@ -0,0 +1,63 @@
 {
  "covers": [
    "file has not been read",
    "modified since read",
    "file_not_read"
  ],
  "created_at": "2026-06-07T13:26:25Z",
  "distribution_ready": true,
  "id": "sp-problem-file_not_read-edit",
  "name": "Read before you Edit",
  "polarity": "problem",
  "problem": "Agents call Edit/Write on a file they have not read in the current session, or after it changed under them. The edit tools reject this ('File has not been read yet' / 'File has been modified since read'), and the retry burns a turn. Top recurring error in the corpus (12/27 sessions, 8 repos).",
  "provenance": {
    "detected_at": null,
    "evidence": {
      "frequency": 32,
      "origin": "AGENTIC-WP-0006 error mining / ASSESSMENT-infra-friction.md",
      "polarity": "problem",
      "repos": 8,
      "sessions": 12
    },
    "promoted_at": null,
    "source_key": "problem:file_not_read:edit"
  },
  "rendering_hints": {
    "claude": {
      "target": "CLAUDE.md"
    },
    "codex": {
      "target": "AGENTS.md"
    },
    "grok": {
      "target": ".grok/instructions.md"
    }
  },
  "resolutions": [
    {
      "detail": "Never blind-write a file you haven't read this session.",
      "steps": [
        "Read the target file",
        "Then Edit/Write"
      ],
      "summary": "Read the file (or the region you'll touch) before Edit/Write"
    },
    {
      "detail": "A stale read means the file changed under you; refresh, don't loop.",
      "steps": [
        "Re-Read the file",
        "Re-apply the Edit"
      ],
      "summary": "On 'modified since read', re-Read then re-Edit"
    }
  ],
  "schema_version": 1,
  "scope": {
    "domains": [],
    "flavors": [],
    "repos": []
  },
  "status": "approved",
  "updated_at": "2026-06-07T19:06:45Z",
  "version": "1.0.1"
 }
--- a/session_memory/catalog/sp-problem-infra_overhead-infra_overhead.history.jsonl
+++ b/session_memory/catalog/sp-problem-infra_overhead-infra_overhead.history.jsonl
@@ -0,0 +1 @@
 {"created_at": "2026-06-07T09:13:20Z", "distribution_ready": false, "id": "sp-problem-infra_overhead-infra_overhead", "name": "problem: infra overhead", "polarity": "problem", "problem": "problem: infra overhead", "provenance": {"detected_at": null, "evidence": {"cost_impact": 0.801, "cross_flavor": false, "flavors": ["claude"], "frequency": 2, "key": "problem:infra_overhead:infra_overhead", "locus": "infra_overhead", "polarity": "problem", "repos": ["markitect-main", "vergabe-teilnahme"], "score": 1.602, "sessions": ["claude:135002f9-98d2-4d1b-b8fb-543b20388782", "claude:b4ae9631-a7eb-42a6-acb1-c65b660c4b74"], "signal_type": "infra_overhead", "title": "problem: infra overhead"}, "promoted_at": "2026-06-07T09:13:20Z", "source_key": "problem:infra_overhead:infra_overhead"}, "rendering_hints": {"claude": {"note": "TODO: refine rendering", "target": "CLAUDE.md"}}, "resolutions": [{"detail": "", "steps": [], "summary": "TODO: capture the recommended resolution"}], "schema_version": 1, "scope": {"domains": [], "flavors": ["claude"], "repos": ["markitect-main", "vergabe-teilnahme"]}, "status": "superseded", "updated_at": "2026-06-07T09:13:20Z", "version": "1.0.0"}
--- a/session_memory/catalog/sp-problem-infra_overhead-infra_overhead.json
+++ b/session_memory/catalog/sp-problem-infra_overhead-infra_overhead.json
@@ -0,0 +1,74 @@
 {
  "created_at": "2026-06-07T09:13:20Z",
  "distribution_ready": false,
  "id": "sp-problem-infra_overhead-infra_overhead",
  "name": "Infrastructure overhead: too much coordination plumbing",
  "polarity": "problem",
  "problem": "A large share of the session's tool calls are State Hub / task-management / schema-loading plumbing rather than touching the repo (corpus median 11.7%, up to 43% in the worst sessions; one session made 231 hub calls).",
  "provenance": {
    "detected_at": null,
    "evidence": {
      "cost_impact": 0.801,
      "cross_flavor": false,
      "flavors": [
        "claude"
      ],
      "frequency": 2,
      "key": "problem:infra_overhead:infra_overhead",
      "locus": "infra_overhead",
      "polarity": "problem",
      "repos": [
        "markitect-main",
        "vergabe-teilnahme"
      ],
      "score": 1.602,
      "sessions": [
        "claude:135002f9-98d2-4d1b-b8fb-543b20388782",
        "claude:b4ae9631-a7eb-42a6-acb1-c65b660c4b74"
      ],
      "signal_type": "infra_overhead",
      "title": "problem: infra overhead"
    },
    "promoted_at": "2026-06-07T09:13:20Z",
    "source_key": "problem:infra_overhead:infra_overhead"
  },
  "rendering_hints": {
    "claude": {
      "target": "CLAUDE.md"
    }
  },
  "resolutions": [
    {
      "detail": "Update several task statuses together; emit fewer, coarser progress events.",
      "steps": [
        "Do a chunk of work",
        "Then sync statuses in one pass"
      ],
      "summary": "Batch hub writes \u2014 sync at checkpoints, not per event"
    },
    {
      "detail": "One scoped summary at session start beats many broad reads.",
      "steps": [],
      "summary": "Orient once with get_domain_summary, don't re-query repeatedly"
    },
    {
      "detail": "See STATE-WP-0058 \u2014 stops the repeated ToolSearch for hub tools.",
      "steps": [],
      "summary": "Front-load hub tool knowledge via the State Hub skill"
    }
  ],
  "schema_version": 1,
  "scope": {
    "domains": [],
    "flavors": [
      "claude"
    ],
    "repos": [
      "markitect-main",
      "vergabe-teilnahme"
    ]
  },
  "status": "provisional",
  "updated_at": "2026-06-07T14:21:06Z",
  "version": "1.0.1"
 }
--- a/session_memory/catalog/sp-problem-schema_thrash-schema_load.history.jsonl
+++ b/session_memory/catalog/sp-problem-schema_thrash-schema_load.history.jsonl
@@ -0,0 +1 @@
 {"created_at": "2026-06-07T09:13:20Z", "distribution_ready": true, "id": "sp-problem-schema_thrash-schema_load", "name": "problem: schema thrash", "polarity": "problem", "problem": "problem: schema thrash", "provenance": {"detected_at": null, "evidence": {"cost_impact": 79.0, "cross_flavor": false, "flavors": ["claude"], "frequency": 8, "key": "problem:schema_thrash:schema_load", "locus": "schema_load", "polarity": "problem", "repos": ["activity-core", "citation-evidence", "flex-auth", "infospace-bench", "ops-bridge", "vergabe-teilnahme"], "score": 632.0, "sessions": ["claude:0ef1b45c-5c27-4e20-88b3-37daeaa24eca", "claude:30dbad62-c042-41f2-80c1-5953a1100e7f", "claude:4340b160-2fb6-47d0-897c-3cac0a8855d8", "claude:63fd4df2-5add-4748-af21-c1544825e006", "claude:8313f946-f008-4e98-9915-31950380e39e", "claude:8fabd5ce-6a20-4412-9a8b-0f0763394a78", "claude:b4ae9631-a7eb-42a6-acb1-c65b660c4b74", "claude:bbcf1c2b-14be-40e4-826b-4b2b49b9d212"], "signal_type": "schema_thrash", "title": "problem: schema thrash"}, "promoted_at": "2026-06-07T09:13:20Z", "source_key": "problem:schema_thrash:schema_load"}, "rendering_hints": {"claude": {"note": "TODO: refine rendering", "target": "CLAUDE.md"}}, "resolutions": [{"detail": "", "steps": [], "summary": "TODO: capture the recommended resolution"}], "schema_version": 1, "scope": {"domains": [], "flavors": ["claude"], "repos": ["activity-core", "citation-evidence", "flex-auth", "infospace-bench", "ops-bridge", "vergabe-teilnahme"]}, "status": "superseded", "updated_at": "2026-06-07T09:13:20Z", "version": "1.0.0"}
--- a/session_memory/catalog/sp-problem-schema_thrash-schema_load.json
+++ b/session_memory/catalog/sp-problem-schema_thrash-schema_load.json
@@ -0,0 +1,83 @@
 {
  "created_at": "2026-06-07T09:13:20Z",
  "distribution_ready": true,
  "id": "sp-problem-schema_thrash-schema_load",
  "name": "Schema thrash: repeated ToolSearch",
  "polarity": "problem",
  "problem": "ToolSearch fires repeatedly within a session (seen in 81% of sessions) because the State Hub MCP tools are deferred and their schemas get re-loaded each time they are needed \u2014 pure overhead with no work value.",
  "provenance": {
    "detected_at": null,
    "evidence": {
      "cost_impact": 79.0,
      "cross_flavor": false,
      "flavors": [
        "claude"
      ],
      "frequency": 8,
      "key": "problem:schema_thrash:schema_load",
      "locus": "schema_load",
      "polarity": "problem",
      "repos": [
        "activity-core",
        "citation-evidence",
        "flex-auth",
        "infospace-bench",
        "ops-bridge",
        "vergabe-teilnahme"
      ],
      "score": 632.0,
      "sessions": [
        "claude:0ef1b45c-5c27-4e20-88b3-37daeaa24eca",
        "claude:30dbad62-c042-41f2-80c1-5953a1100e7f",
        "claude:4340b160-2fb6-47d0-897c-3cac0a8855d8",
        "claude:63fd4df2-5add-4748-af21-c1544825e006",
        "claude:8313f946-f008-4e98-9915-31950380e39e",
        "claude:8fabd5ce-6a20-4412-9a8b-0f0763394a78",
        "claude:b4ae9631-a7eb-42a6-acb1-c65b660c4b74",
        "claude:bbcf1c2b-14be-40e4-826b-4b2b49b9d212"
      ],
      "signal_type": "schema_thrash",
      "title": "problem: schema thrash"
    },
    "promoted_at": "2026-06-07T09:13:20Z",
    "source_key": "problem:schema_thrash:schema_load"
  },
  "rendering_hints": {
    "claude": {
      "target": "CLAUDE.md"
    }
  },
  "resolutions": [
    {
      "detail": "Resolve them by name in one ToolSearch (select:...) rather than searching ad hoc.",
      "steps": [
        "List the hub tools the session needs",
        "Load them once at the start"
      ],
      "summary": "Load the tool schemas you'll need once, up front"
    },
    {
      "detail": "The skill carries the schemas so no per-use discovery is needed.",
      "steps": [],
      "summary": "Adopt the State Hub skill that front-loads common hub tool signatures"
    }
  ],
  "schema_version": 1,
  "scope": {
    "domains": [],
    "flavors": [
      "claude"
    ],
    "repos": [
      "activity-core",
      "citation-evidence",
      "flex-auth",
      "infospace-bench",
      "ops-bridge",
      "vergabe-teilnahme"
    ]
  },
  "status": "approved",
  "updated_at": "2026-06-07T14:21:06Z",
  "version": "1.0.1"
 }
--- a/session_memory/catalog/sp-problem-tool_thrash-tool-bash.history.jsonl
+++ b/session_memory/catalog/sp-problem-tool_thrash-tool-bash.history.jsonl
@@ -0,0 +1 @@
 {"created_at": "2026-06-07T09:13:20Z", "distribution_ready": true, "id": "sp-problem-tool_thrash-tool-bash", "name": "problem: tool thrash", "polarity": "problem", "problem": "problem: tool thrash", "provenance": {"detected_at": null, "evidence": {"cost_impact": 1990.0, "cross_flavor": false, "flavors": ["claude"], "frequency": 11, "key": "problem:tool_thrash:tool:Bash", "locus": "tool:Bash", "polarity": "problem", "repos": ["activity-core", "artifact-store", "citation-evidence", "ihp-railiance-probe", "infospace-bench", "railiance-apps", "state-hub", "vergabe-teilnahme"], "score": 21890.0, "sessions": ["claude:0ef1b45c-5c27-4e20-88b3-37daeaa24eca", "claude:2c0d14e1-d089-4076-bf35-b134737a261d", "claude:30dbad62-c042-41f2-80c1-5953a1100e7f", "claude:4307eff6-cd39-4189-be58-79a3acb69d6c", "claude:4340b160-2fb6-47d0-897c-3cac0a8855d8", "claude:6e0d3d68-872b-4d93-bb09-0691e091314b", "claude:8313f946-f008-4e98-9915-31950380e39e", "claude:8fabd5ce-6a20-4412-9a8b-0f0763394a78", "claude:a9483f07-c9dc-4f71-9fa0-831790ea965e", "claude:b1dfbcfa-91f9-4540-823a-26fcfaab7fc8", "claude:b4ae9631-a7eb-42a6-acb1-c65b660c4b74"], "signal_type": "tool_thrash", "title": "problem: tool thrash"}, "promoted_at": "2026-06-07T09:13:20Z", "source_key": "problem:tool_thrash:tool:Bash"}, "rendering_hints": {"claude": {"note": "TODO: refine rendering", "target": "CLAUDE.md"}}, "resolutions": [{"detail": "", "steps": [], "summary": "TODO: capture the recommended resolution"}], "schema_version": 1, "scope": {"domains": [], "flavors": ["claude"], "repos": ["activity-core", "artifact-store", "citation-evidence", "ihp-railiance-probe", "infospace-bench", "railiance-apps", "state-hub", "vergabe-teilnahme"]}, "status": "superseded", "updated_at": "2026-06-07T09:13:20Z", "version": "1.0.0"}
--- a/session_memory/catalog/sp-problem-tool_thrash-tool-bash.json
+++ b/session_memory/catalog/sp-problem-tool_thrash-tool-bash.json
@@ -0,0 +1,95 @@
 {
  "created_at": "2026-06-07T09:13:20Z",
  "distribution_ready": true,
  "id": "sp-problem-tool_thrash-tool-bash",
  "name": "Tool thrash: one tool hammered",
  "polarity": "problem",
  "problem": "A single tool (often Bash or Edit) is invoked far more than any other in a session \u2014 a sign of trial-and-error churn or missing higher-level tooling.",
  "provenance": {
    "detected_at": null,
    "evidence": {
      "cost_impact": 1990.0,
      "cross_flavor": false,
      "flavors": [
        "claude"
      ],
      "frequency": 11,
      "key": "problem:tool_thrash:tool:Bash",
      "locus": "tool:Bash",
      "polarity": "problem",
      "repos": [
        "activity-core",
        "artifact-store",
        "citation-evidence",
        "ihp-railiance-probe",
        "infospace-bench",
        "railiance-apps",
        "state-hub",
        "vergabe-teilnahme"
      ],
      "score": 21890.0,
      "sessions": [
        "claude:0ef1b45c-5c27-4e20-88b3-37daeaa24eca",
        "claude:2c0d14e1-d089-4076-bf35-b134737a261d",
        "claude:30dbad62-c042-41f2-80c1-5953a1100e7f",
        "claude:4307eff6-cd39-4189-be58-79a3acb69d6c",
        "claude:4340b160-2fb6-47d0-897c-3cac0a8855d8",
        "claude:6e0d3d68-872b-4d93-bb09-0691e091314b",
        "claude:8313f946-f008-4e98-9915-31950380e39e",
        "claude:8fabd5ce-6a20-4412-9a8b-0f0763394a78",
        "claude:a9483f07-c9dc-4f71-9fa0-831790ea965e",
        "claude:b1dfbcfa-91f9-4540-823a-26fcfaab7fc8",
        "claude:b4ae9631-a7eb-42a6-acb1-c65b660c4b74"
      ],
      "signal_type": "tool_thrash",
      "title": "problem: tool thrash"
    },
    "promoted_at": "2026-06-07T09:13:20Z",
    "source_key": "problem:tool_thrash:tool:Bash"
  },
  "rendering_hints": {
    "claude": {
      "target": "CLAUDE.md"
    }
  },
  "resolutions": [
    {
      "detail": "Compose a single command/script; run independent calls in parallel.",
      "steps": [
        "Group the steps",
        "Run them as one block"
      ],
      "summary": "Batch related shell work into one script, not many small Bash calls"
    },
    {
      "detail": "Read the region, then one substantive Edit beats many tiny ones.",
      "steps": [],
      "summary": "Make fewer, larger edits with full context"
    },
    {
      "detail": "If the same invocation recurs, wrap it once.",
      "steps": [],
      "summary": "Factor a repeated command pattern into a helper"
    }
  ],
  "schema_version": 1,
  "scope": {
    "domains": [],
    "flavors": [
      "claude"
    ],
    "repos": [
      "activity-core",
      "artifact-store",
      "citation-evidence",
      "ihp-railiance-probe",
      "infospace-bench",
      "railiance-apps",
      "state-hub",
      "vergabe-teilnahme"
    ]
  },
  "status": "approved",
  "updated_at": "2026-06-07T14:21:06Z",
  "version": "1.0.1"
 }
--- a/session_memory/catalog/sp-success-clean_pass-outcome.history.jsonl
+++ b/session_memory/catalog/sp-success-clean_pass-outcome.history.jsonl
@@ -0,0 +1 @@
 {"created_at": "2026-06-07T09:13:20Z", "distribution_ready": true, "id": "sp-success-clean_pass-outcome", "name": "cross-flavor success: clean pass", "polarity": "success", "problem": "cross-flavor success: clean pass", "provenance": {"detected_at": null, "evidence": {"cost_impact": 17.0, "cross_flavor": true, "flavors": ["claude", "grok"], "frequency": 17, "key": "success:clean_pass:outcome", "locus": "outcome", "polarity": "success", "repos": ["activity-core", "agentic-resources", "artifact-store", "can-you-assist", "citation-evidence", "infospace-bench", "issue-facade", "ops-bridge", "railiance-apps", "state-hub", "the-custodian", "vergabe-teilnahme"], "score": 433.5, "sessions": ["claude:0ef1b45c-5c27-4e20-88b3-37daeaa24eca", "claude:16bdbec4-b018-4902-9fb5-336f8f3d61c8", "claude:2c0d14e1-d089-4076-bf35-b134737a261d", "claude:30dbad62-c042-41f2-80c1-5953a1100e7f", "claude:4307eff6-cd39-4189-be58-79a3acb69d6c", "claude:4340b160-2fb6-47d0-897c-3cac0a8855d8", "claude:631de76e-fdee-43b5-b091-7b7675467ad1", "claude:63fd4df2-5add-4748-af21-c1544825e006", "claude:6e0d3d68-872b-4d93-bb09-0691e091314b", "claude:8313f946-f008-4e98-9915-31950380e39e", "claude:8fabd5ce-6a20-4412-9a8b-0f0763394a78", "claude:a9483f07-c9dc-4f71-9fa0-831790ea965e", "claude:b4ae9631-a7eb-42a6-acb1-c65b660c4b74", "claude:eb837dd1-5b8e-472e-b9e1-4537b10e03e6", "claude:ee9e84f2-bc35-4eb5-a7ad-aaec5f31d965", "claude:f1b25697-0e5f-45f0-81d1-af0f1762c438", "grok:019e6122-00c0-79f3-b4e5-9c70b77c015d"], "signal_type": "clean_pass", "title": "cross-flavor success: clean pass"}, "promoted_at": "2026-06-07T09:13:20Z", "source_key": "success:clean_pass:outcome"}, "rendering_hints": {"claude": {"note": "TODO: refine rendering", "target": "CLAUDE.md"}, "grok": {"note": "TODO: refine rendering", "target": "instructions"}}, "resolutions": [{"detail": "", "steps": [], "summary": "TODO: capture the recommended resolution"}], "schema_version": 1, "scope": {"domains": [], "flavors": ["claude", "grok"], "repos": ["activity-core", "agentic-resources", "artifact-store", "can-you-assist", "citation-evidence", "infospace-bench", "issue-facade", "ops-bridge", "railiance-apps", "state-hub", "the-custodian", "vergabe-teilnahme"]}, "status": "superseded", "updated_at": "2026-06-07T09:13:20Z", "version": "1.0.0"}
--- a/session_memory/catalog/sp-success-clean_pass-outcome.json
+++ b/session_memory/catalog/sp-success-clean_pass-outcome.json
@@ -0,0 +1,110 @@
 {
  "created_at": "2026-06-07T09:13:20Z",
  "distribution_ready": true,
  "id": "sp-success-clean_pass-outcome",
  "name": "Clean pass: tests green, no retries",
  "polarity": "success",
  "problem": "The target session shape: ends in success, runs the test suite, with no errors and no retries \u2014 resolves cheaply and reliably. Seen across many sessions and both Claude and Grok (the highest-value pattern to reinforce).",
  "provenance": {
    "detected_at": null,
    "evidence": {
      "cost_impact": 17.0,
      "cross_flavor": true,
      "flavors": [
        "claude",
        "grok"
      ],
      "frequency": 17,
      "key": "success:clean_pass:outcome",
      "locus": "outcome",
      "polarity": "success",
      "repos": [
        "activity-core",
        "agentic-resources",
        "artifact-store",
        "can-you-assist",
        "citation-evidence",
        "infospace-bench",
        "issue-facade",
        "ops-bridge",
        "railiance-apps",
        "state-hub",
        "the-custodian",
        "vergabe-teilnahme"
      ],
      "score": 433.5,
      "sessions": [
        "claude:0ef1b45c-5c27-4e20-88b3-37daeaa24eca",
        "claude:16bdbec4-b018-4902-9fb5-336f8f3d61c8",
        "claude:2c0d14e1-d089-4076-bf35-b134737a261d",
        "claude:30dbad62-c042-41f2-80c1-5953a1100e7f",
        "claude:4307eff6-cd39-4189-be58-79a3acb69d6c",
        "claude:4340b160-2fb6-47d0-897c-3cac0a8855d8",
        "claude:631de76e-fdee-43b5-b091-7b7675467ad1",
        "claude:63fd4df2-5add-4748-af21-c1544825e006",
        "claude:6e0d3d68-872b-4d93-bb09-0691e091314b",
        "claude:8313f946-f008-4e98-9915-31950380e39e",
        "claude:8fabd5ce-6a20-4412-9a8b-0f0763394a78",
        "claude:a9483f07-c9dc-4f71-9fa0-831790ea965e",
        "claude:b4ae9631-a7eb-42a6-acb1-c65b660c4b74",
        "claude:eb837dd1-5b8e-472e-b9e1-4537b10e03e6",
        "claude:ee9e84f2-bc35-4eb5-a7ad-aaec5f31d965",
        "claude:f1b25697-0e5f-45f0-81d1-af0f1762c438",
        "grok:019e6122-00c0-79f3-b4e5-9c70b77c015d"
      ],
      "signal_type": "clean_pass",
      "title": "cross-flavor success: clean pass"
    },
    "promoted_at": "2026-06-07T09:13:20Z",
    "source_key": "success:clean_pass:outcome"
  },
  "rendering_hints": {
    "claude": {
      "target": "CLAUDE.md"
    },
    "grok": {
      "target": "instructions"
    }
  },
  "resolutions": [
    {
      "detail": "A passing suite is the cheapest proof the change works.",
      "steps": [
        "Make the change",
        "Run the suite",
        "Only then report done"
      ],
      "summary": "Run the test suite before declaring done; let green gate completion"
    },
    {
      "detail": "Small verified steps beat large unverified ones that bounce.",
      "steps": [],
      "summary": "Work incrementally and verify as you go to avoid retries"
    }
  ],
  "schema_version": 1,
  "scope": {
    "domains": [],
    "flavors": [
      "claude",
      "grok"
    ],
    "repos": [
      "activity-core",
      "agentic-resources",
      "artifact-store",
      "can-you-assist",
      "citation-evidence",
      "infospace-bench",
      "issue-facade",
      "ops-bridge",
      "railiance-apps",
      "state-hub",
      "the-custodian",
      "vergabe-teilnahme"
    ]
  },
  "status": "approved",
  "updated_at": "2026-06-07T14:21:06Z",
  "version": "1.0.1"
 }
--- a/session_memory/config.toml
+++ b/session_memory/config.toml
@@ -20,20 +20,64 @@ root    = "~/.claude/projects"
 # glob, relative to root; covers sessions and agent-* sidechains
 glob    = "*/*.jsonl"
-# Codex / Grok adapters land in Phase 1 (schemas confirmed in the design doc).
+# Codex / Grok adapters added in Phase 1 (AGENTIC-WP-0003).
 [sources.codex]
-enabled = false
+enabled = true
 root    = "~/.codex/sessions"
 glob    = "*/*/*/rollout-*.jsonl"
 [sources.grok]
-enabled = false
+enabled = true
 root    = "~/.grok/sessions"
 glob    = "*/*/chat_history.jsonl"
 # Detect phase (AGENTIC-WP-0005): quality filter — drop non-coding/trivial sessions
 # before signals form, so health-checks don't mint false-positive patterns.
 [detect.quality]
 min_events      = 20   # below this many events, not a real coding session
 min_substantive = 3    # require >= this many substantive (edit/read/shell) tool calls
 min_prompt_len  = 25   # first prompt shorter than this is treated as trivial
 # Curate phase (AGENTIC-WP-0004): catalog location + promotion evidence bar.
 # Measure phase (AGENTIC-WP-0009): persisted baseline/trend of fleet metrics.
 [measure]
 baselines = "session_memory/measure/baselines.jsonl"  # timestamped metric snapshots (committed)
 # Weekly retro (AGENTIC-WP-0010): windowed top-3-per-repo report, published to the
 # hub as the coding_retro read model that activity-core's weekly schedule consumes.
 [retro]
 window_days = 7
 report_json = "session_memory/retro/last_retro.json"  # latest report (committed)
 report_md   = "session_memory/retro/last_retro.md"    # human-readable mirror
 hub_url     = "http://127.0.0.1:8000"                 # for --publish (best-effort)
 # Distribute phase (AGENTIC-WP-0007): where per-flavor proposals + the active
 # registry are written. Proposals are HITL — reviewed, never auto-applied.
 [distribute]
 proposals_dir   = "session_memory/proposals"                  # reviewable proposals (gitignored, regenerated)
 active_registry = "session_memory/distribute/active_patterns.json"  # what's proposed/active where (committed)
 [curate]
 catalog_dir    = "session_memory/catalog"               # files-first Pattern Catalog (committed)
 review_log     = "session_memory/.store/reviews.jsonl"  # remembered decisions (gitignored)
 decision_queue = "session_memory/.store/decisions.queue.jsonl"  # hub decisions pending sync
 state_hub_workstream_id = "b3703684-f60e-42f3-b03e-dabe3e8ce3f4"  # AGENTIC-WP-0004
 # Evidence bar (OQ5): floors to promote at all, and stricter floors to be
 # distribution-eligible (status=approved, distribution_ready=true).
 [curate.gate]
 min_frequency             = 2      # >= this many supporting signals to promote
 min_sessions              = 2      # >= this many distinct sessions
 min_cost_impact           = 0.0
 dist_require_cross_flavor = false  # require cross-flavor evidence to distribute
 dist_min_frequency        = 3
 dist_min_cost_impact      = 0.0
 # cwd basename -> domain slug. Used to tag sessions with their Custodian domain.
 [repo_domain_map]
 agentic-resources = "helix_forge"
 the-custodian     = "custodian"
 state-hub         = "custodian"
 ops-bridge        = "custodian"
 net-kingdom       = "netkingdom"
 can-you-assist    = "coulomb_social"
--- a/session_memory/core/digest.py
+++ b/session_memory/core/digest.py
@@ -12,6 +12,8 @@ belongs to the Detect phase (PRD §6.2).
 from __future__ import annotations
 import collections
 import json
 import re
 from typing import Any
 from .schema import Session, SessionEvent
@@ -21,6 +23,22 @@ _FAIL_HINTS = ("error", "failed", "exception", "traceback", "fatal", "non-zero")
 # Substrings suggesting a clean test pass.
 _PASS_HINTS = ("passed", "0 failed", "ok", "success")
 # A line that is numbered source content from a Read result (`cat -n` style),
 # e.g. "229\t    raise InfospaceError(" — code text, never a runtime error.
 _NUMBERED_LINE_RE = re.compile(r"^\s*\d+\t")
 # Top-level keys that mark a JSON tool-result as an actual error (vs. success).
 _JSON_ERROR_KEYS = ("error", "errors", "detail")
 # Normalization patterns so the same error collapses to one fingerprint
 # regardless of paths / ids / counts (WP-0006 T01).
 _UUID_RE = re.compile(r"\b[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}\b", re.I)
 _HEXADDR_RE = re.compile(r"\b0x[0-9a-f]+\b", re.I)
 _PATH_RE = re.compile(r"(?:/[\w.\-]+)+/?|[A-Za-z]:\\[\w.\\\-]+")
 _NUM_RE = re.compile(r"\b\d+\b")
 _WS_RE = re.compile(r"\s+")
 _ERR_SAMPLE_MAX = 200
 _ERR_FP_MAX = 160
 def infer_outcome(events: list[SessionEvent], blobs: dict[str, str] | None = None) -> str:
    """Heuristic outcome label across flavors (design OQ2).
@@ -100,6 +118,7 @@ def build_digest(session: Session, events: list[SessionEvent],
        },
        "first_prompt": _first_prompt(events, blobs),
        "last_assistant": _last_assistant(events, blobs),
        "error_snippets": _error_snippets(events, blobs),
        "schema_version": session.schema_version,
    }
@@ -148,6 +167,114 @@ def _last_assistant(events, blobs):
    return None
 def _error_line(text: str) -> str:
    """Pick the most error-like line from a body.
    Prefers the *last* line matching a fail hint — in a Python traceback the
    actual exception is the final line, while the bare ``Traceback (most recent
    call last):`` header is just noise and is skipped.
    """
    lines = [ln.strip() for ln in text.splitlines() if ln.strip()]
    matches = [ln for ln in lines
               if any(h in ln.lower() for h in _FAIL_HINTS)
               and not ln.lower().startswith("traceback")]
    if matches:
        return matches[-1]
    # fall back to any fail-hint line (e.g. only the traceback header), else first
    any_hint = [ln for ln in lines if any(h in ln.lower() for h in _FAIL_HINTS)]
    return any_hint[-1] if any_hint else (lines[0] if lines else "")
 def _error_fingerprint(text: str) -> str:
    """Stable, content-addressable key for an error, paths/ids/numbers removed."""
    s = _error_line(text).lower()
    s = _UUID_RE.sub("<uuid>", s)
    s = _HEXADDR_RE.sub("<addr>", s)
    s = _PATH_RE.sub("<path>", s)
    s = _NUM_RE.sub("<n>", s)
    return _WS_RE.sub(" ", s).strip()[:_ERR_FP_MAX]
 def _error_body(event: SessionEvent, blobs: dict) -> str:
    """Best available text for a failed event."""
    if event.payload_ref and event.payload_ref in blobs:
        return blobs[event.payload_ref]
    return event.summary or ""
 def _looks_like_file_read(body: str) -> bool:
    """True if the body is mostly numbered source lines (a Read result), not an error."""
    lines = [ln for ln in body.splitlines() if ln.strip()]
    if not lines:
        return False
    numbered = sum(1 for ln in lines if _NUMBERED_LINE_RE.match(ln))
    return numbered >= max(3, len(lines) // 2)
 def _json_verdict(body: str):
    """Classify a JSON tool-result body: 'error', 'success', or None (not JSON).
    Hub MCP successes look like ``{"result": "..."}`` and mention 'error' deep
    inside summaries but are not failures ('success'). A payload with a top-level
    error key (``{"detail": ...}`` / ``{"error": ...}``) is 'error'. Non-JSON text
    returns None so the plain fail-hint heuristic still applies.
    """
    s = body.strip()
    if not s or s[0] not in "{[":
        return None
    try:
        obj = json.loads(s)
    except (ValueError, TypeError):
        return None
    if isinstance(obj, dict) and any(k in obj for k in _JSON_ERROR_KEYS):
        return "error"
    return "success"
 def _is_failed(event: SessionEvent, blobs: dict) -> bool:
    if event.kind == "error":
        return True
    if event.kind == "tool_result":
        body = _error_body(event, blobs)
        if not body.strip():
            return False
        if _looks_like_file_read(body):
            return False
        verdict = _json_verdict(body)
        if verdict is not None:
            return verdict == "error"
        return any(h in body.lower() for h in _FAIL_HINTS)
    return False
 def _error_snippets(events: list[SessionEvent], blobs: dict) -> list[dict]:
    """Collapse a session's failures into deduped, normalized error fingerprints.
    Durable in Tier 2 (the raw blobs may be evicted): each entry is
    ``{fingerprint, sample, count, tool}`` with same-fingerprint occurrences
    counted. Ordered by frequency (then first appearance) for stable output.
    """
    agg: dict[str, dict] = {}
    order: list[str] = []
    for e in events:
        if not _is_failed(e, blobs):
            continue
        body = _error_body(e, blobs)
        if not body.strip():
            continue
        fp = _error_fingerprint(body)
        if not fp:
            continue
        if fp not in agg:
            agg[fp] = {"fingerprint": fp, "sample": _error_line(body)[:_ERR_SAMPLE_MAX],
                       "count": 0, "tool": e.tool}
            order.append(fp)
        agg[fp]["count"] += 1
    snippets = [agg[fp] for fp in order]
    snippets.sort(key=lambda s: (-s["count"], order.index(s["fingerprint"])))
    return snippets
 def _read_blob(store, ref):
    row = store.db.execute("SELECT path FROM blobs WHERE ref=?", (ref,)).fetchone()
    if not row:
--- a/session_memory/core/schema.py
+++ b/session_memory/core/schema.py
@@ -11,7 +11,7 @@ import json
 from dataclasses import asdict, dataclass, field, fields
 from typing import Any, Optional
-SCHEMA_VERSION = 1
+SCHEMA_VERSION = 2  # v2: digest carries error_snippets (WP-0006 T01)
 # Supported agent flavors. ``session_uid`` is always "<flavor>:<native id>".
 FLAVORS = ("claude", "codex", "grok")
--- a/session_memory/core/store.py
+++ b/session_memory/core/store.py
@@ -12,6 +12,7 @@ Tier 2 digest — the invariant that makes budget-based retention non-lossy.
 from __future__ import annotations
 import hashlib
 import json
 import os
 import re
@@ -28,6 +29,18 @@ def _now() -> str:
    return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
 def _fingerprint(ev: SessionEvent, body: Optional[str]) -> str:
    """Stable content fingerprint, independent of seq/payload_ref, for dedup."""
    h = hashlib.sha1()
    parts = [ev.ts or "", ev.kind, ev.role or "", ev.tool or "", ev.summary or "",
             ev.role or "", str(ev.is_sidechain)]
    h.update("\x1f".join(parts).encode("utf-8"))
    if body is not None:
        h.update(b"\x1e")
        h.update(body.encode("utf-8"))
    return h.hexdigest()
 class Store:
    def __init__(self, db_path: str, blob_dir: str):
        self.db_path = db_path
@@ -121,14 +134,75 @@ class Store:
        self.db.commit()
        return total
-    def ingest(self, bundle) -> None:
+    def ingest(self, bundle) -> int:
-        """Persist a full Normalized bundle (session + events + blobs)."""
+        """Persist a Normalized bundle, merging into any existing session.
        Multiple files can map to one ``session_uid`` (Claude resume/sidechains;
        Grok multi-file dirs). Events are de-duplicated by content fingerprint and
        genuinely-new events are appended with offset ``seq`` (design OQ6 / T03).
        Returns the number of new events written. Idempotent: re-ingesting the
        same bundle adds nothing.
        """
        s = bundle.session
-        if s.ingested_at is None:
+        existing = self.get_session(s.session_uid)
-            s.ingested_at = _now()
+        if existing is None:
-        self.upsert_session(s)
+            if s.ingested_at is None:
-        self.upsert_events(bundle.events)
+                s.ingested_at = _now()
-        self.write_blobs(s.session_uid, bundle.blobs)
+            self.upsert_session(s)
        # known fingerprints + current max seq for this session
        seen = self._event_fingerprints(s.session_uid)
        next_seq = self._max_seq(s.session_uid) + 1
        new_events: list[SessionEvent] = []
        new_blobs: dict[str, str] = {}
        old_to_new: dict[int, int] = {}
        for ev in bundle.events:
            body = bundle.blobs.get(ev.payload_ref) if ev.payload_ref else None
            fp = _fingerprint(ev, body)
            if fp in seen:
                continue  # already stored (prior file or prior sweep)
            new_seq = next_seq
            next_seq += 1
            old_to_new[ev.seq] = new_seq
            # remap parent within this bundle; cross-file parents become None
            parent = old_to_new.get(ev.parent_seq) if ev.parent_seq is not None else None
            ref = None
            if body is not None:
                ref = f"blob://{s.session_uid}/{new_seq}"
                new_blobs[ref] = body
            merged = SessionEvent(
                session_uid=s.session_uid, seq=new_seq, parent_seq=parent, ts=ev.ts,
                kind=ev.kind, role=ev.role, tool=ev.tool, summary=ev.summary,
                payload_ref=ref, tokens=ev.tokens, is_sidechain=ev.is_sidechain,
            )
            new_events.append(merged)
            seen.add(fp)
        if new_events:
            self.upsert_events(new_events)
            self.write_blobs(s.session_uid, new_blobs)
        return len(new_events)
    def _max_seq(self, session_uid: str) -> int:
        row = self.db.execute(
            "SELECT COALESCE(MAX(seq), -1) m FROM events WHERE session_uid=?", (session_uid,)
        ).fetchone()
        return int(row["m"])
    def _event_fingerprints(self, session_uid: str) -> set[str]:
        fps: set[str] = set()
        for e in self.get_events(session_uid):
            body = None
            if e.payload_ref:
                r = self.db.execute("SELECT path FROM blobs WHERE ref=?", (e.payload_ref,)).fetchone()
                if r:
                    try:
                        with open(r["path"], "r", encoding="utf-8") as f:
                            body = f.read()
                    except OSError:
                        body = None
            fps.add(_fingerprint(e, body))
        return fps
    # ---- Tier 2 (digest) ---------------------------------------------------
@@ -149,6 +223,22 @@ class Store:
        row = self.db.execute("SELECT json FROM digests WHERE session_uid=?", (session_uid,)).fetchone()
        return json.loads(row["json"]) if row else None
    def list_digests(self) -> list[dict[str, Any]]:
        return [json.loads(r["json"]) for r in self.db.execute("SELECT json FROM digests")]
    def save_patterns(self, patterns: list[dict[str, Any]]) -> None:
        """Persist candidate patterns to a Tier 2 table (replace prior run)."""
        self.db.execute(
            "CREATE TABLE IF NOT EXISTS patterns ("
            "key TEXT PRIMARY KEY, json TEXT NOT NULL, detected_at TEXT NOT NULL)"
        )
        self.db.execute("DELETE FROM patterns")
        self.db.executemany(
            "INSERT INTO patterns(key, json, detected_at) VALUES(?,?,?)",
            [(p["key"], json.dumps(p, sort_keys=True), _now()) for p in patterns],
        )
        self.db.commit()
    # ---- reads -------------------------------------------------------------
    def get_session(self, session_uid: str) -> Optional[Session]:
--- a/session_memory/curate/init.py
+++ b/session_memory/curate/init.py
@@ -0,0 +1,9 @@
 """Curate phase (PRD §6.3) — review candidate patterns into versioned Solution
 Patterns held in an in-repo Pattern Catalog.
 Layout mirrors ``detect/``:
    schema.py    Solution Pattern artifact + per-flavor rendering hints (T01)
    catalog.py   versioned, files-first catalog store (T02)
    review.py    discuss/approve/reject -> promote workflow (T03)
    __main__.py  `python -m session_memory.curate` entrypoint (T06)
 """
--- a/session_memory/curate/main.py
+++ b/session_memory/curate/main.py
@@ -0,0 +1,130 @@
 """Curate entrypoint (T06): review detect candidates into the Pattern Catalog.
    python -m session_memory.curate [--config PATH] [--auto-approve] [--json]
                                    [--workstream-id ID]
 Refreshes candidate patterns (runs the detect pipeline), then drives them through
 the review workflow — **interactive** by default, or **batch** with
 ``--auto-approve`` (promote everything clearing the evidence bar, reject the rest)
 for kaizen-agent runs. Candidates are presented cross-flavor first (detect's
 ranking). Emits a catalog diff summary and, with ``--json``, a machine-readable
 result. Approvals land in the files-first catalog; each final decision is logged
 as a hub decision (queued if the hub is down).
 """
 from __future__ import annotations
 import argparse
 import json
 import os
 from ..detect.__main__ import run_detect
 from ..ingest import _expand, load_config
 from .catalog import Catalog
 from .decisions import DecisionRecorder
 from .gating import bloat_warnings, evaluate, gate_config
 from .review import APPROVE, DISCUSS, REJECT, ReviewLog, review
 def _curate_paths(config: dict):
    c = config.get("curate", {})
    catalog_dir = _expand(c.get("catalog_dir", "session_memory/catalog"))
    review_log = _expand(c.get("review_log", "session_memory/.store/reviews.jsonl"))
    queue = _expand(c.get("decision_queue", "session_memory/.store/decisions.queue.jsonl"))
    ws_id = c.get("state_hub_workstream_id")
    return catalog_dir, review_log, queue, ws_id
 def _render_candidate(cand: dict, gate, existing) -> str:
    g = evaluate(cand, gate)
    flag = " [CROSS-FLAVOR]" if cand.get("cross_flavor") else ""
    lines = [
        f"\n{cand['title']}{flag}",
        f"  key={cand['key']}  score={cand.get('score')} freq={cand['frequency']} "
        f"impact={cand.get('cost_impact')}",
        f"  flavors={','.join(cand.get('flavors', []))}  "
        f"repos={','.join(cand.get('repos', [])) or '-'}  sessions={len(cand.get('sessions', []))}",
        f"  gate: promotable={g.promotable} distribution_ready={g.distribution_ready}"
        + (f"  ({'; '.join(g.reasons)})" if g.reasons else ""),
    ]
    for w in bloat_warnings(cand, existing):
        lines.append(f"  bloat: {w}")
    return "\n".join(lines)
 def _interactive_decider(gate, catalog):
    def decide(cand):
        print(_render_candidate(cand, gate, catalog.list()))
        while True:
            choice = input("  [a]pprove / [r]eject / [d]iscuss ? ").strip().lower()
            if choice in ("a", "approve"):
                return (APPROVE, input("  rationale: ").strip() or "approved")
            if choice in ("r", "reject"):
                return (REJECT, input("  rationale: ").strip() or "rejected")
            if choice in ("d", "discuss"):
                return (DISCUSS, "deferred for discussion")
    return decide
 def _auto_decider(gate):
    """Batch policy: approve candidates clearing the promote floor, reject the rest."""
    def decide(cand):
        g = evaluate(cand, gate)
        if g.promotable:
            return (APPROVE, "auto-approved: clears evidence bar")
        return (REJECT, "auto-rejected: " + "; ".join(g.reasons))
    return decide
 def _summary(result, n_candidates: int) -> str:
    added = [k for k, a in result.approved if a in ("added", "versioned", "updated")]
    lines = [
        f"# Curate summary  ({n_candidates} candidates reviewed)",
        f"  approved : {len(result.approved)}  ({', '.join(f'{k}:{a}' for k, a in result.approved) or '-'})",
        f"  rejected : {len(result.rejected)}  ({', '.join(result.rejected) or '-'})",
        f"  deferred : {len(result.deferred)}  ({', '.join(result.deferred) or '-'})",
        f"  skipped  : {len(result.skipped)}  (already decided)",
        f"  catalog writes: {len(added)}",
    ]
    return "\n".join(lines)
 def main(argv=None) -> int:
    here = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
    ap = argparse.ArgumentParser(description="Curate detect candidates into the Pattern Catalog.")
    ap.add_argument("--config", default=os.path.join(here, "config.toml"))
    ap.add_argument("--auto-approve", action="store_true",
                    help="batch mode: promote everything clearing the evidence bar")
    ap.add_argument("--min-frequency", type=int, default=2)
    ap.add_argument("--workstream-id", default=None, help="hub workstream for decisions")
    ap.add_argument("--json", action="store_true", help="emit machine-readable JSON")
    args = ap.parse_args(argv)
    config = load_config(args.config)
    candidates = run_detect(config, min_frequency=args.min_frequency)
    catalog_dir, review_log_path, queue_path, ws_id = _curate_paths(config)
    gate = gate_config(config)
    catalog = Catalog(catalog_dir)
    log = ReviewLog(review_log_path)
    recorder = DecisionRecorder(queue_path, workstream_id=args.workstream_id or ws_id)
    decide = _auto_decider(gate) if args.auto_approve else _interactive_decider(gate, catalog)
    result = review(candidates, decide, catalog, log, gate=gate, recorder=recorder)
    if args.json:
        print(json.dumps({
            "approved": result.approved, "rejected": result.rejected,
            "deferred": result.deferred, "skipped": result.skipped,
            "decisions_queued": len(recorder.pending()),
        }, indent=2))
    else:
        print(_summary(result, len(candidates)))
        if recorder.pending():
            print(f"  decisions queued (hub offline): {len(recorder.pending())} "
                  f"-> {queue_path}")
    return 0
 if __name__ == "__main__":
    raise SystemExit(main())
--- a/session_memory/curate/catalog.py
+++ b/session_memory/curate/catalog.py
@@ -0,0 +1,148 @@
 """Versioned Pattern Catalog — files-first source of truth (FR-U3; T02).
 The catalog is a directory of one JSON file per Solution Pattern
 (``<catalog_dir>/<pattern-id>.json``). Files originate the work; the State Hub
 indexes them (ADR-001 / PRD §9). Identity is the pattern ``id`` (derived from the
 source candidate key), so re-promoting the same detect candidate maps to the same
 file — dedup is structural, not heuristic.
 :meth:`Catalog.upsert` is the one write path and is **idempotent**:
 * new id                       -> written as-is                  (``added``)
 * same id, identical content   -> no write, no version bump      (``unchanged``)
 * same id, only status/flags   -> updated in place, no bump      (``updated``)
 * same id, content changed     -> version bumped, prior snapshot
                                  appended to ``<id>.history.jsonl`` (``versioned``)
 History is append-only alongside the current file, so the catalog dir stays one
 clean current file per pattern while every superseded version is recoverable.
 """
 from __future__ import annotations
 import json
 import os
 from datetime import datetime, timezone
 from typing import Optional
 from .schema import SolutionPattern
 # Content fields that define a pattern's substance. Version, timestamps, status,
 # and distribution_ready are metadata — changes to them never bump the version.
 _CONTENT_KEYS = ("name", "polarity", "problem", "resolutions", "scope",
                 "provenance", "rendering_hints", "covers")
 ADDED = "added"
 UNCHANGED = "unchanged"
 UPDATED = "updated"
 VERSIONED = "versioned"
 def _now() -> str:
    return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
 def _content(p: SolutionPattern) -> str:
    d = p.to_dict()
    return json.dumps({k: d[k] for k in _CONTENT_KEYS}, sort_keys=True)
 class Catalog:
    """File-backed catalog of versioned :class:`SolutionPattern` artifacts."""
    def __init__(self, catalog_dir: str) -> None:
        self.dir = catalog_dir
        os.makedirs(self.dir, exist_ok=True)
    # --- paths --------------------------------------------------------------
    def _path(self, pattern_id: str) -> str:
        return os.path.join(self.dir, f"{pattern_id}.json")
    def _history_path(self, pattern_id: str) -> str:
        return os.path.join(self.dir, f"{pattern_id}.history.jsonl")
    # --- reads --------------------------------------------------------------
    def load(self, pattern_id: str) -> Optional[SolutionPattern]:
        path = self._path(pattern_id)
        if not os.path.exists(path):
            return None
        with open(path, encoding="utf-8") as fh:
            return SolutionPattern.from_json(fh.read())
    def list(self) -> list[SolutionPattern]:
        out: list[SolutionPattern] = []
        for name in sorted(os.listdir(self.dir)):
            if name.endswith(".json") and not name.endswith(".history.jsonl"):
                with open(os.path.join(self.dir, name), encoding="utf-8") as fh:
                    out.append(SolutionPattern.from_json(fh.read()))
        return out
    def history(self, pattern_id: str) -> list[dict]:
        path = self._history_path(pattern_id)
        if not os.path.exists(path):
            return []
        with open(path, encoding="utf-8") as fh:
            return [json.loads(line) for line in fh if line.strip()]
    def find_for(self, signal_key: str, locus: str = "") -> Optional[SolutionPattern]:
        """Best catalog pattern for a detect signal: exact id first, then ``covers``.
        Lets a signal that doesn't share a pattern's exact key (e.g. a
        ``recurring_error`` fingerprint) inherit the curated recommendation when a
        pattern declares it covers that text.
        """
        exact = self.load(SolutionPattern.make_id(signal_key))
        if exact is not None:
            return exact
        hay = f"{signal_key} {locus}".lower()
        for p in self.list():  # sorted by id -> deterministic
            if any(c.lower() in hay for c in p.covers):
                return p
        return None
    # --- the single write path ---------------------------------------------
    def upsert(self, pattern: SolutionPattern) -> str:
        """Insert or version-update a pattern. Returns the action taken."""
        existing = self.load(pattern.id)
        now = _now()
        if existing is None:
            pattern.created_at = pattern.created_at or now
            pattern.updated_at = now
            self._write(pattern)
            return ADDED
        if _content(existing) == _content(pattern):
            # substance unchanged — only persist a metadata (status/flag) change
            if (existing.status == pattern.status
                    and existing.distribution_ready == pattern.distribution_ready):
                return UNCHANGED
            existing.status = pattern.status
            existing.distribution_ready = pattern.distribution_ready
            existing.updated_at = now
            self._write(existing)
            return UPDATED
        # substance changed: archive the old version, bump, write the new one
        self._append_history(existing)
        pattern.version = SolutionPattern.bump_version(existing.version)
        pattern.created_at = existing.created_at or now
        pattern.updated_at = now
        self._write(pattern)
        return VERSIONED
    # --- internals ----------------------------------------------------------
    def _write(self, pattern: SolutionPattern) -> None:
        with open(self._path(pattern.id), "w", encoding="utf-8") as fh:
            fh.write(pattern.to_json())
            fh.write("\n")
    def _append_history(self, superseded: SolutionPattern) -> None:
        superseded.status = "superseded"
        with open(self._history_path(superseded.id), "a", encoding="utf-8") as fh:
            fh.write(json.dumps(superseded.to_dict(), sort_keys=True))
            fh.write("\n")
--- a/session_memory/curate/decisions.py
+++ b/session_memory/curate/decisions.py
@@ -0,0 +1,114 @@
 """State Hub decision integration (FR-U4; T05).
 Every final promote/reject is recorded as an auditable decision so the rationale,
 the source candidate key, and an evidence snapshot are traceable. The catalog
 file remains the durable artifact (ADR-001); the decision is the audit trail.
 The recorder is **graceful under a hub outage** — exactly the condition hit during
 Phase 1, where statuses were synced after the fact. A pluggable ``sink`` does the
 actual write (HTTP to the hub, or the MCP ``record_decision`` tool driven by the
 operator). If the sink is absent or raises, the decision is appended to a local
 queue (``decisions.queue.jsonl``) and can be replayed later with :meth:`flush`.
 """
 from __future__ import annotations
 import json
 import os
 from dataclasses import dataclass, field
 from datetime import datetime, timezone
 from typing import Callable, Optional
 # A sink takes a hub-shaped decision payload and persists it (may raise on failure).
 Sink = Callable[[dict], None]
 def _now() -> str:
    return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
 def build_decision(candidate: dict, action: str, rationale: str,
                   *, workstream_id: Optional[str] = None,
                   decided_by: str = "curator") -> dict:
    """Shape a curate decision as a State Hub ``record_decision`` payload."""
    key = candidate["key"]
    verb = "Promote" if action == "approve" else "Reject"
    return {
        "title": f"{verb} pattern candidate {key}",
        "decision_type": "made",
        "workstream_id": workstream_id,
        "rationale": rationale,
        "decided_by": decided_by,
        "description": json.dumps({
            "action": action,
            "source_key": key,
            "evidence": candidate,
        }, sort_keys=True),
        "recorded_at": _now(),
    }
@dataclass
 class DecisionRecorder:
    """Records decisions through ``sink`` with a durable local-queue fallback."""
    queue_path: str
    sink: Optional[Sink] = None
    workstream_id: Optional[str] = None
    decided_by: str = "curator"
    _queued: int = field(default=0, init=False)
    def record(self, candidate: dict, action: str, rationale: str) -> bool:
        """Record one decision. Returns True if the sink accepted it, else queued."""
        payload = build_decision(candidate, action, rationale,
                                 workstream_id=self.workstream_id, decided_by=self.decided_by)
        if self.sink is not None:
            try:
                self.sink(payload)
                return True
            except Exception:  # hub down / transient — fall through to the queue
                pass
        self._append(payload)
        return False
    def pending(self) -> list[dict]:
        if not os.path.exists(self.queue_path):
            return []
        with open(self.queue_path, encoding="utf-8") as fh:
            return [json.loads(line) for line in fh if line.strip()]
    def flush(self, sink: Optional[Sink] = None) -> int:
        """Replay queued decisions through ``sink``. Returns count synced.
        Stops at the first failure so ordering is preserved; the unsynced tail is
        rewritten back to the queue.
        """
        sink = sink or self.sink
        if sink is None:
            return 0
        items = self.pending()
        synced = 0
        for i, payload in enumerate(items):
            try:
                sink(payload)
                synced += 1
            except Exception:
                self._rewrite(items[i:])
                return synced
        self._rewrite([])
        return synced
    # --- internals ----------------------------------------------------------
    def _append(self, payload: dict) -> None:
        os.makedirs(os.path.dirname(self.queue_path) or ".", exist_ok=True)
        with open(self.queue_path, "a", encoding="utf-8") as fh:
            fh.write(json.dumps(payload, sort_keys=True))
            fh.write("\n")
        self._queued += 1
    def _rewrite(self, items: list[dict]) -> None:
        with open(self.queue_path, "w", encoding="utf-8") as fh:
            for payload in items:
                fh.write(json.dumps(payload, sort_keys=True))
                fh.write("\n")
--- a/session_memory/curate/gating.py
+++ b/session_memory/curate/gating.py
@@ -0,0 +1,117 @@
 """Promotion evidence-bar + bloat guard (design OQ5/OQ6; T04).
 Two gates protect the catalog:
 * **Evidence bar (OQ5)** — a candidate must clear configurable floors
  (frequency, distinct supporting sessions) before it may be promoted at all.
  A separate, stricter bar decides whether the promoted pattern is
  *distribution-eligible* (``status="approved"``, ``distribution_ready=True``)
  vs. merely ``provisional`` — the minimum trustworthy evidence before a pattern
  is allowed near live agent environments.
 * **Bloat guard (OQ6)** — flags candidates that would add little: a duplicate of
  an already-cataloged pattern, or a near-duplicate sharing the same
  signal-type+locus. Keeps the catalog lean so agent context budgets aren't
  degraded by low-value instructions.
 Knobs live under ``[curate]`` in ``config.toml``; :func:`gate_config` reads them
 with safe defaults so the module also works config-free (tests).
 """
 from __future__ import annotations
 from dataclasses import dataclass, field
 from typing import Optional
 from .schema import SolutionPattern
@dataclass
 class GateConfig:
    # promotion floor (OQ5)
    min_frequency: int = 2
    min_sessions: int = 2
    min_cost_impact: float = 0.0
    # distribution-eligibility floor (stricter; OQ5)
    dist_require_cross_flavor: bool = False
    dist_min_frequency: int = 3
    dist_min_cost_impact: float = 0.0
 def gate_config(config: Optional[dict] = None) -> GateConfig:
    c = (config or {}).get("curate", {}) if config else {}
    g = c.get("gate", {}) if isinstance(c, dict) else {}
    return GateConfig(
        min_frequency=g.get("min_frequency", 2),
        min_sessions=g.get("min_sessions", 2),
        min_cost_impact=g.get("min_cost_impact", 0.0),
        dist_require_cross_flavor=g.get("dist_require_cross_flavor", False),
        dist_min_frequency=g.get("dist_min_frequency", 3),
        dist_min_cost_impact=g.get("dist_min_cost_impact", 0.0),
    )
@dataclass
 class GateResult:
    promotable: bool
    distribution_ready: bool
    status: str  # "approved" if distribution-ready else "provisional"
    reasons: list = field(default_factory=list)
 def _n_sessions(candidate: dict) -> int:
    return len(candidate.get("sessions", []) or [])
 def evaluate(candidate: dict, config: Optional[GateConfig] = None) -> GateResult:
    """Decide whether a candidate may be promoted, and at what trust level."""
    cfg = config or GateConfig()
    reasons: list[str] = []
    freq = candidate.get("frequency", 0)
    sessions = _n_sessions(candidate)
    impact = candidate.get("cost_impact", 0.0)
    promotable = True
    if freq < cfg.min_frequency:
        promotable = False
        reasons.append(f"frequency {freq} < min {cfg.min_frequency}")
    if sessions < cfg.min_sessions:
        promotable = False
        reasons.append(f"sessions {sessions} < min {cfg.min_sessions}")
    if impact < cfg.min_cost_impact:
        promotable = False
        reasons.append(f"cost_impact {impact} < min {cfg.min_cost_impact}")
    dist = promotable
    if cfg.dist_require_cross_flavor and not candidate.get("cross_flavor", False):
        dist = False
        reasons.append("not cross-flavor (required for distribution)")
    if freq < cfg.dist_min_frequency:
        dist = False
        reasons.append(f"frequency {freq} < distribution min {cfg.dist_min_frequency}")
    if impact < cfg.dist_min_cost_impact:
        dist = False
        reasons.append(f"cost_impact {impact} < distribution min {cfg.dist_min_cost_impact}")
    return GateResult(
        promotable=promotable,
        distribution_ready=bool(dist),
        status="approved" if dist else "provisional",
        reasons=reasons,
    )
 def bloat_warnings(candidate: dict, existing: list[SolutionPattern]) -> list[str]:
    """Flag low-value adds against what is already catalogued (OQ6)."""
    warnings: list[str] = []
    cand_id = SolutionPattern.make_id(candidate["key"])
    _, sig_type, locus = (candidate["key"].split(":", 2) + ["", ""])[:3]
    for p in existing:
        if p.id == cand_id:
            warnings.append(f"duplicate of catalogued pattern {p.id}")
            continue
        p_parts = (p.provenance.source_key.split(":", 2) + ["", ""])[:3]
        if (p_parts[1], p_parts[2]) == (sig_type, locus):
            warnings.append(f"near-duplicate of {p.id} (same {sig_type}/{locus})")
    return warnings
--- a/session_memory/curate/review.py
+++ b/session_memory/curate/review.py
@@ -0,0 +1,158 @@
 """Curation review workflow (FR-U1/FR-U2; T03).
 Drives Phase 1 detect candidates through a **discuss / approve / reject** review
 and, on approve, promotes the candidate into a :class:`SolutionPattern` written to
 the :class:`Catalog`. The actual decision is supplied by a ``decide`` callback so
 this engine stays UI-free — the ``__main__`` entrypoint (T06) plugs in interactive
 or batch (auto-approve) logic.
 Re-review is **idempotent** via a :class:`ReviewLog`: a candidate already decided
 is skipped unless its *evidence fingerprint* changed (new sessions/frequency), so
 a prior **reject** is remembered and not re-surfaced, and a prior **approve** is
 updated in place rather than duplicated (catalog dedup does the rest).
 """
 from __future__ import annotations
 import hashlib
 import json
 import os
 from dataclasses import dataclass, field
 from datetime import datetime, timezone
 from typing import Callable, Optional
 from .catalog import Catalog
 from .decisions import DecisionRecorder
 from .gating import GateConfig, evaluate
 from .schema import Provenance, Resolution, Scope, SolutionPattern
 APPROVE = "approve"
 REJECT = "reject"
 DISCUSS = "discuss"  # defer — no final decision recorded
 # Default per-flavor rendering-hint stubs a reviewer can later refine (OQ4).
 _DEFAULT_TARGET = {"claude": "CLAUDE.md", "codex": "AGENTS.md", "grok": "instructions"}
 # A decision callback: (candidate dict) -> (action, rationale)
 Decider = Callable[[dict], tuple]
 def _now() -> str:
    return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
 def evidence_fingerprint(candidate: dict) -> str:
    """Stable hash of the evidence that would justify (re)reviewing a candidate."""
    keys = ("frequency", "cost_impact", "flavors", "repos", "sessions", "cross_flavor")
    payload = {k: candidate.get(k) for k in keys}
    return hashlib.sha1(json.dumps(payload, sort_keys=True).encode("utf-8")).hexdigest()
 def candidate_to_pattern(candidate: dict, *, status: str = "provisional",
                         distribution_ready: bool = False) -> SolutionPattern:
    """Build a Solution Pattern from a detect candidate.
    ``status``/``distribution_ready`` come from the evidence gate (T04); they
    default to a provisional, non-distribution-ready pattern when ungated.
    """
    src = candidate["key"]
    flavors = list(candidate.get("flavors", []))
    hints = {f: {"target": _DEFAULT_TARGET.get(f, ""), "note": "TODO: refine rendering"}
             for f in flavors}
    return SolutionPattern(
        id=SolutionPattern.make_id(src),
        name=candidate.get("title") or src,
        version="1.0.0",
        polarity=candidate.get("polarity", "problem"),
        problem=candidate.get("title") or src,
        resolutions=[Resolution(summary="TODO: capture the recommended resolution")],
        scope=Scope(flavors=flavors, repos=list(candidate.get("repos", []))),
        provenance=Provenance(source_key=src, evidence=dict(candidate), promoted_at=_now()),
        rendering_hints=hints,
        status=status,
        distribution_ready=distribution_ready,
    )
@dataclass
 class ReviewLog:
    """Append-only record of final decisions, keyed by candidate source key."""
    path: str
    _by_key: dict = field(default_factory=dict)
    def __post_init__(self) -> None:
        if os.path.exists(self.path):
            with open(self.path, encoding="utf-8") as fh:
                for line in fh:
                    if line.strip():
                        rec = json.loads(line)
                        self._by_key[rec["source_key"]] = rec  # last write wins
    def prior(self, source_key: str) -> Optional[dict]:
        return self._by_key.get(source_key)
    def already_decided(self, candidate: dict) -> bool:
        rec = self._by_key.get(candidate["key"])
        return bool(rec) and rec["fingerprint"] == evidence_fingerprint(candidate)
    def record(self, candidate: dict, action: str, rationale: str) -> None:
        rec = {
            "source_key": candidate["key"],
            "action": action,
            "rationale": rationale,
            "fingerprint": evidence_fingerprint(candidate),
            "ts": _now(),
        }
        self._by_key[candidate["key"]] = rec
        os.makedirs(os.path.dirname(self.path) or ".", exist_ok=True)
        with open(self.path, "a", encoding="utf-8") as fh:
            fh.write(json.dumps(rec, sort_keys=True))
            fh.write("\n")
@dataclass
 class ReviewResult:
    approved: list = field(default_factory=list)   # (source_key, catalog_action)
    rejected: list = field(default_factory=list)   # source_key
    deferred: list = field(default_factory=list)   # source_key (discuss)
    skipped: list = field(default_factory=list)    # source_key (already decided)
 def review(candidates: list[dict], decide: Decider, catalog: Catalog,
           log: ReviewLog, gate: Optional[GateConfig] = None,
           recorder: Optional[DecisionRecorder] = None) -> ReviewResult:
    """Run each candidate through ``decide``; promote approvals into ``catalog``.
    When a ``gate`` (T04 evidence bar) is supplied, the promoted pattern's
    ``status``/``distribution_ready`` are set from the gate evaluation, so an
    approved-but-thin candidate lands as ``provisional`` rather than
    distribution-ready. When a ``recorder`` (T05) is supplied, each final
    promote/reject is logged as an auditable hub decision (queued if the hub is
    down).
    """
    result = ReviewResult()
    for cand in candidates:
        key = cand["key"]
        if log.already_decided(cand):
            result.skipped.append(key)
            continue
        action, rationale = decide(cand)
        if action == DISCUSS:
            result.deferred.append(key)
            continue  # not a final decision — leave for a later pass
        if action == APPROVE:
            g = evaluate(cand, gate) if gate is not None else None
            pattern = (candidate_to_pattern(cand, status=g.status,
                                            distribution_ready=g.distribution_ready)
                       if g is not None else candidate_to_pattern(cand))
            cat_action = catalog.upsert(pattern)
            result.approved.append((key, cat_action))
        elif action == REJECT:
            result.rejected.append(key)
        else:
            raise ValueError(f"unknown review action {action!r}")
        log.record(cand, action, rationale)
        if recorder is not None:
            recorder.record(cand, action, rationale)
    return result
--- a/session_memory/curate/schema.py
+++ b/session_memory/curate/schema.py
@@ -0,0 +1,160 @@
 """Solution Pattern schema (PRD §6.3 FR-U2; design OQ4) — T01.
 A **Solution Pattern** is the curated, reviewed artifact a candidate pattern is
 promoted into: a named, versioned record pairing a problem (or success) with one
 or more recommended resolutions, written **flavor-agnostically**. Everything a
 distributor needs to render a native artifact lives in a *separate*
 ``rendering_hints`` sub-structure, keyed by flavor — so the core stays neutral
 (FR-A1/FR-A2) while Phase 3 distributors still get enough to render well (OQ4).
 The artifact is the durable unit of the Pattern Catalog (T02): files originate,
 the State Hub indexes (ADR-001). Serialization is deterministic (sorted keys) so
 catalog files diff cleanly and re-saving an unchanged pattern is a no-op.
 """
 from __future__ import annotations
 import json
 import re
 from dataclasses import asdict, dataclass, field, fields
 from typing import Any, Optional
 from ..core.schema import FLAVORS
 SCHEMA_VERSION = 1
 # Lifecycle of a catalogued pattern.
 #   provisional — promoted but below the distribution evidence bar (OQ5)
 #   approved    — meets the bar; distribution-eligible (Phase 3)
 #   rejected    — reviewed and declined; remembered so it is not re-surfaced
 #   superseded  — replaced by a newer version of the same pattern id
 STATUSES = ("provisional", "approved", "rejected", "superseded")
 POLARITIES = ("problem", "success")
@dataclass
 class Resolution:
    """One recommended resolution for the pattern's problem (FR-U2)."""
    summary: str
    detail: str = ""
    steps: list[str] = field(default_factory=list)
@dataclass
 class Scope:
    """Where the pattern applies (FR-X2 input). Empty list == unrestricted."""
    repos: list[str] = field(default_factory=list)
    domains: list[str] = field(default_factory=list)
    flavors: list[str] = field(default_factory=list)
    def __post_init__(self) -> None:
        bad = [f for f in self.flavors if f not in FLAVORS]
        if bad:
            raise ValueError(f"unknown flavor(s) in scope {bad!r}; expected {FLAVORS}")
@dataclass
 class Provenance:
    """Trace back to the detect candidate this pattern was promoted from."""
    source_key: str  # the detect Pattern.key — stable cluster identity
    evidence: dict[str, Any] = field(default_factory=dict)  # snapshot of the candidate
    detected_at: Optional[str] = None
    promoted_at: Optional[str] = None
@dataclass
 class SolutionPattern:
    """A curated, versioned solution pattern (PRD §5 / §6.3)."""
    id: str  # stable, derived from provenance.source_key
    name: str
    version: str  # semantic, e.g. "1.0.0"
    polarity: str  # problem | success
    problem: str  # human-readable description of the recurring situation
    resolutions: list[Resolution] = field(default_factory=list)
    scope: Scope = field(default_factory=Scope)
    provenance: Provenance = field(default_factory=lambda: Provenance(source_key=""))
    # per-flavor rendering hints, kept OUT of the agnostic core (OQ4):
    #   {"claude": {...}, "codex": {...}, "grok": {...}}
    rendering_hints: dict[str, dict[str, Any]] = field(default_factory=dict)
    # other signal keys/loci this pattern's recommendation also applies to —
    # lowercase substrings matched against a candidate signal's key+locus, so a
    # detect signal that doesn't share this pattern's exact key (e.g. a
    # recurring_error fingerprint) can still inherit the curated resolution.
    covers: list[str] = field(default_factory=list)
    status: str = "provisional"
    distribution_ready: bool = False
    created_at: Optional[str] = None
    updated_at: Optional[str] = None
    schema_version: int = SCHEMA_VERSION
    def __post_init__(self) -> None:
        if self.polarity not in POLARITIES:
            raise ValueError(f"unknown polarity {self.polarity!r}; expected {POLARITIES}")
        if self.status not in STATUSES:
            raise ValueError(f"unknown status {self.status!r}; expected {STATUSES}")
        bad = [f for f in self.rendering_hints if f not in FLAVORS]
        if bad:
            raise ValueError(f"unknown flavor(s) in rendering_hints {bad!r}; expected {FLAVORS}")
    # --- identity / versioning helpers -------------------------------------
    @staticmethod
    def make_id(source_key: str) -> str:
        """Stable catalog id from a detect candidate key (``polarity:type:locus``).
        Identity is the source key, so re-promoting the same candidate maps to the
        same pattern (dedup in T02), independent of wording or version.
        """
        slug = re.sub(r"[^a-z0-9_]+", "-", source_key.lower()).strip("-")
        return f"sp-{slug}"
    @staticmethod
    def bump_version(version: str, level: str = "patch") -> str:
        """Increment a ``major.minor.patch`` version string."""
        parts = (version.split(".") + ["0", "0", "0"])[:3]
        major, minor, patch = (int(p) for p in parts)
        if level == "major":
            major, minor, patch = major + 1, 0, 0
        elif level == "minor":
            minor, patch = minor + 1, 0
        else:
            patch += 1
        return f"{major}.{minor}.{patch}"
    # --- serialization ------------------------------------------------------
    def to_dict(self) -> dict[str, Any]:
        return asdict(self)
    def to_json(self) -> str:
        return json.dumps(self.to_dict(), sort_keys=True, indent=2)
    @classmethod
    def from_dict(cls, d: dict[str, Any]) -> "SolutionPattern":
        d = dict(d)
        resolutions = [Resolution(**{k: v for k, v in r.items() if k in _RESOLUTION_FIELDS})
                       for r in d.pop("resolutions", [])]
        scope = d.pop("scope", None)
        prov = d.pop("provenance", None)
        obj = cls(**{k: v for k, v in d.items() if k in _PATTERN_FIELDS})
        obj.resolutions = resolutions
        if scope is not None:
            obj.scope = Scope(**{k: v for k, v in scope.items() if k in _SCOPE_FIELDS})
        if prov is not None:
            obj.provenance = Provenance(**{k: v for k, v in prov.items() if k in _PROV_FIELDS})
        return obj
    @classmethod
    def from_json(cls, s: str) -> "SolutionPattern":
        return cls.from_dict(json.loads(s))
 _PATTERN_FIELDS = {f.name for f in fields(SolutionPattern)}
 _RESOLUTION_FIELDS = {f.name for f in fields(Resolution)}
 _SCOPE_FIELDS = {f.name for f in fields(Scope)}
 _PROV_FIELDS = {f.name for f in fields(Provenance)}
--- a/session_memory/detect/init.py
+++ b/session_memory/detect/init.py
@@ -0,0 +1 @@
 """Detect: extract signals from sessions, cluster into candidate patterns."""
--- a/session_memory/detect/main.py
+++ b/session_memory/detect/main.py
@@ -0,0 +1,72 @@
 """Detect entrypoint (T07): digests -> signals -> clusters -> report.
    python -m session_memory.detect [--config PATH] [--json] [--min-frequency N]
 Reads Tier 2 digests from the store, extracts signals, clusters them into
 candidate patterns, persists the candidates, and prints a ranked report
 (cross-flavor first) — the input to the Curate phase (Phase 2).
 """
 from __future__ import annotations
 import argparse
 import json
 import os
 from ..core.store import Store
 from ..ingest import _expand, load_config
 from .cluster import cluster
 from .quality import filter_real, quality_config
 from .signals import extract_signals
 def run_detect(config: dict, *, min_frequency: int = 2) -> list[dict]:
    store_cfg = config.get("store", {})
    store = Store(_expand(store_cfg["db_path"]), _expand(store_cfg["blob_dir"]))
    digests = filter_real(store.list_digests(), quality_config(config))
    signals = extract_signals(digests)
    patterns = [p.to_dict() for p in cluster(signals, min_frequency=min_frequency)]
    store.save_patterns(patterns)
    store.close()
    return patterns
 def _format_report(patterns: list[dict], n_digests: int) -> str:
    lines = [f"# Candidate Patterns  ({len(patterns)} from {n_digests} sessions)", ""]
    if not patterns:
        lines.append("No recurring patterns above the frequency threshold yet.")
        return "\n".join(lines)
    for i, p in enumerate(patterns, 1):
        flag = " [CROSS-FLAVOR]" if p["cross_flavor"] else ""
        lines.append(f"{i}. {p['title']}{flag}")
        lines.append(f"   score={p['score']} freq={p['frequency']} "
                     f"impact={p['cost_impact']} flavors={','.join(p['flavors'])}")
        lines.append(f"   repos={','.join(p['repos']) or '-'}  "
                     f"sessions={len(p['sessions'])}")
        lines.append("")
    return "\n".join(lines)
 def main(argv=None) -> int:
    here = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
    ap = argparse.ArgumentParser(description="Detect candidate patterns from session digests.")
    ap.add_argument("--config", default=os.path.join(here, "config.toml"))
    ap.add_argument("--min-frequency", type=int, default=2)
    ap.add_argument("--json", action="store_true", help="emit machine-readable JSON")
    args = ap.parse_args(argv)
    config = load_config(args.config)
    store_cfg = config.get("store", {})
    all_digests = Store(_expand(store_cfg["db_path"]), _expand(store_cfg["blob_dir"])).list_digests()
    n = len(filter_real(all_digests, quality_config(config)))
    patterns = run_detect(config, min_frequency=args.min_frequency)
    if args.json:
        print(json.dumps(patterns, indent=2))
    else:
        print(_format_report(patterns, n))
    return 0
 if __name__ == "__main__":
    raise SystemExit(main())
--- a/session_memory/detect/cluster.py
+++ b/session_memory/detect/cluster.py
@@ -0,0 +1,78 @@
 """Pattern clusterer + evidence (PRD §5, §6.2; T05/T06).
 Groups recurring :class:`Signal`s into candidate ``Pattern`` records. Clustering
 is deterministic and keyed on ``(polarity, signal-type, locus)`` — enough to
 surface "the same thing keeps happening" without embeddings (a later option).
 Each candidate carries evidence (FR-D3): supporting sessions, frequency, affected
 repos, affected **flavors**, and an estimated cost-impact score. Candidates whose
 evidence spans more than one flavor are flagged ``cross_flavor`` (FR-D4) — the
 highest-value reuse targets.
 """
 from __future__ import annotations
 import collections
 from dataclasses import asdict, dataclass, field
 from typing import Any
 from .signals import PROBLEM, Signal
@dataclass
 class Pattern:
    key: str                       # stable cluster key
    polarity: str                  # problem | success
    signal_type: str
    locus: str
    frequency: int                 # number of supporting signals
    sessions: list[str] = field(default_factory=list)
    repos: list[str] = field(default_factory=list)
    flavors: list[str] = field(default_factory=list)
    cross_flavor: bool = False
    cost_impact: float = 0.0       # frequency-weighted magnitude
    score: float = 0.0             # ranking score (impact x frequency)
    title: str = ""
    def to_dict(self) -> dict[str, Any]:
        return asdict(self)
 def _key(s: Signal) -> str:
    return f"{s.polarity}:{s.type}:{s.locus}"
 def _title(polarity: str, signal_type: str, n_flavors: int) -> str:
    scope = "cross-flavor " if n_flavors > 1 else ""
    verb = "problem" if polarity == PROBLEM else "success"
    return f"{scope}{verb}: {signal_type.replace('_', ' ')}"
 def cluster(signals: list[Signal], *, min_frequency: int = 2) -> list[Pattern]:
    """Group signals into candidate patterns; keep clusters >= min_frequency."""
    groups: dict[str, list[Signal]] = collections.defaultdict(list)
    for s in signals:
        groups[_key(s)].append(s)
    patterns: list[Pattern] = []
    for key, members in groups.items():
        if len(members) < min_frequency:
            continue
        sessions = sorted({m.session_uid for m in members})
        repos = sorted({m.repo for m in members if m.repo})
        flavors = sorted({m.flavor for m in members})
        cost_impact = sum(m.magnitude for m in members)
        first = members[0]
        p = Pattern(
            key=key, polarity=first.polarity, signal_type=first.type, locus=first.locus,
            frequency=len(members), sessions=sessions, repos=repos, flavors=flavors,
            cross_flavor=len(flavors) > 1, cost_impact=round(cost_impact, 3),
            title=_title(first.polarity, first.type, len(flavors)),
        )
        # rank: impact x frequency, with a boost for cross-flavor reuse value
        p.score = round(p.cost_impact * p.frequency * (1.5 if p.cross_flavor else 1.0), 3)
        patterns.append(p)
    # cross-flavor first, then by score
    patterns.sort(key=lambda p: (not p.cross_flavor, -p.score))
    return patterns
--- a/session_memory/detect/quality.py
+++ b/session_memory/detect/quality.py
@@ -0,0 +1,75 @@
 """Session-quality filter (T01).
 The capture layer ingests *every* session it finds — including API health-checks,
 smoke-tests, and interrupted runs (e.g. ``llm-connect`` firing "Say hello in one
 word", or a transcript that is just ``[Request interrupted by user]``). These are
 not real coding work, but the outcome heuristic labels the short ones ``abandoned``
 and the clusterer then mints false-positive "problem" patterns from them.
 :func:`is_real_coding_session` gates those out so Detect signals/clusters form only
 over genuine coding sessions. It is intentionally conservative — a session counts
 as real if it shows substantive activity, and is dropped only on clear trivial
 markers. Thresholds come from ``[detect.quality]`` in ``config.toml``.
 """
 from __future__ import annotations
 from dataclasses import dataclass
 from typing import Optional
 # Prompt prefixes/markers that indicate a non-coding or interrupted session.
 _TRIVIAL_PROMPTS = (
    "say hello", "hello", "[request interrupted", "return only this json",
    "ping", "ok", "<system-reminder>",
 )
 # Tool buckets that count as "substantive" coding activity.
 _SUBSTANTIVE_TOOLS = (
    "Edit", "Write", "Read", "Bash", "search_replace", "write", "read_file",
    "run_terminal_command", "grep", "Grep", "glob", "Glob", "NotebookEdit",
 )
@dataclass
 class QualityConfig:
    min_events: int = 20          # below this, not a real coding session
    min_substantive: int = 3      # >= this many substantive tool calls required
    min_prompt_len: int = 25      # first prompt shorter than this is suspect
 def quality_config(config: Optional[dict] = None) -> QualityConfig:
    d = (config or {}).get("detect", {}).get("quality", {}) if config else {}
    return QualityConfig(
        min_events=d.get("min_events", 20),
        min_substantive=d.get("min_substantive", 3),
        min_prompt_len=d.get("min_prompt_len", 25),
    )
 def _substantive_calls(digest: dict) -> int:
    hist = digest.get("tool_histogram") or {}
    return sum(n for t, n in hist.items() if t in _SUBSTANTIVE_TOOLS)
 def is_real_coding_session(digest: dict, config: Optional[QualityConfig] = None) -> bool:
    cfg = config or QualityConfig()
    if not digest.get("repo"):
        return False
    if digest.get("event_count", 0) < cfg.min_events:
        return False
    if _substantive_calls(digest) < cfg.min_substantive:
        return False
    prompt = (digest.get("first_prompt") or "").strip().lower()
    if len(prompt) < cfg.min_prompt_len:
        return False
    if any(prompt.startswith(p) for p in _TRIVIAL_PROMPTS):
        return False
    return True
 def filter_real(digests: list[dict], config: Optional[QualityConfig] = None) -> list[dict]:
    cfg = config or QualityConfig()
    return [d for d in digests if is_real_coding_session(d, cfg)]
--- a/session_memory/detect/signals.py
+++ b/session_memory/detect/signals.py
@@ -0,0 +1,205 @@
 """Signal extractors (PRD §6.2; T04).
 Pure functions over a session digest (Tier 2) — the compact, durable view. Each
 extractor emits zero or more :class:`Signal`s. A signal records its source
 session, a *locus* (what it's about), a *polarity* (problem vs. success), and a
 *magnitude*. Signals are the atoms the clusterer groups into candidate patterns.
 No new capture happens here; everything is derived from digests already written
 by the Capture layer, so detection is cheap and re-runnable.
 """
 from __future__ import annotations
 from dataclasses import dataclass, field
 from typing import Any, Callable, Optional
 # polarity
 PROBLEM = "problem"
 SUCCESS = "success"
@dataclass
 class Signal:
    session_uid: str
    flavor: str
    repo: Optional[str]
    type: str               # e.g. "budget_overrun", "clean_pass"
    polarity: str           # PROBLEM | SUCCESS
    locus: str              # normalized subject key (tool, marker, ...)
    magnitude: float = 1.0  # strength / cost weight
    detail: dict[str, Any] = field(default_factory=dict)
 # --- individual extractors --------------------------------------------------
 # Each takes (digest, ctx) and returns a list[Signal]. ctx carries corpus-level
 # stats (e.g. cost percentiles) so extractors can compare a session to its peers.
 def _base(digest, type_, polarity, locus, magnitude=1.0, **detail) -> Signal:
    return Signal(
        session_uid=digest["session_uid"], flavor=digest["flavor"],
        repo=digest.get("repo"), type=type_, polarity=polarity, locus=locus,
        magnitude=magnitude, detail=detail,
    )
 def sig_retry_storm(digest, ctx) -> list[Signal]:
    retries = digest.get("markers", {}).get("retries", 0)
    if retries >= ctx.get("retry_storm_threshold", 3):
        return [_base(digest, "retry_storm", PROBLEM, "retries", float(retries), retries=retries)]
    return []
 def sig_repeated_errors(digest, ctx) -> list[Signal]:
    errors = digest.get("markers", {}).get("errors", 0)
    if errors >= ctx.get("error_threshold", 3):
        return [_base(digest, "repeated_errors", PROBLEM, "errors", float(errors), errors=errors)]
    return []
 def sig_budget_overrun(digest, ctx) -> list[Signal]:
    total = digest.get("cost", {}).get("input_tokens", 0) + digest.get("cost", {}).get("output_tokens", 0)
    p90 = ctx.get("tokens_p90", 0)
    if p90 and total > p90:
        return [_base(digest, "budget_overrun", PROBLEM, "tokens",
                      float(total) / max(p90, 1), tokens=total, p90=p90)]
    return []
 def sig_abandoned(digest, ctx) -> list[Signal]:
    if digest.get("outcome") == "abandoned":
        return [_base(digest, "abandoned", PROBLEM, "outcome", 1.0)]
    return []
 def sig_clean_pass(digest, ctx) -> list[Signal]:
    """Success: ended success, ran tests, no errors, modest cost."""
    m = digest.get("markers", {})
    if (digest.get("outcome") == "success" and m.get("test_runs", 0) >= 1
            and m.get("errors", 0) == 0 and m.get("retries", 0) == 0):
        return [_base(digest, "clean_pass", SUCCESS, "outcome", 1.0,
                      test_runs=m.get("test_runs"))]
    return []
 def sig_error_then_recovery(digest, ctx) -> list[Signal]:
    """Success despite hitting errors — a recovery worth learning from."""
    m = digest.get("markers", {})
    if digest.get("outcome") == "success" and m.get("errors", 0) >= 1:
        return [_base(digest, "error_then_recovery", SUCCESS, "errors",
                      float(m.get("errors", 1)), errors=m.get("errors"))]
    return []
 # --- tool-mix / infrastructure-overhead signals (WP-0005 T02) ----------------
 # These read the captured ``tool_histogram`` — friction that the outcome+marker
 # signals above are blind to (sessions still "succeed", just expensively).
 def tool_bucket(tool: str) -> str:
    """Group a tool name into a coarse activity bucket (flavor-agnostic)."""
    if tool.startswith("mcp__state-hub"):
        return "statehub_mcp"
    if tool in ("TaskUpdate", "TaskCreate", "TaskGet", "TaskList", "TaskOutput",
                "TaskStop", "todo_write", "update_task_status"):
        return "task_mgmt"
    if tool == "ToolSearch":
        return "schema_load"
    if tool in ("Bash", "run_terminal_command"):
        return "shell"
    if tool in ("Edit", "Write", "search_replace", "write", "NotebookEdit"):
        return "edit"
    if tool in ("Read", "read_file", "grep", "Grep", "glob", "Glob"):
        return "read"
    return "other"
 def _bucketed(digest) -> tuple[dict, int]:
    buckets: dict[str, int] = {}
    for tool, n in (digest.get("tool_histogram") or {}).items():
        buckets[tool_bucket(tool)] = buckets.get(tool_bucket(tool), 0) + n
    return buckets, sum(buckets.values())
 def sig_infra_overhead(digest, ctx) -> list[Signal]:
    """Problem: a large share of tool calls is hub/task/schema plumbing, not work."""
    buckets, total = _bucketed(digest)
    if total < ctx.get("infra_min_calls", 20):
        return []
    overhead = buckets.get("statehub_mcp", 0) + buckets.get("task_mgmt", 0) + buckets.get("schema_load", 0)
    share = overhead / total
    if share >= ctx.get("infra_overhead_threshold", 0.30):
        return [_base(digest, "infra_overhead", PROBLEM, "infra_overhead", round(share, 3),
                      overhead_calls=overhead, total_calls=total,
                      statehub=buckets.get("statehub_mcp", 0),
                      task_mgmt=buckets.get("task_mgmt", 0),
                      schema_load=buckets.get("schema_load", 0))]
    return []
 def sig_schema_thrash(digest, ctx) -> list[Signal]:
    """Problem: repeated ToolSearch — deferred-tool schemas reloaded over and over."""
    buckets, _ = _bucketed(digest)
    n = buckets.get("schema_load", 0)
    if n >= ctx.get("schema_thrash_threshold", 5):
        return [_base(digest, "schema_thrash", PROBLEM, "schema_load", float(n), tool_searches=n)]
    return []
 def sig_tool_thrash(digest, ctx) -> list[Signal]:
    """Problem: a single tool is hammered far more than any other — likely churn."""
    hist = digest.get("tool_histogram") or {}
    if not hist:
        return []
    tool, n = max(hist.items(), key=lambda kv: kv[1])
    if n >= ctx.get("tool_thrash_threshold", 80):
        return [_base(digest, "tool_thrash", PROBLEM, f"tool:{tool}", float(n), tool=tool, calls=n)]
    return []
 def sig_recurring_error(digest, ctx) -> list[Signal]:
    """Problem: a normalized error fingerprint (WP-0006) — one signal per distinct
    error in the session, so the same error across sessions/repos/flavors clusters
    into a candidate root-cause pattern (locus = fingerprint, magnitude = in-session
    occurrences). This is the content-level 'why', not just a coarse error count.
    """
    out: list[Signal] = []
    for snip in digest.get("error_snippets", []) or []:
        fp = snip.get("fingerprint")
        if not fp:
            continue
        out.append(_base(digest, "recurring_error", PROBLEM, fp, float(snip.get("count", 1)),
                         sample=snip.get("sample", ""), tool=snip.get("tool"),
                         occurrences=snip.get("count", 1)))
    return out
 EXTRACTORS: list[Callable] = [
    sig_retry_storm, sig_repeated_errors, sig_budget_overrun, sig_abandoned,
    sig_clean_pass, sig_error_then_recovery,
    sig_infra_overhead, sig_schema_thrash, sig_tool_thrash,
    sig_recurring_error,
 ]
 def build_context(digests: list[dict]) -> dict[str, Any]:
    """Corpus-level stats so extractors can compare a session to its peers."""
    totals = sorted(
        d.get("cost", {}).get("input_tokens", 0) + d.get("cost", {}).get("output_tokens", 0)
        for d in digests
    )
    p90 = totals[int(0.9 * (len(totals) - 1))] if totals else 0
    return {
        "tokens_p90": p90, "retry_storm_threshold": 3, "error_threshold": 3,
        # tool-mix / infra-overhead thresholds (WP-0005 T02)
        "infra_min_calls": 20, "infra_overhead_threshold": 0.30,
        "schema_thrash_threshold": 5, "tool_thrash_threshold": 80,
    }
 def extract_signals(digests: list[dict], ctx: Optional[dict] = None) -> list[Signal]:
    ctx = ctx or build_context(digests)
    out: list[Signal] = []
    for d in digests:
        for ex in EXTRACTORS:
            out.extend(ex(d, ctx))
    return out
--- a/session_memory/digest_lookup.py
+++ b/session_memory/digest_lookup.py
@@ -0,0 +1,76 @@
 """Read a single session digest from the local store (AGENTIC-WP-0011 T03).
 Thin read path for ``kaizen-agentic metrics correlate`` and other consumers.
 Does not run ingest.
 Usage:
    python -m session_memory.digest_lookup <session_uid> [--json]
    HELIX_STORE_DB=/abs/path/to/mem.db python -m session_memory.digest_lookup <uid>
 """
 from __future__ import annotations
 import argparse
 import json
 import os
 import sys
 from .core.store import Store
 from .ingest import _expand, load_config
 def resolve_store_paths(*, config_path: str | None = None) -> tuple[str, str]:
    """Resolve db + blob paths from HELIX_STORE_DB or config.toml [store]."""
    env_db = os.environ.get("HELIX_STORE_DB")
    if env_db:
        db_path = _expand(env_db)
        blob_dir = os.path.join(os.path.dirname(db_path), "blobs")
        return db_path, blob_dir
    here = os.path.dirname(os.path.abspath(__file__))
    cfg_path = config_path or os.path.join(here, "config.toml")
    store_cfg = load_config(cfg_path).get("store", {})
    return _expand(store_cfg.get("db_path", "session_memory/.store/mem.db")), _expand(
        store_cfg.get("blob_dir", "session_memory/.store/blobs")
    )
 def lookup_digest(session_uid: str, *, config_path: str | None = None) -> dict | None:
    db_path, blob_dir = resolve_store_paths(config_path=config_path)
    store = Store(db_path, blob_dir)
    try:
        return store.get_digest(session_uid)
    finally:
        store.close()
 def main(argv: list[str] | None = None) -> int:
    here = os.path.dirname(os.path.abspath(__file__))
    ap = argparse.ArgumentParser(
        description="Read one session digest from the Helix Forge store (no ingest)."
    )
    ap.add_argument("session_uid", help="Normalized session uid, e.g. claude:abc-123")
    ap.add_argument("--config", default=os.path.join(here, "config.toml"),
                    help="config.toml when HELIX_STORE_DB is unset")
    ap.add_argument("--json", action="store_true", help="print digest JSON to stdout")
    args = ap.parse_args(argv)
    digest = lookup_digest(args.session_uid, config_path=args.config)
    if digest is None:
        print(f"digest not found: {args.session_uid}", file=sys.stderr)
        return 1
    if args.json:
        print(json.dumps(digest, indent=2, sort_keys=True))
    else:
        cost = digest.get("cost") or {}
        tokens = cost.get("input_tokens", 0) + cost.get("output_tokens", 0)
        print(f"session_uid: {digest.get('session_uid')}")
        print(f"repo: {digest.get('repo')}  flavor: {digest.get('flavor')}")
        print(f"outcome: {digest.get('outcome')}  tokens: {tokens}")
        print(f"started_at: {digest.get('started_at')}  ended_at: {digest.get('ended_at')}")
    return 0
 if __name__ == "__main__":
    raise SystemExit(main())
--- a/session_memory/distribute/init.py
+++ b/session_memory/distribute/init.py
@@ -0,0 +1,9 @@
 """Distribute phase (PRD §6.4) — render approved Solution Patterns into per-flavor
 artifacts. Mirror of the collector design: agnostic core, thin distributor edges.
    base.py      Artifact + Distributor protocol + idempotent snippet markers (T01)
    claude.py    CLAUDE.md snippet distributor (T02)
    codex.py     AGENTS.md snippet distributor (T03)
    grok.py      native instruction distributor (T03)
    __main__.py  `python -m session_memory.distribute` (T05)
 """
--- a/session_memory/distribute/main.py
+++ b/session_memory/distribute/main.py
@@ -0,0 +1,89 @@
 """Distribute entrypoint (T05): catalog -> per-flavor proposals (HITL).
    python -m session_memory.distribute [--config PATH] [--repo R] [--flavor F] [--json]
 Reads approved / distribution-ready Solution Patterns from the Pattern Catalog and
 renders them into per-flavor **proposals** (never auto-applied) scoped by
 repo/domain, recording what is proposed where in the active-pattern registry.
 Targets are the repo->domain map in ``config.toml`` crossed with the known
 distributor flavors; each pattern's own ``Scope`` filters where it actually lands.
 """
 from __future__ import annotations
 import argparse
 import json
 import os
 from ..curate.catalog import Catalog
 from ..ingest import _expand, load_config
 from .proposals import ActiveRegistry, Target, propose
 from .registry import all_flavors
 def build_targets(config: dict, repo_filter=None, flavor_filter=None) -> list[Target]:
    repo_map = config.get("repo_domain_map", {})
    flavors = [flavor_filter] if flavor_filter else all_flavors()
    targets = []
    for repo, domain in repo_map.items():
        if repo_filter and repo != repo_filter:
            continue
        for flavor in flavors:
            targets.append(Target(repo=repo, domain=domain, flavor=flavor))
    return targets
 def run_distribute(config: dict, *, repo_filter=None, flavor_filter=None):
    cur = config.get("curate", {})
    dist = config.get("distribute", {})
    catalog = Catalog(_expand(cur.get("catalog_dir", "session_memory/catalog")))
    patterns = catalog.list()
    targets = build_targets(config, repo_filter, flavor_filter)
    registry = ActiveRegistry(_expand(dist.get("active_registry",
                                               "session_memory/distribute/active_patterns.json")))
    out_dir = _expand(dist.get("proposals_dir", "session_memory/proposals"))
    return propose(patterns, targets, out_dir, registry)
 def _summary(res) -> str:
    by_repo = {}
    for repo, flavor, pid, _ in res.proposals:
        by_repo.setdefault(repo, []).append(f"{pid}[{flavor}]")
    lines = [f"# Distribute proposals  ({len(res.proposals)} renders, "
             f"{len(res.files_written)} files)"]
    for repo in sorted(by_repo):
        lines.append(f"  {repo}: {', '.join(sorted(by_repo[repo]))}")
    if res.skipped_not_distributable:
        lines.append(f"  skipped (not distribution-ready): "
                     f"{len(set(res.skipped_not_distributable))} pattern(s)")
    if not res.proposals:
        lines.append("  (no approved/distribution-ready patterns matched any target)")
    return "\n".join(lines)
 def main(argv=None) -> int:
    here = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
    ap = argparse.ArgumentParser(description="Distribute approved patterns as per-flavor proposals.")
    ap.add_argument("--config", default=os.path.join(here, "config.toml"))
    ap.add_argument("--repo", default=None, help="limit to one target repo")
    ap.add_argument("--flavor", default=None, help="limit to one flavor")
    ap.add_argument("--json", action="store_true")
    args = ap.parse_args(argv)
    config = load_config(args.config)
    res = run_distribute(config, repo_filter=args.repo, flavor_filter=args.flavor)
    if args.json:
        print(json.dumps({
            "proposals": [{"repo": r, "flavor": f, "pattern_id": p, "path": path}
                          for r, f, p, path in res.proposals],
            "files_written": res.files_written,
            "skipped": sorted(set(res.skipped_not_distributable)),
        }, indent=2))
    else:
        print(_summary(res))
    return 0
 if __name__ == "__main__":
    raise SystemExit(main())
--- a/session_memory/distribute/active_patterns.json
+++ b/session_memory/distribute/active_patterns.json
@@ -0,0 +1,242 @@
 [
  {
    "flavor": "claude",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "agentic-resources",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "codex",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "agentic-resources",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "agentic-resources",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "can-you-assist",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "codex",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "can-you-assist",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "can-you-assist",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "net-kingdom",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "codex",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "net-kingdom",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "net-kingdom",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "ops-bridge",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "codex",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "ops-bridge",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "ops-bridge",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "state-hub",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "codex",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "state-hub",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "state-hub",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "the-custodian",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "codex",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "the-custodian",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-problem-file_not_read-edit",
    "repo": "the-custodian",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.0"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-problem-schema_thrash-schema_load",
    "repo": "ops-bridge",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-problem-tool_thrash-tool-bash",
    "repo": "state-hub",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "agentic-resources",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "agentic-resources",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "can-you-assist",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "can-you-assist",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "ops-bridge",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "ops-bridge",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "state-hub",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "state-hub",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "claude",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "the-custodian",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  },
  {
    "flavor": "grok",
    "pattern_id": "sp-success-clean_pass-outcome",
    "repo": "the-custodian",
    "status": "proposed",
    "updated_at": "2026-06-07T14:25:34Z",
    "version": "1.0.1"
  }
 ]
--- a/session_memory/distribute/base.py
+++ b/session_memory/distribute/base.py
@@ -0,0 +1,115 @@
 """Distributor base — Artifact, the Distributor protocol, and idempotent markers
 (PRD §6.4 FR-X1; T01).
 A **distributor** turns one agnostic :class:`SolutionPattern` into a per-flavor
 :class:`Artifact` (a target path + a snippet of content). Everything flavor-neutral
 lives here; each flavor adapter (T02/T03) only supplies its target filename and may
 override the rendered body using the pattern's ``rendering_hints``.
 Snippets carry stable ``BEGIN/END`` markers keyed on the pattern id, so
 re-distributing a pattern **updates its block in place** instead of duplicating it
 — the property that lets Distribute run repeatedly (HITL) without drift.
 """
 from __future__ import annotations
 import re
 from dataclasses import dataclass
 from typing import Any, Optional, Protocol, runtime_checkable
 from ..curate.schema import SolutionPattern
@dataclass
 class Artifact:
    """A proposed per-flavor rendering of a pattern (FR-X1/FR-X3 — proposed, not applied)."""
    flavor: str
    target_path: str        # repo-relative file the snippet belongs in (e.g. "CLAUDE.md")
    pattern_id: str
    content: str            # the marker-wrapped snippet block
@runtime_checkable
 class Distributor(Protocol):
    flavor: str
    target_path: str
    def render(self, pattern: SolutionPattern) -> Artifact: ...
 # --- idempotent snippet markers ---------------------------------------------
 _MARK = "helix-forge pattern"
 def begin_marker(pattern_id: str) -> str:
    return f"<!-- BEGIN {_MARK}:{pattern_id} -->"
 def end_marker(pattern_id: str) -> str:
    return f"<!-- END {_MARK}:{pattern_id} -->"
 def wrap_block(pattern_id: str, body: str, version: str = "") -> str:
    """Wrap a rendered body in stable BEGIN/END markers."""
    ver = f" v{version}" if version else ""
    return f"{begin_marker(pattern_id)}{ver}\n{body.strip()}\n{end_marker(pattern_id)}"
 def upsert_block(doc_text: str, pattern_id: str, block: str) -> str:
    """Insert or replace a pattern's marked block within a document (idempotent)."""
    pat = re.compile(
        re.escape(begin_marker(pattern_id)) + r".*?" + re.escape(end_marker(pattern_id)),
        re.DOTALL,
    )
    if pat.search(doc_text):
        return pat.sub(block, doc_text)
    sep = "" if doc_text.endswith("\n\n") or not doc_text else "\n\n"
    return f"{doc_text}{sep}{block}\n"
 # --- agnostic body rendering ------------------------------------------------
 def render_markdown_body(pattern: SolutionPattern) -> str:
    """Default flavor-neutral snippet body from the agnostic pattern fields."""
    label = "Avoid" if pattern.polarity == "problem" else "Prefer"
    lines = [f"### {pattern.name}", "", pattern.problem.strip(), ""]
    if pattern.resolutions:
        lines.append(f"**{label}:**")
        for r in pattern.resolutions:
            detail = f" — {r.detail}" if r.detail else ""
            lines.append(f"- {r.summary}{detail}")
            for step in r.steps:
                lines.append(f"  - {step}")
    return "\n".join(lines).strip()
 def hint(pattern: SolutionPattern, flavor: str, key: str, default: Any = None) -> Any:
    """Read a per-flavor rendering hint, falling back to ``default``."""
    return (pattern.rendering_hints.get(flavor) or {}).get(key, default)
 class BaseDistributor:
    """Shared distributor: renders the agnostic body, honouring a ``body`` hint
    override and a ``target`` hint, then wraps it in idempotent markers."""
    flavor: str = ""
    target_path: str = ""
    def __init__(self, flavor: Optional[str] = None, target_path: Optional[str] = None) -> None:
        if flavor is not None:
            self.flavor = flavor
        if target_path is not None:
            self.target_path = target_path
    def body(self, pattern: SolutionPattern) -> str:
        return hint(pattern, self.flavor, "body") or render_markdown_body(pattern)
    def target(self, pattern: SolutionPattern) -> str:
        return hint(pattern, self.flavor, "target") or self.target_path
    def render(self, pattern: SolutionPattern) -> Artifact:
        block = wrap_block(pattern.id, self.body(pattern), pattern.version)
        return Artifact(flavor=self.flavor, target_path=self.target(pattern),
                        pattern_id=pattern.id, content=block)
--- a/session_memory/distribute/claude.py
+++ b/session_memory/distribute/claude.py
@@ -0,0 +1,42 @@
 """Claude distributor (PRD §6.4 FR-X1; T02).
 Renders an approved Solution Pattern into a ``CLAUDE.md`` snippet block. Most logic
 is inherited from :class:`BaseDistributor`; the Claude-specific touch is an
 optional **skill** rendering mode (``rendering_hints["claude"]["as"] == "skill"``)
 that emits a skill-style stub instead of a plain instruction snippet — Claude's
 native distribution targets are CLAUDE.md snippets, skills, or hooks.
 """
 from __future__ import annotations
 from ..curate.schema import SolutionPattern
 from .base import BaseDistributor, hint, render_markdown_body
 class ClaudeDistributor(BaseDistributor):
    flavor = "claude"
    target_path = "CLAUDE.md"
    def body(self, pattern: SolutionPattern) -> str:
        override = hint(pattern, self.flavor, "body")
        if override:
            return override
        if hint(pattern, self.flavor, "as") == "skill":
            return self._skill_stub(pattern)
        return render_markdown_body(pattern)
    @staticmethod
    def _skill_stub(pattern: SolutionPattern) -> str:
        trigger = "avoid" if pattern.polarity == "problem" else "apply"
        lines = [
            f"## Skill: {pattern.name}",
            "",
            f"**When:** situations where you would {trigger} — {pattern.problem.strip()}",
            "",
            "**Steps:**",
        ]
        for r in pattern.resolutions:
            lines.append(f"- {r.summary}" + (f" — {r.detail}" if r.detail else ""))
            for step in r.steps:
                lines.append(f"  - {step}")
        return "\n".join(lines).strip()
--- a/session_memory/distribute/codex.py
+++ b/session_memory/distribute/codex.py
@@ -0,0 +1,15 @@
 """Codex distributor (PRD §6.4 FR-X1; T03).
 Renders an approved Solution Pattern into an ``AGENTS.md`` snippet — Codex's native
 repo-convention surface. Identical agnostic body to the other flavors (FR-A3: one
 pattern, expressible everywhere); only the target file differs.
 """
 from __future__ import annotations
 from .base import BaseDistributor
 class CodexDistributor(BaseDistributor):
    flavor = "codex"
    target_path = "AGENTS.md"
--- a/session_memory/distribute/grok.py
+++ b/session_memory/distribute/grok.py
@@ -0,0 +1,15 @@
 """Grok distributor (PRD §6.4 FR-X1; T03).
 Renders an approved Solution Pattern into Grok's native instruction format. Defaults
 to a ``.grok/instructions.md`` snippet; the same agnostic body as the other flavors
 (FR-A3), overridable via ``rendering_hints["grok"]``.
 """
 from __future__ import annotations
 from .base import BaseDistributor
 class GrokDistributor(BaseDistributor):
    flavor = "grok"
    target_path = ".grok/instructions.md"
--- a/session_memory/distribute/proposals.py
+++ b/session_memory/distribute/proposals.py
@@ -0,0 +1,136 @@
 """Scoping, proposed-not-applied output, and the active-pattern registry
 (PRD §6.4 FR-X2/FR-X3/FR-X4; T04).
 * **Scope (FR-X2):** a pattern lands in a target environment only if the target's
  repo/domain/flavor are within the pattern's :class:`Scope` (an empty scope list
  means "unrestricted on that axis").
 * **Proposed, not applied (FR-X3):** rendered artifacts are written under a
  ``proposals/`` tree mirroring the target path — a reviewable diff a human applies,
  never auto-written into the live file. Re-running upserts each pattern's block in
  place (idempotent), so proposals don't accumulate duplicates.
 * **Active-pattern registry (FR-X4):** a JSON record of which pattern (and version)
  is proposed/active in which (repo, flavor) environment.
 """
 from __future__ import annotations
 import json
 import os
 from dataclasses import dataclass
 from datetime import datetime, timezone
 from ..curate.schema import SolutionPattern
 from .base import upsert_block
 from .registry import get_distributor
 def _now() -> str:
    return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
@dataclass(frozen=True)
 class Target:
    """An environment a pattern could be distributed to."""
    repo: str
    domain: str = ""
    flavor: str = "claude"
 def applies(pattern: SolutionPattern, target: Target) -> bool:
    """True if ``target`` is within the pattern's scope (empty axis == any)."""
    sc = pattern.scope
    if sc.repos and target.repo not in sc.repos:
        return False
    if sc.domains and target.domain and target.domain not in sc.domains:
        return False
    if sc.flavors and target.flavor not in sc.flavors:
        return False
    return True
 def is_distributable(pattern: SolutionPattern) -> bool:
    return pattern.status == "approved" and pattern.distribution_ready
 class ActiveRegistry:
    """JSON record of patterns proposed/active per (repo, flavor) — FR-X4."""
    def __init__(self, path: str) -> None:
        self.path = path
        self._entries: dict[str, dict] = {}
        if os.path.exists(path):
            with open(path, encoding="utf-8") as fh:
                for e in json.load(fh):
                    self._entries[self._key(e["pattern_id"], e["repo"], e["flavor"])] = e
    @staticmethod
    def _key(pid: str, repo: str, flavor: str) -> str:
        return f"{pid}|{repo}|{flavor}"
    def record(self, pid: str, repo: str, flavor: str, version: str,
               status: str = "proposed") -> None:
        self._entries[self._key(pid, repo, flavor)] = {
            "pattern_id": pid, "repo": repo, "flavor": flavor,
            "version": version, "status": status, "updated_at": _now(),
        }
    def entries(self) -> list[dict]:
        return [self._entries[k] for k in sorted(self._entries)]
    def save(self) -> None:
        os.makedirs(os.path.dirname(self.path) or ".", exist_ok=True)
        with open(self.path, "w", encoding="utf-8") as fh:
            json.dump(self.entries(), fh, indent=2, sort_keys=True)
            fh.write("\n")
@dataclass
 class ProposalResult:
    proposals: list = None        # (repo, flavor, pattern_id, proposal_path)
    files_written: list = None    # absolute proposal paths
    skipped_not_distributable: list = None  # pattern ids
    def __post_init__(self):
        self.proposals = self.proposals or []
        self.files_written = self.files_written or []
        self.skipped_not_distributable = self.skipped_not_distributable or []
 def propose(patterns: list[SolutionPattern], targets: list[Target], out_dir: str,
            registry: ActiveRegistry) -> ProposalResult:
    """Render in-scope, distributable patterns into per-target proposal files."""
    result = ProposalResult()
    pending: dict[str, str] = {}  # proposal path -> accumulated content
    for p in patterns:
        if not is_distributable(p):
            result.skipped_not_distributable.append(p.id)
            continue
        for t in targets:
            dist = get_distributor(t.flavor)
            if dist is None or not applies(p, t):
                continue
            art = dist.render(p)
            path = os.path.join(out_dir, t.repo, art.target_path)
            if path not in pending:
                pending[path] = _read(path)
            pending[path] = upsert_block(pending[path], p.id, art.content)
            registry.record(p.id, t.repo, t.flavor, p.version)
            result.proposals.append((t.repo, t.flavor, p.id, path))
    for path, content in pending.items():
        os.makedirs(os.path.dirname(path), exist_ok=True)
        with open(path, "w", encoding="utf-8") as fh:
            fh.write(content if content.endswith("\n") else content + "\n")
        result.files_written.append(path)
    registry.save()
    return result
 def _read(path: str) -> str:
    if os.path.exists(path):
        with open(path, encoding="utf-8") as fh:
            return fh.read()
    return ""
--- a/session_memory/distribute/registry.py
+++ b/session_memory/distribute/registry.py
@@ -0,0 +1,26 @@
 """Distributor registry (T03) — flavor -> distributor, the one place that knows
 about all flavor edges. Adding a flavor = one entry here + one adapter module.
 """
 from __future__ import annotations
 from typing import Optional
 from .base import BaseDistributor
 from .claude import ClaudeDistributor
 from .codex import CodexDistributor
 from .grok import GrokDistributor
 _REGISTRY: dict[str, BaseDistributor] = {
    "claude": ClaudeDistributor(),
    "codex": CodexDistributor(),
    "grok": GrokDistributor(),
 }
 def get_distributor(flavor: str) -> Optional[BaseDistributor]:
    return _REGISTRY.get(flavor)
 def all_flavors() -> list[str]:
    return list(_REGISTRY)
--- a/session_memory/ingest.py
+++ b/session_memory/ingest.py
@@ -19,13 +19,19 @@ from dataclasses import dataclass, field
 from typing import Any
 from .adapters import claude as claude_adapter
 from .adapters import codex as codex_adapter
 from .adapters import grok as grok_adapter
 from .core import digest as digest_mod
 from .core.cursor import Cursors
 from .core.retention import RetentionConfig, sweep as retention_sweep
 from .core.store import Store
 # adapter dispatch by source name
-_ADAPTERS = {"claude": claude_adapter.parse_session}
+_ADAPTERS = {
    "claude": claude_adapter.parse_session,
    "codex": codex_adapter.parse_session,
    "grok": grok_adapter.parse_session,
 }
@dataclass
--- a/session_memory/measure/init.py
+++ b/session_memory/measure/init.py
@@ -0,0 +1,9 @@
 """Measure phase (PRD §6.5) — the loop-closer.
    metrics.py    fleet metrics + persisted baseline snapshots (T01)
    effect.py     before/after per-pattern effectiveness (T02)
    __main__.py   python -m session_memory.measure (T03)
 Computation over existing digests (reusing WP-0005 tool buckets + WP-0006 error
 mining); no new capture.
 """
--- a/session_memory/measure/main.py
+++ b/session_memory/measure/main.py
@@ -0,0 +1,101 @@
 """Measure entrypoint (T03): fleet trend + per-pattern effectiveness.
    python -m session_memory.measure [--config PATH] [--label L] [--since DATE]
                                     [--no-save] [--json]
 Computes current fleet metrics over the real (quality-filtered) sessions, appends
 them to the baseline trend, and reports whether the fleet is getting cheaper /
 more reliable over time (FR-M3). With ``--since DATE`` it also reports before/after
 effectiveness around a change (FR-M1/FR-M2).
 """
 from __future__ import annotations
 import argparse
 import json
 import os
 from ..core.store import Store
 from ..detect.quality import filter_real, quality_config
 from ..ingest import _expand, load_config
 from .effect import effectiveness
 from .metrics import load_baselines, save_baseline, snapshot
 _TREND_KEYS = ("infra_overhead_share_median", "error_rate", "schema_thrash_sessions",
               "tokens_p50", "success_rate")
 def real_digests(config: dict) -> list[dict]:
    s = config.get("store", {})
    store = Store(_expand(s["db_path"]), _expand(s["blob_dir"]))
    out = filter_real(store.list_digests(), quality_config(config))
    store.close()
    return out
 def _fmt_trend(baselines: list[dict]) -> str:
    if not baselines:
        return "  (no prior snapshots)"
    lines = []
    recent = baselines[-5:]
    for b in recent:
        when = (b.get("captured_at") or "")[:10]
        lbl = f" {b['label']}" if b.get("label") else ""
        lines.append(f"  {when}{lbl}: overhead_med={b.get('infra_overhead_share_median')} "
                     f"err_rate={b.get('error_rate')} schema_thrash={b.get('schema_thrash_sessions')} "
                     f"tok_p50={b.get('tokens_p50')} success={b.get('success_rate')} "
                     f"(n={b.get('n_sessions')})")
    return "\n".join(lines)
 def _report(current: dict, baselines: list[dict], eff: dict | None) -> str:
    lines = [f"# Fleet metrics  (n={current.get('n_sessions')} real sessions)"]
    for k in _TREND_KEYS:
        lines.append(f"  {k} = {current.get(k)}")
    lines.append("\n## Trend (recent snapshots)")
    lines.append(_fmt_trend(baselines))
    if eff is not None:
        lines.append(f"\n## Effectiveness since {eff['applied_at']} "
                     f"(before={eff['n_before']}, after={eff['n_after']})")
        if eff["insufficient_data"]:
            lines.append("  insufficient data on one side of the date")
        else:
            for k in _TREND_KEYS:
                d = eff["deltas"].get(k, {})
                mark = {True: "improved", False: "worse", None: "—"}[d.get("improved")]
                lines.append(f"  {k}: {d.get('before')} -> {d.get('after')} "
                             f"({d.get('change'):+}) {mark}")
    return "\n".join(lines)
 def main(argv=None) -> int:
    here = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
    ap = argparse.ArgumentParser(description="Measure fleet metrics + per-pattern effectiveness.")
    ap.add_argument("--config", default=os.path.join(here, "config.toml"))
    ap.add_argument("--label", default="")
    ap.add_argument("--since", default=None, help="ISO date for before/after effectiveness")
    ap.add_argument("--no-save", action="store_true", help="don't append to the baseline trend")
    ap.add_argument("--json", action="store_true")
    args = ap.parse_args(argv)
    config = load_config(args.config)
    digests = real_digests(config)
    current = snapshot(digests, label=args.label)
    path = _expand(config.get("measure", {}).get("baselines", "session_memory/measure/baselines.jsonl"))
    prior = load_baselines(path)
    if not args.no_save:
        save_baseline(current, path)
    eff = effectiveness(digests, args.since, label=args.label) if args.since else None
    if args.json:
        print(json.dumps({"current": current, "trend": prior + [current], "effectiveness": eff},
                         indent=2))
    else:
        print(_report(current, prior + [current], eff))
    return 0
 if __name__ == "__main__":
    raise SystemExit(main())
--- a/session_memory/measure/baselines.jsonl
+++ b/session_memory/measure/baselines.jsonl
@@ -0,0 +1 @@
 {"captured_at": "2026-06-07T13:30:14Z", "error_rate": 0.963, "infra_overhead_share_median": 0.117, "infra_overhead_share_p90": 0.261, "label": "phase4-baseline (pre-fixes)", "n_sessions": 27, "recurring_error_occurrences": 505, "schema_thrash_sessions": 8, "success_rate": 1.0, "tokens_p50": 250725, "tokens_p90": 1423966}
--- a/session_memory/measure/effect.py
+++ b/session_memory/measure/effect.py
@@ -0,0 +1,60 @@
 """Before/after per-pattern effectiveness (PRD §6.5 FR-M1/FR-M2; T02).
 Given a change/pattern with an ``applied_at`` date, split sessions into *before*
 and *after* by their start time, aggregate each side, and diff the headline
 metrics — so we can say whether a distributed pattern (e.g. the Read-before-Edit
 reflex, or the State Hub skill) actually moved the numbers, and retire it if not.
 """
 from __future__ import annotations
 from .metrics import aggregate
 # Metrics where a *lower* value after the change means improvement.
 _LOWER_IS_BETTER = {
    "infra_overhead_share_median", "infra_overhead_share_p90", "error_rate",
    "recurring_error_occurrences", "schema_thrash_sessions", "tokens_p50", "tokens_p90",
 }
 # Metrics where a *higher* value is improvement.
 _HIGHER_IS_BETTER = {"success_rate"}
 def split_by_date(digests: list[dict], applied_at: str) -> tuple[list[dict], list[dict]]:
    """Partition digests into (before, after) by ``started_at`` vs ``applied_at``."""
    before, after = [], []
    for d in digests:
        ts = d.get("started_at") or ""
        (after if ts and ts >= applied_at else before).append(d)
    return before, after
 def _delta(metric: str, before: float, after: float) -> dict:
    change = round(after - before, 3)
    if metric in _LOWER_IS_BETTER:
        improved = change < 0
    elif metric in _HIGHER_IS_BETTER:
        improved = change > 0
    else:
        improved = None
    return {"before": before, "after": after, "change": change, "improved": improved}
 def effectiveness(digests: list[dict], applied_at: str, *, label: str = "") -> dict:
    """Compare fleet metrics after ``applied_at`` against the prior period."""
    before, after = split_by_date(digests, applied_at)
    b_agg, a_agg = aggregate(before), aggregate(after)
    metrics = (_LOWER_IS_BETTER | _HIGHER_IS_BETTER)
    deltas = {}
    if before and after:
        for m in metrics:
            deltas[m] = _delta(m, b_agg.get(m, 0.0), a_agg.get(m, 0.0))
    return {
        "label": label,
        "applied_at": applied_at,
        "n_before": len(before),
        "n_after": len(after),
        "before": b_agg,
        "after": a_agg,
        "deltas": deltas,
        "insufficient_data": not (before and after),
    }
--- a/session_memory/measure/metrics.py
+++ b/session_memory/measure/metrics.py
@@ -0,0 +1,102 @@
 """Fleet metrics + persisted baselines (PRD §6.5 FR-M3; T01).
 Computes the headline health metrics of the captured corpus — the same quantities
 the friction assessment reported — so they can be tracked over time and compared
 before/after a change. Reuses :func:`detect.signals.tool_bucket` (WP-0005) and the
 digest ``error_snippets`` (WP-0006); no new capture.
 A **baseline** is a timestamped metrics snapshot appended to a JSONL file, so
 successive runs build a trend the entrypoint (T03) can chart.
 """
 from __future__ import annotations
 import collections
 import json
 import os
 from datetime import datetime, timezone
 from ..detect.signals import tool_bucket
 def _now() -> str:
    return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
 def _pct(values: list[float], q: float) -> float:
    if not values:
        return 0.0
    s = sorted(values)
    return round(s[int(q * (len(s) - 1))], 3)
 def _median(values: list[float]) -> float:
    return _pct(values, 0.5)
 def _buckets(digest: dict) -> collections.Counter:
    b: collections.Counter = collections.Counter()
    for tool, n in (digest.get("tool_histogram") or {}).items():
        b[tool_bucket(tool)] += n
    return b
 def session_metrics(digest: dict) -> dict:
    """Per-session metrics used to build fleet aggregates."""
    b = _buckets(digest)
    total = sum(b.values()) or 1
    overhead = b["statehub_mcp"] + b["task_mgmt"] + b["schema_load"]
    cost = digest.get("cost", {})
    tokens = cost.get("input_tokens", 0) + cost.get("output_tokens", 0)
    return {
        "infra_overhead_share": overhead / total,
        "tool_calls": total,
        "schema_load": b["schema_load"],
        "error_occurrences": sum(s.get("count", 1) for s in (digest.get("error_snippets") or [])),
        "has_error": bool(digest.get("error_snippets")),
        "tokens": tokens,
        "success": digest.get("outcome") == "success",
    }
 def aggregate(digests: list[dict], *, schema_thrash_threshold: int = 5) -> dict:
    """Fleet-level metrics over a set of (already quality-filtered) digests."""
    per = [session_metrics(d) for d in digests]
    n = len(per)
    if n == 0:
        return {"n_sessions": 0}
    shares = [m["infra_overhead_share"] for m in per]
    tokens = [m["tokens"] for m in per]
    return {
        "n_sessions": n,
        "infra_overhead_share_median": _median(shares),
        "infra_overhead_share_p90": _pct(shares, 0.9),
        "error_rate": round(sum(m["has_error"] for m in per) / n, 3),
        "recurring_error_occurrences": sum(m["error_occurrences"] for m in per),
        "schema_thrash_sessions": sum(1 for m in per if m["schema_load"] >= schema_thrash_threshold),
        "tokens_p50": _pct(tokens, 0.5),
        "tokens_p90": _pct(tokens, 0.9),
        "success_rate": round(sum(m["success"] for m in per) / n, 3),
    }
 def snapshot(digests: list[dict], *, label: str = "") -> dict:
    m = aggregate(digests)
    m["captured_at"] = _now()
    m["label"] = label
    return m
 def save_baseline(metrics: dict, path: str) -> None:
    """Append a metrics snapshot to the baseline JSONL trend file."""
    os.makedirs(os.path.dirname(path) or ".", exist_ok=True)
    with open(path, "a", encoding="utf-8") as fh:
        fh.write(json.dumps(metrics, sort_keys=True))
        fh.write("\n")
 def load_baselines(path: str) -> list[dict]:
    if not os.path.exists(path):
        return []
    with open(path, encoding="utf-8") as fh:
        return [json.loads(line) for line in fh if line.strip()]
--- a/session_memory/retro/init.py
+++ b/session_memory/retro/init.py
@@ -0,0 +1,9 @@
 """Weekly retro (AGENTIC-WP-0010) — the analysis half of the coding retrospection.
    build.py     windowed detect + measure -> ranked top-3 suggestions per repo (T01)
    publish.py   publish the retro to the hub read model + local report (T02)
    __main__.py  python -m session_memory.retro (T03)
 Consumed by activity-core's weekly-coding-retro schedule (ACTIVITY-WP-0008) via
 the ``event_type=coding_retro`` read model.
 """
--- a/session_memory/retro/main.py
+++ b/session_memory/retro/main.py
@@ -0,0 +1,68 @@
 """Weekly retro entrypoint (AGENTIC-WP-0010 T03).
    python -m session_memory.retro [--window-days 7] [--since D] [--until D]
                                   [--publish] [--json]
 Builds the windowed top-3-per-repo retro over the captured sessions, writes a local
 JSON + markdown report, and (with ``--publish``) posts it to the hub as the
 ``coding_retro`` read model that activity-core's weekly schedule consumes.
 """
 from __future__ import annotations
 import argparse
 import json
 import os
 from ..core.store import Store
 from ..curate.catalog import Catalog
 from ..ingest import _expand, load_config
 from .build import weekly_retro
 from .publish import publish_to_hub, render_markdown, write_local
 def run_retro(config: dict, *, window_days=None, since=None, until=None):
    s = config.get("store", {})
    store = Store(_expand(s["db_path"]), _expand(s["blob_dir"]))
    digests = store.list_digests()
    store.close()
    cur = config.get("curate", {})
    catalog = Catalog(_expand(cur.get("catalog_dir", "session_memory/catalog")))
    rcfg = config.get("retro", {})
    return weekly_retro(digests, catalog, since=since, until=until,
                        window_days=window_days or rcfg.get("window_days", 7))
 def main(argv=None) -> int:
    here = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
    ap = argparse.ArgumentParser(description="Build (and optionally publish) the weekly coding retro.")
    ap.add_argument("--config", default=os.path.join(here, "config.toml"))
    ap.add_argument("--window-days", type=int, default=None)
    ap.add_argument("--since", default=None)
    ap.add_argument("--until", default=None)
    ap.add_argument("--publish", action="store_true", help="post to the hub coding_retro read model")
    ap.add_argument("--json", action="store_true")
    args = ap.parse_args(argv)
    config = load_config(args.config)
    report = run_retro(config, window_days=args.window_days, since=args.since, until=args.until)
    rcfg = config.get("retro", {})
    write_local(report, _expand(rcfg.get("report_json", "session_memory/retro/last_retro.json")),
                _expand(rcfg.get("report_md", "session_memory/retro/last_retro.md")))
    published = None
    if args.publish:
        published = publish_to_hub(report, base_url=rcfg.get("hub_url", "http://127.0.0.1:8000"))
    if args.json:
        print(json.dumps({"report": report, "published": published}, indent=2))
    else:
        print(render_markdown(report))
        if args.publish:
            print(f"\npublished to hub: {published}")
    return 0
 if __name__ == "__main__":
    raise SystemExit(main())
--- a/session_memory/retro/build.py
+++ b/session_memory/retro/build.py
@@ -0,0 +1,99 @@
 """Windowed weekly retro report (AGENTIC-WP-0010 T01).
 Runs the existing detect pipeline over a date window, ranks the recurring problem
 patterns into **per-repo improvement suggestions** (top 3, cross-flavor first),
 attaches a recommendation from the Pattern Catalog where one exists, and bundles a
 fleet measure snapshot for context. Pure function over digests — the entrypoint
 (T03) handles store/publish.
 """
 from __future__ import annotations
 import collections
 from dataclasses import asdict, dataclass
 from datetime import datetime, timedelta, timezone
 from typing import Optional
 from ..detect.cluster import cluster
 from ..detect.quality import QualityConfig, filter_real
 from ..detect.signals import extract_signals
 from ..measure.metrics import aggregate
 # score at/above which a suggestion is "high" priority even when single-flavor
 _HIGH_SCORE = 100.0
 def _parse(ts: str) -> datetime:
    return datetime.fromisoformat(ts.replace("Z", "+00:00"))
 def _iso(dt: datetime) -> str:
    return dt.astimezone(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
 def _now() -> datetime:
    return datetime.now(timezone.utc)
@dataclass
 class Suggestion:
    repo: str
    title: str
    recommendation: str
    priority: str          # high | medium
    score: float
    signal_type: str
    cross_flavor: bool
    pattern_key: str
 def _recommendation(pattern_key: str, locus: str, catalog) -> Optional[str]:
    if catalog is None:
        return None
    sp = catalog.find_for(pattern_key, locus)
    if sp and sp.resolutions:
        return sp.resolutions[0].summary
    return None
 def weekly_retro(digests: list[dict], catalog=None, *, since: Optional[str] = None,
                 until: Optional[str] = None, window_days: int = 7,
                 max_per_repo: int = 3, min_frequency: int = 2,
                 quality: Optional[QualityConfig] = None) -> dict:
    """Build the ranked weekly retro report over a date window."""
    until_dt = _parse(until) if until else _now()
    since_dt = _parse(since) if since else until_dt - timedelta(days=window_days)
    windowed = [d for d in digests
                if d.get("started_at") and since_dt <= _parse(d["started_at"]) < until_dt]
    real = filter_real(windowed, quality or QualityConfig())
    patterns = cluster(extract_signals(real), min_frequency=min_frequency)
    by_repo: dict[str, list[Suggestion]] = collections.defaultdict(list)
    for p in patterns:
        if p.polarity != "problem":
            continue  # improvements come from problems
        rec = (_recommendation(p.key, p.locus, catalog)
               or f"Investigate {p.signal_type.replace('_', ' ')} on {p.locus}")
        priority = "high" if (p.cross_flavor or p.score >= _HIGH_SCORE) else "medium"
        for repo in (p.repos or ["(unknown)"]):
            by_repo[repo].append(Suggestion(
                repo=repo, title=p.title, recommendation=rec, priority=priority,
                score=p.score, signal_type=p.signal_type, cross_flavor=p.cross_flavor,
                pattern_key=p.key))
    suggestions: list[Suggestion] = []
    for repo in sorted(by_repo):
        items = sorted(by_repo[repo], key=lambda s: -s.score)
        suggestions.extend(items[:max_per_repo])
    # cross-flavor first, then by score (global ordering for the report)
    suggestions.sort(key=lambda s: (not s.cross_flavor, -s.score))
    return {
        "window": {"since": _iso(since_dt), "until": _iso(until_dt), "days": window_days},
        "generated_at": _iso(_now()),
        "n_sessions": len(real),
        "suggestions": [asdict(s) for s in suggestions],
        "measure": aggregate(real),
    }
--- a/session_memory/retro/last_retro.json
+++ b/session_memory/retro/last_retro.json
@@ -0,0 +1,322 @@
 {
  "generated_at": "2026-06-07T19:30:56Z",
  "measure": {
    "error_rate": 0.957,
    "infra_overhead_share_median": 0.167,
    "infra_overhead_share_p90": 0.23,
    "n_sessions": 23,
    "recurring_error_occurrences": 463,
    "schema_thrash_sessions": 7,
    "success_rate": 1.0,
    "tokens_p50": 250725,
    "tokens_p90": 901422
  },
  "n_sessions": 23,
  "suggestions": [
    {
      "cross_flavor": true,
      "pattern_key": "problem:recurring_error:make: *** [makefile:<n>: fix-consistency] error <n>",
      "priority": "high",
      "recommendation": "Investigate recurring error on make: *** [makefile:<n>: fix-consistency] error <n>",
      "repo": "net-kingdom",
      "score": 54.0,
      "signal_type": "recurring_error",
      "title": "cross-flavor problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:tool_thrash:tool:Bash",
      "priority": "high",
      "recommendation": "Batch related shell work into one script, not many small Bash calls",
      "repo": "activity-core",
      "score": 13128.0,
      "signal_type": "tool_thrash",
      "title": "problem: tool thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:tool_thrash:tool:Bash",
      "priority": "high",
      "recommendation": "Batch related shell work into one script, not many small Bash calls",
      "repo": "artifact-store",
      "score": 13128.0,
      "signal_type": "tool_thrash",
      "title": "problem: tool thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:tool_thrash:tool:Bash",
      "priority": "high",
      "recommendation": "Batch related shell work into one script, not many small Bash calls",
      "repo": "citation-evidence",
      "score": 13128.0,
      "signal_type": "tool_thrash",
      "title": "problem: tool thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:tool_thrash:tool:Bash",
      "priority": "high",
      "recommendation": "Batch related shell work into one script, not many small Bash calls",
      "repo": "infospace-bench",
      "score": 13128.0,
      "signal_type": "tool_thrash",
      "title": "problem: tool thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:tool_thrash:tool:Bash",
      "priority": "high",
      "recommendation": "Batch related shell work into one script, not many small Bash calls",
      "repo": "railiance-apps",
      "score": 13128.0,
      "signal_type": "tool_thrash",
      "title": "problem: tool thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:tool_thrash:tool:Bash",
      "priority": "high",
      "recommendation": "Batch related shell work into one script, not many small Bash calls",
      "repo": "state-hub",
      "score": 13128.0,
      "signal_type": "tool_thrash",
      "title": "problem: tool thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:schema_thrash:schema_load",
      "priority": "high",
      "recommendation": "Load the tool schemas you'll need once, up front",
      "repo": "activity-core",
      "score": 441.0,
      "signal_type": "schema_thrash",
      "title": "problem: schema thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:schema_thrash:schema_load",
      "priority": "high",
      "recommendation": "Load the tool schemas you'll need once, up front",
      "repo": "citation-evidence",
      "score": 441.0,
      "signal_type": "schema_thrash",
      "title": "problem: schema thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:schema_thrash:schema_load",
      "priority": "high",
      "recommendation": "Load the tool schemas you'll need once, up front",
      "repo": "flex-auth",
      "score": 441.0,
      "signal_type": "schema_thrash",
      "title": "problem: schema thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:schema_thrash:schema_load",
      "priority": "high",
      "recommendation": "Load the tool schemas you'll need once, up front",
      "repo": "infospace-bench",
      "score": 441.0,
      "signal_type": "schema_thrash",
      "title": "problem: schema thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:schema_thrash:schema_load",
      "priority": "high",
      "recommendation": "Load the tool schemas you'll need once, up front",
      "repo": "ops-bridge",
      "score": 441.0,
      "signal_type": "schema_thrash",
      "title": "problem: schema thrash"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has not been read yet. read it first before writing to it.<<path>>",
      "priority": "high",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "activity-core",
      "score": 290.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has not been read yet. read it first before writing to it.<<path>>",
      "priority": "high",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "citation-evidence",
      "score": 290.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has not been read yet. read it first before writing to it.<<path>>",
      "priority": "high",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "infospace-bench",
      "score": 290.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has not been read yet. read it first before writing to it.<<path>>",
      "priority": "high",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "issue-facade",
      "score": 290.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has not been read yet. read it first before writing to it.<<path>>",
      "priority": "high",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "railiance-apps",
      "score": 290.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has not been read yet. read it first before writing to it.<<path>>",
      "priority": "high",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "state-hub",
      "score": 290.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has not been read yet. read it first before writing to it.<<path>>",
      "priority": "high",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "the-custodian",
      "score": 290.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has not been read yet. read it first before writing to it.<<path>>",
      "priority": "high",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "vergabe-teilnahme",
      "score": 290.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has been modified since read, either by the user or by a linter. read it again before attempting to write it.<<path>>",
      "priority": "medium",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "artifact-store",
      "score": 78.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has been modified since read, either by the user or by a linter. read it again before attempting to write it.<<path>>",
      "priority": "medium",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "issue-facade",
      "score": 78.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has been modified since read, either by the user or by a linter. read it again before attempting to write it.<<path>>",
      "priority": "medium",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "railiance-apps",
      "score": 78.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<tool_use_error>file has been modified since read, either by the user or by a linter. read it again before attempting to write it.<<path>>",
      "priority": "medium",
      "recommendation": "Read the file (or the region you'll touch) before Edit/Write",
      "repo": "state-hub",
      "score": 78.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:budget_overrun:tokens",
      "priority": "medium",
      "recommendation": "Read narrowly \u2014 target the region you need, not whole large files",
      "repo": "artifact-store",
      "score": 50.55,
      "signal_type": "budget_overrun",
      "title": "problem: budget overrun"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:{",
      "priority": "medium",
      "recommendation": "Investigate recurring error on {",
      "repo": "vergabe-teilnahme",
      "score": 12.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:found <n> errors (<n> fixed, <n> remaining).",
      "priority": "medium",
      "recommendation": "Investigate recurring error on found <n> errors (<n> fixed, <n> remaining).",
      "repo": "ops-bridge",
      "score": 10.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:(note: edit also tried swapping \\uxxxx escapes and their characters; neither form matched, so the mismatch is likely elsewhere in old_string. re-read the file a",
      "priority": "medium",
      "recommendation": "Investigate recurring error on (note: edit also tried swapping \\uxxxx escapes and their characters; neither form matched, so the mismatch is likely elsewhere in old_string. re-read the file a",
      "repo": "net-kingdom",
      "score": 6.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:found <n> error (<n> fixed, <n> remaining).",
      "priority": "medium",
      "recommendation": "Investigate recurring error on found <n> error (<n> fixed, <n> remaining).",
      "repo": "ops-bridge",
      "score": 6.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    },
    {
      "cross_flavor": false,
      "pattern_key": "problem:recurring_error:<n> failed, <n> passed in <n>.00s",
      "priority": "medium",
      "recommendation": "Investigate recurring error on <n> failed, <n> passed in <n>.00s",
      "repo": "agentic-resources",
      "score": 4.0,
      "signal_type": "recurring_error",
      "title": "problem: recurring error"
    }
  ],
  "window": {
    "days": 30,
    "since": "2026-05-08T19:30:56Z",
    "until": "2026-06-07T19:30:56Z"
  }
 }
--- a/session_memory/retro/last_retro.md
+++ b/session_memory/retro/last_retro.md
@@ -0,0 +1,39 @@
 # Weekly Coding Retro  (2026-05-08 → 2026-06-07)
 _23 real sessions · generated 2026-06-07T19:30:56Z_
 ## Top improvement suggestions (cross-flavor first, ≤3 per repo)
 - **net-kingdom** (high, score=54.0) [CROSS-FLAVOR]: cross-flavor problem: recurring error — Investigate recurring error on make: *** [makefile:<n>: fix-consistency] error <n>
 - **activity-core** (high, score=13128.0): problem: tool thrash — Batch related shell work into one script, not many small Bash calls
 - **artifact-store** (high, score=13128.0): problem: tool thrash — Batch related shell work into one script, not many small Bash calls
 - **citation-evidence** (high, score=13128.0): problem: tool thrash — Batch related shell work into one script, not many small Bash calls
 - **infospace-bench** (high, score=13128.0): problem: tool thrash — Batch related shell work into one script, not many small Bash calls
 - **railiance-apps** (high, score=13128.0): problem: tool thrash — Batch related shell work into one script, not many small Bash calls
 - **state-hub** (high, score=13128.0): problem: tool thrash — Batch related shell work into one script, not many small Bash calls
 - **activity-core** (high, score=441.0): problem: schema thrash — Load the tool schemas you'll need once, up front
 - **citation-evidence** (high, score=441.0): problem: schema thrash — Load the tool schemas you'll need once, up front
 - **flex-auth** (high, score=441.0): problem: schema thrash — Load the tool schemas you'll need once, up front
 - **infospace-bench** (high, score=441.0): problem: schema thrash — Load the tool schemas you'll need once, up front
 - **ops-bridge** (high, score=441.0): problem: schema thrash — Load the tool schemas you'll need once, up front
 - **activity-core** (high, score=290.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **citation-evidence** (high, score=290.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **infospace-bench** (high, score=290.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **issue-facade** (high, score=290.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **railiance-apps** (high, score=290.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **state-hub** (high, score=290.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **the-custodian** (high, score=290.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **vergabe-teilnahme** (high, score=290.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **artifact-store** (medium, score=78.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **issue-facade** (medium, score=78.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **railiance-apps** (medium, score=78.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **state-hub** (medium, score=78.0): problem: recurring error — Read the file (or the region you'll touch) before Edit/Write
 - **artifact-store** (medium, score=50.55): problem: budget overrun — Read narrowly — target the region you need, not whole large files
 - **vergabe-teilnahme** (medium, score=12.0): problem: recurring error — Investigate recurring error on {
 - **ops-bridge** (medium, score=10.0): problem: recurring error — Investigate recurring error on found <n> errors (<n> fixed, <n> remaining).
 - **net-kingdom** (medium, score=6.0): problem: recurring error — Investigate recurring error on (note: edit also tried swapping \uxxxx escapes and their characters; neither form matched, so the mismatch is likely elsewhere in old_string. re-read the file a
 - **ops-bridge** (medium, score=6.0): problem: recurring error — Investigate recurring error on found <n> error (<n> fixed, <n> remaining).
 - **agentic-resources** (medium, score=4.0): problem: recurring error — Investigate recurring error on <n> failed, <n> passed in <n>.00s
 ## Fleet snapshot
 - infra-overhead median: 0.167
 - error rate: 0.957  ·  schema-thrash: 7
 - success rate: 1.0  ·  tokens p50: 250725
--- a/session_memory/retro/publish.py
+++ b/session_memory/retro/publish.py
@@ -0,0 +1,78 @@
 """Publish the weekly retro (AGENTIC-WP-0010 T02).
 The retro is published to the State Hub as a **read model** — a progress event of
 ``event_type=coding_retro`` whose ``detail`` carries the structured report. This is
 exactly how ``daily-triage-report`` surfaces, and it is what activity-core's
 ``coding_retro`` resolver (ACTIVITY-WP-0008) reads. A local JSON + markdown report
 is always written; the hub publish is best-effort and **degrades gracefully** when
 the hub is unreachable.
 """
 from __future__ import annotations
 import json
 import os
 import urllib.request
 from typing import Callable, Optional
 DEFAULT_HUB = "http://127.0.0.1:8000"
 def render_markdown(report: dict) -> str:
    w = report.get("window", {})
    lines = [
        f"# Weekly Coding Retro  ({w.get('since', '')[:10]} → {w.get('until', '')[:10]})",
        f"_{report.get('n_sessions', 0)} real sessions · generated {report.get('generated_at', '')}_",
        "",
        "## Top improvement suggestions (cross-flavor first, ≤3 per repo)",
    ]
    if not report.get("suggestions"):
        lines.append("- (no recurring problems above threshold this week)")
    for s in report.get("suggestions", []):
        flag = " [CROSS-FLAVOR]" if s.get("cross_flavor") else ""
        lines.append(f"- **{s['repo']}** ({s['priority']}, score={s['score']}){flag}: "
                     f"{s['title']} — {s['recommendation']}")
    m = report.get("measure", {})
    lines += ["", "## Fleet snapshot",
              f"- infra-overhead median: {m.get('infra_overhead_share_median')}",
              f"- error rate: {m.get('error_rate')}  ·  schema-thrash: {m.get('schema_thrash_sessions')}",
              f"- success rate: {m.get('success_rate')}  ·  tokens p50: {m.get('tokens_p50')}"]
    return "\n".join(lines)
 def write_local(report: dict, json_path: str, md_path: Optional[str] = None) -> None:
    os.makedirs(os.path.dirname(json_path) or ".", exist_ok=True)
    with open(json_path, "w", encoding="utf-8") as fh:
        json.dump(report, fh, indent=2, sort_keys=True)
        fh.write("\n")
    if md_path:
        with open(md_path, "w", encoding="utf-8") as fh:
            fh.write(render_markdown(report))
            fh.write("\n")
 def _http_post(url: str, payload: dict) -> None:
    req = urllib.request.Request(url, data=json.dumps(payload).encode(),
                                 headers={"Content-Type": "application/json"}, method="POST")
    with urllib.request.urlopen(req, timeout=10) as r:
        r.read()
 def publish_to_hub(report: dict, *, base_url: str = DEFAULT_HUB,
                   poster: Optional[Callable[[str, dict], None]] = None) -> bool:
    """POST the retro as an event_type=coding_retro progress event. Best-effort."""
    poster = poster or _http_post
    n = report.get("n_sessions", 0)
    k = len(report.get("suggestions", []))
    payload = {
        "event_type": "coding_retro",
        "author": "helix-forge",
        "summary": f"Weekly coding retro: {k} ranked suggestions across "
                   f"{report.get('window', {}).get('days', 7)} days ({n} sessions).",
        "detail": report,
    }
    try:
        poster(f"{base_url.rstrip('/')}/progress/", payload)
        return True
    except Exception:
        return False
--- a/tests/test_catalog_covers.py
+++ b/tests/test_catalog_covers.py
@@ -0,0 +1,62 @@
 """find_for / covers tests (AGENTIC-WP-0010 follow-up)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.catalog import Catalog  # noqa: E402
 from session_memory.curate.schema import (  # noqa: E402
    Provenance,
    Resolution,
    SolutionPattern,
 )
 def _pattern(pid, src, covers=None, name="P"):
    return SolutionPattern(
        id=pid, name=name, version="1.0.0", polarity="problem", problem="p",
        resolutions=[Resolution(summary="do x")],
        provenance=Provenance(source_key=src), covers=covers or [])
 def test_covers_round_trips(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(_pattern("sp-a", "problem:file_not_read:edit",
                        covers=["file has not been read"]))
    assert cat.load("sp-a").covers == ["file has not been read"]
 def test_find_for_exact_key(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(_pattern(SolutionPattern.make_id("problem:retry_storm:retries"),
                        "problem:retry_storm:retries"))
    got = cat.find_for("problem:retry_storm:retries")
    assert got is not None and got.id == "sp-problem-retry_storm-retries"
 def test_find_for_covers_match(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(_pattern("sp-rbe", "problem:file_not_read:edit",
                        covers=["file has not been read", "modified since read"]))
    # a recurring_error signal with a different key but matching fingerprint locus
    got = cat.find_for(
        "problem:recurring_error:<tool_use_error>file has not been read yet...",
        locus="<tool_use_error>file has not been read yet. read it first...")
    assert got is not None and got.id == "sp-rbe"
 def test_find_for_no_match_returns_none(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(_pattern("sp-rbe", "problem:file_not_read:edit",
                        covers=["file has not been read"]))
    assert cat.find_for("problem:recurring_error:some unrelated error") is None
 def test_covers_change_versions(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(_pattern("sp-a", "problem:x:y"))
    p = cat.load("sp-a")
    p.covers = ["new coverage"]
    assert cat.upsert(p) == "versioned"  # covers is substantive content
    assert cat.load("sp-a").version == "1.0.1"
--- a/tests/test_cluster.py
+++ b/tests/test_cluster.py
@@ -0,0 +1,54 @@
 """Clusterer + evidence + cross-flavor tests (T05/T06)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.detect.cluster import cluster  # noqa: E402
 from session_memory.detect.signals import PROBLEM, SUCCESS, Signal  # noqa: E402
 def _sig(uid, flavor, repo, type_, polarity, locus, mag=1.0):
    return Signal(session_uid=uid, flavor=flavor, repo=repo, type=type_,
                  polarity=polarity, locus=locus, magnitude=mag)
 def test_min_frequency_filters_singletons():
    sigs = [_sig("claude:a", "claude", "r1", "retry_storm", PROBLEM, "retries")]
    assert cluster(sigs, min_frequency=2) == []
 def test_clusters_recurring_signal_with_evidence():
    sigs = [
        _sig("claude:a", "claude", "r1", "retry_storm", PROBLEM, "retries", 5),
        _sig("claude:b", "claude", "r2", "retry_storm", PROBLEM, "retries", 3),
    ]
    pats = cluster(sigs, min_frequency=2)
    assert len(pats) == 1
    p = pats[0]
    assert p.frequency == 2
    assert p.sessions == ["claude:a", "claude:b"]
    assert sorted(p.repos) == ["r1", "r2"]
    assert p.flavors == ["claude"]
    assert p.cross_flavor is False
    assert p.cost_impact == 8.0
 def test_cross_flavor_flagged_and_ranked_first():
    sigs = [
        # cross-flavor problem (claude + codex)
        _sig("claude:a", "claude", "r1", "repeated_errors", PROBLEM, "errors", 3),
        _sig("codex:b", "codex", "r2", "repeated_errors", PROBLEM, "errors", 3),
        # single-flavor success cluster with higher raw impact
        _sig("grok:c", "grok", "r3", "clean_pass", SUCCESS, "outcome", 5),
        _sig("grok:d", "grok", "r4", "clean_pass", SUCCESS, "outcome", 5),
    ]
    pats = cluster(sigs, min_frequency=2)
    assert len(pats) == 2
    xf = next(p for p in pats if p.signal_type == "repeated_errors")
    assert xf.cross_flavor is True
    assert sorted(xf.flavors) == ["claude", "codex"]
    # cross-flavor pattern is ranked first even if another has higher raw impact
    assert pats[0].cross_flavor is True
    assert "cross-flavor" in pats[0].title
--- a/tests/test_codex_adapter.py
+++ b/tests/test_codex_adapter.py
@@ -0,0 +1,86 @@
 """Codex adapter tests (T01): synthetic rollout fixture."""
 import json
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.adapters.codex import parse_session  # noqa: E402
 REPO_MAP = {"agentic-resources": "helix_forge"}
 def _rollout(path, lines):
    with open(path, "w", encoding="utf-8") as f:
        for ln in lines:
            f.write(json.dumps(ln) + "\n")
 def test_codex_rollout_parse(tmp_path):
    p = tmp_path / "rollout-2026-06-06-abc.jsonl"
    _rollout(p, [
        {"timestamp": "2026-06-06T10:00:00Z", "type": "session_meta",
         "payload": {"id": "cdx-1", "cwd": "/home/worsch/agentic-resources",
                     "model_provider": "openai", "cli_version": "0.44.0", "model": "gpt-5-codex"}},
        {"timestamp": "2026-06-06T10:00:01Z", "type": "turn_context",
         "payload": {"model": "gpt-5-codex", "approval_policy": "on-request"}},
        {"timestamp": "2026-06-06T10:00:02Z", "type": "event_msg",
         "payload": {"type": "task_started"}},
        {"timestamp": "2026-06-06T10:00:03Z", "type": "response_item",
         "payload": {"type": "message", "role": "user",
                     "content": [{"type": "input_text", "text": "fix the bug"}]}},
        {"timestamp": "2026-06-06T10:00:04Z", "type": "response_item",
         "payload": {"type": "reasoning", "summary": "think about it"}},
        {"timestamp": "2026-06-06T10:00:05Z", "type": "response_item",
         "payload": {"type": "function_call", "name": "apply_patch",
                     "arguments": "{\"path\":\"x.py\"}", "call_id": "call_1"}},
        {"timestamp": "2026-06-06T10:00:06Z", "type": "response_item",
         "payload": {"type": "function_call", "name": "shell",
                     "arguments": "{\"command\":\"pytest -q\"}", "call_id": "call_2"}},
        {"timestamp": "2026-06-06T10:00:07Z", "type": "response_item",
         "payload": {"type": "function_call_output", "call_id": "call_2", "output": "2 passed"}},
        {"timestamp": "2026-06-06T10:00:08Z", "type": "response_item",
         "payload": {"type": "message", "role": "assistant",
                     "content": [{"type": "output_text", "text": "done"}]}},
        {"timestamp": "2026-06-06T10:00:09Z", "type": "event_msg",
         "payload": {"type": "token_count",
                     "info": {"total_token_usage": {"input_tokens": 200, "output_tokens": 30,
                                                    "cached_input_tokens": 15}}}},
        {"timestamp": "2026-06-06T10:00:10Z", "type": "event_msg",
         "payload": {"type": "task_complete"}},
    ])
    norm = parse_session(str(p), REPO_MAP)
    assert norm is not None
    s = norm.session
    assert s.session_uid == "codex:cdx-1"
    assert s.flavor == "codex"
    assert s.repo == "agentic-resources" and s.domain == "helix_forge"
    assert s.model == "gpt-5-codex"
    assert s.cost.input_tokens == 200 and s.cost.output_tokens == 30 and s.cost.cache_tokens == 15
    assert s.cost.turns == 1
    assert s.cost.wall_clock_s == 10.0
    kinds = [e.kind for e in norm.events]
    assert kinds == ["lifecycle", "user_msg", "thinking", "edit", "test_run",
                     "tool_result", "assistant_msg", "completion"]
    # flat linkage: function_call_output links to its function_call by call_id
    out = next(e for e in norm.events if e.kind == "tool_result")
    test_call = next(e for e in norm.events if e.kind == "test_run")
    assert out.parent_seq == test_call.seq
    # apply_patch classified as edit; pytest as test_run
    edit = next(e for e in norm.events if e.kind == "edit")
    assert edit.tool == "apply_patch"
 def test_codex_empty_or_no_meta_returns_none(tmp_path):
    p = tmp_path / "rollout-empty.jsonl"
    p.write_text("")
    assert parse_session(str(p), REPO_MAP) is None
    p2 = tmp_path / "rollout-nometa.jsonl"
    _rollout(p2, [{"timestamp": "t", "type": "event_msg", "payload": {"type": "task_started"}}])
    assert parse_session(str(p2), REPO_MAP) is None  # no session_meta -> no id
--- a/tests/test_curate_catalog.py
+++ b/tests/test_curate_catalog.py
@@ -0,0 +1,86 @@
 """Versioned Pattern Catalog tests (T02): round-trip, dedup, idempotent upsert."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.catalog import (  # noqa: E402
    ADDED,
    UNCHANGED,
    UPDATED,
    VERSIONED,
    Catalog,
 )
 from session_memory.curate.schema import (  # noqa: E402
    Provenance,
    Resolution,
    Scope,
    SolutionPattern,
 )
 def _pattern(src="success:clean_pass:outcome", problem="ran tests, clean finish"):
    return SolutionPattern(
        id=SolutionPattern.make_id(src),
        name="Run tests before declaring success",
        version="1.0.0",
        polarity="success",
        problem=problem,
        resolutions=[Resolution(summary="run the suite")],
        scope=Scope(flavors=["claude", "grok"]),
        provenance=Provenance(source_key=src, evidence={"frequency": 18}),
    )
 def test_add_then_load_round_trips(tmp_path):
    cat = Catalog(str(tmp_path))
    assert cat.upsert(_pattern()) == ADDED
    loaded = cat.load(SolutionPattern.make_id("success:clean_pass:outcome"))
    assert loaded is not None
    assert loaded.problem == "ran tests, clean finish"
    assert loaded.created_at and loaded.updated_at
    assert [p.id for p in cat.list()] == [loaded.id]
 def test_resave_identical_is_noop(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(_pattern())
    assert cat.upsert(_pattern()) == UNCHANGED
    # version not bumped, no history written
    assert cat.load(_pattern().id).version == "1.0.0"
    assert cat.history(_pattern().id) == []
 def test_dedup_on_source_key(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(_pattern())
    cat.upsert(_pattern())  # same source key -> same id -> one file
    assert len(cat.list()) == 1
 def test_content_change_bumps_version_and_archives(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(_pattern())
    assert cat.upsert(_pattern(problem="now with more nuance")) == VERSIONED
    current = cat.load(_pattern().id)
    assert current.version == "1.0.1"
    assert current.problem == "now with more nuance"
    hist = cat.history(_pattern().id)
    assert len(hist) == 1
    assert hist[0]["version"] == "1.0.0"
    assert hist[0]["status"] == "superseded"
 def test_status_only_change_updates_without_bump(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(_pattern())
    p = _pattern()
    p.status = "approved"
    p.distribution_ready = True
    assert cat.upsert(p) == UPDATED
    current = cat.load(p.id)
    assert current.status == "approved"
    assert current.distribution_ready is True
    assert current.version == "1.0.0"  # metadata change, no bump
    assert cat.history(p.id) == []
--- a/tests/test_curate_decisions.py
+++ b/tests/test_curate_decisions.py
@@ -0,0 +1,70 @@
 """Hub decision integration tests (T05): payload shape + graceful queue/flush."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.catalog import Catalog  # noqa: E402
 from session_memory.curate.decisions import DecisionRecorder, build_decision  # noqa: E402
 from session_memory.curate.review import APPROVE, REJECT, ReviewLog, review  # noqa: E402
 def _candidate(key="success:clean_pass:outcome"):
    return {"key": key, "frequency": 18, "sessions": ["a", "b"],
            "cost_impact": 9.0, "cross_flavor": True, "flavors": ["claude", "grok"]}
 def test_build_decision_payload_shape():
    d = build_decision(_candidate(), "approve", "looks solid", workstream_id="ws-1")
    assert d["decision_type"] == "made"
    assert d["workstream_id"] == "ws-1"
    assert "Promote" in d["title"]
    assert d["rationale"] == "looks solid"
    assert "success:clean_pass:outcome" in d["description"]
 def test_sink_accepts_decision(tmp_path):
    captured = []
    rec = DecisionRecorder(str(tmp_path / "q.jsonl"), sink=captured.append)
    assert rec.record(_candidate(), "approve", "ok") is True
    assert rec.pending() == []
    assert len(captured) == 1
 def test_queues_when_sink_down(tmp_path):
    def boom(_):
        raise RuntimeError("hub down")
    rec = DecisionRecorder(str(tmp_path / "q.jsonl"), sink=boom)
    assert rec.record(_candidate(), "reject", "noise") is False
    assert len(rec.pending()) == 1
 def test_no_sink_defaults_to_queue(tmp_path):
    rec = DecisionRecorder(str(tmp_path / "q.jsonl"))
    rec.record(_candidate(), "approve", "ok")
    assert len(rec.pending()) == 1
 def test_flush_replays_queue(tmp_path):
    rec = DecisionRecorder(str(tmp_path / "q.jsonl"))  # offline -> queue
    rec.record(_candidate("problem:abandoned:outcome"), "reject", "x")
    rec.record(_candidate("success:clean_pass:outcome"), "approve", "y")
    captured = []
    assert rec.flush(sink=captured.append) == 2
    assert rec.pending() == []
    assert len(captured) == 2
 def test_review_records_each_final_decision(tmp_path):
    cat = Catalog(str(tmp_path / "catalog"))
    log = ReviewLog(str(tmp_path / "reviews.jsonl"))
    captured = []
    rec = DecisionRecorder(str(tmp_path / "q.jsonl"), sink=captured.append, workstream_id="ws")
    cands = [_candidate("success:clean_pass:outcome"), _candidate("problem:abandoned:outcome")]
    review(cands, lambda c: (APPROVE if "success" in c["key"] else REJECT, "r"), cat, log,
           recorder=rec)
    assert len(captured) == 2
    actions = sorted("Promote" in d["title"] for d in captured)
    assert actions == [False, True]
--- a/tests/test_curate_entrypoint.py
+++ b/tests/test_curate_entrypoint.py
@@ -0,0 +1,84 @@
 """Curate entrypoint tests (T06): batch auto-approve end-to-end via the store."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.core.store import Store  # noqa: E402
 from session_memory.curate.__main__ import main  # noqa: E402
 from session_memory.curate.catalog import Catalog  # noqa: E402
 def _digest(uid, flavor, repo, **markers):
    return {
        "session_uid": uid, "flavor": flavor, "repo": repo, "outcome": "fail",
        "cost": {"input_tokens": 10, "output_tokens": 1},
        "markers": {"errors": markers.get("errors", 0), "retries": markers.get("retries", 0),
                    "test_runs": 0, "edits": 0, "human_interventions": 0},
        # real coding session per the quality filter (WP-0005 T01)
        "event_count": 40, "first_prompt": "Fix the failing build and retry the suite",
        "tool_histogram": {"Bash": 20, "Edit": 12, "Read": 8},
    }
 def _write_config(tmp_path) -> str:
    store = tmp_path / ".store"
    catalog = tmp_path / "catalog"
    cfg = f"""
 [store]
 db_path = "{store / 'm.db'}"
 blob_dir = "{store / 'blobs'}"
 cursor = "{store / 'c.json'}"
 [curate]
 catalog_dir = "{catalog}"
 review_log = "{store / 'reviews.jsonl'}"
 decision_queue = "{store / 'decisions.queue.jsonl'}"
 [curate.gate]
 min_frequency = 2
 min_sessions = 2
 """
    path = tmp_path / "config.toml"
    path.write_text(cfg)
    return str(path), str(store), str(catalog)
 def test_auto_approve_promotes_cross_flavor(tmp_path, capsys):
    cfg_path, store_dir, catalog_dir = _write_config(tmp_path)
    st = Store(os.path.join(store_dir, "m.db"), os.path.join(store_dir, "blobs"))
    st.write_digest("claude:a", _digest("claude:a", "claude", "r1", retries=5))
    st.write_digest("codex:b", _digest("codex:b", "codex", "r2", retries=4))
    st.close()
    rc = main(["--config", cfg_path, "--auto-approve"])
    assert rc == 0
    cat = Catalog(catalog_dir)
    patterns = cat.list()
    assert len(patterns) == 1
    assert patterns[0].polarity == "problem"
    # clears the promote floor (freq>=2) but below the default distribution
    # floor (freq>=3) -> promoted as provisional, not distribution-ready
    assert patterns[0].status == "provisional"
    assert patterns[0].distribution_ready is False
    out = capsys.readouterr().out
    assert "Curate summary" in out
    # hub offline in tests -> decision queued
    assert "decisions queued" in out
 def test_rerun_is_idempotent(tmp_path):
    cfg_path, store_dir, catalog_dir = _write_config(tmp_path)
    st = Store(os.path.join(store_dir, "m.db"), os.path.join(store_dir, "blobs"))
    st.write_digest("claude:a", _digest("claude:a", "claude", "r1", retries=5))
    st.write_digest("codex:b", _digest("codex:b", "codex", "r2", retries=4))
    st.close()
    main(["--config", cfg_path, "--auto-approve"])
    main(["--config", cfg_path, "--auto-approve"])  # second pass: already decided
    cat = Catalog(catalog_dir)
    assert len(cat.list()) == 1
    assert cat.load(cat.list()[0].id).version == "1.0.0"  # no spurious bump
--- a/tests/test_curate_gating.py
+++ b/tests/test_curate_gating.py
@@ -0,0 +1,76 @@
 """Evidence-bar + bloat-guard tests (T04)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.catalog import Catalog  # noqa: E402
 from session_memory.curate.gating import (  # noqa: E402
    GateConfig,
    bloat_warnings,
    evaluate,
    gate_config,
 )
 from session_memory.curate.review import candidate_to_pattern  # noqa: E402
 def _candidate(key="success:clean_pass:outcome", freq=5, sessions=5, impact=10.0,
               cross=True, flavors=("claude", "grok")):
    return {
        "key": key,
        "frequency": freq,
        "sessions": [f"s{i}" for i in range(sessions)],
        "cost_impact": impact,
        "cross_flavor": cross,
        "flavors": list(flavors),
    }
 def test_clears_bar_and_distribution_ready():
    r = evaluate(_candidate(), GateConfig(dist_min_frequency=3))
    assert r.promotable and r.distribution_ready
    assert r.status == "approved"
 def test_thin_candidate_promotable_but_provisional():
    # meets promote floor (freq>=2) but below distribution floor (freq<3)
    r = evaluate(_candidate(freq=2, sessions=2), GateConfig(dist_min_frequency=3))
    assert r.promotable
    assert not r.distribution_ready
    assert r.status == "provisional"
 def test_below_promote_floor_not_promotable():
    r = evaluate(_candidate(freq=1, sessions=1))
    assert not r.promotable
    assert any("frequency" in reason for reason in r.reasons)
 def test_cross_flavor_required_for_distribution():
    r = evaluate(_candidate(cross=False), GateConfig(dist_require_cross_flavor=True))
    assert r.promotable
    assert not r.distribution_ready
    assert any("cross-flavor" in reason for reason in r.reasons)
 def test_gate_config_reads_toml_dict():
    cfg = gate_config({"curate": {"gate": {"min_frequency": 9, "dist_require_cross_flavor": True}}})
    assert cfg.min_frequency == 9
    assert cfg.dist_require_cross_flavor is True
    # defaults preserved for unspecified keys
    assert cfg.dist_min_frequency == 3
 def test_bloat_flags_duplicate_and_near_duplicate(tmp_path):
    cat = Catalog(str(tmp_path))
    cat.upsert(candidate_to_pattern(_candidate(key="success:clean_pass:outcome")))
    existing = cat.list()
    # exact same key -> duplicate
    dup = bloat_warnings(_candidate(key="success:clean_pass:outcome"), existing)
    assert any("duplicate" in w for w in dup)
    # different polarity, same signal_type+locus -> near-duplicate
    near = bloat_warnings(_candidate(key="problem:clean_pass:outcome"), existing)
    assert any("near-duplicate" in w for w in near)
    # unrelated -> no warnings
    assert bloat_warnings(_candidate(key="problem:retry_storm:retries"), existing) == []
--- a/tests/test_curate_review.py
+++ b/tests/test_curate_review.py
@@ -0,0 +1,93 @@
 """Review workflow tests (T03): promote/reject/discuss + idempotent re-review."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.catalog import Catalog  # noqa: E402
 from session_memory.curate.review import (  # noqa: E402
    APPROVE,
    DISCUSS,
    REJECT,
    ReviewLog,
    candidate_to_pattern,
    review,
 )
 from session_memory.curate.schema import SolutionPattern  # noqa: E402
 def _candidate(key="success:clean_pass:outcome", freq=18, flavors=("claude", "grok")):
    return {
        "key": key,
        "polarity": key.split(":")[0],
        "signal_type": key.split(":")[1],
        "locus": key.split(":")[2],
        "title": "cross-flavor success: clean pass",
        "frequency": freq,
        "flavors": list(flavors),
        "repos": ["agentic-resources"],
        "sessions": [f"s{i}" for i in range(freq)],
        "cross_flavor": len(flavors) > 1,
        "cost_impact": 12.5,
    }
 def _decider(action, rationale="because"):
    return lambda cand: (action, rationale)
 def test_approve_promotes_to_catalog(tmp_path):
    cat = Catalog(str(tmp_path / "catalog"))
    log = ReviewLog(str(tmp_path / "reviews.jsonl"))
    res = review([_candidate()], _decider(APPROVE), cat, log)
    assert len(res.approved) == 1
    p = cat.load(SolutionPattern.make_id("success:clean_pass:outcome"))
    assert p is not None
    assert p.scope.flavors == ["claude", "grok"]
    assert set(p.rendering_hints) == {"claude", "grok"}
    assert p.provenance.evidence["frequency"] == 18
 def test_reject_records_no_catalog_write(tmp_path):
    cat = Catalog(str(tmp_path / "catalog"))
    log = ReviewLog(str(tmp_path / "reviews.jsonl"))
    res = review([_candidate()], _decider(REJECT), cat, log)
    assert res.rejected == ["success:clean_pass:outcome"]
    assert cat.list() == []
 def test_discuss_defers_and_is_not_final(tmp_path):
    cat = Catalog(str(tmp_path / "catalog"))
    log = ReviewLog(str(tmp_path / "reviews.jsonl"))
    res = review([_candidate()], _decider(DISCUSS), cat, log)
    assert res.deferred == ["success:clean_pass:outcome"]
    # not recorded as final -> a later pass re-surfaces it
    res2 = review([_candidate()], _decider(APPROVE), cat, log)
    assert len(res2.approved) == 1
 def test_prior_reject_remembered_same_evidence(tmp_path):
    cat = Catalog(str(tmp_path / "catalog"))
    log_path = str(tmp_path / "reviews.jsonl")
    review([_candidate()], _decider(REJECT), cat, ReviewLog(log_path))
    # fresh log instance (reloads from disk) + same evidence -> skipped
    res = review([_candidate()], _decider(APPROVE), cat, ReviewLog(log_path))
    assert res.skipped == ["success:clean_pass:outcome"]
    assert cat.list() == []
 def test_changed_evidence_resurfaces(tmp_path):
    cat = Catalog(str(tmp_path / "catalog"))
    log_path = str(tmp_path / "reviews.jsonl")
    review([_candidate(freq=18)], _decider(REJECT), cat, ReviewLog(log_path))
    # more evidence now -> not skipped, gets re-reviewed
    res = review([_candidate(freq=40)], _decider(APPROVE), cat, ReviewLog(log_path))
    assert len(res.approved) == 1
 def test_candidate_to_pattern_defaults():
    p = candidate_to_pattern(_candidate(flavors=("claude",)))
    assert p.status == "provisional"
    assert p.rendering_hints["claude"]["target"] == "CLAUDE.md"
    assert p.polarity == "success"
--- a/tests/test_curate_schema.py
+++ b/tests/test_curate_schema.py
@@ -0,0 +1,80 @@
 """Round-trip + validation tests for the Solution Pattern schema (T01)."""
 import os
 import sys
 import pytest
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.schema import (  # noqa: E402
    Provenance,
    Resolution,
    Scope,
    SolutionPattern,
 )
 def _sample() -> SolutionPattern:
    src = "success:clean_pass:outcome"
    return SolutionPattern(
        id=SolutionPattern.make_id(src),
        name="Run tests before declaring success",
        version="1.0.0",
        polarity="success",
        problem="Sessions that run tests and finish with no retries resolve cheaply.",
        resolutions=[Resolution(summary="Always run the suite", steps=["edit", "test", "commit"])],
        scope=Scope(flavors=["claude", "grok"]),
        provenance=Provenance(source_key=src, evidence={"frequency": 18, "cross_flavor": True}),
        rendering_hints={"claude": {"target": "CLAUDE.md"}, "codex": {"target": "AGENTS.md"}},
        status="approved",
        distribution_ready=True,
    )
 def test_round_trip_is_lossless():
    p = _sample()
    again = SolutionPattern.from_json(p.to_json())
    assert again.to_dict() == p.to_dict()
    assert again.resolutions[0].steps == ["edit", "test", "commit"]
    assert again.scope.flavors == ["claude", "grok"]
    assert again.provenance.evidence["cross_flavor"] is True
 def test_serialization_is_deterministic():
    p = _sample()
    assert p.to_json() == p.to_json()
    assert SolutionPattern.from_json(p.to_json()).to_json() == p.to_json()
 def test_make_id_is_stable_and_slugged():
    assert SolutionPattern.make_id("success:clean_pass:outcome") == "sp-success-clean_pass-outcome"
    # same source key -> same id regardless of later wording
    assert SolutionPattern.make_id("problem:abandoned:outcome") == SolutionPattern.make_id(
        "problem:abandoned:outcome"
    )
 def test_bump_version():
    assert SolutionPattern.bump_version("1.0.0") == "1.0.1"
    assert SolutionPattern.bump_version("1.2.3", "minor") == "1.3.0"
    assert SolutionPattern.bump_version("1.2.3", "major") == "2.0.0"
 def test_rejects_unknown_polarity():
    with pytest.raises(ValueError):
        SolutionPattern(id="x", name="n", version="1.0.0", polarity="meh", problem="p")
 def test_rejects_unknown_status():
    with pytest.raises(ValueError):
        SolutionPattern(id="x", name="n", version="1.0.0", polarity="problem",
                        problem="p", status="bogus")
 def test_rejects_unknown_flavor_in_hints_and_scope():
    with pytest.raises(ValueError):
        SolutionPattern(id="x", name="n", version="1.0.0", polarity="problem",
                        problem="p", rendering_hints={"gpt": {}})
    with pytest.raises(ValueError):
        Scope(flavors=["gpt"])
--- a/tests/test_detect_entrypoint.py
+++ b/tests/test_detect_entrypoint.py
@@ -0,0 +1,47 @@
 """Detect entrypoint tests (T07): end-to-end digests -> patterns, persisted."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.core.store import Store  # noqa: E402
 from session_memory.detect.__main__ import run_detect  # noqa: E402
 def _digest(uid, flavor, repo, **markers):
    return {
        "session_uid": uid, "flavor": flavor, "repo": repo, "outcome": "fail",
        "cost": {"input_tokens": 10, "output_tokens": 1},
        "markers": {"errors": markers.get("errors", 0), "retries": markers.get("retries", 0),
                    "test_runs": 0, "edits": 0, "human_interventions": 0},
        # fields the quality filter (WP-0005 T01) checks — real coding session
        "event_count": 40, "first_prompt": "Fix the failing build and retry the suite",
        "tool_histogram": {"Bash": 20, "Edit": 12, "Read": 8},
    }
 def _config(tmp_path):
    return {"store": {"db_path": str(tmp_path / ".store/m.db"),
                      "blob_dir": str(tmp_path / ".store/blobs"),
                      "cursor": str(tmp_path / ".store/c.json")}}
 def test_run_detect_persists_cross_flavor_pattern(tmp_path):
    cfg = _config(tmp_path)
    st = Store(cfg["store"]["db_path"], cfg["store"]["blob_dir"])
    # same problem (retry_storm) across two flavors -> cross-flavor candidate
    st.write_digest("claude:a", _digest("claude:a", "claude", "r1", retries=5))
    st.write_digest("codex:b", _digest("codex:b", "codex", "r2", retries=4))
    st.close()
    patterns = run_detect(cfg, min_frequency=2)
    assert len(patterns) == 1
    assert patterns[0]["cross_flavor"] is True
    assert patterns[0]["signal_type"] == "retry_storm"
    # persisted to the Tier 2 patterns table
    st2 = Store(cfg["store"]["db_path"], cfg["store"]["blob_dir"])
    rows = st2.db.execute("SELECT key FROM patterns").fetchall()
    assert len(rows) == 1
    st2.close()
--- a/tests/test_detect_infra_signals.py
+++ b/tests/test_detect_infra_signals.py
@@ -0,0 +1,80 @@
 """Infra-overhead + thrash signal tests (WP-0005 T02)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.detect.signals import (  # noqa: E402
    build_context,
    extract_signals,
    sig_infra_overhead,
    sig_schema_thrash,
    sig_tool_thrash,
    tool_bucket,
 )
 def _digest(uid="claude:a", repo="r1", tools=None):
    return {"session_uid": uid, "flavor": "claude", "repo": repo, "outcome": "success",
            "cost": {"input_tokens": 1, "output_tokens": 1},
            "markers": {"errors": 0, "retries": 0, "test_runs": 0},
            "tool_histogram": tools or {}}
 CTX = {"infra_min_calls": 20, "infra_overhead_threshold": 0.30,
       "schema_thrash_threshold": 5, "tool_thrash_threshold": 80}
 def test_tool_bucket_mapping():
    assert tool_bucket("mcp__state-hub__update_task_status") == "statehub_mcp"
    assert tool_bucket("ToolSearch") == "schema_load"
    assert tool_bucket("TaskUpdate") == "task_mgmt"
    assert tool_bucket("Bash") == "shell"
    assert tool_bucket("Edit") == "edit"
 def test_infra_overhead_fires_above_share():
    # 18 statehub of 30 total = 60% overhead
    d = _digest(tools={"mcp__state-hub__create_task": 18, "Bash": 8, "Edit": 4})
    sig = sig_infra_overhead(d, CTX)
    assert sig and sig[0].type == "infra_overhead"
    assert sig[0].magnitude >= 0.30
    assert sig[0].detail["statehub"] == 18
 def test_infra_overhead_quiet_when_mostly_work():
    d = _digest(tools={"mcp__state-hub__create_task": 3, "Bash": 40, "Edit": 30})
    assert sig_infra_overhead(d, CTX) == []
 def test_infra_overhead_ignores_tiny_sessions():
    d = _digest(tools={"mcp__state-hub__create_task": 5})  # below infra_min_calls
    assert sig_infra_overhead(d, CTX) == []
 def test_schema_thrash_fires():
    d = _digest(tools={"ToolSearch": 9, "Bash": 5})
    sig = sig_schema_thrash(d, CTX)
    assert sig and sig[0].type == "schema_thrash"
    assert sig[0].detail["tool_searches"] == 9
 def test_tool_thrash_fires_on_dominant_tool():
    d = _digest(tools={"Bash": 120, "Edit": 5})
    sig = sig_tool_thrash(d, CTX)
    assert sig and sig[0].locus == "tool:Bash"
 def test_extract_signals_includes_infra():
    d = _digest(tools={"mcp__state-hub__create_task": 18, "Bash": 8, "Edit": 4,
                       "ToolSearch": 6})
    types = {s.type for s in extract_signals([d])}
    assert "infra_overhead" in types
    assert "schema_thrash" in types
 def test_build_context_has_infra_defaults():
    ctx = build_context([])
    assert ctx["infra_overhead_threshold"] == 0.30
    assert ctx["schema_thrash_threshold"] == 5
--- a/tests/test_detect_quality.py
+++ b/tests/test_detect_quality.py
@@ -0,0 +1,61 @@
 """Session-quality filter tests (T01)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.detect.quality import (  # noqa: E402
    QualityConfig,
    filter_real,
    is_real_coding_session,
    quality_config,
 )
 def _digest(repo="agentic-resources", events=60, prompt="Implement the curate entrypoint",
            tools=None):
    return {
        "session_uid": "claude:x", "flavor": "claude", "repo": repo,
        "event_count": events, "first_prompt": prompt,
        "tool_histogram": tools if tools is not None else {"Bash": 20, "Edit": 15, "Read": 8},
    }
 def test_real_session_passes():
    assert is_real_coding_session(_digest()) is True
 def test_healthcheck_prompt_dropped():
    assert is_real_coding_session(_digest(events=3, prompt="Say hello in one word.",
                                          tools={})) is False
 def test_interrupted_dropped():
    assert is_real_coding_session(_digest(events=1, prompt="[Request interrupted by user]",
                                          tools={})) is False
 def test_too_short_dropped():
    assert is_real_coding_session(_digest(events=5)) is False
 def test_no_repo_dropped():
    assert is_real_coding_session(_digest(repo=None)) is False
 def test_no_substantive_tools_dropped():
    # plenty of events but only plumbing calls -> not real coding
    assert is_real_coding_session(
        _digest(tools={"mcp__state-hub__update_task_status": 40})) is False
 def test_filter_real_keeps_only_real():
    digs = [_digest(), _digest(events=3, prompt="hello", tools={}), _digest(repo=None)]
    assert len(filter_real(digs)) == 1
 def test_quality_config_from_toml():
    cfg = quality_config({"detect": {"quality": {"min_events": 50}}})
    assert cfg.min_events == 50
    assert cfg.min_substantive == 3  # default preserved
--- a/tests/test_detect_recurring_error.py
+++ b/tests/test_detect_recurring_error.py
@@ -0,0 +1,59 @@
 """Recurring-error signal + clustering (WP-0006 T02)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.detect.cluster import cluster  # noqa: E402
 from session_memory.detect.signals import (  # noqa: E402
    extract_signals,
    sig_recurring_error,
 )
 def _digest(uid, repo, flavor="claude", snippets=None):
    return {
        "session_uid": uid, "flavor": flavor, "repo": repo, "outcome": "success",
        "cost": {"input_tokens": 1, "output_tokens": 1},
        "markers": {"errors": 0, "retries": 0, "test_runs": 0},
        "tool_histogram": {}, "error_snippets": snippets or [],
    }
 _FP = "modulenotfounderror: no module named 'foo' at <path>:<n>"
 def test_signal_per_distinct_fingerprint():
    d = _digest("claude:a", "r1", snippets=[
        {"fingerprint": _FP, "sample": "ModuleNotFoundError ...", "count": 3, "tool": "Bash"},
        {"fingerprint": "keyerror: <str>", "sample": "KeyError", "count": 1, "tool": None},
    ])
    sigs = sig_recurring_error(d, {})
    assert len(sigs) == 2
    top = [s for s in sigs if s.locus == _FP][0]
    assert top.type == "recurring_error"
    assert top.magnitude == 3.0
    assert top.detail["sample"].startswith("ModuleNotFound")
 def test_clusters_across_sessions_and_flavors():
    # same fingerprint in a claude and a grok session -> cross-flavor candidate
    digs = [
        _digest("claude:a", "r1", "claude",
                [{"fingerprint": _FP, "sample": "ModuleNotFoundError", "count": 2, "tool": "Bash"}]),
        _digest("grok:b", "r2", "grok",
                [{"fingerprint": _FP, "sample": "ModuleNotFoundError", "count": 1, "tool": None}]),
    ]
    signals = extract_signals(digs)
    pats = cluster([s for s in signals if s.type == "recurring_error"], min_frequency=2)
    assert len(pats) == 1
    p = pats[0]
    assert p.signal_type == "recurring_error"
    assert p.cross_flavor is True
    assert sorted(p.flavors) == ["claude", "grok"]
    assert p.frequency == 2
 def test_no_snippets_no_signal():
    assert sig_recurring_error(_digest("claude:a", "r1"), {}) == []
--- a/tests/test_digest_errors.py
+++ b/tests/test_digest_errors.py
@@ -0,0 +1,101 @@
 """Error-body mining into the digest (WP-0006 T01)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.core.digest import (  # noqa: E402
    _error_fingerprint,
    _error_snippets,
    build_digest,
 )
 from session_memory.core.schema import SCHEMA_VERSION, Session, SessionEvent  # noqa: E402
 def _ev(seq, kind, **kw):
    return SessionEvent(session_uid="claude:s", seq=seq, kind=kind, **kw)
 def test_fingerprint_normalizes_paths_numbers_ids():
    a = _error_fingerprint("ModuleNotFoundError: No module named 'foo' at /home/x/a.py:42")
    b = _error_fingerprint("ModuleNotFoundError: No module named 'foo' at /srv/y/b.py:9991")
    assert a == b  # paths + line numbers stripped -> same fingerprint
    assert "<path>" in a and "<n>" in a
 def test_fingerprint_uuid_and_addr():
    fp = _error_fingerprint("connection 0xDEADBEEF to 1972d1d9-fc35-4912-8126-1fe64cc51425 failed")
    assert "<addr>" in fp and "<uuid>" in fp
 def test_snippets_dedup_and_count():
    blobs = {"b1": "Traceback...\nValueError: bad thing at /p/x.py:10",
             "b2": "Traceback...\nValueError: bad thing at /q/y.py:99",
             "b3": "KeyError: 'id'"}
    events = [
        _ev(0, "error", payload_ref="b1"),
        _ev(1, "error", payload_ref="b2"),       # same fingerprint as b1
        _ev(2, "error", payload_ref="b3"),
    ]
    snips = _error_snippets(events, blobs)
    assert len(snips) == 2
    top = snips[0]
    assert top["count"] == 2  # the ValueError collapsed
    assert "ValueError" in top["sample"]
 def test_failed_tool_result_mined():
    blobs = {"b1": "npm ERR! something failed with non-zero exit"}
    events = [_ev(0, "tool_result", tool="Bash", payload_ref="b1")]
    snips = _error_snippets(events, blobs)
    assert len(snips) == 1
    assert snips[0]["tool"] == "Bash"
 def test_clean_tool_result_not_mined():
    blobs = {"b1": "6 passed in 0.4s"}
    events = [_ev(0, "tool_result", tool="Bash", payload_ref="b1")]
    assert _error_snippets(events, blobs) == []
 def test_success_json_not_mined():
    # a hub MCP success payload mentioning 'error' deep inside is NOT a failure
    blobs = {"b1": '{"result": "{\\"domain\\": \\"custodian\\", \\"note\\": \\"no errors\\"}"}'}
    events = [_ev(0, "tool_result", tool="mcp__state-hub__get_domain_summary", payload_ref="b1")]
    assert _error_snippets(events, blobs) == []
 def test_error_json_still_mined():
    blobs = {"b1": '{"detail": "Invalid request parameters"}'}
    events = [_ev(0, "tool_result", tool="Bash", payload_ref="b1")]
    snips = _error_snippets(events, blobs)
    assert len(snips) == 1
 def test_plain_mcp_error_still_mined():
    blobs = {"b1": "MCP error -32602: Invalid request parameters"}
    events = [_ev(0, "tool_result", tool="Bash", payload_ref="b1")]
    assert len(_error_snippets(events, blobs)) == 1
 def test_file_read_snapshot_not_mined():
    # a Read result of source code containing 'raise ...Error' is not a runtime error
    blobs = {"b1": "227\t    def f():\n228\t        x = 1\n229\t        raise InfospaceError()\n"}
    events = [_ev(0, "tool_result", tool="Read", payload_ref="b1")]
    assert _error_snippets(events, blobs) == []
 def test_build_digest_includes_error_snippets_and_v2():
    s = Session(session_uid="claude:s", flavor="claude", native_session_id="s", repo="r")
    events = [_ev(0, "user_msg"), _ev(1, "error", payload_ref="b1"), _ev(2, "assistant_msg")]
    d = build_digest(s, events, {"b1": "RuntimeError: kaboom at /a/b.py:3"})
    assert d["schema_version"] == SCHEMA_VERSION == 2
    assert d["error_snippets"][0]["count"] == 1
    assert "RuntimeError" in d["error_snippets"][0]["sample"]
 def test_no_errors_empty_list():
    s = Session(session_uid="claude:s", flavor="claude", native_session_id="s", repo="r")
    d = build_digest(s, [_ev(0, "user_msg"), _ev(1, "assistant_msg")])
    assert d["error_snippets"] == []
--- a/tests/test_digest_lookup.py
+++ b/tests/test_digest_lookup.py
@@ -0,0 +1,78 @@
 """digest_lookup entrypoint tests (AGENTIC-WP-0011 T03)."""
 import json
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.core.store import Store  # noqa: E402
 from session_memory.digest_lookup import lookup_digest, main, resolve_store_paths  # noqa: E402
 def _write_config(tmp_path) -> str:
    store = tmp_path / ".store"
    toml = tmp_path / "config.toml"
    toml.write_text(
        f'[store]\ndb_path = "{store / "m.db"}"\nblob_dir = "{store / "blobs"}"\n'
        f'cursor = "{store / "c.json"}"\n')
    return str(toml), str(store)
 def _seed(store_dir, uid="claude:test-uid"):
    st = Store(os.path.join(store_dir, "m.db"), os.path.join(store_dir, "blobs"))
    st.write_digest(uid, {
        "session_uid": uid,
        "flavor": "claude",
        "repo": "agentic-resources",
        "outcome": "success",
        "started_at": "2026-06-19T10:00:00Z",
        "ended_at": "2026-06-19T11:00:00Z",
        "cost": {"input_tokens": 100, "output_tokens": 25},
        "tool_histogram": {"Bash": 10, "Edit": 5},
    })
    st.close()
    return uid
 def test_resolve_store_paths_from_config(tmp_path):
    cfg_path, store_dir = _write_config(tmp_path)
    db, blob = resolve_store_paths(config_path=cfg_path)
    assert db.endswith("m.db")
    assert blob.endswith("blobs")
    assert store_dir in db
 def test_resolve_store_paths_from_env(tmp_path, monkeypatch):
    db = tmp_path / "custom" / "mem.db"
    db.parent.mkdir(parents=True)
    monkeypatch.setenv("HELIX_STORE_DB", str(db))
    resolved_db, blob = resolve_store_paths()
    assert resolved_db == str(db)
    assert blob == str(tmp_path / "custom" / "blobs")
 def test_lookup_digest_found_and_missing(tmp_path):
    cfg_path, store_dir = _write_config(tmp_path)
    uid = _seed(store_dir)
    found = lookup_digest(uid, config_path=cfg_path)
    assert found is not None and found["outcome"] == "success"
    assert lookup_digest("claude:missing", config_path=cfg_path) is None
 def test_main_json_success(tmp_path, capsys):
    cfg_path, store_dir = _write_config(tmp_path)
    uid = _seed(store_dir)
    rc = main(["--config", cfg_path, uid, "--json"])
    assert rc == 0
    data = json.loads(capsys.readouterr().out)
    assert data["session_uid"] == uid
    assert data["repo"] == "agentic-resources"
 def test_main_not_found(tmp_path, capsys):
    cfg_path, store_dir = _write_config(tmp_path)
    _seed(store_dir)
    rc = main(["--config", cfg_path, "claude:missing"])
    assert rc == 1
    assert "not found" in capsys.readouterr().err.lower()
--- a/tests/test_distribute_base.py
+++ b/tests/test_distribute_base.py
@@ -0,0 +1,88 @@
 """Distributor base tests (WP-0007 T01): markers, idempotent upsert, rendering."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.schema import Resolution, SolutionPattern  # noqa: E402
 from session_memory.distribute.base import (  # noqa: E402
    Artifact,
    BaseDistributor,
    Distributor,
    render_markdown_body,
    upsert_block,
    wrap_block,
 )
 def _pattern(pid="sp-x", polarity="problem"):
    return SolutionPattern(
        id=pid, name="Read before edit", version="1.2.0", polarity=polarity,
        problem="Agents edit files they have not read.",
        resolutions=[Resolution(summary="Read the file first", detail="then Edit",
                                steps=["Read", "Edit"])],
        rendering_hints={"claude": {"target": "CLAUDE.md"}},
    )
 def test_render_markdown_body_has_problem_and_resolution():
    body = render_markdown_body(_pattern())
    assert "### Read before edit" in body
    assert "Agents edit files" in body
    assert "**Avoid:**" in body  # problem polarity
    assert "- Read the file first — then Edit" in body
    assert "  - Read" in body
 def test_success_polarity_label():
    assert "**Prefer:**" in render_markdown_body(_pattern(polarity="success"))
 def test_wrap_block_has_markers_and_version():
    block = wrap_block("sp-x", "hello", "1.2.0")
    assert block.startswith("<!-- BEGIN helix-forge pattern:sp-x --> v1.2.0")
    assert block.rstrip().endswith("<!-- END helix-forge pattern:sp-x -->")
 def test_upsert_inserts_then_replaces_in_place():
    doc = "# Title\n\nsome text\n"
    b1 = wrap_block("sp-x", "first", "1")
    once = upsert_block(doc, "sp-x", b1)
    assert "first" in once and once.count("BEGIN helix-forge pattern:sp-x") == 1
    # re-distributing the same id replaces, does not duplicate
    b2 = wrap_block("sp-x", "second", "2")
    twice = upsert_block(once, "sp-x", b2)
    assert "second" in twice and "first" not in twice
    assert twice.count("BEGIN helix-forge pattern:sp-x") == 1
 def test_upsert_keeps_other_patterns():
    doc = upsert_block("", "sp-a", wrap_block("sp-a", "A"))
    doc = upsert_block(doc, "sp-b", wrap_block("sp-b", "B"))
    assert "sp-a" in doc and "sp-b" in doc
 def test_base_distributor_renders_artifact():
    d = BaseDistributor(flavor="claude", target_path="CLAUDE.md")
    art = d.render(_pattern())
    assert isinstance(art, Artifact)
    assert isinstance(d, Distributor)  # satisfies the protocol
    assert art.flavor == "claude"
    assert art.target_path == "CLAUDE.md"
    assert "BEGIN helix-forge pattern:sp-x" in art.content
    assert "Read before edit" in art.content
 def test_body_hint_overrides_default():
    p = _pattern()
    p.rendering_hints["claude"]["body"] = "custom claude body"
    d = BaseDistributor(flavor="claude", target_path="CLAUDE.md")
    assert "custom claude body" in d.render(p).content
 def test_target_hint_overrides_default():
    p = _pattern()
    p.rendering_hints["claude"]["target"] = "docs/CLAUDE.md"
    d = BaseDistributor(flavor="claude", target_path="CLAUDE.md")
    assert d.render(p).target_path == "docs/CLAUDE.md"
--- a/tests/test_distribute_claude.py
+++ b/tests/test_distribute_claude.py
@@ -0,0 +1,40 @@
 """Claude distributor tests (WP-0007 T02)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.schema import Resolution, SolutionPattern  # noqa: E402
 from session_memory.distribute.claude import ClaudeDistributor  # noqa: E402
 def _pattern(hints=None):
    return SolutionPattern(
        id="sp-read-before-edit", name="Read before edit", version="1.0.0",
        polarity="problem", problem="Agents edit files they have not read.",
        resolutions=[Resolution(summary="Read the file first", steps=["Read", "Edit"])],
        rendering_hints=hints or {"claude": {}},
    )
 def test_default_targets_claude_md():
    art = ClaudeDistributor().render(_pattern())
    assert art.flavor == "claude"
    assert art.target_path == "CLAUDE.md"
    assert "BEGIN helix-forge pattern:sp-read-before-edit" in art.content
    assert "### Read before edit" in art.content
 def test_skill_mode_emits_skill_stub():
    art = ClaudeDistributor().render(_pattern({"claude": {"as": "skill"}}))
    assert "## Skill: Read before edit" in art.content
    assert "**When:**" in art.content
    assert "  - Read" in art.content
 def test_idempotent_marker_present_for_reupsert():
    art = ClaudeDistributor().render(_pattern())
    # same id in both renders -> caller can upsert in place
    art2 = ClaudeDistributor().render(_pattern())
    assert art.pattern_id == art2.pattern_id == "sp-read-before-edit"
--- a/tests/test_distribute_codex_grok.py
+++ b/tests/test_distribute_codex_grok.py
@@ -0,0 +1,49 @@
 """Codex + Grok distributor + registry tests (WP-0007 T03)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.schema import Resolution, SolutionPattern  # noqa: E402
 from session_memory.distribute.codex import CodexDistributor  # noqa: E402
 from session_memory.distribute.grok import GrokDistributor  # noqa: E402
 from session_memory.distribute.registry import all_flavors, get_distributor  # noqa: E402
 def _pattern():
    return SolutionPattern(
        id="sp-x", name="Read before edit", version="1.0.0", polarity="problem",
        problem="Agents edit files they have not read.",
        resolutions=[Resolution(summary="Read the file first")],
    )
 def test_codex_targets_agents_md():
    art = CodexDistributor().render(_pattern())
    assert art.flavor == "codex" and art.target_path == "AGENTS.md"
    assert "Read before edit" in art.content
 def test_grok_targets_native_instructions():
    art = GrokDistributor().render(_pattern())
    assert art.flavor == "grok" and art.target_path == ".grok/instructions.md"
 def test_same_pattern_expressible_for_all_flavors():
    # FR-A3: one pattern, rendered for every flavor (same body, different targets)
    p = _pattern()
    bodies = {}
    for f in all_flavors():
        art = get_distributor(f).render(p)
        # strip markers -> compare agnostic body
        inner = art.content.split("\n", 1)[1].rsplit("\n", 1)[0]
        bodies[f] = inner
    targets = {get_distributor(f).render(p).target_path for f in all_flavors()}
    assert len(targets) == 3                 # distinct per-flavor targets
    assert len(set(bodies.values())) == 1    # identical agnostic body
 def test_registry_unknown_flavor():
    assert get_distributor("gpt") is None
    assert set(all_flavors()) == {"claude", "codex", "grok"}
--- a/tests/test_distribute_entrypoint.py
+++ b/tests/test_distribute_entrypoint.py
@@ -0,0 +1,76 @@
 """Distribute entrypoint tests (WP-0007 T05)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.catalog import Catalog  # noqa: E402
 from session_memory.curate.schema import Resolution, Scope, SolutionPattern  # noqa: E402
 from session_memory.distribute.__main__ import build_targets, main, run_distribute  # noqa: E402
 def _pattern(pid, repos, flavors, status="approved", ready=True):
    return SolutionPattern(
        id=pid, name=pid, version="1.0.0", polarity="problem", problem="p",
        resolutions=[Resolution(summary="do x")],
        scope=Scope(repos=repos, flavors=flavors), status=status, distribution_ready=ready,
    )
 def _config(tmp_path):
    return {
        "repo_domain_map": {"agentic-resources": "helix_forge", "state-hub": "custodian"},
        "curate": {"catalog_dir": str(tmp_path / "catalog")},
        "distribute": {"proposals_dir": str(tmp_path / "proposals"),
                       "active_registry": str(tmp_path / "active.json")},
    }
 def test_build_targets_crosses_repos_and_flavors():
    cfg = {"repo_domain_map": {"r1": "d1", "r2": "d2"}}
    targets = build_targets(cfg)
    assert len(targets) == 2 * 3  # 2 repos x 3 flavors
    assert build_targets(cfg, repo_filter="r1") and all(t.repo == "r1"
                                                        for t in build_targets(cfg, repo_filter="r1"))
    assert all(t.flavor == "claude" for t in build_targets(cfg, flavor_filter="claude"))
 def test_run_distribute_scopes_to_catalog(tmp_path):
    cfg = _config(tmp_path)
    cat = Catalog(cfg["curate"]["catalog_dir"])
    # in-scope for agentic-resources/claude only
    cat.upsert(_pattern("sp-a", ["agentic-resources"], ["claude"]))
    # provisional -> must be skipped
    cat.upsert(_pattern("sp-prov", [], [], status="provisional", ready=False))
    res = run_distribute(cfg)
    rendered = {pid for _, _, pid, _ in res.proposals}
    assert "sp-a" in rendered
    assert "sp-prov" not in rendered
    assert "sp-prov" in res.skipped_not_distributable
    # landed only in the agentic-resources/CLAUDE.md proposal
    p = os.path.join(cfg["distribute"]["proposals_dir"], "agentic-resources", "CLAUDE.md")
    assert os.path.exists(p)
    assert not os.path.exists(
        os.path.join(cfg["distribute"]["proposals_dir"], "state-hub", "CLAUDE.md"))
 def test_main_runs_json(tmp_path, capsys):
    cfg = _config(tmp_path)
    cat = Catalog(cfg["curate"]["catalog_dir"])
    cat.upsert(_pattern("sp-a", [], ["claude"]))  # unrestricted repos
    # write a config file
    import json as _json
    cfg_path = tmp_path / "c.json"
    # main() loads TOML; emulate by calling run_distribute path via a tiny toml
    toml = tmp_path / "config.toml"
    toml.write_text(
        f'[repo_domain_map]\nagentic-resources = "helix_forge"\n'
        f'[curate]\ncatalog_dir = "{cfg["curate"]["catalog_dir"]}"\n'
        f'[distribute]\nproposals_dir = "{cfg["distribute"]["proposals_dir"]}"\n'
        f'active_registry = "{cfg["distribute"]["active_registry"]}"\n')
    rc = main(["--config", str(toml), "--json"])
    assert rc == 0
    out = capsys.readouterr().out
    assert "sp-a" in out
    _json.loads(out)  # valid JSON
--- a/tests/test_distribute_proposals.py
+++ b/tests/test_distribute_proposals.py
@@ -0,0 +1,79 @@
 """Scoping + proposals + active registry tests (WP-0007 T04)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.curate.schema import Resolution, Scope, SolutionPattern  # noqa: E402
 from session_memory.distribute.proposals import (  # noqa: E402
    ActiveRegistry,
    Target,
    applies,
    propose,
 )
 def _pattern(pid="sp-x", repos=None, flavors=None, status="approved", ready=True):
    return SolutionPattern(
        id=pid, name="Read before edit", version="1.0.0", polarity="problem",
        problem="edit before read", resolutions=[Resolution(summary="read first")],
        scope=Scope(repos=repos or [], flavors=flavors or []),
        status=status, distribution_ready=ready,
    )
 def test_applies_respects_scope():
    p = _pattern(repos=["agentic-resources"], flavors=["claude"])
    assert applies(p, Target("agentic-resources", flavor="claude"))
    assert not applies(p, Target("other-repo", flavor="claude"))
    assert not applies(p, Target("agentic-resources", flavor="codex"))
 def test_empty_scope_is_unrestricted():
    assert applies(_pattern(), Target("any", flavor="grok"))
 def test_propose_writes_scoped_proposal_files(tmp_path):
    out = str(tmp_path / "proposals")
    reg = ActiveRegistry(str(tmp_path / "active.json"))
    p = _pattern(flavors=["claude"])
    res = propose([p], [Target("agentic-resources", flavor="claude"),
                        Target("agentic-resources", flavor="codex")], out, reg)
    # only claude target is in scope
    assert len(res.proposals) == 1
    path = os.path.join(out, "agentic-resources", "CLAUDE.md")
    assert os.path.exists(path)
    assert "BEGIN helix-forge pattern:sp-x" in open(path).read()
 def test_not_distributable_skipped(tmp_path):
    reg = ActiveRegistry(str(tmp_path / "active.json"))
    prov = _pattern(status="provisional", ready=False)
    res = propose([prov], [Target("r", flavor="claude")], str(tmp_path / "p"), reg)
    assert res.proposals == []
    assert "sp-x" in res.skipped_not_distributable
 def test_proposals_idempotent_on_rerun(tmp_path):
    out = str(tmp_path / "proposals")
    reg_path = str(tmp_path / "active.json")
    p = _pattern()
    propose([p], [Target("r", flavor="claude")], out, ActiveRegistry(reg_path))
    propose([p], [Target("r", flavor="claude")], out, ActiveRegistry(reg_path))
    content = open(os.path.join(out, "r", "CLAUDE.md")).read()
    assert content.count("BEGIN helix-forge pattern:sp-x") == 1  # no duplication
 def test_active_registry_records_environment(tmp_path):
    reg_path = str(tmp_path / "active.json")
    reg = ActiveRegistry(reg_path)
    propose([_pattern()], [Target("r", domain="helix_forge", flavor="claude")],
            str(tmp_path / "p"), reg)
    reg2 = ActiveRegistry(reg_path)  # reload from disk
    entries = reg2.entries()
    assert len(entries) == 1
    assert entries[0]["pattern_id"] == "sp-x"
    assert entries[0]["repo"] == "r"
    assert entries[0]["flavor"] == "claude"
    assert entries[0]["status"] == "proposed"
--- a/tests/test_grok_adapter.py
+++ b/tests/test_grok_adapter.py
@@ -0,0 +1,92 @@
 """Grok adapter tests (T02): synthetic session dir + real local sessions."""
 import glob
 import json
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.adapters.grok import parse_session  # noqa: E402
 REPO_MAP = {"agentic-resources": "helix_forge", "net-kingdom": "netkingdom",
            "can-you-assist": "coulomb_social"}
 def _mk_session(dir_path, sid):
    os.makedirs(dir_path, exist_ok=True)
    with open(os.path.join(dir_path, "summary.json"), "w") as f:
        json.dump({"info": {"id": sid, "cwd": "/home/worsch/agentic-resources"},
                   "created_at": "2026-06-06T10:00:00Z",
                   "last_active_at": "2026-06-06T10:05:00Z",
                   "current_model_id": "grok-build", "head_branch": "main"}, f)
    with open(os.path.join(dir_path, "events.jsonl"), "w") as f:
        f.write(json.dumps({"ts": "2026-06-06T10:00:00Z", "type": "turn_started",
                            "turn_number": 0, "model_id": "grok-build"}) + "\n")
        f.write(json.dumps({"ts": "2026-06-06T10:05:00Z", "type": "turn_ended",
                            "turn_number": 0}) + "\n")
    with open(os.path.join(dir_path, "chat_history.jsonl"), "w") as f:
        for rec in [
            {"type": "system", "content": "sys prompt"},
            {"type": "user", "content": [{"type": "text", "text": "fix the bug"}]},
            {"type": "reasoning", "content": [{"type": "text", "text": "thinking..."}]},
            {"type": "assistant", "content": ""},   # empty -> skipped
            {"type": "tool_result", "content": "The file x.py has been updated"},
            {"type": "assistant", "content": "done"},
            {"type": "tool_result", "content": "6 passed"},
        ]:
            f.write(json.dumps(rec) + "\n")
    with open(os.path.join(dir_path, "updates.jsonl"), "w") as f:
        for u in [
            {"sessionUpdate": "tool_call", "toolCallId": "c1", "title": "edit_file",
             "rawInput": {"target_file": "x.py"}},
            {"sessionUpdate": "tool_call", "toolCallId": "c2", "title": "shell",
             "rawInput": {"command": "pytest -q"}},
        ]:
            f.write(json.dumps({"timestamp": "t", "method": "session/update",
                                "params": {"sessionId": sid, "update": u}}) + "\n")
 def test_grok_synthetic_dir(tmp_path):
    d = tmp_path / "%2Fhome%2Fworsch%2Fagentic-resources" / "sid-1"
    _mk_session(str(d), "sid-1")
    norm = parse_session(str(d / "chat_history.jsonl"), REPO_MAP)
    assert norm is not None
    s = norm.session
    assert s.session_uid == "grok:sid-1"
    assert s.flavor == "grok"
    assert s.repo == "agentic-resources" and s.domain == "helix_forge"
    assert s.model == "grok-build"
    assert s.git_branch == "main"
    assert s.cost.turns == 1
    assert s.cost.wall_clock_s == 300.0
    kinds = [e.kind for e in norm.events]
    # 4 lifecycle from events.jsonl? no: turn_started + turn_ended = 2 lifecycle
    assert kinds.count("lifecycle") == 2
    assert "user_msg" in kinds and "thinking" in kinds and "assistant_msg" in kinds
    # paired tool calls recovered names -> edit + test_run, each followed by tool_result
    assert "edit" in kinds and "test_run" in kinds
    edit = next(e for e in norm.events if e.kind == "edit")
    assert edit.tool == "edit_file"
    # tool_result after test_run links to it
    tr = [e for e in norm.events if e.kind == "tool_result"]
    assert len(tr) == 2
 def test_real_local_grok_sessions_if_available():
    base = os.path.expanduser("~/.grok/sessions")
    chats = glob.glob(os.path.join(base, "*", "*", "chat_history.jsonl"))
    if not chats:
        return
    parsed = 0
    for c in chats:
        norm = parse_session(c, REPO_MAP)
        if norm is None:
            continue
        parsed += 1
        assert norm.session.session_uid.startswith("grok:")
        seqs = [e.seq for e in norm.events]
        assert seqs == sorted(seqs) and len(seqs) == len(set(seqs))
    assert parsed >= 1
--- a/tests/test_measure_effect.py
+++ b/tests/test_measure_effect.py
@@ -0,0 +1,49 @@
 """Before/after effectiveness tests (WP-0009 T02)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.measure.effect import effectiveness, split_by_date  # noqa: E402
 def _digest(ts, tools=None, errors=0, outcome="success"):
    return {
        "started_at": ts, "outcome": outcome,
        "cost": {"input_tokens": 100, "output_tokens": 0},
        "tool_histogram": tools or {"Bash": 10},
        "error_snippets": [{"fingerprint": f"e{i}", "count": 1} for i in range(errors)],
    }
 def test_split_by_date():
    digs = [_digest("2026-06-01"), _digest("2026-06-05"), _digest("2026-06-10")]
    before, after = split_by_date(digs, "2026-06-05")
    assert len(before) == 1 and len(after) == 2  # >= applied_at goes to after
 def test_effectiveness_detects_improvement():
    # before: lots of errors + hub overhead; after: clean
    before = [_digest("2026-06-01", tools={"mcp__state-hub__x": 8, "Bash": 2}, errors=3)
              for _ in range(3)]
    after = [_digest("2026-06-10", tools={"Bash": 10}, errors=0) for _ in range(3)]
    e = effectiveness(before + after, "2026-06-05", label="read-before-edit")
    assert not e["insufficient_data"]
    assert e["n_before"] == 3 and e["n_after"] == 3
    assert e["deltas"]["error_rate"]["improved"] is True
    assert e["deltas"]["infra_overhead_share_median"]["improved"] is True
    assert e["deltas"]["error_rate"]["change"] < 0
 def test_effectiveness_insufficient_data():
    e = effectiveness([_digest("2026-06-01")], "2026-06-05")
    assert e["insufficient_data"] is True
    assert e["deltas"] == {}
 def test_success_rate_higher_is_better():
    before = [_digest("2026-06-01", outcome="fail") for _ in range(2)]
    after = [_digest("2026-06-10", outcome="success") for _ in range(2)]
    e = effectiveness(before + after, "2026-06-05")
    assert e["deltas"]["success_rate"]["improved"] is True
--- a/tests/test_measure_entrypoint.py
+++ b/tests/test_measure_entrypoint.py
@@ -0,0 +1,79 @@
 """Measure entrypoint tests (WP-0009 T03)."""
 import json
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.core.store import Store  # noqa: E402
 from session_memory.measure.__main__ import main, real_digests  # noqa: E402
 from session_memory.measure.metrics import load_baselines  # noqa: E402
 def _digest(uid, ts, tools=None):
    return {
        "session_uid": uid, "flavor": "claude", "repo": "agentic-resources",
        "outcome": "success", "started_at": ts,
        "cost": {"input_tokens": 100, "output_tokens": 10},
        "event_count": 40, "first_prompt": "Implement the measure entrypoint cleanly",
        "tool_histogram": tools or {"Bash": 20, "Edit": 12, "Read": 8},
        "error_snippets": [],
    }
 def _write_config(tmp_path) -> str:
    store = tmp_path / ".store"
    toml = tmp_path / "config.toml"
    toml.write_text(
        f'[store]\ndb_path = "{store / "m.db"}"\nblob_dir = "{store / "blobs"}"\n'
        f'cursor = "{store / "c.json"}"\n'
        f'[measure]\nbaselines = "{tmp_path / "baselines.jsonl"}"\n')
    return str(toml), str(store)
 def _seed(store_dir):
    st = Store(os.path.join(store_dir, "m.db"), os.path.join(store_dir, "blobs"))
    st.write_digest("claude:a", _digest("claude:a", "2026-06-01"))
    st.write_digest("claude:b", _digest("claude:b", "2026-06-10",
                                        tools={"mcp__state-hub__x": 18, "Bash": 8, "Edit": 4}))
    st.close()
 def test_real_digests_filters_and_loads(tmp_path):
    cfg_path, store_dir = _write_config(tmp_path)
    _seed(store_dir)
    from session_memory.ingest import load_config
    digs = real_digests(load_config(cfg_path))
    assert len(digs) == 2
 def test_main_writes_baseline_and_reports(tmp_path, capsys):
    cfg_path, store_dir = _write_config(tmp_path)
    _seed(store_dir)
    rc = main(["--config", cfg_path, "--label", "first"])
    assert rc == 0
    out = capsys.readouterr().out
    assert "Fleet metrics" in out
    rows = load_baselines(str(tmp_path / "baselines.jsonl"))
    assert len(rows) == 1 and rows[0]["label"] == "first"
 def test_main_no_save_and_json(tmp_path, capsys):
    cfg_path, store_dir = _write_config(tmp_path)
    _seed(store_dir)
    rc = main(["--config", cfg_path, "--no-save", "--json"])
    assert rc == 0
    data = json.loads(capsys.readouterr().out)
    assert data["current"]["n_sessions"] == 2
    assert not os.path.exists(str(tmp_path / "baselines.jsonl"))
 def test_main_effectiveness_since(tmp_path, capsys):
    cfg_path, store_dir = _write_config(tmp_path)
    _seed(store_dir)
    rc = main(["--config", cfg_path, "--no-save", "--since", "2026-06-05", "--json"])
    assert rc == 0
    data = json.loads(capsys.readouterr().out)
    assert data["effectiveness"]["n_before"] == 1
    assert data["effectiveness"]["n_after"] == 1
--- a/tests/test_measure_metrics.py
+++ b/tests/test_measure_metrics.py
@@ -0,0 +1,63 @@
 """Fleet metrics + baseline tests (WP-0009 T01)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.measure.metrics import (  # noqa: E402
    aggregate,
    load_baselines,
    save_baseline,
    session_metrics,
    snapshot,
 )
 def _digest(tools=None, errors=0, tokens=100, outcome="success"):
    return {
        "outcome": outcome,
        "cost": {"input_tokens": tokens, "output_tokens": 0},
        "tool_histogram": tools or {"Bash": 10, "Edit": 5},
        "error_snippets": [{"fingerprint": f"e{i}", "count": 1} for i in range(errors)],
    }
 def test_session_metrics_overhead_and_errors():
    m = session_metrics(_digest(tools={"mcp__state-hub__create_task": 6, "Bash": 4}, errors=2))
    assert abs(m["infra_overhead_share"] - 0.6) < 1e-9
    assert m["error_occurrences"] == 2
    assert m["has_error"] is True
 def test_aggregate_rates_and_percentiles():
    digs = [
        _digest(tools={"mcp__state-hub__x": 8, "Bash": 2}, errors=1, tokens=50),   # 80% overhead
        _digest(tools={"Bash": 9, "Edit": 1}, errors=0, tokens=200),               # 0% overhead
        _digest(tools={"ToolSearch": 6, "Bash": 4}, errors=0, tokens=100, outcome="fail"),
    ]
    a = aggregate(digs)
    assert a["n_sessions"] == 3
    assert a["error_rate"] == round(1 / 3, 3)
    assert a["success_rate"] == round(2 / 3, 3)
    assert a["schema_thrash_sessions"] == 1   # the ToolSearch=6 session
    assert 0 <= a["infra_overhead_share_median"] <= 1
 def test_aggregate_empty():
    assert aggregate([]) == {"n_sessions": 0}
 def test_snapshot_has_timestamp_and_label():
    s = snapshot([_digest()], label="baseline")
    assert s["label"] == "baseline"
    assert "captured_at" in s and s["n_sessions"] == 1
 def test_baseline_roundtrip_appends(tmp_path):
    path = str(tmp_path / "baselines.jsonl")
    save_baseline(snapshot([_digest()], label="a"), path)
    save_baseline(snapshot([_digest(), _digest()], label="b"), path)
    rows = load_baselines(path)
    assert [r["label"] for r in rows] == ["a", "b"]
    assert rows[1]["n_sessions"] == 2
--- a/tests/test_merge.py
+++ b/tests/test_merge.py
@@ -0,0 +1,66 @@
 """Multi-file session merge tests (T03)."""
 import os
 import sys
 sys.path.insert(0, os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
 from session_memory.adapters.common import Normalized  # noqa: E402
 from session_memory.core.schema import Session, SessionEvent  # noqa: E402
 from session_memory.core.store import Store  # noqa: E402
 def _part(native, kinds, base_blob="b"):
    uid = Session.make_uid("claude", native)
    s = Session(session_uid=uid, flavor="claude", native_session_id=native)
    events, blobs = [], {}
    for i, k in enumerate(kinds):
        ref = f"blob://{native}/{i}"
        events.append(SessionEvent(session_uid=uid, seq=i, parent_seq=(i - 1 if i else None),
                                   kind=k, ts=f"2026-06-06T10:0{i}:00Z", payload_ref=ref))
        blobs[ref] = f"{base_blob}-{k}-{i}"
    return Normalized(session=s, events=events, blobs=blobs)
 def test_second_file_appends_not_overwrites(tmp_path):
    st = Store(str(tmp_path / "m.db"), str(tmp_path / "blobs"))
    uid = Session.make_uid("claude", "s1")
    # file 1: 3 events (seq 0..2)
    n1 = _part("s1", ["user_msg", "assistant_msg", "tool_call"])
    added1 = st.ingest(n1)
    assert added1 == 3
    assert st.count_events(uid) == 3
    # file 2 for the SAME session: repeats event 0 + adds 2 new (continuation)
    n2 = _part("s1", ["user_msg", "edit", "completion"])
    # make the first event identical to file1's first event so it dedups
    n2.events[0].kind = "user_msg"
    n2.events[0].ts = "2026-06-06T10:00:00Z"
    n2.blobs[n2.events[0].payload_ref] = "b-user_msg-0"
    added2 = st.ingest(n2)
    # only the 2 genuinely-new events appended; total grows additively
    assert added2 == 2
    assert st.count_events(uid) == 5
    seqs = [e.seq for e in st.get_events(uid)]
    assert seqs == [0, 1, 2, 3, 4]  # contiguous, offset
 def test_reingest_same_bundle_is_idempotent(tmp_path):
    st = Store(str(tmp_path / "m.db"), str(tmp_path / "blobs"))
    uid = Session.make_uid("claude", "s2")
    n = _part("s2", ["user_msg", "assistant_msg"])
    assert st.ingest(n) == 2
    assert st.ingest(n) == 0          # nothing new on re-run
    assert st.count_events(uid) == 2
 def test_appended_event_parent_remapped_within_part(tmp_path):
    st = Store(str(tmp_path / "m.db"), str(tmp_path / "blobs"))
    uid = Session.make_uid("claude", "s3")
    st.ingest(_part("s3", ["user_msg", "assistant_msg"]))   # seq 0,1
    st.ingest(_part("s3", ["x_unused"]) if False else _part("s3", ["thinking", "edit"]))  # new 2,3
    events = {e.seq: e for e in st.get_events(uid)}
    # the 'edit' (seq 3) had parent_seq=0 within its part -> remapped to its part's first new seq (2)
    assert events[3].parent_seq == 2
--- a/Show More
+++ b/Show More
		`@@ -0,0 +1 @@`
							{"created_at": "2026-06-07T09:13:20Z", "distribution_ready": true, "id": "sp-problem-budget_overrun-tokens", "name": "problem: budget overrun", "polarity": "problem", "problem": "problem: budget overrun", "provenance": {"detected_at": null, "evidence": {"cost_impact": 10.667, "cross_flavor": false, "flavors": ["claude"], "frequency": 3, "key": "problem:budget_overrun:tokens", "locus": "tokens", "polarity": "problem", "repos": ["artifact-store", "citation-evidence", "infospace-bench"], "score": 32.001, "sessions": ["claude:0ef1b45c-5c27-4e20-88b3-37daeaa24eca", "claude:6e0d3d68-872b-4d93-bb09-0691e091314b", "claude:8fabd5ce-6a20-4412-9a8b-0f0763394a78"], "signal_type": "budget_overrun", "title": "problem: budget overrun"}, "promoted_at": "2026-06-07T09:13:20Z", "source_key": "problem:budget_overrun:tokens"}, "rendering_hints": {"claude": {"note": "TODO: refine rendering", "target": "CLAUDE.md"}}, "resolutions": [{"detail": "", "steps": [], "summary": "TODO: capture the recommended resolution"}], "schema_version": 1, "scope": {"domains": [], "flavors": ["claude"], "repos": ["artifact-store", "citation-evidence", "infospace-bench"]}, "status": "superseded", "updated_at": "2026-06-07T09:13:20Z", "version": "1.0.0"}
		`@@ -0,0 +1 @@`
							{"covers": [], "created_at": "2026-06-07T13:26:25Z", "distribution_ready": true, "id": "sp-problem-file_not_read-edit", "name": "Read before you Edit", "polarity": "problem", "problem": "Agents call Edit/Write on a file they have not read in the current session, or after it changed under them. The edit tools reject this ('File has not been read yet' / 'File has been modified since read'), and the retry burns a turn. Top recurring error in the corpus (12/27 sessions, 8 repos).", "provenance": {"detected_at": null, "evidence": {"frequency": 32, "origin": "AGENTIC-WP-0006 error mining / ASSESSMENT-infra-friction.md", "polarity": "problem", "repos": 8, "sessions": 12}, "promoted_at": null, "source_key": "problem:file_not_read:edit"}, "rendering_hints": {"claude": {"target": "CLAUDE.md"}, "codex": {"target": "AGENTS.md"}, "grok": {"target": ".grok/instructions.md"}}, "resolutions": [{"detail": "Never blind-write a file you haven't read this session.", "steps": ["Read the target file", "Then Edit/Write"], "summary": "Read the file (or the region you'll touch) before Edit/Write"}, {"detail": "A stale read means the file changed under you; refresh, don't loop.", "steps": ["Re-Read the file", "Re-apply the Edit"], "summary": "On 'modified since read', re-Read then re-Edit"}], "schema_version": 1, "scope": {"domains": [], "flavors": [], "repos": []}, "status": "superseded", "updated_at": "2026-06-07T13:26:25Z", "version": "1.0.0"}
		`@@ -0,0 +1 @@`
							`"""Detect: extract signals from sessions, cluster into candidate patterns."""`
		`@@ -0,0 +1 @@`
							`{"captured_at": "2026-06-07T13:30:14Z", "error_rate": 0.963, "infra_overhead_share_median": 0.117, "infra_overhead_share_p90": 0.261, "label": "phase4-baseline (pre-fixes)", "n_sessions": 27, "recurring_error_occurrences": 505, "schema_thrash_sessions": 8, "success_rate": 1.0, "tokens_p50": 250725, "tokens_p90": 1423966}`