diff --git a/workplans/AGENTIC-WP-0006-error-body-mining.md b/workplans/AGENTIC-WP-0006-error-body-mining.md new file mode 100644 index 0000000..86c7420 --- /dev/null +++ b/workplans/AGENTIC-WP-0006-error-body-mining.md @@ -0,0 +1,80 @@ +--- +id: AGENTIC-WP-0006 +type: workplan +title: "Coding Session Memory — Error-Body Mining (content-level root causes)" +domain: helix_forge +repo: agentic-resources +status: ready +owner: codex +topic_slug: helix-forge +created: "2026-06-07" +updated: "2026-06-07" +state_hub_workstream_id: "c6e44147-15fd-4cfa-ab2d-87461a6858f1" +--- + +# Coding Session Memory — Error-Body Mining + +The friction assessment ([ASSESSMENT-infra-friction.md](../docs/ASSESSMENT-infra-friction.md)) +can see *that* a session was expensive (tool-mix, cost, overhead share) but not +always *why* at the content level — the specific error messages and repeated +failed approaches. The digest captures tool histograms and prompt/response +snippets, but **not error-body text**. This workplan closes that gap so Detect can +surface recurring root-cause errors, not just coarse markers. + +Approach: capture **normalized error fingerprints + samples into the durable Tier 2 +digest** (raw Tier 1 blobs are evictable, so mining must persist into the digest), +then cluster recurring fingerprints across sessions into candidate problem +patterns through the existing clusterer. No new capture source — this reads the +event/blob bodies already ingested. + +## Capture Error-Body Snippets into the Digest + +```task +id: AGENTIC-WP-0006-T01 +status: todo +priority: high +state_hub_task_id: "136a0a73-61c2-4390-876c-de3880a967e6" +``` + +Extend `core/digest.py` `build_digest` to extract, from failed events +(`kind=error` and `tool_result` bodies matching the existing `_FAIL_HINTS`), a +**normalized fingerprint** (strip paths, line numbers, UUIDs, hex) plus a short +sample, stored as `digest["error_snippets"] = [{fingerprint, sample, count, tool}]`. +Same error across a session collapses to one fingerprint with a count. Durable in +Tier 2 (survives Tier 1 eviction). Bump `SCHEMA_VERSION`. Unit-tested on synthetic +sessions with repeated and varied errors. + +## Recurring-Error Signal + Clustering + +```task +id: AGENTIC-WP-0006-T02 +status: todo +priority: high +state_hub_task_id: "1a41b6f5-48bc-4080-bd18-94f2186ef566" +``` + +Add `detect/signals.py` `sig_recurring_error` keyed on the error fingerprint, so +the same error recurring across sessions/repos/flavors clusters into a candidate +problem pattern (locus = fingerprint; magnitude = occurrences). Feeds the existing +clusterer + cross-flavor flagging, so a root-cause error common to multiple flavors +is flagged cross-flavor. Respects the WP-0005 quality filter. Unit-tested on +synthetic digests sharing a fingerprint. + +## Re-run Live, Extend Friction Assessment with Root Causes + +```task +id: AGENTIC-WP-0006-T03 +status: todo +priority: medium +state_hub_task_id: "bed16d23-3971-4257-b066-d1e639fef150" +``` + +Re-ingest (to populate `error_snippets` — schema bump invalidates old digests) and +re-run detect over the real local sessions. Add a **"content-level root causes"** +section to [ASSESSMENT-infra-friction.md](../docs/ASSESSMENT-infra-friction.md): +top recurring error fingerprints with counts and affected repos/flavors. Full suite +green. After workplan updates, notify the operator to run from `~/state-hub`: + +```bash +make fix-consistency REPO=agentic-resources +```