session-memory: denoise error fingerprints (WP-0006 follow-up)

Tighten _is_failed: exclude successful hub JSON responses (top-level no-error
payloads) and file-read snapshots (numbered cat -n source lines) that were
polluting error_snippets. JSON verdict classifies error vs success payloads
directly. Cuts distinct fingerprints 444 -> 269 (~40%) over the real corpus with
the top errors unchanged. Assessment caveat updated. 5 new tests; suite 102/102.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-07 13:39:08 +02:00
parent 7cce276d32
commit 1b6081cd88
3 changed files with 80 additions and 7 deletions

View File

@@ -120,11 +120,14 @@ Reading:
in 3 sessions each — corroborates the plumbing-overhead story and the live MCP
flakiness seen during this work (REST fallback used).
**Caveat — fingerprint noise:** the fail-hint heuristic also catches non-failures
(successful hub JSON responses, source lines containing `raise …Error`, linter
"N errors" summaries). The *top* fingerprints above are real; a future refinement
should tighten `_is_failed` (e.g. skip valid-JSON success payloads and code-read
snapshots) before trusting the long tail.
**Fingerprint noise — mostly handled.** `_is_failed` now excludes successful hub
JSON responses (top-level no-error payloads) and file-read snapshots (numbered
`cat -n` source lines), which cut distinct fingerprints **444 → 269 (~40 %)**
without touching the top entries. Residual low-value items remain in the long tail
(bare structural lines like `{`, linter "N errors" summaries); the *top*
fingerprints are real. Note several entries (`MCP error -32602`,
`update_task_status 'title'`) reflect the State Hub MCP instability hit live during
this work — genuine, if self-referential, friction.
## What this assessment still can't see