generated from coulomb/repo-seed
Re-ran ingest->detect with the quality filter + infra signals over real local sessions (72 captured -> 27 real). Purged the false-positive 'abandoned' catalog entry and re-curated; catalog now carries tool_thrash/schema_thrash/infra_overhead patterns. docs/ASSESSMENT-infra-friction.md ranks the friction: ~17.6% of real tool activity is hub/task/schema plumbing (State Hub 10.3%, one session 231 calls; ToolSearch in 81% of sessions). Validates the CLI/MCP-skill hypothesis as top-2; recommends a State Hub skill (front-load schemas + batched writes) + bulk hub ops. Workplan finished; suite 88/88. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
89 lines
3.4 KiB
Markdown
89 lines
3.4 KiB
Markdown
---
|
||
id: AGENTIC-WP-0005
|
||
type: workplan
|
||
title: "Coding Session Memory — Detect Hardening (quality filter + infra signals)"
|
||
domain: helix_forge
|
||
repo: agentic-resources
|
||
status: finished
|
||
owner: codex
|
||
topic_slug: helix-forge
|
||
created: "2026-06-07"
|
||
updated: "2026-06-07"
|
||
state_hub_workstream_id: "d8b7b8d1-1d85-4d2a-8ccd-7b0366a9442d"
|
||
---
|
||
|
||
# Coding Session Memory — Detect Hardening
|
||
|
||
A focused hardening pass (call it Phase 1.5) so the Detect output is trustworthy
|
||
enough to drive an **infrastructure assessment**. Triggered by ad-hoc analysis of
|
||
the live store after Phase 2:
|
||
|
||
- Of **72 captured sessions, only 31 are real coding sessions**; the rest are
|
||
health-checks / smoke-tests / interrupted runs (mostly `llm-connect` *"Say hello
|
||
in one word"*). The `abandoned` outcome heuristic mislabels these, and Phase 2
|
||
cataloged a **false-positive** "cross-flavor abandoned" pattern as
|
||
`approved`/`distribution_ready`.
|
||
- All 31 real sessions read as `success`, so the current signal set
|
||
(outcome + markers + cost) surfaces almost no genuine friction.
|
||
- The already-captured `tool_histogram` tells the real story: **~17% of tool
|
||
activity in real sessions is State Hub MCP + task plumbing + `ToolSearch`
|
||
schema-loading**, concentrated to 40–70% in some sessions — but `signals.py`
|
||
never looks at it.
|
||
|
||
No new capture is needed — this is analysis the data already supports.
|
||
|
||
## Session-Quality Filter
|
||
|
||
```task
|
||
id: AGENTIC-WP-0005-T01
|
||
status: done
|
||
priority: high
|
||
state_hub_task_id: "9f8b4304-0a37-4f66-ad34-d93e12fba0d8"
|
||
```
|
||
|
||
Add `detect/quality.py` with `is_real_coding_session(digest)` that filters out
|
||
health-checks, smoke-tests, interrupted, and trivially-short sessions (event-count
|
||
floor, repo present, substantive edit/tool activity, not a single hello/interrupt
|
||
prompt). Wire it into the detect pipeline so signals/clusters only form over real
|
||
sessions — fixing the `abandoned` false-positive. Knobs under `[detect]` in
|
||
`config.toml`. Unit-tested on synthetic trivial-vs-real digests.
|
||
|
||
## Infra-Overhead + Thrash Signals
|
||
|
||
```task
|
||
id: AGENTIC-WP-0005-T02
|
||
status: done
|
||
priority: high
|
||
state_hub_task_id: "10d57b05-a731-4ece-bf45-f6a98ac77555"
|
||
```
|
||
|
||
Add `tool_histogram`-based extractors to `detect/signals.py`: a shared tool-bucket
|
||
helper (`shell` / `edit` / `read` / `statehub_mcp` / `task_mgmt` / `schema_load` /
|
||
`other`); `sig_infra_overhead` (PROBLEM when the statehub+task+schema share of tool
|
||
calls exceeds a threshold; magnitude = share; locus `infra_overhead`);
|
||
`sig_schema_thrash` (`ToolSearch` count over threshold; locus `schema_load`);
|
||
`sig_tool_thrash` (extreme single-tool repetition). Pure functions over digests;
|
||
thresholds configurable. Unit-tested.
|
||
|
||
## Re-run Live, Purge False Positives, Ranked Friction Report
|
||
|
||
```task
|
||
id: AGENTIC-WP-0005-T03
|
||
status: done
|
||
priority: high
|
||
state_hub_task_id: "8b9d029a-60d0-4caf-af62-4fcc9c9a645c"
|
||
```
|
||
|
||
Re-run `ingest → detect` over the real local sessions with the filter + new
|
||
signals. Purge the false-positive catalog entries seeded in Phase 2 (the
|
||
health-check `abandoned` pattern) and re-curate so the catalog reflects real
|
||
friction. Produce a ranked **friction assessment** (`docs/ASSESSMENT-infra-friction.md`)
|
||
of the major infrastructure problems — quantified per repo/flavor, infra-overhead
|
||
share, schema-thrash — with recommendations (incl. the State Hub / MCP skill
|
||
hypothesis). After workplan file updates, notify the operator to run from
|
||
`~/state-hub`:
|
||
|
||
```bash
|
||
make fix-consistency REPO=agentic-resources
|
||
```
|