detect/quality.py: is_real_coding_session drops health-checks / smoke-tests / interrupted / trivially-short sessions (event floor, repo present, substantive tool activity, non-trivial prompt). Wired into run_detect so signals only form over real sessions — fixes the abandoned false-positive. [detect.quality] knobs; existing detect/curate fixtures made realistic. 8 new tests; suite 80/80. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
session_memory
Capture + retention layer for Helix Forge — the Capture stage of the loop in ../docs/PRD-helix-forge.md, built to the ../docs/DESIGN-session-memory.md spec.
It scans coding-agent session logs, normalizes them into one schema, distills a compact per-session digest, and ages out raw bulk under a storage budget (dropping sessions once analyzed and once space is needed) rather than a fixed time window.
Layout
session_memory/
adapters/common.py # shared Normalized bundle + helpers
adapters/claude.py # Tier0 -> Tier1 normalizers, one per flavor
adapters/codex.py # (rollout {timestamp,type,payload}, flat call_id join)
adapters/grok.py # (per-session dir: chat_history + events + updates)
core/schema.py # Session / SessionEvent / Cost
core/store.py # SQLite rows + blob-dir bodies (Tier1) + digests/patterns (Tier2)
core/cursor.py # incremental ingest cursors
core/digest.py # Tier1 -> Tier2 promotion + outcome heuristic
core/retention.py # budget-based eviction sweep
ingest.py # one sweep: discover -> normalize -> store -> digest -> evict
detect/signals.py # signal extractors over digests
detect/cluster.py # cluster signals -> candidate patterns + cross-flavor flag
detect/__main__.py # python -m session_memory.detect (ranked report)
curate/schema.py # SolutionPattern artifact + per-flavor rendering hints
curate/catalog.py # versioned, files-first Pattern Catalog (dedup on id)
curate/gating.py # promotion evidence bar + bloat guard
curate/review.py # discuss/approve/reject -> promote workflow
curate/decisions.py # hub decision audit trail (graceful local-queue fallback)
curate/__main__.py # python -m session_memory.curate (interactive / --auto-approve)
catalog/ # the committed Pattern Catalog (source of truth)
config.toml # store paths, retention caps, sources, repo->domain map, curate gate
The local store lives under session_memory/.store/ (gitignored).
Run a sweep
# from the repo root
python -m session_memory.ingest # ingest + analyze + evict
python -m session_memory.ingest --dry-run # discover + parse only, writes nothing
python -m session_memory.ingest --config path/to/config.toml
Output reports discovered / ingested / skipped_unchanged / analyzed and a
retention line (freed, final_usage, and per-pass eviction counts). Sweeps are
idempotent — re-running skips unchanged files via the cursor.
Scheduling (cadence)
Retention is budget-based; the cadence in config.toml only decides how often
the sweep runs. Trigger it with the repo scheduler, e.g. daily:
# Claude Code: schedule a daily routine that runs the sweep
/schedule "daily session-memory sweep" -- python -m session_memory.ingest
or a cron entry / /loop on a timer. Push-capture (agent Stop/SessionEnd hooks)
can also enqueue a sweep; see design §7.
Detect candidate patterns
After ingesting, mine the digests for recurring problem/success patterns:
python -m session_memory.detect # ranked report, cross-flavor first
python -m session_memory.detect --json # machine-readable candidates
python -m session_memory.detect --min-frequency 3
Candidates are persisted to a Tier 2 patterns table and are the input to the
Curate phase (Phase 2). Patterns whose evidence spans more than one agent flavor
are flagged [CROSS-FLAVOR] — the highest-value reuse targets.
Curate candidates into the Pattern Catalog
Review detect candidates into versioned Solution Patterns held in the
files-first catalog (session_memory/catalog/). The flow is detect → curate →
(Phase 3) distribute; curate refreshes candidates by running detect first.
python -m session_memory.curate # interactive review (a/r/d per candidate)
python -m session_memory.curate --auto-approve # batch: promote all that clear the evidence bar
python -m session_memory.curate --json # machine-readable result
- Promotion writes a
SolutionPatternfile (id = source candidate key, so re-promoting the same candidate dedups; content changes bump the semver and archive the prior version to<id>.history.jsonl). - The evidence bar (
[curate.gate]) sets two floors: a promote floor and a stricter distribution floor. A thin-but-real candidate landsprovisional; one clearing the distribution floor landsapproved+distribution_ready. - A bloat guard flags duplicate / near-duplicate candidates so the catalog stays lean.
- Re-review is idempotent — a remembered decision is skipped unless the candidate's evidence changed; a prior reject is not re-surfaced.
- Each final promote/reject is recorded as a hub decision; if the hub is
offline the decision is queued to
[curate].decision_queuefor later sync (the same after-the-fact pattern used in Phase 1).
Curate knobs ([curate] / [curate.gate] in config.toml)
| Key | Meaning |
|---|---|
catalog_dir |
committed Pattern Catalog dir (source of truth) |
review_log / decision_queue |
remembered decisions + pending hub decisions (gitignored) |
min_frequency / min_sessions / min_cost_impact |
floor to promote at all |
dist_require_cross_flavor |
require cross-flavor evidence to be distribution-eligible |
dist_min_frequency / dist_min_cost_impact |
stricter floor for distribution_ready |
Retention knobs ([retention] in config.toml)
| Key | Meaning |
|---|---|
raw_soft_cap_bytes |
begin evicting analyzed sessions above this (oldest first) |
raw_hard_cap_bytes |
absolute Tier 1 ceiling; overflow path may, as a last resort, evict un-analyzed sessions and report data_loss |
raw_max_age_days |
backstop: analyzed raw older than this is evictable regardless of space |
distilled_cap_bytes |
Tier 2 ceiling — alert only, never auto-dropped |
Invariant: a session's raw bytes are never dropped before its Tier 2 digest exists, except the explicitly-reported hard-cap overflow path.
Tests
python -m pytest # schema, adapters, store, digest, retention, ingest, detect, curate
Status
- Phase 0 (AGENTIC-WP-0002): schema, store, digest, budget retention, Claude adapter, ingest sweep.
- Phase 1 (AGENTIC-WP-0003): Codex + Grok adapters, multi-file session merge, and the Detect pipeline (signals → clustering → cross-flavor candidate patterns).
- Phase 2 (AGENTIC-WP-0004): Curate — Solution Pattern schema, versioned files-first Pattern Catalog, discuss/approve/reject review with an evidence bar + bloat guard, and hub-decision audit trail.
- Next — Phase 3 (Distribute) / Phase 4 (Measure) follow per the PRD.