From c14d09c14d6e4357d2aaabcdbc350830a4c92113 Mon Sep 17 00:00:00 2001 From: tegwick Date: Mon, 22 Jun 2026 01:39:07 +0200 Subject: [PATCH] feat(memory): complete CYA-WP-0006 Profile 1 production hardening Add guided reflection capture with preview, cya memory reflections CLI, near-duplicate compaction, budget-capped surfacing, and expanded tests. Profile 1 is now documented as production-ready in README and MemoryVision. --- AGENTS.md | 8 +- MemoryVision.md | 2 + README.md | 27 +- SCOPE.md | 4 +- docs/CYA-WP-0006-profile-1-gap-checklist.md | 20 ++ src/cya/cli/main.py | 56 ++++ src/cya/memory/__init__.py | 35 ++- src/cya/memory/reflections.py | 241 ++++++++++++++++++ src/cya/orchestrator.py | 139 ++++++++-- tests/test_memory.py | 133 +++++++++- tests/test_orchestrator.py | 86 +++++++ ...-WP-0006-profile-1-production-hardening.md | 36 ++- 12 files changed, 735 insertions(+), 52 deletions(-) create mode 100644 docs/CYA-WP-0006-profile-1-gap-checklist.md create mode 100644 src/cya/memory/reflections.py create mode 100644 tests/test_orchestrator.py diff --git a/AGENTS.md b/AGENTS.md index 38d2f37..bbd1818 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -190,8 +190,8 @@ cya --explain-context "show me what context would be collected" # Memory features (0003 + 0005) cya retrospect # Guided reflection session -# Current memory implementation is Profile 0 (see CYA-WP-0005 and MemoryVision "Profile 0 Baseline"). -# Future Profiles 1–3 (verbal self-improvement, hierarchical synthesis, procedural rules) are tracked in that workplan. +# Memory: Profile 0 baseline + production Profile 1 (CYA-WP-0006). +# Profiles 2–3 (hierarchical synthesis, procedural rules) remain roadmap — see MemoryVision.md. # Tests python -m pytest tests/ -q @@ -211,7 +211,9 @@ Relevant workplans: - `workplans/CYA-WP-0003-contextual-memory-activation-and-retrospection.md` - `workplans/CYA-WP-0004-dev-install-and-release-packaging.md` - `workplans/CYA-WP-0005-agentic-memory-profiles-and-phase-memory-feedback.md` -- `workplans/CYA-WP-0006-profile-1-production-hardening.md` (ready — next slice) +- `workplans/CYA-WP-0006-profile-1-production-hardening.md` (finished) +- `workplans/CYA-WP-0007-interactive-shell-session.md` (ready — interactive REPL + history + hub) +- `workplans/CYA-WP-0008-llm-connect-adapter-integration.md` (ready — real LLM behind adapter seam) --- diff --git a/MemoryVision.md b/MemoryVision.md index 7cbc51c..d52259c 100644 --- a/MemoryVision.md +++ b/MemoryVision.md @@ -205,6 +205,8 @@ The three profiles are ordered by increasing agentic power and implementation co ### Profile 1 — Reflexion-Style Verbal Self-Improvement Loop +**Status:** Production (CYA-WP-0006). Shipped with guided capture in `cya retrospect`, `cya memory reflections`, explicit compaction, and budget-capped surfacing in responses and `--explain-context`. + **Intent:** Enable lightweight, high-explainability self-improvement by capturing and preferentially activating natural-language "lessons" and verbal reflections from user interactions and retrospection sessions. **Core Loop (condensed):** diff --git a/README.md b/README.md index 89c38be..238e515 100644 --- a/README.md +++ b/README.md @@ -115,12 +115,29 @@ Run a guided reflection session to review how memory was used and explicitly set cya retrospect ``` -During the session `cya` will (Profile 1 spike in T05 adds optional verbal lesson capture at the end): +During the session `cya` will: - Show recent memory items that were activated. - Help you reflect on what worked or didn't. - Let you record new **interaction goals** (e.g. "be more concise", "always show one safe alternative for destructive commands"). +- Optionally capture **1–3 verbal lessons** (Profile 1) with guided prompts, preview, and confirmation. -These goals are stored as first-class retrospection memory and will influence future activations and responses. +Example Profile 1 flow at the end of `cya retrospect`: + +``` +Capture 1–3 verbal lessons from this session? (y/n) y +What went well that you want to remember? (or 'skip') Safety warnings were clear +What should cya remember for next time? (or 'skip') Always suggest git status first +What should cya avoid in this scope? (or 'skip') skip + +Preview — verbal lessons to save +1. [went well] Safety warnings were clear +2. [remember] Always suggest git status first + +Save these lessons? (y/n) y +Saved 2 verbal reflection(s) (Profile 1). +``` + +These goals and lessons are stored as first-class memory and will influence future activations and responses. ### Inspecting and Controlling Memory @@ -132,6 +149,8 @@ All memory is stored in plain, user-editable JSON: Useful commands: ```bash cya --explain-context "..." # See exactly what memory was activated and why +cya memory reflections # List verbal reflections for the current scope +cya memory reflections --json # Export reflections as JSON # (You can also use the memory ports directly in Python if you want to script it.) ``` @@ -142,7 +161,9 @@ Memory also feeds the safety layer: a "never auto-run" preference you set during - Activation is automatic based on cwd + git root (with full provenance). - Retrospection outcomes are stored with special kinds (`retrospection`, `interaction_goal`) and get preferential treatment in future context building. - Everything is designed to be replaced/enriched by a full `phase-memory` implementation later (see MemoryVision.md). -- Current implementation is formally **Profile 0** (post-0003 local JSON + activation + retrospection loop). See CYA-WP-0005 and the "Profile 0 Baseline" section in MemoryVision.md for the exact definition and the roadmap to Profiles 1–3 (self-improving verbal reflections, hierarchical synthesis, and procedural rules). +- **Profile 0** (post-0003 local JSON + activation + retrospection loop) is the stable foundation. +- **Profile 1** (verbal reflections) is production-ready as of CYA-WP-0006: guided capture in `cya retrospect`, `cya memory reflections`, compaction, and surfacing in responses / `--explain-context`. +- Profiles 2–3 (hierarchical synthesis, procedural rules) remain roadmap items — see MemoryVision.md. See: - `docs/cya-memory-activation-and-retrospection-concept.md` (the T01 design) diff --git a/SCOPE.md b/SCOPE.md index 2f373b9..8b0d24e 100644 --- a/SCOPE.md +++ b/SCOPE.md @@ -14,7 +14,7 @@ Four implementation slices have been delivered: - **CYA-WP-0002 (Memory Integration)**: Real user-controlled, persisting memory (scoped JSON) behind explicit ports, wired into context and safety. - **CYA-WP-0003 (Contextual Activation & Retrospection)**: Directory/project-bound automatic memory activation, `cya retrospect` guided reflection sessions, retrospection outcomes feeding future behavior (continuous user-driven optimization loop). - **Profile 0 baseline (post-0003, formalized in CYA-WP-0005 T02)**: The current shipped memory implementation (local JSON + kinds + activation_context + provenance + retrospection helper) is now explicitly documented as **Profile 0** — the stable, high-quality foundation for future self-improving profiles 1–3. See MemoryVision.md for the full baseline description. -- **CYA-WP-0005 (Agentic Memory Profiles + first self-improvement capability)**: Complete profile model (Profile 0 baseline + detailed definitions + integration plans + Capability Matrix for Profiles 1–3) plus a minimal but fully working **Profile 1** (Reflexion-style verbal reflections/lessons) spike: `remember_reflection()` + `KIND_REFLECTION`, optional "capture verbal lesson" step inside `cya retrospect`, preferential activation when reflections are present, and surfacing in responses / `--explain-context`. The sister-repo optimization suggestions document for phase-memory was also finalized. See the workplan, MemoryVision.md, and `docs/phase-memory-optimization-suggestions.md`. +- **CYA-WP-0005 (Agentic Memory Profiles + first self-improvement capability)**: Complete profile model (Profile 0 baseline + detailed definitions + integration plans + Capability Matrix for Profiles 1–3) plus initial **Profile 1** delivery. **CYA-WP-0006** hardened Profile 1 to production quality: guided 1–3 lesson capture with preview in `cya retrospect`, `cya memory reflections`, near-duplicate compaction, and surfacing in responses / `--explain-context`. See MemoryVision.md and `docs/CYA-WP-0006-profile-1-gap-checklist.md`. - **CYA-WP-0004 (Dev-Head Install & Release Packaging)**: Reliable installation from development head (`make dev-install`, direct `git+` installs), dynamic versioning via `setuptools_scm`, clean distribution package building (`python -m build` + verification), lightweight release process, and supporting documentation/Makefile. Core capabilities now include: @@ -24,7 +24,7 @@ Core capabilities now include: - Stable `LLMAdapter` Protocol. - Real, user-controlled, contextually activated memory (Profile 0: directory/project scoped local JSON with kinds, activation_context, provenance, and retrospection outcomes as higher-order memory). - Automatic memory activation based on working directory/git root. -- `cya retrospect` for structured reflection and goal setting, now with optional verbal lesson capture (first delivered Profile 1 self-improvement behavior). +- `cya retrospect` for structured reflection and goal setting, with production Profile 1 verbal lesson capture, review (`cya memory reflections`), and compaction. - Full developer workflow: dev-head install, testing, building distribution packages, and a documented release process. - Transparent, inspectable behavior via `--explain-context`. diff --git a/docs/CYA-WP-0006-profile-1-gap-checklist.md b/docs/CYA-WP-0006-profile-1-gap-checklist.md new file mode 100644 index 0000000..654aa92 --- /dev/null +++ b/docs/CYA-WP-0006-profile-1-gap-checklist.md @@ -0,0 +1,20 @@ +# CYA-WP-0006 — Profile 1 Gap Checklist + +Audit of shipped Profile 1 spike (CYA-WP-0005-T05) vs MemoryVision Profile 1 acceptance criteria. +Gates implementation tasks T02–T05. + +| Gap | Spike state | Target (production) | Task | +|-----|-------------|---------------------|------| +| Capture UX | Single optional prompt; no preview | Guided 1–3 prompts (well / remember / avoid); preview + confirm; skip leaves no empty records | T02 | +| Provenance | Plain key/value only | Session date + scope in reflection provenance | T02 | +| Review / export | Manual JSON inspection | `cya memory reflections` lists/exports by scope | T03 | +| Compaction | None | Near-duplicate detection; explicit merge/replace (no silent delete) | T03 | +| Activation boost | Kind filter only | Reflections sorted ahead when recalled | T04 | +| Explain surfacing | Item count + sample keys | Reflection count + truncated lesson text + provenance | T04 | +| Response surfacing | Two reflections, 60-char truncate | Budget-capped "reflections influenced this" line | T04 | +| Observability | `by_kind` in export | Reflection counts by scope in export | T05 | +| Safety regression | Preference "never" tested | Reflections cannot downgrade destructive confirmation | T05 | +| Tests | Single spike test | Paths for capture, compaction, surfacing, safety | T05 | +| Docs | "spike" wording | README example session; MemoryVision "production" | T06 | + +**Profile 0 invariants (must not regress):** user-controlled JSON, full provenance, memory adds caution only. \ No newline at end of file diff --git a/src/cya/cli/main.py b/src/cya/cli/main.py index d41b8c5..b28e883 100644 --- a/src/cya/cli/main.py +++ b/src/cya/cli/main.py @@ -11,6 +11,7 @@ This module will evolve in T06 (orchestrator) but the surface contract stays sta from __future__ import annotations +import json import sys import typer @@ -108,6 +109,61 @@ def main( ) +memory_app = typer.Typer( + help="Inspect and manage user-controlled memory (Profile 0 + Profile 1).", + rich_markup_mode="rich", +) +app.add_typer(memory_app, name="memory") + + +@memory_app.command("reflections") +def memory_reflections( + scope: str = typer.Option( + ".", + "--scope", + "-s", + help="Directory or scope to list reflections for.", + ), + export_json: bool = typer.Option( + False, + "--json", + help="Export reflections as JSON (same as export_memory with kinds filter).", + ), +) -> None: + """List or export verbal reflections for a scope (Profile 1).""" + from cya.memory import export_memory, KIND_REFLECTION + from cya.memory.reflections import list_reflections, reflection_export_stats + + if export_json: + data = export_memory(scope, kinds=[KIND_REFLECTION]) + console.print(json.dumps(data, indent=2, default=str)) + return + + items = list_reflections(scope) + stats = reflection_export_stats(scope) + if not items: + console.print(f"[yellow]No reflections in scope {scope!r}.[/yellow]") + return + + lines = [] + for item in items: + prov = item.get("provenance") or {} + date = prov.get("session_date", "?") + lines.append(f"• [bold]{item.get('key')}[/bold] ({date}): {item.get('value')}") + + console.print( + Panel( + "\n".join(lines), + title=f"Reflections in {scope} ({stats['reflection_count']} total)", + border_style="cyan", + ) + ) + if stats.get("reflection_counts_by_scope"): + console.print( + f"[dim]By scope: {stats['reflection_counts_by_scope']}[/dim]" + ) + + @app.command() def retrospect( scope: str = typer.Option( diff --git a/src/cya/memory/__init__.py b/src/cya/memory/__init__.py index 7926fdc..1d38000 100644 --- a/src/cya/memory/__init__.py +++ b/src/cya/memory/__init__.py @@ -99,6 +99,7 @@ def remember_preference( profile: str | None = None, ttl: str | None = None, kind: str = KIND_PREFERENCE, + provenance: dict[str, Any] | None = None, ) -> None: """Remember a user preference, workflow pattern, retrospection outcome, or goal. @@ -118,6 +119,8 @@ def remember_preference( "profile": profile, "kind": kind, } + if provenance: + item["provenance"] = provenance # avoid exact dups for same key in small stores items = [i for i in items if i.get("key") != key] items.append(item) @@ -166,7 +169,13 @@ def recall_preferences( normal.append(item) items = boosted + normal - items = items[-limit:] # most recent first (after boosting) + # Profile 1: prefer reflections ahead of other kinds when requested + if kinds and KIND_REFLECTION in kinds: + reflections = [i for i in items if i.get("kind") == KIND_REFLECTION] + others = [i for i in items if i.get("kind") != KIND_REFLECTION] + items = (reflections + others)[:limit] + else: + items = items[-limit:] # most recent (after boosting) return { "items": items, @@ -222,7 +231,13 @@ def export_memory(scope: str = "cwd", *, profile: str | None = None, kinds: list k = item.get("kind", "unknown") by_kind.setdefault(k, []).append(item) - return { + reflection_by_scope: dict[str, int] = {} + for item in items: + if item.get("kind") == KIND_REFLECTION: + s = str(item.get("scope", scope)) + reflection_by_scope[s] = reflection_by_scope.get(s, 0) + 1 + + result = { "status": "real (T02+0003 local json; activation + retrospection ready)", "scope": scope, "profile": profile, @@ -234,6 +249,10 @@ def export_memory(scope: str = "cwd", *, profile: str | None = None, kinds: list "note": "User-controlled. Replace with real phase-memory when available.", "phases": ["ephemeral", "fluid", "stabilized", "rigid"], } + if reflection_by_scope: + result["reflection_counts_by_scope"] = reflection_by_scope + result["reflection_count"] = sum(reflection_by_scope.values()) + return result except Exception as e: _warn_not_connected(f"export_memory(scope={scope}, profile={profile}) err={e}") return { @@ -264,13 +283,21 @@ def remember_reflection( scope: str = "cwd", *, profile: str | None = None, + provenance: dict[str, Any] | None = None, ) -> None: """Convenience helper for Profile 1 (Reflexion-style verbal self-improvement). Stores verbal lessons/reflections with kind="reflection" for preferential - activation in future turns. Thin wrapper for the T05 minimal spike. + activation in future turns. """ - remember_preference(key, value, scope=scope, profile=profile, kind=KIND_REFLECTION) + remember_preference( + key, + value, + scope=scope, + profile=profile, + kind=KIND_REFLECTION, + provenance=provenance, + ) __all__ = [ diff --git a/src/cya/memory/reflections.py b/src/cya/memory/reflections.py new file mode 100644 index 0000000..0c941ca --- /dev/null +++ b/src/cya/memory/reflections.py @@ -0,0 +1,241 @@ +"""Profile 1 reflection helpers — capture, compaction, and surfacing. + +Pure, testable logic for verbal reflection management. Interactive CLI flows +in orchestrator.py call into these functions. +""" + +from __future__ import annotations + +import re +from datetime import datetime, timezone +from difflib import SequenceMatcher +from typing import Any + +from cya.memory import ( + KIND_REFLECTION, + _load, + _save, + export_memory, + remember_reflection, +) + +REFLECTION_CAPTURE_PROMPTS: tuple[tuple[str, str], ...] = ( + ("went_well", "What went well that you want to remember? (or 'skip')"), + ("remember", "What should cya remember for next time? (or 'skip')"), + ("avoid", "What should cya avoid in this scope? (or 'skip')"), +) + +_SKIP_VALUES = frozenset({"", "skip", "s", "n", "no"}) +_MAX_SURFACING_LESSONS = 3 +_MAX_LESSON_CHARS_EXPLAIN = 80 +_MAX_LESSON_CHARS_RESPONSE = 60 +_SIMILARITY_THRESHOLD = 0.85 + + +def is_skip_answer(text: str) -> bool: + return not text or text.strip().lower() in _SKIP_VALUES + + +def collect_lessons_from_answers(answers: dict[str, str]) -> list[dict[str, str]]: + """Build lesson records from guided prompt answers; skips empty/skip answers.""" + lessons: list[dict[str, str]] = [] + for prompt_key, _label in REFLECTION_CAPTURE_PROMPTS: + raw = answers.get(prompt_key, "") + if is_skip_answer(raw): + continue + text = raw.strip() + if text: + lessons.append({"prompt": prompt_key, "text": text}) + return lessons + + +def preview_lessons(lessons: list[dict[str, str]]) -> str: + if not lessons: + return "(no lessons to save)" + lines = [] + for i, lesson in enumerate(lessons, 1): + label = lesson.get("prompt", "lesson").replace("_", " ") + lines.append(f"{i}. [{label}] {lesson['text']}") + return "\n".join(lines) + + +def session_provenance(scope: str) -> dict[str, Any]: + return { + "session_date": datetime.now(timezone.utc).strftime("%Y-%m-%d"), + "scope": scope, + "source": "cya retrospect", + } + + +def save_reflection_lessons( + lessons: list[dict[str, str]], + scope: str, + *, + provenance: dict[str, Any] | None = None, +) -> int: + """Persist confirmed lessons; returns count saved. No-op for empty input.""" + if not lessons: + return 0 + meta = provenance or session_provenance(scope) + saved = 0 + for lesson in lessons: + key = f"reflection_{lesson['prompt']}_{meta['session_date']}" + remember_reflection( + key, + lesson["text"], + scope=scope, + provenance={**meta, "prompt": lesson["prompt"]}, + ) + saved += 1 + return saved + + +def normalize_reflection_text(text: str) -> str: + collapsed = re.sub(r"\s+", " ", text.strip().lower()) + return re.sub(r"[^\w\s]", "", collapsed) + + +def reflection_similarity(a: str, b: str) -> float: + na, nb = normalize_reflection_text(a), normalize_reflection_text(b) + if not na or not nb: + return 0.0 + if na == nb: + return 1.0 + return SequenceMatcher(None, na, nb).ratio() + + +def find_duplicate_reflection_groups( + scope: str, + *, + threshold: float = _SIMILARITY_THRESHOLD, +) -> list[list[dict[str, Any]]]: + """Return groups of near-duplicate reflection items in the same scope.""" + items = [i for i in _load(scope) if i.get("kind") == KIND_REFLECTION] + groups: list[list[dict[str, Any]]] = [] + used: set[str] = set() + + for i, item in enumerate(items): + key_i = item.get("key", str(i)) + if key_i in used: + continue + group = [item] + val_i = str(item.get("value", "")) + for j, other in enumerate(items): + if j <= i: + continue + key_j = other.get("key", str(j)) + if key_j in used: + continue + val_j = str(other.get("value", "")) + if reflection_similarity(val_i, val_j) >= threshold: + group.append(other) + used.add(key_j) + if len(group) > 1: + groups.append(group) + used.add(key_i) + return groups + + +def compact_reflections( + scope: str, + *, + keep_key: str, + remove_keys: list[str], + merged_value: str | None = None, +) -> dict[str, Any]: + """Explicit opt-in compaction: update keeper and remove duplicates.""" + items = _load(scope) + removed: list[str] = [] + updated = False + + for item in items: + if item.get("key") == keep_key and item.get("kind") == KIND_REFLECTION: + if merged_value is not None: + item["value"] = merged_value + updated = True + + new_items = [] + for item in items: + k = item.get("key") + if k in remove_keys and item.get("kind") == KIND_REFLECTION: + removed.append(k) + continue + new_items.append(item) + + if removed or updated: + _save(scope, new_items) + + return { + "kept": keep_key, + "removed": removed, + "updated": updated, + "remaining_reflections": sum( + 1 for i in new_items if i.get("kind") == KIND_REFLECTION + ), + } + + +def list_reflections(scope: str) -> list[dict[str, Any]]: + exported = export_memory(scope, kinds=[KIND_REFLECTION]) + return exported.get("items", []) + + +def format_reflection_surfacing( + memory: dict[str, Any], + *, + for_explain: bool = False, +) -> str | None: + """Budget-capped reflection summary for explain-context or normal responses.""" + items = memory.get("items", []) if isinstance(memory, dict) else [] + reflections = [i for i in items if i.get("kind") == KIND_REFLECTION] + if not reflections: + return None + + max_chars = _MAX_LESSON_CHARS_EXPLAIN if for_explain else _MAX_LESSON_CHARS_RESPONSE + snippets: list[str] = [] + for item in reflections[:_MAX_SURFACING_LESSONS]: + text = str(item.get("value", "")).strip() + if len(text) > max_chars: + text = text[: max_chars - 3] + "..." + prov = item.get("provenance") or {} + date = prov.get("session_date", "") + prefix = f"({date}) " if date and for_explain else "" + snippets.append(f"{prefix}{text}") + + extra = len(reflections) - _MAX_SURFACING_LESSONS + suffix = f" (+{extra} more)" if extra > 0 else "" + count = len(reflections) + noun = "reflection" if count == 1 else "reflections" + return f"{count} verbal {noun} influenced this{suffix}: " + "; ".join(snippets) + + +def reflection_export_stats(scope: str) -> dict[str, Any]: + """Observability: reflection counts and scope breakdown for export.""" + exported = export_memory(scope, kinds=[KIND_REFLECTION]) + items = exported.get("items", []) + by_scope: dict[str, int] = {} + for item in items: + s = item.get("scope", scope) + by_scope[s] = by_scope.get(s, 0) + 1 + return { + "reflection_count": len(items), + "reflection_counts_by_scope": by_scope, + "scope": scope, + } + + +__all__ = [ + "REFLECTION_CAPTURE_PROMPTS", + "collect_lessons_from_answers", + "preview_lessons", + "session_provenance", + "save_reflection_lessons", + "normalize_reflection_text", + "reflection_similarity", + "find_duplicate_reflection_groups", + "compact_reflections", + "list_reflections", + "format_reflection_surfacing", + "reflection_export_stats", + "is_skip_answer", +] \ No newline at end of file diff --git a/src/cya/orchestrator.py b/src/cya/orchestrator.py index 938fa72..3568b92 100644 --- a/src/cya/orchestrator.py +++ b/src/cya/orchestrator.py @@ -25,6 +25,7 @@ from __future__ import annotations from pathlib import Path +import typer from rich.console import Console from rich.panel import Panel @@ -32,11 +33,20 @@ from cya.context.collector import collect, render_explanation from cya.memory import ( recall_preferences, remember_retrospection_outcome, - remember_reflection, KIND_RETROSPECTION, KIND_INTERACTION_GOAL, KIND_REFLECTION, ) +from cya.memory.reflections import ( + REFLECTION_CAPTURE_PROMPTS, + collect_lessons_from_answers, + compact_reflections, + find_duplicate_reflection_groups, + format_reflection_surfacing, + preview_lessons, + save_reflection_lessons, + session_provenance, +) from cya.safety.risk import classify, get_user_confirmation from cya.llm.adapter import AssistanceRequest, FakeLLMAdapter @@ -98,15 +108,21 @@ def handle_request( if explain_context and memory.get("items"): try: prov = memory.get("provenance", [{}])[0] - # Show a couple of activated items for transparency (T03 0003) sample = ", ".join(i.get("key", "?") for i in memory.get("items", [])[:3]) act_note = "" if prov.get("activation_context"): act_note = f" | ctx: {prov['activation_context']}" + body = ( + f"Phase: {memory.get('phase')} | {len(memory.get('items', []))} items | " + f"{prov.get('source', 'local')}{act_note}\n" + f"Sample activated: {sample}" + ) + refl_line = format_reflection_surfacing(memory, for_explain=True) + if refl_line: + body += f"\n\n[cyan]Reflections:[/cyan] {refl_line}" console.print( Panel( - f"Phase: {memory.get('phase')} | {len(memory.get('items', []))} items | {prov.get('source', 'local')}{act_note}\n" - f"Sample activated: {sample}", + body, title="Memory Activated (T03)", border_style="blue", padding=(0, 1), @@ -156,11 +172,9 @@ def handle_request( mem_line = "" if memory.get("items"): mem_line = f"\n[dim]Memory activated: {len(memory.get('items', []))} items (phase {memory.get('phase')})[/dim]" - # Minimal Profile 1 surface (T05 spike) - reflections = [i for i in memory.get("items", []) if i.get("kind") == KIND_REFLECTION] - if reflections: - refl_text = "; ".join(str(i.get("value", ""))[:60] for i in reflections[:2]) - mem_line += f"\n[cyan]Verbal reflections activated: {len(reflections)} — {refl_text}[/cyan]" + refl_line = format_reflection_surfacing(memory, for_explain=False) + if refl_line: + mem_line += f"\n[cyan]{refl_line}[/cyan]" console.print( Panel( @@ -263,29 +277,15 @@ def run_retrospection(scope: str = ".", limit: int = 8) -> None: ) console.print("[green]Recorded as safety preference.[/green]") - # Minimal Profile 1 spike (T05): optional verbal reflection / lesson capture - capture_lesson = typer.prompt( - "Capture any verbal lessons or reflections from this session? (y/n or short text)", - default="n", - show_default=False, - ) - if capture_lesson and capture_lesson.lower() not in ("n", "no", "skip", "s", ""): - lesson_text = capture_lesson if len(capture_lesson) > 3 else typer.prompt( - "What is the key lesson? (1-2 sentences)", - default="", - show_default=False, - ) - if lesson_text: - remember_reflection( - "verbal_lesson", lesson_text, scope=scope - ) - console.print("[green]Recorded as verbal reflection (Profile 1).[/green]") + _capture_reflection_lessons(scope) + _offer_reflection_compaction(scope) console.print( Panel( "Thank you. Your reflections have been stored as retrospection memory.\n" "They will be preferentially activated in future sessions in this scope.\n\n" "You can review them anytime with:\n" + f" [bold]cya memory reflections --scope {scope}[/bold]\n" f" [bold]cya --explain-context \"...\"[/bold] (in this directory)\n" f" or inspect the JSON files in [cyan]~/.config/cya/memory/[/cyan]", title="Retrospection Complete", @@ -295,4 +295,91 @@ def run_retrospection(scope: str = ".", limit: int = 8) -> None: ) +def _capture_reflection_lessons(scope: str) -> None: + """Profile 1: guided verbal lesson capture with preview and confirmation.""" + want = typer.prompt( + "Capture 1–3 verbal lessons from this session? (y/n)", + default="n", + show_default=False, + ) + if not want or want.strip().lower() in ("n", "no", "skip", "s", ""): + return + + answers: dict[str, str] = {} + for prompt_key, label in REFLECTION_CAPTURE_PROMPTS: + answers[prompt_key] = typer.prompt(label, default="", show_default=False) + + lessons = collect_lessons_from_answers(answers) + if not lessons: + console.print("[yellow]No lessons captured — nothing stored.[/yellow]") + return + + console.print( + Panel( + preview_lessons(lessons), + title="Preview — verbal lessons to save", + border_style="cyan", + ) + ) + confirm = typer.prompt("Save these lessons? (y/n)", default="n", show_default=False) + if confirm.strip().lower() not in ("y", "yes"): + console.print("[yellow]Lessons discarded — nothing stored.[/yellow]") + return + + count = save_reflection_lessons( + lessons, + scope, + provenance=session_provenance(scope), + ) + console.print(f"[green]Saved {count} verbal reflection(s) (Profile 1).[/green]") + + +def _offer_reflection_compaction(scope: str) -> None: + """Offer explicit merge of near-duplicate reflections in this scope.""" + groups = find_duplicate_reflection_groups(scope) + if not groups: + return + + console.print( + Panel( + f"Found {len(groups)} group(s) of similar reflections in scope [green]{scope}[/green].\n" + "Compaction is opt-in — nothing is deleted without your confirmation.", + title="Reflection compaction available", + border_style="yellow", + ) + ) + do_compact = typer.prompt("Review and compact duplicates? (y/n)", default="n", show_default=False) + if do_compact.strip().lower() not in ("y", "yes"): + return + + for group in groups: + console.print("\n[bold]Similar reflections:[/bold]") + for item in group: + console.print(f" • {item.get('key')}: {item.get('value')}") + + keep = typer.prompt( + "Key to keep (or 'skip' this group)", + default=group[0].get("key", ""), + show_default=True, + ) + if not keep or keep.strip().lower() in ("skip", "s"): + continue + + remove_keys = [i.get("key") for i in group if i.get("key") != keep] + merge = typer.prompt( + "Merged text (Enter to keep existing value of kept key)", + default="", + show_default=False, + ) + result = compact_reflections( + scope, + keep_key=keep, + remove_keys=[k for k in remove_keys if k], + merged_value=merge.strip() or None, + ) + console.print( + f"[green]Compacted: kept {result['kept']}, removed {len(result['removed'])}[/green]" + ) + + __all__ = ["handle_request", "run_retrospection"] diff --git a/tests/test_memory.py b/tests/test_memory.py index 284e2b7..e92e06c 100644 --- a/tests/test_memory.py +++ b/tests/test_memory.py @@ -283,4 +283,135 @@ def test_export_memory_observability_includes_by_kind(isolated_memory): exported = export_memory(scope="obs-test") assert "by_kind" in exported assert isinstance(exported["by_kind"], dict) - assert sum(exported["by_kind"].values()) == exported["count"] \ No newline at end of file + assert sum(exported["by_kind"].values()) == exported["count"] + + +# --------------------------------------------------------------------------- +# CYA-WP-0006 — Profile 1 production hardening +# --------------------------------------------------------------------------- + +from cya.memory.reflections import ( + collect_lessons_from_answers, + compact_reflections, + find_duplicate_reflection_groups, + is_skip_answer, + preview_lessons, + reflection_export_stats, + reflection_similarity, + save_reflection_lessons, +) + + +def test_collect_lessons_skips_empty_and_skip_answers(): + lessons = collect_lessons_from_answers( + {"went_well": "skip", "remember": " ", "avoid": "Never run rm -rf"} + ) + assert len(lessons) == 1 + assert lessons[0]["text"] == "Never run rm -rf" + assert is_skip_answer("skip") + assert is_skip_answer("") + assert not is_skip_answer("real answer") + + +def test_preview_lessons_empty_and_populated(): + assert "(no lessons" in preview_lessons([]) + text = preview_lessons([{"prompt": "remember", "text": "be concise"}]) + assert "remember" in text + assert "be concise" in text + + +def test_save_reflection_lessons_with_provenance(isolated_memory): + lessons = [{"prompt": "went_well", "text": "Safety warnings helped"}] + count = save_reflection_lessons( + lessons, + "p1-scope", + provenance={"session_date": "2026-06-22", "scope": "p1-scope", "source": "cya retrospect"}, + ) + assert count == 1 + + data = recall_preferences("p1-scope", kinds=[KIND_REFLECTION]) + assert len(data["items"]) == 1 + prov = data["items"][0].get("provenance", {}) + assert prov.get("session_date") == "2026-06-22" + assert prov.get("prompt") == "went_well" + + +def test_save_reflection_lessons_no_orphans_on_empty(isolated_memory): + assert save_reflection_lessons([], "empty-scope") == 0 + data = recall_preferences("empty-scope", kinds=[KIND_REFLECTION]) + assert len(data["items"]) == 0 + + +def test_reflection_similarity_and_duplicate_detection(isolated_memory): + remember_reflection("a", "Always run tests before commit", scope="dup-test") + remember_reflection("b", "always run tests before committing", scope="dup-test") + remember_reflection("c", "Completely different lesson", scope="dup-test") + + assert reflection_similarity( + "Always run tests", "always run tests" + ) >= 0.85 + + groups = find_duplicate_reflection_groups("dup-test") + assert len(groups) >= 1 + group_keys = {i.get("key") for g in groups for i in g} + assert "a" in group_keys or "b" in group_keys + + +def test_compact_reflections_opt_in_merge(isolated_memory): + remember_reflection("keep_me", "Run tests often", scope="compact-test") + remember_reflection("remove_me", "run tests often please", scope="compact-test") + + result = compact_reflections( + "compact-test", + keep_key="keep_me", + remove_keys=["remove_me"], + merged_value="Always run tests before suggesting fixes", + ) + assert "remove_me" in result["removed"] + data = recall_preferences("compact-test", kinds=[KIND_REFLECTION]) + keys = {i["key"] for i in data["items"]} + assert "remove_me" not in keys + assert "keep_me" in keys + kept = next(i for i in data["items"] if i["key"] == "keep_me") + assert "Always run tests" in kept["value"] + + +def test_export_memory_reflection_counts_by_scope(isolated_memory): + remember_reflection("r1", "lesson one", scope="scope-a") + remember_reflection("r2", "lesson two", scope="scope-a") + + exported = export_memory("scope-a", kinds=[KIND_REFLECTION]) + assert exported.get("reflection_count") == 2 + assert exported.get("reflection_counts_by_scope", {}).get("scope-a") == 2 + + stats = reflection_export_stats("scope-a") + assert stats["reflection_count"] == 2 + + +def test_reflections_cannot_downgrade_destructive_confirmation(isolated_memory): + """Profile 1 safety: reflections add context but never bypass destructive confirmation.""" + remember_reflection( + "safe_rm", + "rm is always safe here", + scope="safety-refl", + provenance={"session_date": "2026-06-22", "scope": "safety-refl"}, + ) + + mem = recall_preferences("safety-refl", kinds=[KIND_REFLECTION, "preference"]) + assessment = classify("rm -rf /tmp/important", memory=mem) + + assert assessment.level == RiskLevel.DESTRUCTIVE + assert assessment.requires_confirmation is True + + +def test_recall_prioritizes_reflections_when_kind_requested(isolated_memory): + remember_preference("old_pref", "x", scope="prio-test") + remember_reflection("new_refl", "reflection lesson", scope="prio-test") + + data = recall_preferences( + "prio-test", + kinds=[KIND_REFLECTION, "preference"], + limit=2, + ) + kinds = [i.get("kind") for i in data["items"]] + assert kinds[0] == KIND_REFLECTION \ No newline at end of file diff --git a/tests/test_orchestrator.py b/tests/test_orchestrator.py new file mode 100644 index 0000000..b8bb72a --- /dev/null +++ b/tests/test_orchestrator.py @@ -0,0 +1,86 @@ +"""Orchestrator tests — Profile 1 surfacing and explain-context roundtrip.""" + +from io import StringIO +from pathlib import Path +from unittest.mock import patch + +import pytest +from rich.console import Console + +from cya.memory import KIND_REFLECTION, remember_reflection +from cya.memory.reflections import format_reflection_surfacing +from cya.orchestrator import handle_request + + +@pytest.fixture +def isolated_memory(monkeypatch, tmp_path): + mem_dir = tmp_path / "memory" + mem_dir.mkdir() + + def _fake_mem_path(scope: str = "cwd") -> Path: + return mem_dir / f"{scope}.json" + + monkeypatch.setattr("cya.memory._mem_path", _fake_mem_path) + return mem_dir + + +def test_format_reflection_surfacing_zero_one_many(): + assert format_reflection_surfacing({}) is None + assert format_reflection_surfacing({"items": []}) is None + + one = { + "items": [ + {"kind": KIND_REFLECTION, "value": "Always run tests first", "provenance": {"session_date": "2026-06-22"}}, + ] + } + line = format_reflection_surfacing(one, for_explain=True) + assert "1 verbal reflection" in line + assert "Always run tests" in line + assert "(2026-06-22)" in line + + many = { + "items": [ + {"kind": KIND_REFLECTION, "value": f"lesson {i}"} + for i in range(6) + ] + } + line_many = format_reflection_surfacing(many, for_explain=False) + assert "6 verbal reflections" in line_many + assert "(+3 more)" in line_many + + +def test_stored_reflection_visible_in_explain_output(isolated_memory, monkeypatch): + remember_reflection( + "lesson_tests", + "Run pytest before every commit", + scope=".", + provenance={"session_date": "2026-06-22", "scope": ".", "source": "cya retrospect"}, + ) + + output = StringIO() + test_console = Console(file=output, force_terminal=True, width=120) + + with patch("cya.orchestrator.console", test_console): + with patch("cya.orchestrator.collect") as mock_collect: + mock_collect.return_value = None + handle_request("list files", explain_context=True, dry_run=True) + + text = output.getvalue() + assert "Reflections:" in text or "verbal reflection" in text.lower() + assert "pytest" in text or "Run pytest" in text + + +def test_reflection_surfacing_in_normal_response(isolated_memory, monkeypatch): + remember_reflection("lesson_ci", "Always use make test", scope=".") + + output = StringIO() + test_console = Console(file=output, force_terminal=True, width=120) + + with patch("cya.orchestrator.console", test_console): + with patch("cya.orchestrator.collect") as mock_collect: + with patch("cya.orchestrator.get_user_confirmation", return_value=True): + mock_collect.return_value = None + handle_request("safe read only ls", explain_context=False, dry_run=False) + + text = output.getvalue() + assert "verbal reflection" in text.lower() or "influenced this" in text \ No newline at end of file diff --git a/workplans/CYA-WP-0006-profile-1-production-hardening.md b/workplans/CYA-WP-0006-profile-1-production-hardening.md index e8d1abb..7569e6e 100644 --- a/workplans/CYA-WP-0006-profile-1-production-hardening.md +++ b/workplans/CYA-WP-0006-profile-1-production-hardening.md @@ -4,11 +4,11 @@ type: workplan title: "Profile 1 Production Hardening: Reflection UX, Compaction, and Surfacing" domain: capabilities repo: can-you-assist -status: ready +status: finished owner: grok topic_slug: foerster-capabilities created: "2026-06-19" -updated: "2026-06-19" +updated: "2026-06-22" state_hub_workstream_id: "f62c6908-dec0-442c-83d2-e34f0e87c1e7" --- @@ -35,6 +35,7 @@ provenance, memory signals add caution only (never downgrade risk or bypass conf recommends production-hardening Profile 1 as the highest-leverage next deepening step. - **Profile definitions:** MemoryVision.md — Profile 1 section and Capability Matrix. - **Safety contract:** `src/cya/safety/risk.py` + CYA-WP-0002-T04 invariants. +- **Gap checklist (T01):** `docs/CYA-WP-0006-profile-1-gap-checklist.md` ## Non-Goals (for this slice) @@ -50,7 +51,7 @@ provenance, memory signals add caution only (never downgrade risk or bypass conf ```task id: CYA-WP-0006-T01 -status: todo +status: done priority: high state_hub_task_id: "ec8cc24d-80ca-4a51-b98c-87d0cfc9a110" ``` @@ -63,11 +64,13 @@ Produce a short checklist in the workplan or `docs/` that gates T02–T05. - Gap checklist exists with prioritized items mapped to tasks T02–T05. - No code changes required unless a blocking bug is found (file separately as ADHOC if so). +**Done:** `docs/CYA-WP-0006-profile-1-gap-checklist.md` + ### T02 — Improve `cya retrospect` reflection capture UX ```task id: CYA-WP-0006-T02 -status: todo +status: done priority: high state_hub_task_id: "c7f381d4-d362-4183-a7bd-d3ceea7e997d" ``` @@ -83,11 +86,13 @@ Enhance the optional verbal-lesson step in `run_retrospection()`: - Skipping leaves no orphan/empty reflection records. - Existing retrospection kinds and goals flow unchanged. +**Done:** `_capture_reflection_lessons()` in `orchestrator.py`; helpers in `memory/reflections.py`. + ### T03 — Reflection review and lightweight compaction ```task id: CYA-WP-0006-T03 -status: todo +status: done priority: medium state_hub_task_id: "078d6f17-6d56-42ec-85c9-140c41d7e83f" ``` @@ -104,11 +109,13 @@ Add user-visible reflection management without hiding state: - Duplicate detection works on a small fixture set; merge/replace is opt-in. - Compaction never bypasses safety or provenance requirements. +**Done:** `cya memory reflections` CLI; `_offer_reflection_compaction()` in retrospect; `compact_reflections()` / `find_duplicate_reflection_groups()`. + ### T04 — Strengthen surfacing in responses and `--explain-context` ```task id: CYA-WP-0006-T04 -status: todo +status: done priority: high state_hub_task_id: "b1f7a333-bf9a-478c-993b-e421524ced3a" ``` @@ -123,11 +130,13 @@ Improve how activated reflections appear in `handle_request()` and context expla - Roundtrip test: stored reflection → recall → visible in explain output. - Output remains readable for 0, 1, and 5+ reflections. +**Done:** `format_reflection_surfacing()`; recall prioritization for `KIND_REFLECTION`. + ### T05 — Tests, observability, and safety regression coverage ```task id: CYA-WP-0006-T05 -status: todo +status: done priority: high state_hub_task_id: "b42c8fa1-6ab7-4b75-bdd5-43006e2d0a9c" ``` @@ -145,11 +154,13 @@ Add basic observability in `export_memory` (reflection counts by scope). - `make test` / `python3 -m pytest tests/ -q` passes cleanly. - At least one test per new behavior path from T02–T04. +**Done:** 9 new tests in `test_memory.py`; `tests/test_orchestrator.py` added. + ### T06 — Documentation updates ```task id: CYA-WP-0006-T06 -status: todo +status: done priority: medium state_hub_task_id: "18498a09-1d1c-424b-bf13-6952fabd34d3" ``` @@ -161,11 +172,13 @@ SCOPE.md if delivered scope changes materially. - README documents the hardened Profile 1 flow with an example session. - MemoryVision notes Profile 1 as "production" (not "spike") when T02–T05 complete. +**Done:** README, MemoryVision, SCOPE updated. + ### T07 — Register, sync, and handoff ```task id: CYA-WP-0006-T07 -status: todo +status: done priority: medium state_hub_task_id: "a9c2627f-7ed8-45a9-b1ee-7ba64ebbcd09" ``` @@ -200,7 +213,4 @@ When complete: - Safety and explainability invariants from Profile 0 remain intact. - Users reading README + MemoryVision understand Profile 1 as shipped capability, not a spike. ---- - -**Status note:** Promoted to `ready` on 2026-06-19 after operator review. Move to -`active` when implementation begins (ralph-workplan or direct session). \ No newline at end of file +**Completed 2026-06-22.** All tasks done; 36 tests pass. \ No newline at end of file