diff --git a/workplans/CYA-WP-0001-console-native-mvp.md b/workplans/CYA-WP-0001-console-native-mvp.md new file mode 100644 index 0000000..c78ed79 --- /dev/null +++ b/workplans/CYA-WP-0001-console-native-mvp.md @@ -0,0 +1,295 @@ +--- +id: CYA-WP-0001 +type: workplan +title: "Console-Native MVP: CLI Skeleton, Safe Assistance Flow, and Integration Boundaries" +domain: capabilities +repo: can-you-assist +status: ready +owner: grok +topic_slug: foerster-capabilities +created: "2026-05-25" +updated: "2026-05-25" +--- + +# CYA-WP-0001: Console-Native MVP — CLI Skeleton, Safe Assistance Flow, and Integration Boundaries + +## Goal + +Deliver the first narrow, usable slice of `cya` (the can-you-assist console assistant) that proves the core loop: + +User expresses intent in natural language from a terminal → `cya` safely gathers minimal relevant local context → routes through a clean adapter boundary for `llm-connect` → returns an **explainable** suggestion, command, or answer → enforces explicit confirmation for any potentially destructive or broad action. + +This workplan establishes the CLI surface, context collector, safety layer, and the integration seams for the two sibling projects (`llm-connect` and `phase-memory`) without taking ownership of their implementations. It produces a minimal but *real* tool that can be invoked today for practical console work. + +## Current Evidence + +- High-quality intent, scope, and agent instruction documents exist and are consistent (INTENT.md, SCOPE.md, AGENTS.md, README.md, wiki/CyaSpeechModeExtension.md). +- State Hub bootstrap workstream `repo-integration-can-you-assist` (id `0a1233fd-75ab-4726-8857-6c97de939069`) is active. Its Task T02 ("Write first workplan and initialise workplans/") is the direct driver for creating this file. +- Sibling projects exist and are further along: + - `llm-connect` (real Python package with multi-provider adapters, config, tests). + - `phase-memory` (foundational workplans complete; local runtime, ports, and contracts exist). +- The repo is still at the pure documentation seed stage (2 commits). No source, tests, or packaging yet. +- `grok inspect` successfully discovers AGENTS.md and the project context. + +## Non-Goals (for this MVP slice) + +- No implementation of durable, user-controlled memory or complex preference adaptation (define explicit thin ports only; real behavior comes from `phase-memory` later). +- No voice, speech, or phone-bridge mode (the detailed design in wiki/CyaSpeechModeExtension.md is explicitly future work). +- No autonomous or background execution of generated commands, scripts, or file modifications. +- Do not vendor, copy, or fork code from `llm-connect` or `phase-memory`. +- No deep repository indexing, embeddings, vector search, or large-scale content analysis in the first slice. +- No distribution, homebrew/pip packaging polish, or multi-platform installers. +- No long-lived multi-turn conversational REPL or session state machine (one-shot + very lightweight session possible if it falls out naturally). +- No assumption of any specific LLM provider or hosted service. + +## Success Criteria for This Slice + +- `cya "..."` (and `cya --help`) can be run from a fresh checkout after a simple editable install and produces useful, context-aware output for the canonical examples in INTENT.md. +- Context collection is **transparent and bounded**: the user can always see exactly what local information would be sent to the model. +- Any suggestion that the risk classifier marks above "safe" (destructive filesystem operations, force pushes, mass edits, network-affecting commands, etc.) triggers a clear preview + explicit terminal confirmation flow. Nothing executes without the user typing confirmation. +- 100% of LLM interaction (even in tests) flows through a small, well-documented adapter boundary whose shape is compatible with the direction of `llm-connect`. +- A focused test suite exists with strong coverage of the safety invariants and context-selection rules. +- The workplan file is the source of truth. After the operator runs `make fix-consistency`, State Hub reflects the tasks and status. +- A new contributor or the sibling teams can read the updated README + this workplan and immediately understand the integration points and non-goals. + +## Task Breakdown + +### T01 — Project scaffolding, language choice, and CLI entrypoint + +```task +id: CYA-WP-0001-T01 +status: todo +priority: high +``` + +Bootstrap the minimal runnable package and the primary user-facing command. + +- Confirm primary implementation language (strong current signal: Python, driven by .gitignore patterns and the concrete state of the two sibling projects). Record the decision and rationale in this workplan or a short ADR note. +- Create the initial package layout and build configuration (pyproject.toml or equivalent) with: + - Proper package name (`can-you-assist`), version, and console script entry point `cya`. + - Clean separation: `cya/cli/`, `cya/context/`, `cya/safety/`, `cya/llm/` (boundary), `cya/memory/` (ports). +- Implement the absolute minimum CLI surface: + - `cya --version` + - `cya --help` (rich, with examples) + - `cya "free-form natural language request here"` (one-shot mode) + - Basic flags for context control (`--file`, `--no-git`, `--explain-context`, `--dry-run`). +- Support both one-shot invocation and a very lightweight "session" mode if it falls out naturally from the REPL-less design (future voice bridge will need session tokens). + +**Acceptance criteria**: +- After `pip install -e .` (or the equivalent for the chosen stack), `cya --help` and `cya --version` work with no external services. +- The command accepts a positional request string and prints a structured, human-readable response even before any LLM integration exists (graceful fallback or explicit "no backend configured" message is acceptable in early T01). +- Layout and naming make the future integration seams obvious. + +### T02 — Safe, transparent, and intentionally bounded local context collector + +```task +id: CYA-WP-0001-T02 +status: todo +priority: high +``` + +Implement the "Context Collector" responsibility described in INTENT.md and SCOPE.md. + +Collect *only* what is necessary for the current request and make the collection completely inspectable. + +Minimum sources for MVP (all opt-in or narrowly scoped): +- Current working directory (top-level entries only; respect common ignore patterns such as `.git`, `node_modules`, `.venv`, `__pycache__`, etc.). +- Git state summary (current branch, dirty status, recent commit subject, list of modified files — obtained via read-only subprocess calls). +- Explicitly provided files or globs via `--file` / positional arguments. +- Data from stdin when the tool is used in a pipeline. +- Tiny, high-signal environment facts only when clearly relevant (e.g., `$SHELL`, `$EDITOR`/`$VISUAL`). + +Hard constraints for this slice: +- No unbounded recursive directory walks. +- No secret scanning or credential harvesting. +- No automatic scraping of shell history, `~/.config`, or other user data without an explicit future opt-in + memory layer. +- Collection must produce a stable, serializable `ContextEnvelope` (or equivalent) that can be pretty-printed for the user and hashed/token-counted for the model. + +**Acceptance criteria**: +- A `--show-context` / `--explain-context` (or debug) flag prints *exactly* the data that would be sent to the LLM, with clear provenance for each piece. +- Tests prove that the collector refuses to traverse known dangerous or expensive locations. +- The collector is a pure module with no side effects beyond read-only inspection. + +### T03 — Risk classification and mandatory confirmation layer + +```task +id: CYA-WP-0001-T03 +status: todo +priority: high +``` + +This is a core product behavior, not an afterthought. + +Build a risk classifier (simple rules + optional LLM assistance for edge cases) that labels suggestions as one of: + +- safe +- review (needs a second look) +- destructive / mass-edit / privileged +- network-affecting +- other (with rationale) + +For anything above "safe": +- Always emit a clear, copy-pasteable preview of the exact command(s) or action(s) being suggested. +- Show a concise "what will be affected" summary derived from the context envelope. +- Require an explicit confirmation step in the controlling terminal (e.g., `Run this? [y/N]` or `Type 'yes' to proceed`). +- Record the confirmation decision in any future audit trail (even if the memory layer is not yet present). + +Never auto-execute anything in this slice, even "safe" suggestions, unless the user has explicitly asked for a "run" sub-mode (and even then only after preview). + +**Acceptance criteria**: +- `cya "delete every log file older than 30 days in this tree"` produces a preview, classifies the action as destructive, and blocks until the user confirms in the terminal. +- The confirmation channel is always the terminal that launched `cya` (important for the future voice bridge design). +- The classifier and confirmation logic have dedicated tests that are part of the default test run (no live LLM required). + +### T04 — llm-connect adapter boundary (the integration seam) + +```task +id: CYA-WP-0001-T04 +status: todo +priority: high +``` + +Per SCOPE.md and INTENT.md, `can-you-assist` owns orchestration and the CLI experience; `llm-connect` owns provider access. + +Define a small, stable interface (protocol / abstract base / typed call) in this repository that all model interaction must go through: + +- Something like: + ```python + class LLMAdapter(Protocol): + def complete( + self, + request: AssistanceRequest, # contains framed intent + packed context + config hints + ) -> AssistanceResponse: # contains suggestions, explanation, rationale, risks, raw model output summary + ``` +- Provide a deterministic fake / mock implementation used by all unit and safety tests. +- Provide a thin concrete adapter (or direct import) that will eventually delegate to the real `llm-connect` package. +- Configuration surface (which backend, model, temperature, token budget, etc.) must be explicit and delegated to the `llm-connect` configuration model once the boundary is stable. + +**Acceptance criteria**: +- There is zero production code path that talks to an LLM (or a mock) bypassing this boundary. +- The interface is documented with a short "Integration Guide for llm-connect" section or companion note. +- Switching from the fake to a real (or stubbed) `llm-connect` client is a small, localized change. + +### T05 — Thin, explicit phase-memory ports and future hooks + +```task +id: CYA-WP-0001-T05 +status: todo +priority: medium +``` + +Prepare the ground for `phase-memory` without pulling a dependency or inventing hidden state. + +Define clear ports / extension points for the memory capabilities that INTENT.md says must remain under user control: + +- Remember a preference or workflow pattern. +- Recall relevant history / preferences for the current cwd + task class. +- Forget / reset (scoped). +- Inspect / export current memory for this project or user. + +In the MVP these ports can be: +- Pure no-ops that clearly log "phase-memory not yet connected". +- Or a trivial local JSON file store under an opt-in, user-visible location (`~/.config/cya/` or per-project `.cya/memory.json`), explicitly labeled as a placeholder. + +All memory interactions must be behind these ports. No global singletons, no implicit `~/.cache` magic, no opaque vendor memory. + +**Acceptance criteria**: +- Code review can point to the exact files/functions that will be replaced or implemented by `phase-memory` integration. +- The MVP still functions (gracefully) with memory completely disabled. +- README and help text explain the intended memory story and how users will stay in control. + +### T06 — Assistance orchestrator and prompt/response handling + +```task +id: CYA-WP-0001-T06 +status: todo +priority: high +``` + +The piece that turns raw user intent + collected context into a well-formed request to the LLM adapter and then turns the adapter response into terminal output the user can act on. + +Responsibilities: +- Lightweight intent framing (command suggestion vs. explanation vs. summarization vs. plan vs. "help me understand this error"). +- Context packing with rough token awareness (so we don't blindly overflow models). +- Prompt construction (or structured request) that includes the safety charter, output schema expectations, and the collected context. +- Post-processing: parse structured output, attach local provenance explanations, produce the final user-facing artifact (suggestion block, rationale, next-step hints). + +The orchestrator must be testable in isolation with a fake LLM adapter. + +**Acceptance criteria**: +- The end-to-end flow (context → orchestrator → fake LLM → rendered output) works for at least the four primary use-case families listed in INTENT.md. +- Every response the user sees includes a short "why this answer" section that references the context pieces actually used. + +### T07 — Test strategy, harness, and safety-focused suite + +```task +id: CYA-WP-0001-T07 +status: todo +priority: high +``` + +Choose and bootstrap a test framework appropriate for a console tool (pytest is the obvious default given the Python signal). + +Required test categories for the slice: +- Context collector safety and bounding properties. +- Risk classifier + confirmation flow (the "never auto-execute destructive actions" invariant must be tested). +- Orchestrator behavior with deterministic fake LLM responses (including error and refusal paths). +- CLI surface (help, version, argument parsing, stdin handling). +- Adapter boundary contract tests (the fake satisfies the protocol; swapping implementations is safe). + +Prefer fast, hermetic, no-network tests as the default. Any test that touches a real model or external service must be explicitly opt-in (markers, env vars) and skipped in normal CI. + +**Acceptance criteria**: +- `pytest` (or chosen equivalent) runs cleanly from a fresh checkout with `pytest` or `make test`. +- Safety invariants have explicit, readable test names and assertions. +- Coverage or a manual checklist shows the high-risk paths are exercised. + +### T08 — Documentation, examples, and handoff to siblings / operator + +```task +id: CYA-WP-0001-T08 +status: todo +priority: medium +``` + +Make the MVP usable and the integration points obvious. + +- Heavily update README.md with real invocation examples, architecture overview (text diagram is fine), and "How to integrate with llm-connect / phase-memory" guidance. +- Add or expand USAGE.md / docs/ with the canonical workflows. +- In the workplan or a short companion note, explicitly list the extension points and any technical debt discovered during the slice (for later registration in State Hub). +- Ensure the bootstrap workstream tasks that depend on this (SBOM, extension point registration) have clear handoff notes. + +**Acceptance criteria**: +- A person who has never seen the repo before can clone it, follow the updated README, install, and successfully run 2–3 non-trivial `cya` requests after reading only the README. +- The sibling project owners can read this workplan + the boundary documentation and know exactly where their packages will plug in. + +## Dependencies & Cross-Repo Coordination + +- **llm-connect**: Supplies the real multi-provider client, config chain, token counting, and structured response helpers. We define the consumption contract here. +- **phase-memory**: Supplies the actual memory implementation behind the ports we define. We own the call sites and the "user stays in control" contract. +- **State Hub (custodian domain)**: Owns work tracking, progress, decisions, and the `fix-consistency` sync that will import this workplan's tasks. +- No other hard runtime dependencies for the core MVP loop. + +## Stack & Technology Choices (Decision Required Before `active`) + +This workplan is intentionally left in `ready` status until the following are explicitly reviewed and recorded: + +- Primary language / runtime (Python is the current leading candidate). +- CLI framework (typer, click, or stdlib argparse + rich for presentation). +- Terminal presentation library (rich, textual, or plain stdlib + colors). +- Configuration format (TOML strongly preferred for alignment with llm-connect patterns). +- Whether an initial `pyproject.toml` + src layout or a lighter single-file bootstrap is preferred for T01. + +Once the stack review is complete, change the workplan `status` to `active` and begin implementation (or spawn follow-on workplans if the slice needs to be split). + +## References & Links + +- Parent bootstrap: State Hub workstream `repo-integration-can-you-assist` (tasks T01–T04) +- This repo: INTENT.md, SCOPE.md, AGENTS.md, README.md, wiki/CyaSpeechModeExtension.md +- Sibling repos (for patterns and interfaces): llm-connect, phase-memory +- State Hub consistency command (run after any workplan or task file change): + `cd ~/state-hub && make fix-consistency REPO=can-you-assist` + +--- + +**Status note**: Created as the direct output of bootstrap Task T02. This file is the authoritative plan. Do not implement the tasks until the status is moved to `active` after stack review.