feat(workplan): add CYA-WP-0001 console-native-mvp

- Created first repo-backed workplan per AGENTS.md and bootstrap
  workstream "repo-integration-can-you-assist" Task T02.
- Workplan kept at status: ready (pending explicit stack choice
  review before moving to active).
- 8 focused tasks (T01-T08) covering: CLI skeleton + packaging,
  bounded context collector, mandatory safety/confirmation layer,
  llm-connect adapter boundary, phase-memory ports, orchestrator,
  test strategy, and documentation handoff.
- Follows exact workplan convention and sibling (phase-memory)
  style for Goal / Current Evidence / Non-Goals / task blocks.
- Prepares narrow MVP slice that proves the core console loop
  while respecting crisp boundaries with llm-connect and phase-memory.

Workstream: 0a1233fd-75ab-4726-8857-6c97de939069
Progress event: 15cdf940-db46-4f97-ae0b-e21ce2e57ce6
Ready for: make fix-consistency REPO=can-you-assist
This commit is contained in:
2026-05-26 00:37:16 +02:00
parent 7b40a1ecab
commit 1bdd8e03d8

View File

@@ -0,0 +1,295 @@
---
id: CYA-WP-0001
type: workplan
title: "Console-Native MVP: CLI Skeleton, Safe Assistance Flow, and Integration Boundaries"
domain: capabilities
repo: can-you-assist
status: ready
owner: grok
topic_slug: foerster-capabilities
created: "2026-05-25"
updated: "2026-05-25"
---
# CYA-WP-0001: Console-Native MVP — CLI Skeleton, Safe Assistance Flow, and Integration Boundaries
## Goal
Deliver the first narrow, usable slice of `cya` (the can-you-assist console assistant) that proves the core loop:
User expresses intent in natural language from a terminal → `cya` safely gathers minimal relevant local context → routes through a clean adapter boundary for `llm-connect` → returns an **explainable** suggestion, command, or answer → enforces explicit confirmation for any potentially destructive or broad action.
This workplan establishes the CLI surface, context collector, safety layer, and the integration seams for the two sibling projects (`llm-connect` and `phase-memory`) without taking ownership of their implementations. It produces a minimal but *real* tool that can be invoked today for practical console work.
## Current Evidence
- High-quality intent, scope, and agent instruction documents exist and are consistent (INTENT.md, SCOPE.md, AGENTS.md, README.md, wiki/CyaSpeechModeExtension.md).
- State Hub bootstrap workstream `repo-integration-can-you-assist` (id `0a1233fd-75ab-4726-8857-6c97de939069`) is active. Its Task T02 ("Write first workplan and initialise workplans/") is the direct driver for creating this file.
- Sibling projects exist and are further along:
- `llm-connect` (real Python package with multi-provider adapters, config, tests).
- `phase-memory` (foundational workplans complete; local runtime, ports, and contracts exist).
- The repo is still at the pure documentation seed stage (2 commits). No source, tests, or packaging yet.
- `grok inspect` successfully discovers AGENTS.md and the project context.
## Non-Goals (for this MVP slice)
- No implementation of durable, user-controlled memory or complex preference adaptation (define explicit thin ports only; real behavior comes from `phase-memory` later).
- No voice, speech, or phone-bridge mode (the detailed design in wiki/CyaSpeechModeExtension.md is explicitly future work).
- No autonomous or background execution of generated commands, scripts, or file modifications.
- Do not vendor, copy, or fork code from `llm-connect` or `phase-memory`.
- No deep repository indexing, embeddings, vector search, or large-scale content analysis in the first slice.
- No distribution, homebrew/pip packaging polish, or multi-platform installers.
- No long-lived multi-turn conversational REPL or session state machine (one-shot + very lightweight session possible if it falls out naturally).
- No assumption of any specific LLM provider or hosted service.
## Success Criteria for This Slice
- `cya "..."` (and `cya --help`) can be run from a fresh checkout after a simple editable install and produces useful, context-aware output for the canonical examples in INTENT.md.
- Context collection is **transparent and bounded**: the user can always see exactly what local information would be sent to the model.
- Any suggestion that the risk classifier marks above "safe" (destructive filesystem operations, force pushes, mass edits, network-affecting commands, etc.) triggers a clear preview + explicit terminal confirmation flow. Nothing executes without the user typing confirmation.
- 100% of LLM interaction (even in tests) flows through a small, well-documented adapter boundary whose shape is compatible with the direction of `llm-connect`.
- A focused test suite exists with strong coverage of the safety invariants and context-selection rules.
- The workplan file is the source of truth. After the operator runs `make fix-consistency`, State Hub reflects the tasks and status.
- A new contributor or the sibling teams can read the updated README + this workplan and immediately understand the integration points and non-goals.
## Task Breakdown
### T01 — Project scaffolding, language choice, and CLI entrypoint
```task
id: CYA-WP-0001-T01
status: todo
priority: high
```
Bootstrap the minimal runnable package and the primary user-facing command.
- Confirm primary implementation language (strong current signal: Python, driven by .gitignore patterns and the concrete state of the two sibling projects). Record the decision and rationale in this workplan or a short ADR note.
- Create the initial package layout and build configuration (pyproject.toml or equivalent) with:
- Proper package name (`can-you-assist`), version, and console script entry point `cya`.
- Clean separation: `cya/cli/`, `cya/context/`, `cya/safety/`, `cya/llm/` (boundary), `cya/memory/` (ports).
- Implement the absolute minimum CLI surface:
- `cya --version`
- `cya --help` (rich, with examples)
- `cya "free-form natural language request here"` (one-shot mode)
- Basic flags for context control (`--file`, `--no-git`, `--explain-context`, `--dry-run`).
- Support both one-shot invocation and a very lightweight "session" mode if it falls out naturally from the REPL-less design (future voice bridge will need session tokens).
**Acceptance criteria**:
- After `pip install -e .` (or the equivalent for the chosen stack), `cya --help` and `cya --version` work with no external services.
- The command accepts a positional request string and prints a structured, human-readable response even before any LLM integration exists (graceful fallback or explicit "no backend configured" message is acceptable in early T01).
- Layout and naming make the future integration seams obvious.
### T02 — Safe, transparent, and intentionally bounded local context collector
```task
id: CYA-WP-0001-T02
status: todo
priority: high
```
Implement the "Context Collector" responsibility described in INTENT.md and SCOPE.md.
Collect *only* what is necessary for the current request and make the collection completely inspectable.
Minimum sources for MVP (all opt-in or narrowly scoped):
- Current working directory (top-level entries only; respect common ignore patterns such as `.git`, `node_modules`, `.venv`, `__pycache__`, etc.).
- Git state summary (current branch, dirty status, recent commit subject, list of modified files — obtained via read-only subprocess calls).
- Explicitly provided files or globs via `--file` / positional arguments.
- Data from stdin when the tool is used in a pipeline.
- Tiny, high-signal environment facts only when clearly relevant (e.g., `$SHELL`, `$EDITOR`/`$VISUAL`).
Hard constraints for this slice:
- No unbounded recursive directory walks.
- No secret scanning or credential harvesting.
- No automatic scraping of shell history, `~/.config`, or other user data without an explicit future opt-in + memory layer.
- Collection must produce a stable, serializable `ContextEnvelope` (or equivalent) that can be pretty-printed for the user and hashed/token-counted for the model.
**Acceptance criteria**:
- A `--show-context` / `--explain-context` (or debug) flag prints *exactly* the data that would be sent to the LLM, with clear provenance for each piece.
- Tests prove that the collector refuses to traverse known dangerous or expensive locations.
- The collector is a pure module with no side effects beyond read-only inspection.
### T03 — Risk classification and mandatory confirmation layer
```task
id: CYA-WP-0001-T03
status: todo
priority: high
```
This is a core product behavior, not an afterthought.
Build a risk classifier (simple rules + optional LLM assistance for edge cases) that labels suggestions as one of:
- safe
- review (needs a second look)
- destructive / mass-edit / privileged
- network-affecting
- other (with rationale)
For anything above "safe":
- Always emit a clear, copy-pasteable preview of the exact command(s) or action(s) being suggested.
- Show a concise "what will be affected" summary derived from the context envelope.
- Require an explicit confirmation step in the controlling terminal (e.g., `Run this? [y/N]` or `Type 'yes' to proceed`).
- Record the confirmation decision in any future audit trail (even if the memory layer is not yet present).
Never auto-execute anything in this slice, even "safe" suggestions, unless the user has explicitly asked for a "run" sub-mode (and even then only after preview).
**Acceptance criteria**:
- `cya "delete every log file older than 30 days in this tree"` produces a preview, classifies the action as destructive, and blocks until the user confirms in the terminal.
- The confirmation channel is always the terminal that launched `cya` (important for the future voice bridge design).
- The classifier and confirmation logic have dedicated tests that are part of the default test run (no live LLM required).
### T04 — llm-connect adapter boundary (the integration seam)
```task
id: CYA-WP-0001-T04
status: todo
priority: high
```
Per SCOPE.md and INTENT.md, `can-you-assist` owns orchestration and the CLI experience; `llm-connect` owns provider access.
Define a small, stable interface (protocol / abstract base / typed call) in this repository that all model interaction must go through:
- Something like:
```python
class LLMAdapter(Protocol):
def complete(
self,
request: AssistanceRequest, # contains framed intent + packed context + config hints
) -> AssistanceResponse: # contains suggestions, explanation, rationale, risks, raw model output summary
```
- Provide a deterministic fake / mock implementation used by all unit and safety tests.
- Provide a thin concrete adapter (or direct import) that will eventually delegate to the real `llm-connect` package.
- Configuration surface (which backend, model, temperature, token budget, etc.) must be explicit and delegated to the `llm-connect` configuration model once the boundary is stable.
**Acceptance criteria**:
- There is zero production code path that talks to an LLM (or a mock) bypassing this boundary.
- The interface is documented with a short "Integration Guide for llm-connect" section or companion note.
- Switching from the fake to a real (or stubbed) `llm-connect` client is a small, localized change.
### T05 — Thin, explicit phase-memory ports and future hooks
```task
id: CYA-WP-0001-T05
status: todo
priority: medium
```
Prepare the ground for `phase-memory` without pulling a dependency or inventing hidden state.
Define clear ports / extension points for the memory capabilities that INTENT.md says must remain under user control:
- Remember a preference or workflow pattern.
- Recall relevant history / preferences for the current cwd + task class.
- Forget / reset (scoped).
- Inspect / export current memory for this project or user.
In the MVP these ports can be:
- Pure no-ops that clearly log "phase-memory not yet connected".
- Or a trivial local JSON file store under an opt-in, user-visible location (`~/.config/cya/` or per-project `.cya/memory.json`), explicitly labeled as a placeholder.
All memory interactions must be behind these ports. No global singletons, no implicit `~/.cache` magic, no opaque vendor memory.
**Acceptance criteria**:
- Code review can point to the exact files/functions that will be replaced or implemented by `phase-memory` integration.
- The MVP still functions (gracefully) with memory completely disabled.
- README and help text explain the intended memory story and how users will stay in control.
### T06 — Assistance orchestrator and prompt/response handling
```task
id: CYA-WP-0001-T06
status: todo
priority: high
```
The piece that turns raw user intent + collected context into a well-formed request to the LLM adapter and then turns the adapter response into terminal output the user can act on.
Responsibilities:
- Lightweight intent framing (command suggestion vs. explanation vs. summarization vs. plan vs. "help me understand this error").
- Context packing with rough token awareness (so we don't blindly overflow models).
- Prompt construction (or structured request) that includes the safety charter, output schema expectations, and the collected context.
- Post-processing: parse structured output, attach local provenance explanations, produce the final user-facing artifact (suggestion block, rationale, next-step hints).
The orchestrator must be testable in isolation with a fake LLM adapter.
**Acceptance criteria**:
- The end-to-end flow (context → orchestrator → fake LLM → rendered output) works for at least the four primary use-case families listed in INTENT.md.
- Every response the user sees includes a short "why this answer" section that references the context pieces actually used.
### T07 — Test strategy, harness, and safety-focused suite
```task
id: CYA-WP-0001-T07
status: todo
priority: high
```
Choose and bootstrap a test framework appropriate for a console tool (pytest is the obvious default given the Python signal).
Required test categories for the slice:
- Context collector safety and bounding properties.
- Risk classifier + confirmation flow (the "never auto-execute destructive actions" invariant must be tested).
- Orchestrator behavior with deterministic fake LLM responses (including error and refusal paths).
- CLI surface (help, version, argument parsing, stdin handling).
- Adapter boundary contract tests (the fake satisfies the protocol; swapping implementations is safe).
Prefer fast, hermetic, no-network tests as the default. Any test that touches a real model or external service must be explicitly opt-in (markers, env vars) and skipped in normal CI.
**Acceptance criteria**:
- `pytest` (or chosen equivalent) runs cleanly from a fresh checkout with `pytest` or `make test`.
- Safety invariants have explicit, readable test names and assertions.
- Coverage or a manual checklist shows the high-risk paths are exercised.
### T08 — Documentation, examples, and handoff to siblings / operator
```task
id: CYA-WP-0001-T08
status: todo
priority: medium
```
Make the MVP usable and the integration points obvious.
- Heavily update README.md with real invocation examples, architecture overview (text diagram is fine), and "How to integrate with llm-connect / phase-memory" guidance.
- Add or expand USAGE.md / docs/ with the canonical workflows.
- In the workplan or a short companion note, explicitly list the extension points and any technical debt discovered during the slice (for later registration in State Hub).
- Ensure the bootstrap workstream tasks that depend on this (SBOM, extension point registration) have clear handoff notes.
**Acceptance criteria**:
- A person who has never seen the repo before can clone it, follow the updated README, install, and successfully run 23 non-trivial `cya` requests after reading only the README.
- The sibling project owners can read this workplan + the boundary documentation and know exactly where their packages will plug in.
## Dependencies & Cross-Repo Coordination
- **llm-connect**: Supplies the real multi-provider client, config chain, token counting, and structured response helpers. We define the consumption contract here.
- **phase-memory**: Supplies the actual memory implementation behind the ports we define. We own the call sites and the "user stays in control" contract.
- **State Hub (custodian domain)**: Owns work tracking, progress, decisions, and the `fix-consistency` sync that will import this workplan's tasks.
- No other hard runtime dependencies for the core MVP loop.
## Stack & Technology Choices (Decision Required Before `active`)
This workplan is intentionally left in `ready` status until the following are explicitly reviewed and recorded:
- Primary language / runtime (Python is the current leading candidate).
- CLI framework (typer, click, or stdlib argparse + rich for presentation).
- Terminal presentation library (rich, textual, or plain stdlib + colors).
- Configuration format (TOML strongly preferred for alignment with llm-connect patterns).
- Whether an initial `pyproject.toml` + src layout or a lighter single-file bootstrap is preferred for T01.
Once the stack review is complete, change the workplan `status` to `active` and begin implementation (or spawn follow-on workplans if the slice needs to be split).
## References & Links
- Parent bootstrap: State Hub workstream `repo-integration-can-you-assist` (tasks T01T04)
- This repo: INTENT.md, SCOPE.md, AGENTS.md, README.md, wiki/CyaSpeechModeExtension.md
- Sibling repos (for patterns and interfaces): llm-connect, phase-memory
- State Hub consistency command (run after any workplan or task file change):
`cd ~/state-hub && make fix-consistency REPO=can-you-assist`
---
**Status note**: Created as the direct output of bootstrap Task T02. This file is the authoritative plan. Do not implement the tasks until the status is moved to `active` after stack review.