diff --git a/INTENT.md b/INTENT.md new file mode 100644 index 0000000..6fc7599 --- /dev/null +++ b/INTENT.md @@ -0,0 +1,511 @@ +# INTENT + +## Project Name + +**can-you-assist** + +Short name / command name: + +**cya** + +--- + +## Purpose + +`can-you-assist` exists to provide a command-line assistant interface for getting practical work done in the console with the help of large language models. + +The project aims to make LLM-based assistance available directly where many technical users already work: in shells, repositories, local filesystems, notes, scripts, logs, and project directories. + +Instead of requiring users to remember every option, flag, subcommand, file format, or tool-specific convention, `cya` should allow users to express intent in natural language and receive useful, context-aware help for command-line tasks. + +The guiding idea is simple: + +> Let the user ask for help at the console, let available LLM infrastructure provide the reasoning, and keep the memory, history, preferences, and adaptation under the user’s control. + +--- + +## Primary Utility + +`cya` provides a command-line tool that helps users interact with their local working environment through LLM-supported assistance. + +It should support tasks such as: + +- understanding files, folders, repositories, notes, and project structures; +- generating, explaining, improving, or transforming shell commands; +- assisting with code repository inspection and maintenance; +- helping with file-based workflows; +- summarizing and navigating textual content; +- providing context-aware hints without requiring users to remember detailed command syntax; +- adapting to user preferences, workflows, aliases, conventions, and recurring patterns over time. + +The tool should act as a practical helper for console-based work, not as a replacement for the shell, editor, version control system, or specialized CLI tools. + +--- + +## Core Motivation + +Modern command-line environments are powerful but cognitively expensive. + +Users often know what they want to achieve but do not remember: + +- the exact command; +- the right flags; +- the safest invocation; +- the expected file format; +- the best tool for the job; +- the local project conventions; +- the accumulated context from previous related work. + +LLM-based tools can help with this, but many existing systems centralize memory, history, adaptation, and workflow knowledge inside vendor-controlled applications. + +`cya` should provide a user-controlled alternative: + +- the user chooses or configures the available LLM backend; +- interaction history remains inspectable and portable; +- preferences and learned patterns are stored under user control; +- project-specific context can be managed locally; +- the tool can integrate with different LLM infrastructures instead of being locked to one provider. + +--- + +## Intended Users + +`cya` is primarily intended for users who work frequently in a terminal and want lightweight LLM assistance without leaving their local workflow. + +Important user groups include: + +- software developers; +- technical writers; +- repository maintainers; +- system operators; +- power users managing file-based notes and knowledge spaces; +- users working with many command-line tools but not wanting to memorize all details; +- users who want LLM assistance while keeping memory and adaptation portable and transparent. + +--- + +## Primary Use Cases + +### 1. Command Assistance + +The user describes what they want to do, and `cya` suggests an appropriate command or sequence of commands. + +Examples: + +```bash +cya "find all markdown files modified this week" +cya "show me the git branches that are already merged" +cya "compress this folder but exclude node_modules" +```` + +The goal is not merely command generation, but safe, explainable, context-aware assistance. + +--- + +### 2. Filesystem and Content Interaction + +`cya` should help users work with local files and directories. + +Examples: + +```bash +cya "summarize the notes in this folder" +cya "find files that look like project plans" +cya "extract todos from these markdown files" +cya "compare these two config files" +``` + +The tool should be especially useful for text-heavy local workspaces such as code repositories, markdown notes, documentation trees, logs, and exported knowledge bases. + +--- + +### 3. Code Repository Assistance + +`cya` should help users inspect, understand, and work with code repositories. + +Examples: + +```bash +cya "what does this repo do?" +cya "where is the CLI entry point?" +cya "find likely places where authentication is handled" +cya "summarize recent changes" +cya "suggest a safe first refactoring step" +``` + +The project should support repository-aware context discovery without assuming one fixed project structure or language ecosystem. + +--- + +### 4. Local Notes and Knowledge Work + +`cya` should help users interact with file-based notes, project journals, markdown documents, and knowledge collections. + +Examples: + +```bash +cya "collect open decisions from these notes" +cya "turn this rough note into a project outline" +cya "find related notes about deployment" +cya "summarize what I worked on in this directory" +``` + +This use case is important because many users maintain valuable context in local text files rather than in centralized SaaS tools. + +--- + +### 5. Personalized Console Helper + +`cya` should adapt to the user’s habits over time. + +It should be able to remember things such as: + +* preferred command styles; +* preferred editors; +* shell conventions; +* repository conventions; +* naming preferences; +* common project types; +* preferred explanation depth; +* safety preferences; +* recurring workflows. + +This adaptation should be implemented through user-controlled memory rather than opaque vendor-side personalization. + +--- + +## Relationship to `llm-connect` + +`cya` should use **llm-connect** as the abstraction layer for accessing LLM capabilities. + +The purpose of this relationship is to keep `cya` independent from any single LLM provider, application, API, or runtime. + +`llm-connect` should provide the connectivity layer for: + +* selecting available LLM backends; +* sending prompts and context; +* receiving completions or structured responses; +* supporting different providers or local models; +* allowing users to configure their own LLM infrastructure. + +`cya` should not hard-code one privileged LLM vendor as its conceptual foundation. + +Instead: + +> `cya` asks for assistance; `llm-connect` determines how available LLM infrastructure is reached. + +--- + +## Relationship to `phase-memory` + +`cya` should use **phase-memory** to provide memory, history, preferences, and adaptation under user control. + +The purpose of this relationship is to separate assistant behavior from opaque application-managed memory. + +`phase-memory` should support: + +* storing interaction history; +* managing user preferences; +* capturing recurring workflows; +* maintaining project-specific memory; +* distinguishing global, local, temporary, and contextual memory; +* allowing users to inspect, modify, export, reset, or delete memory. + +This is a core design principle: + +> The user’s working memory should belong to the user, not to a vendor-owned assistant interface. + +`cya` should therefore treat memory as a local or user-controlled capability, not as an invisible feature hidden behind a hosted product. + +--- + +## Design Principles + +### Console-Native + +`cya` should feel natural in a terminal environment. + +It should work well with: + +* shell pipelines; +* current working directory context; +* local files; +* environment variables; +* git repositories; +* scripts; +* standard input and output; +* human-readable text; +* machine-readable output where useful. + +--- + +### User-Controlled + +The user should remain in control of: + +* which LLM backend is used; +* what context is sent; +* what history is stored; +* what preferences are remembered; +* what project memory exists; +* when potentially destructive actions are executed. + +`cya` should assist, not take over. + +--- + +### Explainable and Inspectable + +When `cya` suggests commands or actions, it should be possible to inspect what it is doing. + +Important behaviors should be explainable: + +* why a command was suggested; +* what files are being read; +* what context is being used; +* what memory influenced the answer; +* what risks may exist; +* what assumptions were made. + +--- + +### Safe by Default + +`cya` should avoid dangerous automatic execution. + +Potentially destructive operations should require explicit user confirmation. + +Examples include: + +* deleting files; +* overwriting files; +* force-pushing git branches; +* changing permissions; +* running network-affecting commands; +* executing generated scripts; +* modifying many files at once. + +The tool should prefer preview, explanation, and confirmation over blind execution. + +--- + +### Backend-Agnostic + +`cya` should not be tied to one assistant product. + +It should be able to use different LLM infrastructures through `llm-connect`, including: + +* hosted LLM APIs; +* local models; +* CLI-based LLM tools; +* organization-provided endpoints; +* future agentic runtimes. + +The command-line interface should remain stable even if the underlying LLM infrastructure changes. + +--- + +### Memory-Aware but Not Memory-Dependent + +`cya` should become more useful through memory, but it should not require memory to function. + +A first-time user should be able to use it immediately. + +A long-time user should benefit from accumulated context, preferences, and workflow knowledge. + +--- + +## Conceptual Architecture + +At a conceptual level, `cya` consists of the following responsibilities: + +### 1. CLI Interface + +The command-line entry point used by the user. + +Responsible for: + +* parsing user input; +* identifying command modes; +* handling stdin/stdout; +* resolving current directory context; +* presenting results; +* asking for confirmation where needed. + +--- + +### 2. Context Collector + +Responsible for gathering relevant local context. + +Possible sources include: + +* current working directory; +* selected files; +* git status; +* repository metadata; +* configuration files; +* piped input; +* recent command history, where available and permitted; +* explicit user-provided context. + +The context collector should be conservative and transparent. + +--- + +### 3. Assistance Orchestrator + +Responsible for turning user intent and collected context into a structured assistance request. + +It decides: + +* what kind of help is being requested; +* what context is relevant; +* whether memory should be consulted; +* whether the answer should be a command, explanation, patch, summary, or plan; +* whether safety checks are needed. + +--- + +### 4. LLM Access Layer + +Provided through `llm-connect`. + +Responsible for: + +* communicating with the selected LLM backend; +* abstracting provider differences; +* supporting configuration of available models and endpoints. + +--- + +### 5. Memory Layer + +Provided through `phase-memory`. + +Responsible for: + +* user preferences; +* project-specific memory; +* interaction history; +* workflow adaptation; +* reusable patterns; +* inspectable assistant state. + +--- + +### 6. Safety and Confirmation Layer + +Responsible for identifying risky actions and enforcing confirmation flows. + +This layer should help prevent accidental damage caused by generated commands or misunderstood instructions. + +--- + +## Expected Repository Role + +This repository should contain the implementation and documentation for the `cya` command-line tool. + +It should define: + +* the CLI behavior; +* configuration model; +* integration points with `llm-connect`; +* integration points with `phase-memory`; +* prompt and context-handling conventions; +* safety rules; +* user-facing documentation; +* examples and workflows; +* tests for core command behavior. + +--- + +## Out of Scope + +The project should not initially aim to become: + +* a full IDE; +* a full agentic coding environment; +* a vendor-specific LLM client; +* a replacement for Claude Code, Codex, shell tools, git, editors, or package managers; +* a centralized SaaS memory platform; +* an autonomous system that freely modifies the filesystem without confirmation; +* a general-purpose chatbot detached from console work. + +`cya` may overlap with existing LLM console tools, but its intended differentiator is user-controlled, console-native assistance with portable memory and backend-agnostic LLM access. + +--- + +## Initial Scope + +The initial version should focus on a small but useful command-line assistant loop. + +A reasonable first scope includes: + +* a `cya` command that accepts natural-language requests; +* current-directory awareness; +* optional file input; +* optional stdin input; +* basic command suggestion; +* basic explanation mode; +* safe preview-first behavior; +* integration with `llm-connect`; +* simple integration with `phase-memory` for preferences and history; +* configuration file support; +* clear documentation of what context is sent to the LLM. + +Example commands: + +```bash +cya "what does this repo do?" +cya "suggest a command to find large files" +cya explain "tar -czf archive.tar.gz ./docs" +cya files "summarize these markdown files" README.md docs/*.md +cat error.log | cya "what is going wrong here?" +``` + +--- + +## Long-Term Direction + +Over time, `cya` may evolve into a personalized command-line assistant substrate. + +Possible future capabilities include: + +* project-specific assistant profiles; +* reusable workflow recipes; +* local command learning; +* shell integration; +* repository indexing; +* structured task execution plans; +* safe patch generation; +* memory inspection and editing commands; +* team-shared memory packs; +* domain-specific helper plugins; +* integration with agentic coding tools; +* support for multiple concurrent LLM backends; +* transparent context and cost reporting. + +The long-term ambition is not to create yet another closed assistant interface, but to provide a small, powerful, user-controlled bridge between local work and available LLM intelligence. + +--- + +## Success Criteria + +The project is successful if users can: + +* get useful command-line help without remembering tool details; +* work more effectively with local files, notes, and repositories; +* choose their own LLM backend; +* inspect and control what context is used; +* preserve their own history and preferences; +* adapt the assistant to their workflows over time; +* avoid vendor lock-in for assistant memory and personalization; +* safely use LLM assistance in real console workflows. + +--- + +## Mission Statement + +`can-you-assist` gives users a console-native, backend-agnostic LLM helper that understands local work, remembers under user control, and helps get things done without requiring command-line trivia to be kept in human memory. + diff --git a/README.md b/README.md index fcd7b8f..a0eae14 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,3 @@ -# repo-seed - -A git repository template to bootstrap coulomb projects from. \ No newline at end of file +Cya provides a console-native, backend-agnostic LLM helper that understands local work, +remembers, is under user control, and helps get things done without requiring command-line +trivia to be kept in human memory. diff --git a/wiki/CyaSpeechModeExtension.md b/wiki/CyaSpeechModeExtension.md new file mode 100644 index 0000000..b78a88d --- /dev/null +++ b/wiki/CyaSpeechModeExtension.md @@ -0,0 +1,1018 @@ +# CYA Speech Mode Extension + +## Working Name + +**cya speech mode** + +Possible short names: + +- `cya talk` +- `cya voice` +- `cya listen` +- `cya pair` +- `cya phone` + +--- + +## Core Idea + +`cya` should support a speech interaction mode where the user can use a phone as the microphone, speech-recognition device, and conversational front-end for a `cya` helper session running on a terminal. + +This allows users to speak to `cya` even when the active terminal environment has no usable microphone, no audio stack, no GUI, or no convenient speech-recognition capability. + +The terminal remains the operational context. + +The phone becomes the voice interface. + +The LLM interaction remains bound to the currently activated `cya` session. + +--- + +## User Scenario + +A user is working in a terminal on a server, VM, SSH session, minimal Linux installation, or development machine. + +They want to ask: + +> “What does this error mean?” +> “Generate a command to find all Python files importing requests.” +> “Explain the current git status.” +> “Create a commit message for these changes.” +> “Summarize the files in this folder.” + +Instead of typing the request, they activate a `cya` helper session in the terminal and connect the phone app. + +The phone enters conversation mode and uses its microphone and speech recognition. + +The recognized text is sent to the active `cya` session. + +The terminal-side `cya` helper answers in the terminal, and optionally also sends a spoken or textual response back to the phone. + +--- + +## Basic Interaction Flow + +### 1. User activates CYA in terminal + +Example: + +```bash +cya talk +```` + +or: + +```bash +cya --voice +``` + +The CLI starts a local or remote helper session and displays a pairing instruction. + +Example: + +```text +CYA voice session started. + +Pair with phone: + Open the CYA mobile app and scan this QR code. + +Session: + repo: /home/bernd/projects/example + mode: voice bridge + expires: 5 minutes +``` + +The terminal may show a QR code containing a short-lived pairing token. + +--- + +### 2. User opens the CYA phone app + +The app scans the QR code or enters a pairing code. + +The phone is now connected to the active `cya` session. + +The app shows: + +```text +Connected to: +example repo on tinker-base + +Listening... +``` + +--- + +### 3. User speaks into the phone + +The phone captures audio and performs speech recognition. + +The recognized text is sent to the active terminal-side `cya` helper. + +Example recognized utterance: + +```text +Please explain the current git status and suggest a commit message. +``` + +--- + +### 4. Terminal-side CYA processes the request + +The terminal-side `cya` helper has access to the local context: + +* current working directory; +* selected files; +* git status; +* shell environment; +* project memory; +* user preferences; +* configured LLM backend through `llm-connect`; +* memory through `phase-memory`. + +The phone does **not** need direct access to the filesystem. + +That is important. + +The phone provides speech input. + +The CLI session owns the local operational context. + +--- + +### 5. Response is returned + +The response appears in the terminal. + +Optionally, a concise response is also sent back to the phone. + +Example terminal response: + +```text +Current git status summary: + +- 3 modified files +- 1 new file +- no staged changes + +Suggested commit message: + +feat(cli): add initial voice session pairing flow + +Suggested next command: + +git add src/voice-session.ts README.md +git commit -m "feat(cli): add initial voice session pairing flow" +``` + +The phone may show: + +```text +I found three modified files and one new file. Suggested commit message: +feat(cli): add initial voice session pairing flow +``` + +Optionally, the phone can read that aloud. + +--- + +## Key Design Principle + +The phone should be a **voice bridge**, not the primary authority. + +The terminal session remains the source of truth for: + +* current working directory; +* filesystem access; +* repository context; +* execution permissions; +* local configuration; +* project memory; +* safety confirmations. + +The phone handles: + +* microphone input; +* speech recognition; +* maybe text-to-speech; +* session pairing; +* conversational convenience. + +This avoids turning the phone app into a remote filesystem client or full IDE. + +--- + +## Conceptual Architecture + +```text ++------------------+ +----------------------+ +------------------+ +| | speech | | text | | +| Phone App +----------> CYA Session Broker +----------> CYA CLI Session | +| | text | | result | | ++--------+---------+<----------+----------+-----------+<----------+--------+---------+ + | | | + | | | + v v v + Microphone / STT Pairing / Routing Local Context + Text-to-Speech Session Registry Filesystem + Conversation UI Authentication Git / Shell + llm-connect + phase-memory +``` + +--- + +## Main Components + +### 1. CYA CLI Voice Session + +The CLI needs a mode that creates an active voice-addressable session. + +Responsibilities: + +* create session ID; +* generate short-lived pairing token; +* register itself with a broker; +* expose current terminal context; +* receive transcribed user messages; +* process messages through normal `cya` assistance flow; +* return responses; +* enforce safety and confirmation rules. + +Possible command: + +```bash +cya talk +``` + +or: + +```bash +cya session start --voice +``` + +--- + +### 2. CYA Phone App + +The phone app provides the user-facing speech interface. + +Responsibilities: + +* scan QR code or enter pairing code; +* establish secure connection to active session; +* capture microphone input; +* run speech recognition; +* show transcript before sending, depending on mode; +* send text requests to the active `cya` session; +* display responses; +* optionally read responses aloud; +* allow pause, mute, reconnect, and disconnect. + +The app does not need to know the details of the user’s filesystem. + +--- + +### 3. Session Broker + +A broker is needed to connect the phone to the CLI session. + +This could be implemented in different ways. + +#### Option A: Local Network Broker + +The CLI starts a small local WebSocket server. + +The phone connects directly over the LAN. + +Good for: + +* home networks; +* office networks; +* local development; +* no cloud dependency. + +Challenges: + +* NAT/firewall issues; +* phone and terminal must be on same network; +* HTTPS/TLS handling; +* service discovery. + +Example: + +```bash +cya talk --listen 192.168.1.42:47391 +``` + +QR code contains: + +```text +cya://pair?host=192.168.1.42&port=47391&token=... +``` + +--- + +#### Option B: Relay Broker + +A small relay service connects phone and CLI. + +Good for: + +* SSH sessions; +* cloud servers; +* NAT traversal; +* mobile networks; +* remote development. + +The relay does not need to see filesystem data if messages are end-to-end encrypted or if the relay only routes session traffic. + +QR code contains: + +```text +cya://pair?relay=https://relay.cya.example&session=...&token=... +``` + +Challenges: + +* requires hosted infrastructure; +* introduces trust and privacy questions; +* should be optional, not mandatory. + +--- + +#### Option C: User-Controlled Broker + +Advanced users can self-host the broker. + +This fits the philosophy of `cya`. + +Possible command: + +```bash +cya-broker serve +``` + +Then: + +```bash +cya talk --broker https://my-broker.example +``` + +This preserves the user-controlled infrastructure principle. + +--- + +## Recommended Implementation Path + +A practical implementation path could have four stages. + +--- + +## Stage 1: Text Bridge Prototype + +Do not start with speech. + +Start with a phone-to-terminal text bridge. + +Goal: + +* CLI starts a session. +* Phone connects. +* User types text on phone. +* Text appears in the active `cya` session. +* CLI processes request and returns response. + +This proves: + +* pairing; +* routing; +* session ownership; +* broker model; +* authentication; +* terminal-side context handling. + +Example: + +```bash +cya talk +``` + +Phone app: + +```text +Connected. Type your request. +``` + +This avoids early complexity around speech APIs. + +--- + +## Stage 2: Speech-to-Text on Phone + +Add microphone and speech recognition. + +The phone converts speech to text locally or through platform speech recognition. + +Options: + +* iOS speech recognition; +* Android speech recognition; +* browser-based Web Speech API for a PWA; +* local speech model on device where feasible; +* cloud speech-to-text if user permits. + +The important architectural point: + +> `cya` receives text, not raw audio, unless explicitly configured otherwise. + +This keeps the CLI helper simple and avoids requiring audio handling on the terminal machine. + +--- + +## Stage 3: Conversation Mode + +Add continuous or semi-continuous voice interaction. + +Modes: + +### Push-to-Talk Mode + +Safest and easiest. + +User presses and holds a button, speaks, releases, and sends. + +Good default. + +### Confirm-Before-Send Mode + +The app shows recognized text first. + +User taps send. + +Useful when commands may be risky. + +### Continuous Conversation Mode + +The app listens continuously and sends utterances automatically. + +Useful but riskier. + +Should probably be opt-in. + +--- + +## Stage 4: Bidirectional Voice + +Add optional text-to-speech responses. + +The phone can speak concise answers aloud. + +This should be configurable: + +```bash +cya talk --phone-speak concise +cya talk --phone-speak off +cya talk --phone-speak full +``` + +The terminal should still show the full response. + +The phone should preferably receive a summarized voice response to avoid reading long command explanations aloud. + +--- + +## Safety Model + +Speech mode needs stricter safety defaults than typed CLI use. + +Speech recognition can mishear commands. + +Therefore: + +### Never execute destructive commands directly from speech + +For example, if the user says: + +> Delete all generated files. + +`cya` should produce a preview and require explicit terminal confirmation. + +### Require confirmation for execution + +Possible pattern: + +```text +Suggested command: + +rm -rf build/ + +This may delete files. +Run it? [y/N] +``` + +The confirmation should happen in the terminal by default, not only on the phone. + +### Separate “ask”, “suggest”, and “run” + +Useful command modes: + +```bash +cya talk --suggest-only +cya talk --allow-run +cya talk --no-exec +``` + +Default should be: + +```bash +cya talk --suggest-only +``` + +### Show recognized transcript + +The phone should always make clear what it heard. + +For risky requests, it should require confirmation before sending. + +--- + +## Privacy Model + +Speech mode touches sensitive areas: + +* spoken input; +* local filesystem context; +* repository contents; +* personal memory; +* command history; +* notes. + +So the design should make privacy visible and controllable. + +The user should be able to see: + +* which phone is connected; +* which terminal session it is connected to; +* what context is being sent to the LLM; +* which LLM backend is used; +* what is stored in memory; +* whether speech recognition is local or cloud-based; +* whether a relay broker is involved. + +Example session banner: + +```text +CYA voice session active + +Phone: + Bernd's iPhone + +Speech recognition: + on-device if available, platform fallback allowed + +LLM backend: + llm-connect: local-openrouter-default + +Memory: + phase-memory: user + project preferences + +Broker: + local websocket + +Execution: + suggest-only +``` + +--- + +## Pairing and Authentication + +Pairing should be short-lived and explicit. + +Possible pairing methods: + +### QR Code + +Best default. + +```bash +cya talk +``` + +Terminal displays QR code. + +Phone scans it. + +### Numeric Code + +Fallback for terminals that cannot show QR codes. + +```text +Pairing code: 482-119 +Expires in 5 minutes. +``` + +### Known Device Trust + +After initial pairing, user may mark a phone as trusted. + +Even then, each terminal voice session should be explicitly activated. + +Trusted device should not mean always-on access. + +--- + +## Session Ownership + +Each voice interaction should be bound to a specific active `cya` session. + +That session has: + +* session ID; +* current working directory; +* user identity; +* host identity; +* start time; +* allowed capabilities; +* selected LLM backend; +* selected memory scope; +* execution policy. + +This prevents accidental cross-talk where the phone sends a request to the wrong terminal or wrong repository. + +Useful phone display: + +```text +Connected to: +tinker-base +/home/bernd/repos/can-you-assist + +Mode: +suggest-only +``` + +--- + +## Possible CLI Commands + +### Start voice session + +```bash +cya talk +``` + +### Start voice session for current repository + +```bash +cya talk --repo +``` + +### Start with local broker + +```bash +cya talk --broker local +``` + +### Start with relay broker + +```bash +cya talk --broker relay +``` + +### Start with strict safety + +```bash +cya talk --suggest-only +``` + +### Start with no memory + +```bash +cya talk --no-memory +``` + +### Start with project memory + +```bash +cya talk --memory project +``` + +### List active sessions + +```bash +cya sessions +``` + +### End current session + +```bash +cya session stop +``` + +--- + +## Example User Experience + +Terminal: + +```bash +cd ~/repos/can-you-assist +cya talk +``` + +Terminal output: + +```text +CYA voice session started for: + + ~/repos/can-you-assist + +Mode: + suggest-only + +Pair your phone: + scan QR code or enter code 913-442 + +Waiting for phone... +``` + +Phone: + +```text +Connected to can-you-assist on tinker-base. + +Tap and speak. +``` + +User speaks: + +> What would be a good initial module structure for this project? + +Terminal: + +```text +You asked: +"What would be a good initial module structure for this project?" + +Suggested initial structure: + +src/ + cli/ + main.ts + commands/ + session/ + voice-session.ts + pairing.ts + context/ + collector.ts + git.ts + filesystem.ts + llm/ + llm-connect-client.ts + memory/ + phase-memory-client.ts + safety/ + classifier.ts + confirmation.ts + +Rationale: +... +``` + +Phone: + +```text +I suggested a module structure with CLI, session, context, LLM, memory, and safety layers. +``` + +--- + +## Integration With `llm-connect` + +Speech mode should not bypass `llm-connect`. + +The flow should be: + +```text +Phone speech +→ transcript +→ cya CLI session +→ llm-connect +→ selected LLM backend +→ cya response +→ terminal and/or phone +``` + +This keeps backend selection consistent with normal `cya` behavior. + +The phone app should not directly call the LLM unless specifically configured for a future lightweight mode. + +--- + +## Integration With `phase-memory` + +Speech mode should use `phase-memory` for: + +* preferred speech interaction style; +* trusted phones; +* known devices; +* user voice mode preferences; +* project-specific command conventions; +* common workflows; +* interaction summaries. + +Examples of remembered preferences: + +```text +User prefers speech responses to be concise. +User wants destructive commands previewed but never executed automatically. +User usually works with git commit messages in conventional commit style. +User prefers explanations before suggested shell commands. +``` + +Memory should remain inspectable and editable. + +Possible commands: + +```bash +cya memory show +cya memory edit +cya memory forget voice.devices +``` + +--- + +## Minimal Technical Design + +A minimal first implementation could use: + +### CLI Side + +* `cya talk` starts a WebSocket server or connects to a broker. +* It creates a session token. +* It renders a QR code. +* It listens for incoming text messages. +* It routes messages through the normal `cya` assistant pipeline. + +### Phone Side + +A Progressive Web App may be enough initially. + +The PWA can: + +* scan QR codes; +* use browser speech recognition where available; +* send text over WebSocket; +* show responses. + +This avoids building native iOS and Android apps immediately. + +Later, native apps can provide better: + +* speech recognition; +* background handling; +* push-to-talk UX; +* device trust; +* text-to-speech; +* secure local storage. + +### Broker Side + +For the first version, choose one: + +#### Local-only prototype + +Simpler, private, no cloud. + +Good for proof of concept. + +#### Minimal relay prototype + +More useful for SSH and remote development. + +Better real-world fit. + +A good architectural compromise: + +* implement a broker interface; +* start with local broker; +* allow relay broker later; +* keep protocol stable. + +--- + +## Protocol Sketch + +Phone sends: + +```json +{ + "type": "user_message", + "session_id": "cya_sess_123", + "message_id": "msg_001", + "input_mode": "speech", + "transcript": "Explain the current git status and suggest a commit message.", + "confidence": 0.91 +} +``` + +CLI responds: + +```json +{ + "type": "assistant_response", + "session_id": "cya_sess_123", + "message_id": "msg_001", + "terminal_response": "...full response...", + "phone_response": "I found three modified files and one new file. Suggested commit message: ...", + "requires_confirmation": false +} +``` + +For risky actions: + +```json +{ + "type": "assistant_response", + "session_id": "cya_sess_123", + "message_id": "msg_002", + "terminal_response": "Suggested command: rm -rf build/", + "phone_response": "This may delete files. Please confirm in the terminal.", + "requires_confirmation": true, + "confirmation_channel": "terminal" +} +``` + +--- + +## Important Product Distinction + +This extension should not be framed as: + +> “CYA becomes a mobile assistant.” + +It should be framed as: + +> “The phone becomes a microphone and conversation surface for the active terminal helper.” + +That distinction protects the project scope. + +The center remains the console. + +--- + +## Updated INTENT Addition + +This could be added to the `INTENT.md` under long-term direction or primary use cases: + +```markdown +### Speech-Assisted Console Interaction + +`cya` should eventually support a speech interaction mode where a phone or other capable device can act as a microphone, speech-recognition frontend, and lightweight conversation surface for an active `cya` CLI session. + +This enables voice interaction even when the terminal environment itself has no microphone, audio stack, graphical interface, or speech-recognition capability. + +In this mode, the phone does not become the primary execution environment. Instead, it connects to a currently activated `cya` helper session. The CLI session remains responsible for local filesystem context, repository context, memory scope, LLM backend selection, and safety confirmation. + +The phone provides convenient speech input and optional spoken output, while `cya` preserves its console-native, user-controlled architecture. +``` + +--- + +## My Recommended Direction + +I would treat this as a distinct but natural extension: + +```text +cya-core + console assistant + +cya-voice + speech bridge protocol and session mode + +cya-mobile + phone app or PWA + +cya-broker + optional pairing and relay service +``` + +The first implementation should probably be: + +```text +cya talk ++ local WebSocket session ++ QR pairing ++ phone PWA ++ push-to-talk speech recognition ++ suggest-only mode +``` + +That gives you the core magic without overbuilding the system. + +The deeper architectural insight is: + +> Speech mode should not make the terminal speak. +> It should let a speech-capable companion device speak *to the terminal’s active assistant context*. +