generated from coulomb/repo-seed
Implement CYA-WP-0008 llm-connect adapter integration.
Wire LLMConnectAdapter behind the existing LLMAdapter seam with config-driven selection, graceful degradation, --offline mode, and bounded session context. Add unit tests, integration docs, and update README/SCOPE/AGENTS.
This commit is contained in:
@@ -185,6 +185,7 @@ make check-dist
|
||||
|
||||
# Run the assistant
|
||||
cya "your natural language request here"
|
||||
cya --offline "..." # deterministic fake adapter (CI / no API keys)
|
||||
cya --help
|
||||
cya --explain-context "show me what context would be collected"
|
||||
|
||||
@@ -213,7 +214,7 @@ Relevant workplans:
|
||||
- `workplans/CYA-WP-0005-agentic-memory-profiles-and-phase-memory-feedback.md`
|
||||
- `workplans/CYA-WP-0006-profile-1-production-hardening.md` (finished)
|
||||
- `workplans/CYA-WP-0007-interactive-shell-session.md` (ready — interactive REPL + history + hub)
|
||||
- `workplans/CYA-WP-0008-llm-connect-adapter-integration.md` (ready — real LLM behind adapter seam)
|
||||
- `workplans/CYA-WP-0008-llm-connect-adapter-integration.md` (finished — real LLM behind adapter seam)
|
||||
|
||||
---
|
||||
|
||||
|
||||
43
README.md
43
README.md
@@ -16,7 +16,7 @@ usable after `pip install -e .`:
|
||||
- `cya "your request in plain English"`
|
||||
- `cya --explain-context "..."` — shows exactly what local context would be sent
|
||||
- Automatic rule-based risk classification with mandatory confirmation for anything destructive, privileged, mass-edit, or network-affecting
|
||||
- All LLM interaction flows through a documented `LLMAdapter` seam (currently a deterministic fake; ready for real `llm-connect`)
|
||||
- All LLM interaction flows through a documented `LLMAdapter` seam (`FakeLLMAdapter` by default; real `llm-connect` when configured)
|
||||
|
||||
## Installation
|
||||
|
||||
@@ -58,6 +58,41 @@ git pull
|
||||
make dev-install
|
||||
```
|
||||
|
||||
## LLM backend (configured vs offline)
|
||||
|
||||
By default `cya` uses a deterministic **offline** adapter (no API keys, no network).
|
||||
For real inference, configure llm-connect:
|
||||
|
||||
```bash
|
||||
# 1. Install llm-connect (sibling checkout)
|
||||
pip install -e ~/llm-connect
|
||||
|
||||
# 2. Route credentials — do not commit keys
|
||||
warden route find "OpenRouter API key" --json
|
||||
|
||||
# 3. Export key and configure cya
|
||||
export OPENROUTER_API_KEY="..." # from OpenBao / operator path
|
||||
mkdir -p ~/.config/cya
|
||||
cat > ~/.config/cya/config.toml <<'EOF'
|
||||
[llm]
|
||||
adapter = "connect"
|
||||
backend = "openrouter"
|
||||
model = "anthropic/claude-sonnet-4"
|
||||
EOF
|
||||
|
||||
cya "show me the recent git history for this repo"
|
||||
```
|
||||
|
||||
Force offline mode anytime (tests, CI, air-gapped):
|
||||
|
||||
```bash
|
||||
cya --offline "your request"
|
||||
# or: CYA_LLM_ADAPTER=fake cya "..."
|
||||
```
|
||||
|
||||
See `docs/llm-connect-integration.md` for the full mapping, session context budget,
|
||||
and optional `.cya.toml` project overrides.
|
||||
|
||||
## Usage examples
|
||||
|
||||
```bash
|
||||
@@ -190,9 +225,11 @@ decisions, and integration guide.
|
||||
```bash
|
||||
# Recommended one-liner (see Installation section above)
|
||||
make dev-install
|
||||
pip install -e ~/llm-connect # optional, for live inference
|
||||
|
||||
pytest tests/ -q
|
||||
cya "..." # manual verification
|
||||
pytest tests/ -q # offline mocks only; no API keys
|
||||
pytest -m llm_live # manual live check (requires OPENROUTER_API_KEY)
|
||||
cya --offline "..." # manual verification without network
|
||||
make version # show current dev version
|
||||
```
|
||||
|
||||
|
||||
6
SCOPE.md
6
SCOPE.md
@@ -21,7 +21,7 @@ Core capabilities now include:
|
||||
- Natural language request handling via clean Typer CLI.
|
||||
- Bounded, transparent local context collection.
|
||||
- Genuine rule-based (memory-aware) risk classification with mandatory confirmation.
|
||||
- Stable `LLMAdapter` Protocol.
|
||||
- Stable `LLMAdapter` Protocol with `LLMConnectAdapter` behind the same factory when configured.
|
||||
- Real, user-controlled, contextually activated memory (Profile 0: directory/project scoped local JSON with kinds, activation_context, provenance, and retrospection outcomes as higher-order memory).
|
||||
- Automatic memory activation based on working directory/git root.
|
||||
- `cya retrospect` for structured reflection and goal setting, with production Profile 1 verbal lesson capture, review (`cya memory reflections`), and compaction.
|
||||
@@ -77,7 +77,7 @@ See the individual workplans for detailed scope per slice.
|
||||
## Explicitly Out of Scope (Current and Near-Term)
|
||||
|
||||
- Full deep integration with the complete `phase-memory` profile/planner/graph system (current implementation uses a deliberate, user-visible local JSON store with contextual activation; deeper integration is planned future work per MemoryVision.md).
|
||||
- Real `llm-connect` client implementation (only the stable `LLMAdapter` Protocol contract + FakeLLMAdapter exists).
|
||||
- Deep llm-connect features beyond basic `execute_prompt` delegation (adaptive routing, cost dashboards, structured output schemas).
|
||||
- Deep semantic repository understanding or large-scale content analysis.
|
||||
- Automatic command execution (even "safe" suggestions) — explicit user confirmation remains mandatory for anything non-safe.
|
||||
- Rich multi-turn conversational state beyond lightweight scoped memory + retrospection.
|
||||
@@ -109,7 +109,7 @@ Sibling project owners (llm-connect, phase-memory, State Hub) can read the workp
|
||||
|
||||
---
|
||||
|
||||
**This SCOPE document reflects the state after CYA-WP-0004 (Dev-Head Install & Release Packaging).**
|
||||
**This SCOPE document reflects the state after CYA-WP-0008 (llm-connect Adapter Integration).**
|
||||
|
||||
It remains intentionally narrower than the long-term vision in INTENT.md and MemoryVision.md, but now incorporates significant advances in contextual memory activation, user-driven retrospection/optimization loops, and proper packaging & distribution capabilities.
|
||||
|
||||
|
||||
9
docs/cya-config.example.toml
Normal file
9
docs/cya-config.example.toml
Normal file
@@ -0,0 +1,9 @@
|
||||
# Example ~/.config/cya/config.toml — placeholders only; do not commit secrets.
|
||||
|
||||
[llm]
|
||||
adapter = "connect"
|
||||
backend = "openrouter"
|
||||
model = "anthropic/claude-sonnet-4"
|
||||
temperature = 0.3
|
||||
max_tokens = 2000
|
||||
api_key_env = "OPENROUTER_API_KEY"
|
||||
100
docs/llm-connect-integration.md
Normal file
100
docs/llm-connect-integration.md
Normal file
@@ -0,0 +1,100 @@
|
||||
# llm-connect Integration (CYA-WP-0008)
|
||||
|
||||
## Mapping: cya ↔ llm-connect
|
||||
|
||||
| cya (`AssistanceRequest`) | llm-connect |
|
||||
|---------------------------|-------------|
|
||||
| `user_request` | User message body (after context framing) |
|
||||
| `context` (envelope + memory + `session_turns`) | Serialized into the prompt via `cya.llm.prompt.build_assistance_prompt` |
|
||||
| `hints` (`model`, `temperature`, `max_tokens`) | `RunConfig` fields for `execute_prompt` |
|
||||
| `AssistanceResponse.suggestion` | `LLMResponse.content` |
|
||||
| `AssistanceResponse.metadata` | `LLMResponse.model`, `usage`, `finish_reason` |
|
||||
|
||||
llm-connect owns provider clients (`create_adapter`), API key resolution, retries, and
|
||||
token usage. `cya` never imports vendor SDKs directly.
|
||||
|
||||
## Configuration
|
||||
|
||||
User config: `~/.config/cya/config.toml`
|
||||
|
||||
```toml
|
||||
[llm]
|
||||
adapter = "connect" # "connect" | "fake" (default: fake when absent)
|
||||
backend = "openrouter" # openrouter | openai | gemini | claude-code | mock
|
||||
model = "anthropic/claude-sonnet-4"
|
||||
temperature = 0.3
|
||||
max_tokens = 2000
|
||||
api_key_env = "OPENROUTER_API_KEY" # optional override
|
||||
# system_prompt = "..." # optional; uses cya default when omitted
|
||||
```
|
||||
|
||||
Optional project override: `.cya.toml` (same `[llm]` section; merged over user config).
|
||||
|
||||
Environment overrides:
|
||||
|
||||
| Variable | Purpose |
|
||||
|----------|---------|
|
||||
| `CYA_LLM_ADAPTER` | `connect` or `fake` |
|
||||
| `CYA_LLM_BACKEND` / `CYA_LLM_PROVIDER` | Provider name |
|
||||
| `CYA_LLM_MODEL` | Model id |
|
||||
|
||||
CLI: `cya --offline "..."` forces `FakeLLMAdapter`.
|
||||
|
||||
## Session context budget (multi-turn / `cya shell`)
|
||||
|
||||
Recent turns are passed in `AssistanceRequest.context["session_turns"]` as
|
||||
`{"user": "...", "assistant": "..."}` records.
|
||||
|
||||
Bounds (see `cya.config`):
|
||||
|
||||
- **Max turns:** 10
|
||||
- **Max characters:** 4000 (total across included turns)
|
||||
|
||||
Older or oversized history is dropped from the prompt automatically.
|
||||
|
||||
## Credential routing
|
||||
|
||||
Do **not** commit API keys. Before requesting secrets, route custody:
|
||||
|
||||
```bash
|
||||
warden route find "OpenRouter API key" --json
|
||||
warden route show <catalog-id> --json
|
||||
```
|
||||
|
||||
Typical ownership:
|
||||
|
||||
| Need | Owner | ops-warden executes? |
|
||||
|------|-------|----------------------|
|
||||
| `OPENROUTER_API_KEY` | OpenBao (`railiance-platform`) | No — route only |
|
||||
| `OPENAI_API_KEY` | OpenBao | No — route only |
|
||||
| `GEMINI_API_KEY` | OpenBao | No — route only |
|
||||
|
||||
llm-connect resolves keys via `resolve_api_key()` (explicit arg → env var → project key file).
|
||||
|
||||
## Adapter selection
|
||||
|
||||
`cya.llm.factory.get_adapter()` is the single factory for one-shot and shell paths:
|
||||
|
||||
1. `--offline` or `CYA_LLM_ADAPTER=fake` → `FakeLLMAdapter`
|
||||
2. `adapter = "connect"` in config/env → `LLMConnectAdapter` (graceful degrade on failure)
|
||||
3. Otherwise → `FakeLLMAdapter` (current default)
|
||||
|
||||
## Installation
|
||||
|
||||
```bash
|
||||
make dev-install
|
||||
pip install -e ~/llm-connect # sibling checkout
|
||||
```
|
||||
|
||||
Optional extra group (placeholder for packaging): `pip install -e ".[llm]"`.
|
||||
|
||||
## Tests
|
||||
|
||||
- Default CI: `make test` — mocks llm-connect; no network.
|
||||
- Manual live check: `pytest -m llm_live` (requires configured API key).
|
||||
|
||||
## Known gaps
|
||||
|
||||
- Structured JSON output schema not enforced yet (free-form model text).
|
||||
- `claude-code` backend does not require an API key; other backends do.
|
||||
- Per-directory `.cya.toml` overrides user config but does not yet mirror llm-connect's full 7-layer resolution.
|
||||
@@ -15,6 +15,7 @@ authors = [
|
||||
dependencies = [
|
||||
"typer[standard]>=0.12.0",
|
||||
"rich>=13.0.0",
|
||||
"tomli>=2.0.0; python_version<'3.11'",
|
||||
]
|
||||
|
||||
[project.optional-dependencies]
|
||||
@@ -26,6 +27,9 @@ dev = [
|
||||
test = [
|
||||
"pytest>=8.0",
|
||||
]
|
||||
llm = [
|
||||
# Install llm-connect from a sibling checkout, e.g. pip install -e ~/llm-connect
|
||||
]
|
||||
|
||||
[project.scripts]
|
||||
cya = "cya.cli.main:run"
|
||||
@@ -57,6 +61,7 @@ python_functions = ["test_*"]
|
||||
addopts = "-q --tb=short"
|
||||
markers = [
|
||||
"safety: core safety and risk classifier invariants (always run)",
|
||||
"llm_live: live llm-connect inference (requires API key; manual runs only)",
|
||||
]
|
||||
|
||||
[tool.setuptools_scm]
|
||||
|
||||
@@ -66,6 +66,11 @@ def main(
|
||||
"-n",
|
||||
help="Preview mode — do not perform any actions (stub in T01).",
|
||||
),
|
||||
offline: bool = typer.Option(
|
||||
False,
|
||||
"--offline",
|
||||
help="Use the deterministic FakeLLMAdapter (no llm-connect / no API keys).",
|
||||
),
|
||||
version: bool = typer.Option(
|
||||
None,
|
||||
"--version",
|
||||
@@ -106,6 +111,7 @@ def main(
|
||||
request,
|
||||
explain_context=explain_context,
|
||||
dry_run=dry_run,
|
||||
offline=offline,
|
||||
)
|
||||
|
||||
|
||||
|
||||
174
src/cya/config.py
Normal file
174
src/cya/config.py
Normal file
@@ -0,0 +1,174 @@
|
||||
"""User configuration for cya (CYA-WP-0008-T03).
|
||||
|
||||
Reads ``~/.config/cya/config.toml`` and optional project ``.cya.toml``.
|
||||
Environment variables override file values where noted.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import sys
|
||||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
_USER_CONFIG = Path.home() / ".config" / "cya" / "config.toml"
|
||||
_PROJECT_CONFIG_NAME = ".cya.toml"
|
||||
|
||||
# Session context bounds (CYA-WP-0008-T04) — documented in docs/llm-connect-integration.md
|
||||
MAX_SESSION_TURNS = 10
|
||||
MAX_SESSION_CHARS = 4000
|
||||
|
||||
|
||||
def _load_toml(path: Path) -> dict[str, Any]:
|
||||
if not path.is_file():
|
||||
return {}
|
||||
if sys.version_info >= (3, 11):
|
||||
import tomllib
|
||||
|
||||
return tomllib.loads(path.read_text())
|
||||
import tomli
|
||||
|
||||
return tomli.loads(path.read_bytes())
|
||||
|
||||
|
||||
def _find_project_config(start: Path | None = None) -> Path | None:
|
||||
current = (start or Path.cwd()).resolve()
|
||||
for directory in [current, *current.parents]:
|
||||
candidate = directory / _PROJECT_CONFIG_NAME
|
||||
if candidate.is_file():
|
||||
return candidate
|
||||
return None
|
||||
|
||||
|
||||
def _merge_llm_sections(*sources: dict[str, Any]) -> dict[str, Any]:
|
||||
merged: dict[str, Any] = {}
|
||||
for source in sources:
|
||||
section = source.get("llm")
|
||||
if isinstance(section, dict):
|
||||
merged.update(section)
|
||||
return merged
|
||||
|
||||
|
||||
@dataclass
|
||||
class LLMSettings:
|
||||
"""Resolved LLM adapter settings."""
|
||||
|
||||
adapter: str = "fake" # "fake" | "connect"
|
||||
backend: str = "openrouter"
|
||||
model: str | None = None
|
||||
temperature: float = 0.3
|
||||
max_tokens: int = 2000
|
||||
api_key_env: str | None = None
|
||||
system_prompt: str | None = None
|
||||
configured: bool = False
|
||||
source: str = "default"
|
||||
|
||||
def to_hints(self) -> dict[str, Any]:
|
||||
hints: dict[str, Any] = {
|
||||
"backend": self.backend,
|
||||
"temperature": self.temperature,
|
||||
"max_tokens": self.max_tokens,
|
||||
}
|
||||
if self.model:
|
||||
hints["model"] = self.model
|
||||
if self.api_key_env:
|
||||
hints["api_key_env"] = self.api_key_env
|
||||
return hints
|
||||
|
||||
|
||||
def _coerce_float(value: Any, default: float) -> float:
|
||||
try:
|
||||
return float(value)
|
||||
except (TypeError, ValueError):
|
||||
return default
|
||||
|
||||
|
||||
def _coerce_int(value: Any, default: int) -> int:
|
||||
try:
|
||||
return int(value)
|
||||
except (TypeError, ValueError):
|
||||
return default
|
||||
|
||||
|
||||
def load_llm_settings(*, offline: bool = False) -> LLMSettings:
|
||||
"""Resolve LLM settings from env, user config, and project config."""
|
||||
if offline:
|
||||
return LLMSettings(adapter="fake", configured=False, source="--offline")
|
||||
|
||||
env_adapter = os.environ.get("CYA_LLM_ADAPTER", "").strip().lower()
|
||||
if env_adapter in ("fake", "connect"):
|
||||
base = LLMSettings(adapter=env_adapter, configured=env_adapter == "connect", source="CYA_LLM_ADAPTER")
|
||||
else:
|
||||
base = LLMSettings()
|
||||
|
||||
user_data = _load_toml(_USER_CONFIG)
|
||||
project_path = _find_project_config()
|
||||
project_data = _load_toml(project_path) if project_path else {}
|
||||
merged = _merge_llm_sections(user_data, project_data)
|
||||
|
||||
if merged:
|
||||
file_adapter = str(merged.get("adapter", "")).strip().lower()
|
||||
if file_adapter in ("fake", "connect") and not env_adapter:
|
||||
base.adapter = file_adapter
|
||||
base.configured = file_adapter == "connect"
|
||||
base.source = str(project_path or _USER_CONFIG)
|
||||
|
||||
backend = merged.get("backend") or merged.get("provider")
|
||||
if backend:
|
||||
base.backend = str(backend)
|
||||
if not env_adapter and file_adapter != "fake":
|
||||
base.adapter = "connect"
|
||||
base.configured = True
|
||||
base.source = str(project_path or _USER_CONFIG)
|
||||
|
||||
if merged.get("model"):
|
||||
base.model = str(merged["model"])
|
||||
base.temperature = _coerce_float(merged.get("temperature"), base.temperature)
|
||||
base.max_tokens = _coerce_int(merged.get("max_tokens"), base.max_tokens)
|
||||
if merged.get("api_key_env"):
|
||||
base.api_key_env = str(merged["api_key_env"])
|
||||
if merged.get("system_prompt"):
|
||||
base.system_prompt = str(merged["system_prompt"])
|
||||
|
||||
env_backend = os.environ.get("CYA_LLM_BACKEND") or os.environ.get("CYA_LLM_PROVIDER")
|
||||
if env_backend:
|
||||
base.backend = env_backend.strip()
|
||||
if base.adapter != "fake":
|
||||
base.adapter = "connect"
|
||||
base.configured = True
|
||||
base.source = "CYA_LLM_BACKEND"
|
||||
|
||||
env_model = os.environ.get("CYA_LLM_MODEL")
|
||||
if env_model:
|
||||
base.model = env_model.strip()
|
||||
if base.adapter != "fake":
|
||||
base.adapter = "connect"
|
||||
base.configured = True
|
||||
base.source = "CYA_LLM_MODEL"
|
||||
|
||||
return base
|
||||
|
||||
|
||||
def bound_session_turns(
|
||||
turns: list[dict[str, str]] | None,
|
||||
*,
|
||||
max_turns: int = MAX_SESSION_TURNS,
|
||||
max_chars: int = MAX_SESSION_CHARS,
|
||||
) -> list[dict[str, str]]:
|
||||
"""Trim session history to a bounded token/line budget for the adapter."""
|
||||
if not turns:
|
||||
return []
|
||||
|
||||
recent = turns[-max_turns:]
|
||||
bounded: list[dict[str, str]] = []
|
||||
used = 0
|
||||
for turn in recent:
|
||||
user = turn.get("user", "")
|
||||
assistant = turn.get("assistant", "")
|
||||
chunk_len = len(user) + len(assistant)
|
||||
if used + chunk_len > max_chars and bounded:
|
||||
break
|
||||
bounded.append({"user": user, "assistant": assistant})
|
||||
used += chunk_len
|
||||
return bounded
|
||||
@@ -17,11 +17,15 @@ from .adapter import (
|
||||
LLMAdapter,
|
||||
FakeLLMAdapter,
|
||||
)
|
||||
from .connect_adapter import LLMConnectAdapter
|
||||
from .factory import get_adapter
|
||||
|
||||
__all__ = [
|
||||
"AssistanceRequest",
|
||||
"AssistanceResponse",
|
||||
"LLMAdapter",
|
||||
"FakeLLMAdapter",
|
||||
"LLMConnectAdapter",
|
||||
"get_adapter",
|
||||
]
|
||||
|
||||
|
||||
138
src/cya/llm/connect_adapter.py
Normal file
138
src/cya/llm/connect_adapter.py
Normal file
@@ -0,0 +1,138 @@
|
||||
"""llm-connect-backed adapter (CYA-WP-0008-T02)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from typing import Any
|
||||
|
||||
from cya.config import LLMSettings
|
||||
from cya.llm.adapter import AssistanceRequest, AssistanceResponse
|
||||
from cya.llm.prompt import build_assistance_prompt
|
||||
|
||||
_PROVIDER_ENV_KEYS: dict[str, str] = {
|
||||
"openrouter": "OPENROUTER_API_KEY",
|
||||
"openai": "OPENAI_API_KEY",
|
||||
"gemini": "GEMINI_API_KEY",
|
||||
}
|
||||
|
||||
|
||||
class LLMConnectAdapter:
|
||||
"""Delegates to llm-connect while satisfying cya's LLMAdapter protocol."""
|
||||
|
||||
def __init__(self, settings: LLMSettings) -> None:
|
||||
self._settings = settings
|
||||
self._client: Any | None = None
|
||||
self._init_error: str | None = None
|
||||
self._ensure_client()
|
||||
|
||||
def _ensure_client(self) -> None:
|
||||
try:
|
||||
from llm_connect import create_adapter
|
||||
from llm_connect.config import resolve_api_key
|
||||
except ImportError:
|
||||
self._init_error = (
|
||||
"llm-connect is not installed. Install with:\n"
|
||||
" pip install -e ~/llm-connect\n"
|
||||
"or: pip install -e \".[llm]\" after adding llm-connect to your environment."
|
||||
)
|
||||
return
|
||||
|
||||
env_var = self._settings.api_key_env or _PROVIDER_ENV_KEYS.get(
|
||||
self._settings.backend, "OPENROUTER_API_KEY"
|
||||
)
|
||||
api_key = resolve_api_key(env_var=env_var)
|
||||
if self._settings.backend in ("openrouter", "openai", "gemini") and not api_key:
|
||||
self._init_error = (
|
||||
f"No API key found for backend {self._settings.backend!r} "
|
||||
f"(checked env {env_var!r}).\n"
|
||||
"Route credential custody via warden before requesting secrets:\n"
|
||||
" warden route find \"OpenRouter API key\" --json\n"
|
||||
"Then export the key into your environment — never commit it to the repo."
|
||||
)
|
||||
return
|
||||
|
||||
try:
|
||||
self._client = create_adapter(
|
||||
provider=self._settings.backend,
|
||||
model=self._settings.model,
|
||||
api_key=api_key,
|
||||
system_prompt=self._settings.system_prompt,
|
||||
)
|
||||
except Exception as exc: # noqa: BLE001 — surface config errors to the user
|
||||
self._init_error = f"Failed to initialize llm-connect adapter: {exc}"
|
||||
|
||||
def complete(self, request: AssistanceRequest) -> AssistanceResponse:
|
||||
if self._init_error or self._client is None:
|
||||
return self._degraded_response(self._init_error or "llm-connect client unavailable.")
|
||||
|
||||
try:
|
||||
from llm_connect.models import RunConfig
|
||||
except ImportError:
|
||||
return self._degraded_response(self._init_error or "llm-connect not installed.")
|
||||
|
||||
system, user_prompt = build_assistance_prompt(
|
||||
request,
|
||||
system_prompt=self._settings.system_prompt,
|
||||
)
|
||||
hints = {**self._settings.to_hints(), **request.hints}
|
||||
run_config = RunConfig(
|
||||
model_name=hints.get("model") or self._settings.model or "anthropic/claude-sonnet-4",
|
||||
temperature=float(hints.get("temperature", self._settings.temperature)),
|
||||
max_tokens=int(hints.get("max_tokens", self._settings.max_tokens)),
|
||||
)
|
||||
|
||||
# Re-create adapter when per-request system prompt differs (llm-connect stores it at init).
|
||||
client = self._client
|
||||
if system and not self._settings.system_prompt:
|
||||
from llm_connect import create_adapter
|
||||
from llm_connect.config import resolve_api_key
|
||||
|
||||
env_var = self._settings.api_key_env or _PROVIDER_ENV_KEYS.get(
|
||||
self._settings.backend, "OPENROUTER_API_KEY"
|
||||
)
|
||||
api_key = resolve_api_key(env_var=env_var)
|
||||
client = create_adapter(
|
||||
provider=self._settings.backend,
|
||||
model=self._settings.model,
|
||||
api_key=api_key,
|
||||
system_prompt=system,
|
||||
)
|
||||
|
||||
try:
|
||||
llm_response = client.execute_prompt(user_prompt, run_config)
|
||||
except Exception as exc: # noqa: BLE001 — user-facing degrade path
|
||||
return self._degraded_response(
|
||||
f"llm-connect request failed: {exc}",
|
||||
partial_raw=str(exc),
|
||||
)
|
||||
|
||||
content = (llm_response.content or "").strip()
|
||||
return AssistanceResponse(
|
||||
suggestion=content or "(empty model response)",
|
||||
explanation="Response generated via llm-connect.",
|
||||
rationale="Model inference using configured backend and bounded local context.",
|
||||
risks=[],
|
||||
raw_model_output=content,
|
||||
metadata={
|
||||
"adapter": "LLMConnectAdapter",
|
||||
"backend": self._settings.backend,
|
||||
"model": llm_response.model,
|
||||
"usage": llm_response.usage,
|
||||
"finish_reason": llm_response.finish_reason,
|
||||
},
|
||||
)
|
||||
|
||||
@staticmethod
|
||||
def _degraded_response(message: str, *, partial_raw: str | None = None) -> AssistanceResponse:
|
||||
return AssistanceResponse(
|
||||
suggestion=(
|
||||
"cya could not reach a configured LLM backend.\n\n"
|
||||
f"{message}\n\n"
|
||||
"Continuing in offline mode: re-run with `--offline` or configure "
|
||||
"`~/.config/cya/config.toml` (see README)."
|
||||
),
|
||||
explanation="Graceful degradation — no live inference was performed.",
|
||||
rationale="llm-connect unavailable or misconfigured.",
|
||||
risks=["No live model inference"],
|
||||
raw_model_output=partial_raw,
|
||||
metadata={"adapter": "LLMConnectAdapter", "degraded": True},
|
||||
)
|
||||
18
src/cya/llm/factory.py
Normal file
18
src/cya/llm/factory.py
Normal file
@@ -0,0 +1,18 @@
|
||||
"""Adapter selection factory (CYA-WP-0008-T04)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
from cya.config import LLMSettings, load_llm_settings
|
||||
from cya.llm.adapter import FakeLLMAdapter, LLMAdapter
|
||||
from cya.llm.connect_adapter import LLMConnectAdapter
|
||||
|
||||
|
||||
def get_adapter(*, offline: bool = False, settings: LLMSettings | None = None) -> LLMAdapter:
|
||||
"""Return the active LLMAdapter for one-shot and shell code paths."""
|
||||
resolved = settings or load_llm_settings(offline=offline)
|
||||
if resolved.adapter == "connect":
|
||||
return LLMConnectAdapter(resolved)
|
||||
return FakeLLMAdapter()
|
||||
|
||||
|
||||
__all__ = ["get_adapter", "load_llm_settings"]
|
||||
73
src/cya/llm/prompt.py
Normal file
73
src/cya/llm/prompt.py
Normal file
@@ -0,0 +1,73 @@
|
||||
"""Prompt construction for llm-connect delegation (CYA-WP-0008)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from typing import Any
|
||||
|
||||
from cya.llm.adapter import AssistanceRequest
|
||||
|
||||
_DEFAULT_SYSTEM = """You are cya, a console-native assistant for practical local work from the shell.
|
||||
|
||||
Help the user with command-line tasks: repository inspection, file workflows, command
|
||||
suggestion, command explanation, and local context summarization.
|
||||
|
||||
Be concise and practical. When suggesting shell commands, explain risks briefly.
|
||||
Do not claim to have executed anything — the user runs commands themselves.
|
||||
Reference the provided context when it is relevant."""
|
||||
|
||||
|
||||
def default_system_prompt() -> str:
|
||||
return _DEFAULT_SYSTEM
|
||||
|
||||
|
||||
def build_assistance_prompt(request: AssistanceRequest, *, system_prompt: str | None = None) -> tuple[str, str]:
|
||||
"""Return (system_prompt, user_prompt) for llm-connect execute_prompt."""
|
||||
system = system_prompt or default_system_prompt()
|
||||
parts: list[str] = []
|
||||
|
||||
context = request.context or {}
|
||||
session_turns = context.get("session_turns")
|
||||
if session_turns:
|
||||
parts.append("## Recent conversation")
|
||||
for turn in session_turns:
|
||||
parts.append(f"User: {turn.get('user', '')}")
|
||||
parts.append(f"Assistant: {turn.get('assistant', '')}")
|
||||
|
||||
envelope = {k: v for k, v in context.items() if k not in ("session_turns", "memory")}
|
||||
if envelope:
|
||||
parts.append("## Local context")
|
||||
parts.append(_summarize_context(envelope))
|
||||
|
||||
memory = context.get("memory")
|
||||
if isinstance(memory, dict) and memory.get("items"):
|
||||
parts.append("## Activated memory")
|
||||
for item in memory["items"][:8]:
|
||||
parts.append(f"- [{item.get('kind', '?')}] {item.get('key', '?')}: {item.get('value', '')}")
|
||||
|
||||
parts.append("## Current request")
|
||||
parts.append(request.user_request.strip())
|
||||
|
||||
return system, "\n\n".join(parts)
|
||||
|
||||
|
||||
def _summarize_context(envelope: dict[str, Any]) -> str:
|
||||
"""Compact, JSON-safe context summary to stay within prompt budget."""
|
||||
summary: dict[str, Any] = {}
|
||||
if envelope.get("cwd"):
|
||||
summary["cwd"] = envelope["cwd"]
|
||||
if envelope.get("git"):
|
||||
git = envelope["git"]
|
||||
summary["git"] = {
|
||||
k: git[k]
|
||||
for k in ("branch", "status_short", "workdir", "is_repo")
|
||||
if k in git
|
||||
}
|
||||
if envelope.get("top_level"):
|
||||
names = [e.get("name") for e in envelope["top_level"][:30] if e.get("name")]
|
||||
summary["top_level"] = names
|
||||
if envelope.get("env"):
|
||||
summary["env"] = envelope["env"]
|
||||
if envelope.get("notes"):
|
||||
summary["notes"] = envelope["notes"][:5]
|
||||
return json.dumps(summary, indent=2, default=str)
|
||||
@@ -48,7 +48,9 @@ from cya.memory.reflections import (
|
||||
session_provenance,
|
||||
)
|
||||
from cya.safety.risk import classify, get_user_confirmation
|
||||
from cya.llm.adapter import AssistanceRequest, FakeLLMAdapter
|
||||
from cya.config import bound_session_turns
|
||||
from cya.llm.adapter import AssistanceRequest
|
||||
from cya.llm.factory import get_adapter
|
||||
|
||||
|
||||
console = Console()
|
||||
@@ -59,6 +61,8 @@ def handle_request(
|
||||
*,
|
||||
explain_context: bool = False,
|
||||
dry_run: bool = False,
|
||||
offline: bool = False,
|
||||
session_turns: list[dict[str, str]] | None = None,
|
||||
) -> None:
|
||||
"""Primary orchestrator entry point.
|
||||
|
||||
@@ -158,10 +162,12 @@ def handle_request(
|
||||
console.print("[green]--dry-run acknowledged.[/green] No side-effects.")
|
||||
return
|
||||
|
||||
# 3. Call through the single LLMAdapter boundary (T04)
|
||||
adapter = FakeLLMAdapter()
|
||||
# 3. Call through the single LLMAdapter boundary (T04 / CYA-WP-0008)
|
||||
adapter = get_adapter(offline=offline)
|
||||
ctx = (envelope.to_dict() if envelope else {}) or {}
|
||||
ctx["memory"] = memory # T03: memory now in context passed to LLM (for personalization + explain)
|
||||
if session_turns:
|
||||
ctx["session_turns"] = bound_session_turns(session_turns)
|
||||
llm_request = AssistanceRequest(
|
||||
user_request=user_request,
|
||||
context=ctx,
|
||||
|
||||
111
tests/test_llm_connect_adapter.py
Normal file
111
tests/test_llm_connect_adapter.py
Normal file
@@ -0,0 +1,111 @@
|
||||
"""LLMConnectAdapter unit tests with mocked llm-connect (CYA-WP-0008)."""
|
||||
|
||||
from unittest.mock import MagicMock, patch
|
||||
|
||||
import pytest
|
||||
|
||||
from cya.config import LLMSettings
|
||||
from cya.llm.adapter import AssistanceRequest
|
||||
from cya.llm.connect_adapter import LLMConnectAdapter
|
||||
|
||||
|
||||
def _mock_llm_response(content: str = "Try: git status"):
|
||||
resp = MagicMock()
|
||||
resp.content = content
|
||||
resp.model = "mock/model"
|
||||
resp.usage = {"total_tokens": 42}
|
||||
resp.finish_reason = "stop"
|
||||
return resp
|
||||
|
||||
|
||||
@patch("llm_connect.create_adapter")
|
||||
@patch("llm_connect.config.resolve_api_key", return_value="test-key")
|
||||
def test_complete_delegates_to_llm_connect(mock_resolve, mock_create):
|
||||
client = MagicMock()
|
||||
client.execute_prompt.return_value = _mock_llm_response()
|
||||
mock_create.return_value = client
|
||||
|
||||
settings = LLMSettings(adapter="connect", backend="mock", model="mock/model", configured=True)
|
||||
adapter = LLMConnectAdapter(settings)
|
||||
response = adapter.complete(
|
||||
AssistanceRequest(user_request="show git status", context={"cwd": "/tmp"})
|
||||
)
|
||||
|
||||
assert "git status" in response.suggestion.lower()
|
||||
assert response.metadata.get("adapter") == "LLMConnectAdapter"
|
||||
assert response.metadata.get("degraded") is not True
|
||||
client.execute_prompt.assert_called_once()
|
||||
|
||||
|
||||
def test_graceful_degrade_when_llm_connect_missing(monkeypatch):
|
||||
import builtins
|
||||
|
||||
real_import = builtins.__import__
|
||||
|
||||
def _import(name, *args, **kwargs):
|
||||
if name == "llm_connect" or name.startswith("llm_connect."):
|
||||
raise ImportError("no llm_connect")
|
||||
return real_import(name, *args, **kwargs)
|
||||
|
||||
monkeypatch.setattr(builtins, "__import__", _import)
|
||||
settings = LLMSettings(adapter="connect", backend="openrouter", configured=True)
|
||||
adapter = LLMConnectAdapter(settings)
|
||||
response = adapter.complete(AssistanceRequest(user_request="hello"))
|
||||
|
||||
assert response.metadata.get("degraded") is True
|
||||
assert "llm-connect" in response.suggestion
|
||||
|
||||
|
||||
@patch("llm_connect.create_adapter")
|
||||
@patch("llm_connect.config.resolve_api_key", return_value=None)
|
||||
def test_graceful_degrade_when_api_key_missing(mock_resolve, mock_create):
|
||||
settings = LLMSettings(adapter="connect", backend="openrouter", configured=True)
|
||||
adapter = LLMConnectAdapter(settings)
|
||||
response = adapter.complete(AssistanceRequest(user_request="hello"))
|
||||
|
||||
assert response.metadata.get("degraded") is True
|
||||
assert "API key" in response.suggestion
|
||||
mock_create.assert_not_called()
|
||||
|
||||
|
||||
@patch("llm_connect.create_adapter")
|
||||
@patch("llm_connect.config.resolve_api_key", return_value="test-key")
|
||||
def test_session_turns_included_in_prompt(mock_resolve, mock_create):
|
||||
client = MagicMock()
|
||||
client.execute_prompt.return_value = _mock_llm_response("ok")
|
||||
mock_create.return_value = client
|
||||
|
||||
settings = LLMSettings(adapter="connect", backend="mock", configured=True)
|
||||
adapter = LLMConnectAdapter(settings)
|
||||
adapter.complete(
|
||||
AssistanceRequest(
|
||||
user_request="follow up",
|
||||
context={
|
||||
"session_turns": [{"user": "first", "assistant": "reply"}],
|
||||
},
|
||||
)
|
||||
)
|
||||
|
||||
_prompt_arg, _ = client.execute_prompt.call_args[0]
|
||||
assert "first" in _prompt_arg
|
||||
assert "follow up" in _prompt_arg
|
||||
|
||||
|
||||
@pytest.mark.llm_live
|
||||
def test_live_openrouter_smoke():
|
||||
"""Manual verification only — skipped unless OPENROUTER_API_KEY is set."""
|
||||
import os
|
||||
|
||||
if not os.environ.get("OPENROUTER_API_KEY"):
|
||||
pytest.skip("OPENROUTER_API_KEY not set")
|
||||
|
||||
settings = LLMSettings(
|
||||
adapter="connect",
|
||||
backend="openrouter",
|
||||
model="anthropic/claude-sonnet-4",
|
||||
configured=True,
|
||||
)
|
||||
adapter = LLMConnectAdapter(settings)
|
||||
response = adapter.complete(AssistanceRequest(user_request="Reply with exactly: pong"))
|
||||
assert response.metadata.get("degraded") is not True
|
||||
assert response.suggestion
|
||||
67
tests/test_llm_factory.py
Normal file
67
tests/test_llm_factory.py
Normal file
@@ -0,0 +1,67 @@
|
||||
"""Adapter factory and config resolution (CYA-WP-0008)."""
|
||||
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
import pytest
|
||||
|
||||
from cya.config import bound_session_turns, load_llm_settings
|
||||
from cya.llm.adapter import FakeLLMAdapter
|
||||
from cya.llm.connect_adapter import LLMConnectAdapter
|
||||
from cya.llm.factory import get_adapter
|
||||
|
||||
|
||||
def test_default_adapter_is_fake(monkeypatch):
|
||||
monkeypatch.delenv("CYA_LLM_ADAPTER", raising=False)
|
||||
monkeypatch.setattr("cya.config._USER_CONFIG", Path("/nonexistent/config.toml"))
|
||||
adapter = get_adapter()
|
||||
assert isinstance(adapter, FakeLLMAdapter)
|
||||
|
||||
|
||||
def test_offline_forces_fake(monkeypatch, tmp_path):
|
||||
cfg = tmp_path / "config.toml"
|
||||
cfg.write_text('[llm]\nadapter = "connect"\nbackend = "openrouter"\n')
|
||||
monkeypatch.setattr("cya.config._USER_CONFIG", cfg)
|
||||
adapter = get_adapter(offline=True)
|
||||
assert isinstance(adapter, FakeLLMAdapter)
|
||||
|
||||
|
||||
def test_connect_adapter_when_configured(monkeypatch, tmp_path):
|
||||
cfg = tmp_path / "config.toml"
|
||||
cfg.write_text('[llm]\nadapter = "connect"\nbackend = "mock"\n')
|
||||
monkeypatch.setattr("cya.config._USER_CONFIG", cfg)
|
||||
adapter = get_adapter()
|
||||
assert isinstance(adapter, LLMConnectAdapter)
|
||||
|
||||
|
||||
def test_env_adapter_override(monkeypatch):
|
||||
monkeypatch.setenv("CYA_LLM_ADAPTER", "connect")
|
||||
monkeypatch.setenv("CYA_LLM_BACKEND", "mock")
|
||||
adapter = get_adapter()
|
||||
assert isinstance(adapter, LLMConnectAdapter)
|
||||
|
||||
|
||||
def test_load_llm_settings_merges_project_config(monkeypatch, tmp_path):
|
||||
user_cfg = tmp_path / "user.toml"
|
||||
user_cfg.write_text('[llm]\nbackend = "openrouter"\nmodel = "from-user"\n')
|
||||
project_cfg = tmp_path / ".cya.toml"
|
||||
project_cfg.write_text('[llm]\nmodel = "from-project"\n')
|
||||
monkeypatch.setattr("cya.config._USER_CONFIG", user_cfg)
|
||||
monkeypatch.setattr("cya.config._find_project_config", lambda start=None: project_cfg)
|
||||
|
||||
settings = load_llm_settings()
|
||||
assert settings.backend == "openrouter"
|
||||
assert settings.model == "from-project"
|
||||
assert settings.adapter == "connect"
|
||||
|
||||
|
||||
def test_bound_session_turns_limits():
|
||||
turns = [
|
||||
{"user": "a" * 1000, "assistant": "b" * 1000},
|
||||
{"user": "c" * 1000, "assistant": "d" * 1000},
|
||||
{"user": "e", "assistant": "f"},
|
||||
]
|
||||
bounded = bound_session_turns(turns, max_turns=10, max_chars=2500)
|
||||
assert len(bounded) >= 1
|
||||
total = sum(len(t["user"]) + len(t["assistant"]) for t in bounded)
|
||||
assert total <= 2500 or len(bounded) == 1
|
||||
22
tests/test_llm_prompt.py
Normal file
22
tests/test_llm_prompt.py
Normal file
@@ -0,0 +1,22 @@
|
||||
"""Prompt builder tests."""
|
||||
|
||||
from cya.llm.adapter import AssistanceRequest
|
||||
from cya.llm.prompt import build_assistance_prompt
|
||||
|
||||
|
||||
def test_build_assistance_prompt_includes_context_and_request():
|
||||
system, user = build_assistance_prompt(
|
||||
AssistanceRequest(
|
||||
user_request="list files",
|
||||
context={
|
||||
"cwd": "/home/user/proj",
|
||||
"session_turns": [{"user": "hi", "assistant": "hello"}],
|
||||
"memory": {"items": [{"kind": "preference", "key": "style", "value": "concise"}]},
|
||||
},
|
||||
)
|
||||
)
|
||||
assert "cya" in system.lower()
|
||||
assert "list files" in user
|
||||
assert "/home/user/proj" in user
|
||||
assert "hi" in user
|
||||
assert "concise" in user
|
||||
@@ -4,7 +4,7 @@ type: workplan
|
||||
title: "llm-connect Adapter Integration for Production Assistance"
|
||||
domain: capabilities
|
||||
repo: can-you-assist
|
||||
status: ready
|
||||
status: finished
|
||||
owner: grok
|
||||
topic_slug: foerster-capabilities
|
||||
created: "2026-06-22"
|
||||
@@ -50,7 +50,7 @@ llm-connect. Credential routing via `warden route` before requesting secrets.
|
||||
|
||||
```task
|
||||
id: CYA-WP-0008-T01
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "483d13bb-aabe-48ad-96c2-8df83de5f442"
|
||||
```
|
||||
@@ -62,11 +62,13 @@ Identify config surface (TOML keys, env vars). Note gaps requiring llm-connect c
|
||||
- Short integration note in `docs/` or workplan appendix.
|
||||
- Credential route catalog id(s) documented via `warden route find`.
|
||||
|
||||
**Delivered:** `docs/llm-connect-integration.md`
|
||||
|
||||
### T02 — Implement `LLMConnectAdapter`
|
||||
|
||||
```task
|
||||
id: CYA-WP-0008-T02
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "0fc17ad5-d90b-4ad1-b060-a1a2f9c25ea8"
|
||||
```
|
||||
@@ -80,11 +82,13 @@ New class in `src/cya/llm/` implementing `LLMAdapter`:
|
||||
**Acceptance criteria:**
|
||||
- Protocol-compliant; swap via config or env (`CYA_LLM_ADAPTER=connect|fake`).
|
||||
|
||||
**Delivered:** `src/cya/llm/connect_adapter.py`, `src/cya/llm/factory.py`
|
||||
|
||||
### T03 — Configuration and developer ergonomics
|
||||
|
||||
```task
|
||||
id: CYA-WP-0008-T03
|
||||
status: todo
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "e8470a37-ecec-42f1-920b-ccd8b98b5512"
|
||||
```
|
||||
@@ -97,11 +101,13 @@ state_hub_task_id: "e8470a37-ecec-42f1-920b-ccd8b98b5512"
|
||||
- Operator can configure adapter without editing source.
|
||||
- No secrets committed; example config uses placeholders.
|
||||
|
||||
**Delivered:** `src/cya/config.py`, `docs/cya-config.example.toml`, README section
|
||||
|
||||
### T04 — Orchestrator and shell integration
|
||||
|
||||
```task
|
||||
id: CYA-WP-0008-T04
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "f2781963-fecf-4576-96de-bd745df271a0"
|
||||
```
|
||||
@@ -116,11 +122,13 @@ Wire `handle_request()` and CYA-WP-0007 shell turns to adapter selection:
|
||||
- One-shot and shell paths use same adapter factory.
|
||||
- Session context bounded (token/line budget documented).
|
||||
|
||||
**Delivered:** `get_adapter()` wired in orchestrator; `session_turns` + `bound_session_turns()` ready for CYA-WP-0007 shell
|
||||
|
||||
### T05 — Tests and offline CI strategy
|
||||
|
||||
```task
|
||||
id: CYA-WP-0008-T05
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "32de980b-1a24-4159-9550-7c516570cae3"
|
||||
```
|
||||
@@ -133,11 +141,13 @@ state_hub_task_id: "32de980b-1a24-4159-9550-7c516570cae3"
|
||||
- `make test` passes without network or API keys.
|
||||
- Live test documented for operator manual verification.
|
||||
|
||||
**Delivered:** `tests/test_llm_factory.py`, `tests/test_llm_connect_adapter.py`, `tests/test_llm_prompt.py`
|
||||
|
||||
### T06 — Documentation and SCOPE update
|
||||
|
||||
```task
|
||||
id: CYA-WP-0008-T06
|
||||
status: todo
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "2d152d4b-e4b2-4a94-8f85-d8f033e55d5f"
|
||||
```
|
||||
@@ -151,7 +161,7 @@ Update README, SCOPE.md (remove "only FakeLLMAdapter" where accurate), AGENTS.md
|
||||
|
||||
```task
|
||||
id: CYA-WP-0008-T07
|
||||
status: todo
|
||||
status: done
|
||||
priority: low
|
||||
state_hub_task_id: "2fb42517-b2df-43d3-8195-f02d310107dc"
|
||||
```
|
||||
@@ -166,4 +176,4 @@ state_hub_task_id: "2fb42517-b2df-43d3-8195-f02d310107dc"
|
||||
|
||||
---
|
||||
|
||||
**Status note:** `ready` on 2026-06-22. Can start T01–T03 in parallel with CYA-WP-0007 T02–T04.
|
||||
**Status note:** `finished` on 2026-06-22. Integration doc: `docs/llm-connect-integration.md`.
|
||||
Reference in New Issue
Block a user