feat: WP-0001 foundation + WP-0002 core extensions

WP-0001 — Foundation & GAAF Baseline
- SCOPE.md, ARCHITECTURE-LAYERS.md, contracts/ tree
- .claude/rules/ stubs filled (architecture, stack, boundary)
- 57 tests (pytest), pyproject.toml with ruff+mypy, CI workflow

WP-0002 — Core Extensions (FR-4 + FR-3)
- FR-4: BudgetTracker (thread-safe) + LLMBudgetExceededError +
  optional RunConfig.budget_tracker + enforcement in all adapters
- FR-3: async_execute_prompt on LLMAdapter ABC (asyncio.to_thread
  fallback) + native asyncio.create_subprocess_exec in ClaudeCodeAdapter

81 tests passing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-04-01 22:24:14 +00:00
parent 57b346bb8b
commit d71f4114d1
28 changed files with 1601 additions and 26 deletions

View File

@@ -0,0 +1,80 @@
# Contract: Configuration — TOML Config Chain
**Layer:** Configuration
**Version:** 0.1.0
**Last updated:** 2026-04-01
---
## resolve_llm()
`llm_connect.toml_config.resolve_llm(cli_provider, cli_model, app_name)`
Walks a 7-level priority chain to resolve provider and model independently.
Returns `ResolvedLLM(provider, model, provider_source, model_source)`.
### Priority chain (highest → lowest)
| Level | Source |
|-------|--------|
| 1 | CLI flags (`cli_provider`, `cli_model`) |
| 2 | Env var `{APP_NAME}_HELPER_MODEL` (model only) |
| 3 | User preference — `~/.config/{app_name}/config.toml` `[llm.preference]` |
| 4 | Directory preference — `.{app_name}.toml` `[llm.preference]` |
| 5 | Directory default — `.{app_name}.toml` `[llm.default]` |
| 6 | User default — `~/.config/{app_name}/config.toml` `[llm.default]` |
| 7 | Hardcoded fallback — `gemini / gemini-2.5-flash` |
### Invariants
- Always returns a fully-resolved `ResolvedLLM` (never raises, never returns None).
- Provider and model are resolved independently — a preference for model does
not imply a preference for provider.
- TOML parse errors are silently ignored (returns empty layer).
- `app_name` defaults to `"markitect"` for backward compatibility; consumers
should pass their own app name.
### Known issue
`toml_config.py` has `markitect`-specific defaults (`MARKITECT_HELPER_MODEL`,
`USER_CONFIG_DIR`). These are kept for backward compatibility but callers
outside markitect should always pass an explicit `app_name`.
---
## resolve_api_key()
`llm_connect.config.resolve_api_key(explicit, env_var, key_file_paths)`
Resolution order:
1. `explicit` argument
2. Environment variable `env_var`
3. First readable file in `key_file_paths` with non-empty content
Returns `None` if nothing is found. Never raises.
---
## find_project_root()
Walks up from CWD looking for `pyproject.toml`. Returns the containing directory
or `None`. Used by adapters to locate key files.
---
## LLMConfig
`llm_connect.config.LLMConfig`
Dataclass holding per-adapter configuration. Used directly by `OpenRouterAdapter`
and `ClaudeCodeAdapter`. Not required by the Core `LLMAdapter` ABC.
| Field | Default |
|-------|---------|
| `provider` | `"openrouter"` |
| `model` | `"anthropic/claude-sonnet-4"` |
| `api_key` | `None` |
| `api_base` | `"https://openrouter.ai/api/v1"` |
| `claude_cli_path` | `"claude"` |
| `timeout_seconds` | `300` |
| `max_retries` | `3` |

View File

@@ -0,0 +1,122 @@
# Contract: Core — LLMAdapter Interface
**Layer:** Core
**Version:** 0.1.0
**Status:** Draft (stabilises at v1.0.0)
**Last updated:** 2026-04-01
---
## LLMAdapter ABC
`llm_connect.adapter.LLMAdapter`
### Interface
```python
class LLMAdapter(ABC):
@abstractmethod
def execute_prompt(self, prompt: str, config: RunConfig) -> LLMResponse: ...
@abstractmethod
def validate_config(self, config: RunConfig) -> bool: ...
```
**Planned addition (WP-0002 T07):**
```python
async def async_execute_prompt(self, prompt: str, config: RunConfig) -> LLMResponse:
# Default: runs execute_prompt in a thread executor
...
```
### Invariants
1. `execute_prompt` MUST return an `LLMResponse` with a non-empty `content` field on success.
2. `execute_prompt` MUST raise a subclass of `LLMError` on any failure — never a bare exception.
3. `validate_config` MUST be side-effect-free and return `bool` only.
4. `validate_config` returning `False` does not preclude calling `execute_prompt` — it is advisory.
5. Adapters MUST NOT mutate the `config` argument.
6. `execute_prompt` is allowed to be slow (network I/O) but MUST respect `config.timeout_seconds`.
### Failure modes
| Condition | Exception |
|-----------|-----------|
| Missing / invalid API key | `LLMConfigurationError` |
| HTTP 4xx (non-429) | `LLMAPIError` (with `.status_code`) |
| HTTP 429 | `LLMRateLimitError` |
| Request timeout | `LLMTimeoutError` |
| CLI subprocess failure | `LLMSubprocessError` (with `.return_code`, `.stderr`) |
| Token budget exceeded (WP-0002) | `LLMBudgetExceededError` |
### Compatibility rules
- Any code that accepts `LLMAdapter` MUST work with `MockLLMAdapter`.
- Adding new optional methods to the ABC is non-breaking (default implementations provided).
- Removing or changing the signature of `execute_prompt` or `validate_config` is a **breaking Core change** requiring a major version bump.
---
## RunConfig
`llm_connect.models.RunConfig`
### Fields and invariants
| Field | Type | Default | Invariant |
|-------|------|---------|-----------|
| `model_name` | `str` | `"gpt-4"` | Non-empty string; adapters MAY override |
| `temperature` | `float` | `0.7` | 0.0 ≤ temperature ≤ 2.0 |
| `max_tokens` | `int` | `2000` | > 0 |
| `model_params` | `dict` | `{}` | Provider-specific pass-through; no invariants |
| `max_depth` | `int` | `3` | ≥ 0 |
| `skip_if_exists` | `bool` | `True` | — |
| `timeout_seconds` | `int` | `300` | > 0 |
| `budget_tracker` | `BudgetTracker \| None` | `None` | Optional; added in WP-0002 |
Adapters MUST NOT mutate `RunConfig` fields.
---
## LLMResponse
`llm_connect.models.LLMResponse`
### Fields and invariants
| Field | Type | Invariant |
|-------|------|-----------|
| `content` | `str` | Non-empty on success; may be empty only if provider returned empty output |
| `model` | `str` | Non-empty; the model actually used (may differ from `RunConfig.model_name`) |
| `usage` | `dict` | Keys: `prompt_tokens`, `completion_tokens`, `total_tokens` (all int ≥ 0) |
| `finish_reason` | `str` | Provider-reported; `"stop"` is the normal value |
| `metadata` | `dict` | Arbitrary; always includes `"provider"` key |
---
## LLMError Hierarchy
```
LLMError
├── LLMConfigurationError bad key / unknown provider
├── LLMAPIError HTTP error (has .status_code, .response_body)
│ └── LLMRateLimitError 429
├── LLMTimeoutError request or subprocess timed out
├── LLMSubprocessError CLI failed (has .return_code, .stderr)
└── LLMBudgetExceededError token budget cap exceeded (WP-0002)
```
All exceptions carry optional `cause` (chained exception) and `context` (dict).
---
## Mock adapters
`MockLLMAdapter` and `ErrorLLMAdapter` are part of Core — they are test
primitives that any consumer may depend on without importing dev extras.
`MockLLMAdapter` invariants:
- Returns deterministic response without network I/O
- Increments `call_count` on each call
- Records `last_prompt` and `last_config`
- `reset()` clears all counters and recorded state

View File

@@ -0,0 +1,94 @@
# Contract: Functional — Provider Adapters
**Layer:** Functional
**Version:** 0.1.0
**Maturity:** Beta (all adapters)
**Last updated:** 2026-04-01
---
## Common adapter contract
All provider adapters implement `LLMAdapter` (see `contracts/core/llm-adapter.md`).
Additional shared guarantees:
- Constructors resolve API keys at instantiation and raise `LLMConfigurationError`
immediately if no key is found (fail-fast).
- HTTP-based adapters (`OpenAIAdapter`, `GeminiAdapter`, `OpenRouterAdapter`)
use `_http.post_json` and do not add runtime dependencies beyond stdlib.
- `metadata` in the returned `LLMResponse` always contains `"provider"` and
`"latency_seconds"` keys.
- HTTP adapters that retry (`OpenAIAdapter`, `OpenRouterAdapter`) use
exponential backoff: `sleep(2 ** attempt)` on 429 and 5xx.
---
## OpenAIAdapter
**Provider key:** `"openai"`
**Default model:** `gpt-4.1-mini`
**API:** `https://api.openai.com/v1/chat/completions`
**Auth:** `OPENAI_API_KEY` env var or `apikey-chatgpt.txt` in project root
**Retries:** 3 (exponential backoff on 429 and 5xx)
---
## GeminiAdapter
**Provider key:** `"gemini"`
**Default model:** `gemini-2.5-flash`
**API:** `https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent`
**Auth:** `GEMINI_API_KEY` env var or `apikey-geminifree.txt` in project root
**Retries:** 0 (no retry logic; rate-limit handling deferred)
**Note:** System prompt is simulated via a user/model turn pair (Gemini has no native system role).
---
## OpenRouterAdapter
**Provider key:** `"openrouter"`
**Default model:** `anthropic/claude-sonnet-4`
**API:** `https://openrouter.ai/api/v1/chat/completions` (configurable via `LLMConfig.api_base`)
**Auth:** `OPENROUTER_API_KEY` env var or `apikey-openrouter.txt` in project root
**Retries:** 3 (exponential backoff on 429 and 5xx)
**Note:** OpenRouter is an OpenAI-compatible endpoint; `RunConfig.model_params` are merged into the payload.
---
## ClaudeCodeAdapter
**Provider key:** `"claude-code"`
**Default model:** n/a (uses the CLI's configured default)
**Auth:** none (delegates to locally installed `claude` CLI)
**Subprocess:** `claude --print [--model M]` with prompt on stdin
**Token counts:** estimated via `_token_estimator` (not provider-reported)
**validate_config:** runs `claude --version`; returns `False` if CLI not found
---
## EmbeddingAdapter ABC
`llm_connect.embedding_adapter.EmbeddingAdapter`
```python
class EmbeddingAdapter(ABC):
@abstractmethod
def embed(self, texts: list[str]) -> list[list[float]]: ...
```
Invariant: returns a list of the same length as `texts`.
### OpenAICompatibleEmbeddingAdapter
Compatible with any OpenAI-format embedding endpoint (`/v1/embeddings`).
Default model: `text-embedding-3-small`.
---
## EmbeddingCache
`llm_connect.embedding_cache.EmbeddingCache`
Disk-backed cache keyed by text content (SHA-256 hash).
`get_or_compute(text, compute_fn)` returns cached vector or calls `compute_fn`.