generated from coulomb/repo-seed
feat: WP-0001 foundation + WP-0002 core extensions
WP-0001 — Foundation & GAAF Baseline - SCOPE.md, ARCHITECTURE-LAYERS.md, contracts/ tree - .claude/rules/ stubs filled (architecture, stack, boundary) - 57 tests (pytest), pyproject.toml with ruff+mypy, CI workflow WP-0002 — Core Extensions (FR-4 + FR-3) - FR-4: BudgetTracker (thread-safe) + LLMBudgetExceededError + optional RunConfig.budget_tracker + enforcement in all adapters - FR-3: async_execute_prompt on LLMAdapter ABC (asyncio.to_thread fallback) + native asyncio.create_subprocess_exec in ClaudeCodeAdapter 81 tests passing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
80
contracts/config/toml-chain.md
Normal file
80
contracts/config/toml-chain.md
Normal file
@@ -0,0 +1,80 @@
|
||||
# Contract: Configuration — TOML Config Chain
|
||||
|
||||
**Layer:** Configuration
|
||||
**Version:** 0.1.0
|
||||
**Last updated:** 2026-04-01
|
||||
|
||||
---
|
||||
|
||||
## resolve_llm()
|
||||
|
||||
`llm_connect.toml_config.resolve_llm(cli_provider, cli_model, app_name)`
|
||||
|
||||
Walks a 7-level priority chain to resolve provider and model independently.
|
||||
Returns `ResolvedLLM(provider, model, provider_source, model_source)`.
|
||||
|
||||
### Priority chain (highest → lowest)
|
||||
|
||||
| Level | Source |
|
||||
|-------|--------|
|
||||
| 1 | CLI flags (`cli_provider`, `cli_model`) |
|
||||
| 2 | Env var `{APP_NAME}_HELPER_MODEL` (model only) |
|
||||
| 3 | User preference — `~/.config/{app_name}/config.toml` `[llm.preference]` |
|
||||
| 4 | Directory preference — `.{app_name}.toml` `[llm.preference]` |
|
||||
| 5 | Directory default — `.{app_name}.toml` `[llm.default]` |
|
||||
| 6 | User default — `~/.config/{app_name}/config.toml` `[llm.default]` |
|
||||
| 7 | Hardcoded fallback — `gemini / gemini-2.5-flash` |
|
||||
|
||||
### Invariants
|
||||
|
||||
- Always returns a fully-resolved `ResolvedLLM` (never raises, never returns None).
|
||||
- Provider and model are resolved independently — a preference for model does
|
||||
not imply a preference for provider.
|
||||
- TOML parse errors are silently ignored (returns empty layer).
|
||||
- `app_name` defaults to `"markitect"` for backward compatibility; consumers
|
||||
should pass their own app name.
|
||||
|
||||
### Known issue
|
||||
|
||||
`toml_config.py` has `markitect`-specific defaults (`MARKITECT_HELPER_MODEL`,
|
||||
`USER_CONFIG_DIR`). These are kept for backward compatibility but callers
|
||||
outside markitect should always pass an explicit `app_name`.
|
||||
|
||||
---
|
||||
|
||||
## resolve_api_key()
|
||||
|
||||
`llm_connect.config.resolve_api_key(explicit, env_var, key_file_paths)`
|
||||
|
||||
Resolution order:
|
||||
1. `explicit` argument
|
||||
2. Environment variable `env_var`
|
||||
3. First readable file in `key_file_paths` with non-empty content
|
||||
|
||||
Returns `None` if nothing is found. Never raises.
|
||||
|
||||
---
|
||||
|
||||
## find_project_root()
|
||||
|
||||
Walks up from CWD looking for `pyproject.toml`. Returns the containing directory
|
||||
or `None`. Used by adapters to locate key files.
|
||||
|
||||
---
|
||||
|
||||
## LLMConfig
|
||||
|
||||
`llm_connect.config.LLMConfig`
|
||||
|
||||
Dataclass holding per-adapter configuration. Used directly by `OpenRouterAdapter`
|
||||
and `ClaudeCodeAdapter`. Not required by the Core `LLMAdapter` ABC.
|
||||
|
||||
| Field | Default |
|
||||
|-------|---------|
|
||||
| `provider` | `"openrouter"` |
|
||||
| `model` | `"anthropic/claude-sonnet-4"` |
|
||||
| `api_key` | `None` |
|
||||
| `api_base` | `"https://openrouter.ai/api/v1"` |
|
||||
| `claude_cli_path` | `"claude"` |
|
||||
| `timeout_seconds` | `300` |
|
||||
| `max_retries` | `3` |
|
||||
122
contracts/core/llm-adapter.md
Normal file
122
contracts/core/llm-adapter.md
Normal file
@@ -0,0 +1,122 @@
|
||||
# Contract: Core — LLMAdapter Interface
|
||||
|
||||
**Layer:** Core
|
||||
**Version:** 0.1.0
|
||||
**Status:** Draft (stabilises at v1.0.0)
|
||||
**Last updated:** 2026-04-01
|
||||
|
||||
---
|
||||
|
||||
## LLMAdapter ABC
|
||||
|
||||
`llm_connect.adapter.LLMAdapter`
|
||||
|
||||
### Interface
|
||||
|
||||
```python
|
||||
class LLMAdapter(ABC):
|
||||
@abstractmethod
|
||||
def execute_prompt(self, prompt: str, config: RunConfig) -> LLMResponse: ...
|
||||
|
||||
@abstractmethod
|
||||
def validate_config(self, config: RunConfig) -> bool: ...
|
||||
```
|
||||
|
||||
**Planned addition (WP-0002 T07):**
|
||||
```python
|
||||
async def async_execute_prompt(self, prompt: str, config: RunConfig) -> LLMResponse:
|
||||
# Default: runs execute_prompt in a thread executor
|
||||
...
|
||||
```
|
||||
|
||||
### Invariants
|
||||
|
||||
1. `execute_prompt` MUST return an `LLMResponse` with a non-empty `content` field on success.
|
||||
2. `execute_prompt` MUST raise a subclass of `LLMError` on any failure — never a bare exception.
|
||||
3. `validate_config` MUST be side-effect-free and return `bool` only.
|
||||
4. `validate_config` returning `False` does not preclude calling `execute_prompt` — it is advisory.
|
||||
5. Adapters MUST NOT mutate the `config` argument.
|
||||
6. `execute_prompt` is allowed to be slow (network I/O) but MUST respect `config.timeout_seconds`.
|
||||
|
||||
### Failure modes
|
||||
|
||||
| Condition | Exception |
|
||||
|-----------|-----------|
|
||||
| Missing / invalid API key | `LLMConfigurationError` |
|
||||
| HTTP 4xx (non-429) | `LLMAPIError` (with `.status_code`) |
|
||||
| HTTP 429 | `LLMRateLimitError` |
|
||||
| Request timeout | `LLMTimeoutError` |
|
||||
| CLI subprocess failure | `LLMSubprocessError` (with `.return_code`, `.stderr`) |
|
||||
| Token budget exceeded (WP-0002) | `LLMBudgetExceededError` |
|
||||
|
||||
### Compatibility rules
|
||||
|
||||
- Any code that accepts `LLMAdapter` MUST work with `MockLLMAdapter`.
|
||||
- Adding new optional methods to the ABC is non-breaking (default implementations provided).
|
||||
- Removing or changing the signature of `execute_prompt` or `validate_config` is a **breaking Core change** requiring a major version bump.
|
||||
|
||||
---
|
||||
|
||||
## RunConfig
|
||||
|
||||
`llm_connect.models.RunConfig`
|
||||
|
||||
### Fields and invariants
|
||||
|
||||
| Field | Type | Default | Invariant |
|
||||
|-------|------|---------|-----------|
|
||||
| `model_name` | `str` | `"gpt-4"` | Non-empty string; adapters MAY override |
|
||||
| `temperature` | `float` | `0.7` | 0.0 ≤ temperature ≤ 2.0 |
|
||||
| `max_tokens` | `int` | `2000` | > 0 |
|
||||
| `model_params` | `dict` | `{}` | Provider-specific pass-through; no invariants |
|
||||
| `max_depth` | `int` | `3` | ≥ 0 |
|
||||
| `skip_if_exists` | `bool` | `True` | — |
|
||||
| `timeout_seconds` | `int` | `300` | > 0 |
|
||||
| `budget_tracker` | `BudgetTracker \| None` | `None` | Optional; added in WP-0002 |
|
||||
|
||||
Adapters MUST NOT mutate `RunConfig` fields.
|
||||
|
||||
---
|
||||
|
||||
## LLMResponse
|
||||
|
||||
`llm_connect.models.LLMResponse`
|
||||
|
||||
### Fields and invariants
|
||||
|
||||
| Field | Type | Invariant |
|
||||
|-------|------|-----------|
|
||||
| `content` | `str` | Non-empty on success; may be empty only if provider returned empty output |
|
||||
| `model` | `str` | Non-empty; the model actually used (may differ from `RunConfig.model_name`) |
|
||||
| `usage` | `dict` | Keys: `prompt_tokens`, `completion_tokens`, `total_tokens` (all int ≥ 0) |
|
||||
| `finish_reason` | `str` | Provider-reported; `"stop"` is the normal value |
|
||||
| `metadata` | `dict` | Arbitrary; always includes `"provider"` key |
|
||||
|
||||
---
|
||||
|
||||
## LLMError Hierarchy
|
||||
|
||||
```
|
||||
LLMError
|
||||
├── LLMConfigurationError bad key / unknown provider
|
||||
├── LLMAPIError HTTP error (has .status_code, .response_body)
|
||||
│ └── LLMRateLimitError 429
|
||||
├── LLMTimeoutError request or subprocess timed out
|
||||
├── LLMSubprocessError CLI failed (has .return_code, .stderr)
|
||||
└── LLMBudgetExceededError token budget cap exceeded (WP-0002)
|
||||
```
|
||||
|
||||
All exceptions carry optional `cause` (chained exception) and `context` (dict).
|
||||
|
||||
---
|
||||
|
||||
## Mock adapters
|
||||
|
||||
`MockLLMAdapter` and `ErrorLLMAdapter` are part of Core — they are test
|
||||
primitives that any consumer may depend on without importing dev extras.
|
||||
|
||||
`MockLLMAdapter` invariants:
|
||||
- Returns deterministic response without network I/O
|
||||
- Increments `call_count` on each call
|
||||
- Records `last_prompt` and `last_config`
|
||||
- `reset()` clears all counters and recorded state
|
||||
94
contracts/functional/adapters.md
Normal file
94
contracts/functional/adapters.md
Normal file
@@ -0,0 +1,94 @@
|
||||
# Contract: Functional — Provider Adapters
|
||||
|
||||
**Layer:** Functional
|
||||
**Version:** 0.1.0
|
||||
**Maturity:** Beta (all adapters)
|
||||
**Last updated:** 2026-04-01
|
||||
|
||||
---
|
||||
|
||||
## Common adapter contract
|
||||
|
||||
All provider adapters implement `LLMAdapter` (see `contracts/core/llm-adapter.md`).
|
||||
|
||||
Additional shared guarantees:
|
||||
|
||||
- Constructors resolve API keys at instantiation and raise `LLMConfigurationError`
|
||||
immediately if no key is found (fail-fast).
|
||||
- HTTP-based adapters (`OpenAIAdapter`, `GeminiAdapter`, `OpenRouterAdapter`)
|
||||
use `_http.post_json` and do not add runtime dependencies beyond stdlib.
|
||||
- `metadata` in the returned `LLMResponse` always contains `"provider"` and
|
||||
`"latency_seconds"` keys.
|
||||
- HTTP adapters that retry (`OpenAIAdapter`, `OpenRouterAdapter`) use
|
||||
exponential backoff: `sleep(2 ** attempt)` on 429 and 5xx.
|
||||
|
||||
---
|
||||
|
||||
## OpenAIAdapter
|
||||
|
||||
**Provider key:** `"openai"`
|
||||
**Default model:** `gpt-4.1-mini`
|
||||
**API:** `https://api.openai.com/v1/chat/completions`
|
||||
**Auth:** `OPENAI_API_KEY` env var or `apikey-chatgpt.txt` in project root
|
||||
**Retries:** 3 (exponential backoff on 429 and 5xx)
|
||||
|
||||
---
|
||||
|
||||
## GeminiAdapter
|
||||
|
||||
**Provider key:** `"gemini"`
|
||||
**Default model:** `gemini-2.5-flash`
|
||||
**API:** `https://generativelanguage.googleapis.com/v1beta/models/{model}:generateContent`
|
||||
**Auth:** `GEMINI_API_KEY` env var or `apikey-geminifree.txt` in project root
|
||||
**Retries:** 0 (no retry logic; rate-limit handling deferred)
|
||||
**Note:** System prompt is simulated via a user/model turn pair (Gemini has no native system role).
|
||||
|
||||
---
|
||||
|
||||
## OpenRouterAdapter
|
||||
|
||||
**Provider key:** `"openrouter"`
|
||||
**Default model:** `anthropic/claude-sonnet-4`
|
||||
**API:** `https://openrouter.ai/api/v1/chat/completions` (configurable via `LLMConfig.api_base`)
|
||||
**Auth:** `OPENROUTER_API_KEY` env var or `apikey-openrouter.txt` in project root
|
||||
**Retries:** 3 (exponential backoff on 429 and 5xx)
|
||||
**Note:** OpenRouter is an OpenAI-compatible endpoint; `RunConfig.model_params` are merged into the payload.
|
||||
|
||||
---
|
||||
|
||||
## ClaudeCodeAdapter
|
||||
|
||||
**Provider key:** `"claude-code"`
|
||||
**Default model:** n/a (uses the CLI's configured default)
|
||||
**Auth:** none (delegates to locally installed `claude` CLI)
|
||||
**Subprocess:** `claude --print [--model M]` with prompt on stdin
|
||||
**Token counts:** estimated via `_token_estimator` (not provider-reported)
|
||||
**validate_config:** runs `claude --version`; returns `False` if CLI not found
|
||||
|
||||
---
|
||||
|
||||
## EmbeddingAdapter ABC
|
||||
|
||||
`llm_connect.embedding_adapter.EmbeddingAdapter`
|
||||
|
||||
```python
|
||||
class EmbeddingAdapter(ABC):
|
||||
@abstractmethod
|
||||
def embed(self, texts: list[str]) -> list[list[float]]: ...
|
||||
```
|
||||
|
||||
Invariant: returns a list of the same length as `texts`.
|
||||
|
||||
### OpenAICompatibleEmbeddingAdapter
|
||||
|
||||
Compatible with any OpenAI-format embedding endpoint (`/v1/embeddings`).
|
||||
Default model: `text-embedding-3-small`.
|
||||
|
||||
---
|
||||
|
||||
## EmbeddingCache
|
||||
|
||||
`llm_connect.embedding_cache.EmbeddingCache`
|
||||
|
||||
Disk-backed cache keyed by text content (SHA-256 hash).
|
||||
`get_or_compute(text, compute_fn)` returns cached vector or calls `compute_fn`.
|
||||
Reference in New Issue
Block a user