llm-connect

Author	SHA1	Message	Date
tegwick	c11c6afa3f	Implement-LLM-WP-0005-cost-model-estimators Some checks failed CI / test (3.10) (push) Has been cancelled Details CI / test (3.11) (push) Has been cancelled Details CI / test (3.12) (push) Has been cancelled Details	2026-05-19 05:02:20 +02:00
tegwick	0054afe689	plan: WP-0005 — cost model and problem-class token estimators Some checks failed CI / test (3.10) (push) Has been cancelled Details CI / test (3.11) (push) Has been cancelled Details CI / test (3.12) (push) Has been cancelled Details Drafted workplan to move two consumer-side concerns into llm-connect: - ModelRateRegistry: per-model USD-per-1k rates with provenance, a property of the base model, not the application. - ProblemClass token estimators: generic shapes (chunk-summarization, entity-extraction, relation-extraction, judge-eval, report-synthesis) with base dimensions + tunable params; consumer supplies the shape of its problem and gets a TokenEstimate before any call. Demand signal: the 2026-05-18 infospace-bench Lefevre Chapter-I smoke ran 32 calls / 28k tokens / 0.009 USD actual against a planned 8.40 USD — the 1000x variance was entirely consumer-side because there is no rate table in llm-connect to delegate to. Three new modules (rates.py, costs.py, problem_classes.py), eight tasks, registered as workstream 869196c5-551b-4eef-b8d8-cca6f770a9b0 under the custodian topic. A follow-on consumer workplan in infospace-bench will migrate plan_generation_summary to delegate once T01-T04 land here. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-19 04:30:52 +02:00
tegwick	a27945101c	Adaptive routing initial version Some checks failed CI / test (3.10) (push) Has been cancelled Details CI / test (3.11) (push) Has been cancelled Details CI / test (3.12) (push) Has been cancelled Details	2026-05-18 11:38:12 +02:00
tegwick	c4ad4bb9f2	Add adaptive cost-quality routing primitives Some checks failed CI / test (3.10) (push) Has been cancelled Details CI / test (3.11) (push) Has been cancelled Details CI / test (3.12) (push) Has been cancelled Details	2026-05-17 21:32:27 +02:00
tegwick	deade6ad76	plan: WP-0004 — adaptive cost-quality routing (todo) Some checks failed CI / test (3.10) (push) Has been cancelled Details CI / test (3.11) (push) Has been cancelled Details CI / test (3.12) (push) Has been cancelled Details Draft the workplan that extends the static RoutingPolicy (WP-0003) with a quality observation ledger, a BaselineGrader (ClaudeCodeAdapter as the default oracle), an AdaptiveRoutingPolicy that picks the cheapest adapter clearing a per-task quality floor, and a sampled ShadowingAdapter for production observation collection. Scope is explicit: ship primitives only. Task-type taxonomy, quality thresholds, baseline choice, and re-grading cadence stay with the consumer. infospace-bench is the named first consumer; consumer wiring deferred until T01-T03 land. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-17 17:17:07 +02:00
Bernd Worsch	d51d6303e2	feat: WP-0003 — RoutingPolicy (FR-2) and HTTP serve mode (FR-1) Some checks failed CI / test (3.10) (push) Has been cancelled Details CI / test (3.11) (push) Has been cancelled Details CI / test (3.12) (push) Has been cancelled Details FR-2 RoutingPolicy: - RoutingPolicy + RoutingRule dataclasses in llm_connect/routing.py - resolve(task_type, estimated_cost_per_1k=None) with cost-cap fallback - Exported from llm_connect.__init__; contract doc at contracts/functional/routing-policy.md - 11 tests covering rule match, cost-cap, fallback, unknown type, no-match FR-1 HTTP serve mode: - LLMServer in llm_connect/server.py (stdlib http.server, zero extra deps) - POST /execute + GET /health; CLI via python -m llm_connect.server - [server] optional-dep group added to pyproject.toml - Contract doc at contracts/functional/server.md - 9 tests: health, round-trip, 400/404/500 errors, config forwarding - Added "mock" provider to factory for CLI default All 101 tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-01 22:34:00 +00:00
Bernd Worsch	d71f4114d1	feat: WP-0001 foundation + WP-0002 core extensions WP-0001 — Foundation & GAAF Baseline - SCOPE.md, ARCHITECTURE-LAYERS.md, contracts/ tree - .claude/rules/ stubs filled (architecture, stack, boundary) - 57 tests (pytest), pyproject.toml with ruff+mypy, CI workflow WP-0002 — Core Extensions (FR-4 + FR-3) - FR-4: BudgetTracker (thread-safe) + LLMBudgetExceededError + optional RunConfig.budget_tracker + enforcement in all adapters - FR-3: async_execute_prompt on LLMAdapter ABC (asyncio.to_thread fallback) + native asyncio.create_subprocess_exec in ClaudeCodeAdapter 81 tests passing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-01 22:24:14 +00:00

7 Commits