llm-connect

Author	SHA1	Message	Date
tegwick	24f4c09d42	Implement llm-connect ADHOC diagnostics Some checks failed CI / test (3.10) (push) Has been cancelled Details CI / test (3.11) (push) Has been cancelled Details CI / test (3.12) (push) Has been cancelled Details	2026-06-03 11:56:21 +02:00
tegwick	79c899b694	Capture llm-connect lessons from CUST-WP-0045 canary as ADHOC-2026-06-02 Some checks failed CI / test (3.10) (push) Has been cancelled Details CI / test (3.11) (push) Has been cancelled Details CI / test (3.12) (push) Has been cancelled Details The 2026-06-02 daily-triage canary debugging session uncovered five real bugs (commits `9de0f49`, `435da49`, `cd4551c`, `583ab57`, `1b01f0e`), mostly because llm-connect has no way to see what payload the adapter sent or what the provider returned. Capture the six structural improvements that would collapse the next diagnosis of this shape from half a day to minutes: T01 — LLM_CONNECT_DEBUG envelope mode for /execute responses T02 — ThreadingHTTPServer drop-in replacement for stdlib HTTPServer T03 — Per-call audit log + replay CLI (LLM_CONNECT_AUDIT_DIR) T04 — Apply param-translation contract to OpenAI and Gemini adapters T05 — Provider-agnostic structured-output smoke test in CI T06 — Document the model_params translation contract for adapter authors All six registered in the State Hub under workstream adhoc-llmc-2026-06-02 (1c936c91-79c7-427d-ab37-9052e8a61cda). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-06-02 15:55:42 +02:00
tegwick	c11c6afa3f	Implement-LLM-WP-0005-cost-model-estimators Some checks failed CI / test (3.10) (push) Has been cancelled Details CI / test (3.11) (push) Has been cancelled Details CI / test (3.12) (push) Has been cancelled Details	2026-05-19 05:02:20 +02:00
tegwick	0054afe689	plan: WP-0005 — cost model and problem-class token estimators Some checks failed CI / test (3.10) (push) Has been cancelled Details CI / test (3.11) (push) Has been cancelled Details CI / test (3.12) (push) Has been cancelled Details Drafted workplan to move two consumer-side concerns into llm-connect: - ModelRateRegistry: per-model USD-per-1k rates with provenance, a property of the base model, not the application. - ProblemClass token estimators: generic shapes (chunk-summarization, entity-extraction, relation-extraction, judge-eval, report-synthesis) with base dimensions + tunable params; consumer supplies the shape of its problem and gets a TokenEstimate before any call. Demand signal: the 2026-05-18 infospace-bench Lefevre Chapter-I smoke ran 32 calls / 28k tokens / 0.009 USD actual against a planned 8.40 USD — the 1000x variance was entirely consumer-side because there is no rate table in llm-connect to delegate to. Three new modules (rates.py, costs.py, problem_classes.py), eight tasks, registered as workstream 869196c5-551b-4eef-b8d8-cca6f770a9b0 under the custodian topic. A follow-on consumer workplan in infospace-bench will migrate plan_generation_summary to delegate once T01-T04 land here. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-19 04:30:52 +02:00
tegwick	a27945101c	Adaptive routing initial version Some checks failed CI / test (3.10) (push) Has been cancelled Details CI / test (3.11) (push) Has been cancelled Details CI / test (3.12) (push) Has been cancelled Details	2026-05-18 11:38:12 +02:00
tegwick	c4ad4bb9f2	Add adaptive cost-quality routing primitives Some checks failed CI / test (3.10) (push) Has been cancelled Details CI / test (3.11) (push) Has been cancelled Details CI / test (3.12) (push) Has been cancelled Details	2026-05-17 21:32:27 +02:00
tegwick	deade6ad76	plan: WP-0004 — adaptive cost-quality routing (todo) Some checks failed CI / test (3.10) (push) Has been cancelled Details CI / test (3.11) (push) Has been cancelled Details CI / test (3.12) (push) Has been cancelled Details Draft the workplan that extends the static RoutingPolicy (WP-0003) with a quality observation ledger, a BaselineGrader (ClaudeCodeAdapter as the default oracle), an AdaptiveRoutingPolicy that picks the cheapest adapter clearing a per-task quality floor, and a sampled ShadowingAdapter for production observation collection. Scope is explicit: ship primitives only. Task-type taxonomy, quality thresholds, baseline choice, and re-grading cadence stay with the consumer. infospace-bench is the named first consumer; consumer wiring deferred until T01-T03 land. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-17 17:17:07 +02:00
Bernd Worsch	d51d6303e2	feat: WP-0003 — RoutingPolicy (FR-2) and HTTP serve mode (FR-1) Some checks failed CI / test (3.10) (push) Has been cancelled Details CI / test (3.11) (push) Has been cancelled Details CI / test (3.12) (push) Has been cancelled Details FR-2 RoutingPolicy: - RoutingPolicy + RoutingRule dataclasses in llm_connect/routing.py - resolve(task_type, estimated_cost_per_1k=None) with cost-cap fallback - Exported from llm_connect.__init__; contract doc at contracts/functional/routing-policy.md - 11 tests covering rule match, cost-cap, fallback, unknown type, no-match FR-1 HTTP serve mode: - LLMServer in llm_connect/server.py (stdlib http.server, zero extra deps) - POST /execute + GET /health; CLI via python -m llm_connect.server - [server] optional-dep group added to pyproject.toml - Contract doc at contracts/functional/server.md - 9 tests: health, round-trip, 400/404/500 errors, config forwarding - Added "mock" provider to factory for CLI default All 101 tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-01 22:34:00 +00:00
Bernd Worsch	d71f4114d1	feat: WP-0001 foundation + WP-0002 core extensions WP-0001 — Foundation & GAAF Baseline - SCOPE.md, ARCHITECTURE-LAYERS.md, contracts/ tree - .claude/rules/ stubs filled (architecture, stack, boundary) - 57 tests (pytest), pyproject.toml with ruff+mypy, CI workflow WP-0002 — Core Extensions (FR-4 + FR-3) - FR-4: BudgetTracker (thread-safe) + LLMBudgetExceededError + optional RunConfig.budget_tracker + enforcement in all adapters - FR-3: async_execute_prompt on LLMAdapter ABC (asyncio.to_thread fallback) + native asyncio.create_subprocess_exec in ClaudeCodeAdapter 81 tests passing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-01 22:24:14 +00:00

9 Commits