llm-connect

coulomb/llm-connect

Fork 0

generated from coulomb/repo-seed

Files

History

tegwick 0054afe689

CI / test (3.10) (push) Has been cancelled

Details

CI / test (3.11) (push) Has been cancelled

Details

CI / test (3.12) (push) Has been cancelled

Details

plan: WP-0005 — cost model and problem-class token estimators

Drafted workplan to move two consumer-side concerns into llm-connect:

- ModelRateRegistry: per-model USD-per-1k rates with provenance, a
  property of the base model, not the application.
- ProblemClass token estimators: generic shapes (chunk-summarization,
  entity-extraction, relation-extraction, judge-eval, report-synthesis)
  with base dimensions + tunable params; consumer supplies the shape
  of its problem and gets a TokenEstimate before any call.

Demand signal: the 2026-05-18 infospace-bench Lefevre Chapter-I smoke
ran 32 calls / 28k tokens / 0.009 USD actual against a planned 8.40
USD — the 1000x variance was entirely consumer-side because there is
no rate table in llm-connect to delegate to.

Three new modules (rates.py, costs.py, problem_classes.py), eight
tasks, registered as workstream 869196c5-551b-4eef-b8d8-cca6f770a9b0
under the custodian topic. A follow-on consumer workplan in
infospace-bench will migrate plan_generation_summary to delegate once
T01-T04 land here.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-19 04:30:52 +02:00

llm-connect-WP-0001-foundation-gaaf-baseline.md

Adaptive routing initial version

2026-05-18 11:38:12 +02:00

llm-connect-WP-0002-core-extensions.md

Adaptive routing initial version

2026-05-18 11:38:12 +02:00

llm-connect-WP-0003-functional-extensions.md

Adaptive routing initial version

2026-05-18 11:38:12 +02:00

llm-connect-WP-0004-adaptive-cost-quality-routing.md

Add adaptive cost-quality routing primitives

2026-05-17 21:32:27 +02:00

llm-connect-WP-0005-cost-model-and-problem-class-estimators.md

plan: WP-0005 — cost model and problem-class token estimators

2026-05-19 04:30:52 +02:00