generated from coulomb/repo-seed
Implement-LLM-WP-0005-cost-model-estimators
This commit is contained in:
25
contracts/functional/costs.md
Normal file
25
contracts/functional/costs.md
Normal file
@@ -0,0 +1,25 @@
|
||||
# Cost Estimates
|
||||
|
||||
`llm_connect.costs` converts token estimates or observed token counts into
|
||||
USD estimates using `ModelRateRegistry`.
|
||||
|
||||
## Contract
|
||||
|
||||
```python
|
||||
from llm_connect import estimate_cost
|
||||
|
||||
estimate = estimate_cost("openai/gpt-4o-mini", 28_000, 7_500)
|
||||
```
|
||||
|
||||
For known models the result is:
|
||||
|
||||
- `cost_usd`: prompt plus completion estimate.
|
||||
- `prompt_cost_usd`: prompt-token component.
|
||||
- `completion_cost_usd`: completion-token component.
|
||||
- `cost_source`: `rate_table:<model_id>`.
|
||||
|
||||
Unknown models return `CostEstimate(cost_usd=None, cost_source="unknown")`.
|
||||
Missing rates are never silently treated as zero cost.
|
||||
|
||||
The module also exposes `CostModel(registry=...)` for callers that prefer to
|
||||
carry a registry object and call `model.estimate_cost(...)`.
|
||||
46
contracts/functional/problem-classes.md
Normal file
46
contracts/functional/problem-classes.md
Normal file
@@ -0,0 +1,46 @@
|
||||
# Problem Classes
|
||||
|
||||
`llm_connect.problem_classes` provides generic token estimators for recurring
|
||||
LLM workflow shapes.
|
||||
|
||||
## Contract
|
||||
|
||||
Every problem class exposes:
|
||||
|
||||
- `name`: stable registry key.
|
||||
- `base_dimensions`: required dimension names supplied by consumers.
|
||||
- `tunable_params`: parameters that can be overridden or fitted.
|
||||
- `estimate(dimensions, params=None) -> TokenEstimate`.
|
||||
- `fit(observations, min_observations=3) -> ProblemClass`.
|
||||
|
||||
`TokenEstimate` contains `prompt_tokens`, `completion_tokens`, and a
|
||||
`confidence` score from `0` to `1`.
|
||||
|
||||
## Built-Ins
|
||||
|
||||
| Name | Dimensions | Tunable params |
|
||||
|---|---|---|
|
||||
| `chunk-summarization` | `chunk_words`, `template_words` | `completion_ratio` |
|
||||
| `entity-extraction` | `chunk_words`, `template_words`, `expected_entities` | `tokens_per_entity` |
|
||||
| `relation-extraction` | `chunk_words`, `template_words`, `expected_relations` | `tokens_per_relation` |
|
||||
| `judge-eval` | `artifact_words`, `template_words`, `n_criteria` | `tokens_per_criterion` |
|
||||
| `report-synthesis` | `n_chunks`, `n_entities`, `n_relations`, `template_words` | `base_completion_tokens` |
|
||||
|
||||
## Observations
|
||||
|
||||
`fit()` accepts either `Observation` objects or `QualityObservation` rows whose
|
||||
`tags` include:
|
||||
|
||||
```python
|
||||
{
|
||||
"problem_class": "entity-extraction",
|
||||
"dimensions": {
|
||||
"chunk_words": 900,
|
||||
"template_words": 200,
|
||||
"expected_entities": 4,
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
When fewer than `min_observations` usable rows are present, fitting falls back
|
||||
to the current parameters.
|
||||
30
contracts/functional/rates.md
Normal file
30
contracts/functional/rates.md
Normal file
@@ -0,0 +1,30 @@
|
||||
# Model Rate Registry
|
||||
|
||||
`llm_connect.rates` owns static model list prices used for planning and
|
||||
post-hoc estimates.
|
||||
|
||||
## Contract
|
||||
|
||||
- `ModelRate` records `model_id`, prompt and completion rates in USD per
|
||||
1,000 tokens, `currency`, `source_url`, and `captured_at`.
|
||||
- `ModelRateRegistry.default()` returns the bundled OpenRouter snapshot
|
||||
captured on `2026-05-17`.
|
||||
- `ModelRateRegistry.from_yaml(path)` accepts the package/consumer override
|
||||
shape:
|
||||
|
||||
```yaml
|
||||
schema_version: 1
|
||||
currency: USD
|
||||
source_url: https://openrouter.ai/models
|
||||
captured_at: "2026-05-17"
|
||||
rates:
|
||||
openai/gpt-4o-mini:
|
||||
prompt_per_1k: 0.00015
|
||||
completion_per_1k: 0.00060
|
||||
```
|
||||
|
||||
- `merged_with(override)` returns a new registry where matching override
|
||||
entries replace default entries by `model_id`.
|
||||
|
||||
Rates are a static snapshot. Consumers decide whether `captured_at` is fresh
|
||||
enough for their workflow.
|
||||
Reference in New Issue
Block a user