generated from coulomb/repo-seed
3.4 KiB
3.4 KiB
infospace-bench Cost Estimator Migration
infospace-bench can replace its local rate table and coarse word-count
budget math with the primitives added in LLM-WP-0005.
Rate Table
- Drop
src/infospace_bench/model_rates.yamlafter the dependency is bumped. - Load
ModelRateRegistry.default()fromllm-connect. - Keep the workspace-level
model-rates.yamloverride and merge it withdefault().merged_with(ModelRateRegistry.from_yaml(path)). - Preserve
--cost-per-1kas an explicit blended-rate override. When supplied, it should win over the registry and reportcost_source="cost_per_1k_blended".
Plan Summary Sketch
from llm_connect import (
CostEstimate,
ModelRateRegistry,
ProblemClassRegistry,
estimate_cost,
)
def plan_generation_summary(...):
problem_classes = ProblemClassRegistry.default()
rates = ModelRateRegistry.default()
workspace_rates = _workspace_rate_path(root_path)
if workspace_rates.exists():
rates = rates.merged_with(ModelRateRegistry.from_yaml(workspace_rates))
total_prompt_tokens = 0
total_completion_tokens = 0
per_stage = []
for workflow_id in workflow_ids:
class_name, dimensions = _problem_class_for_workflow(
workflow_id,
selected_chunks=selected,
template_words=template_words,
entities_per_chunk=entities_per_chunk,
)
estimate = problem_classes.get(class_name).estimate(dimensions)
calls = _calls_for_workflow(workflow_id, selected, entities_per_chunk)
prompt_tokens = estimate.prompt_tokens * calls
completion_tokens = estimate.completion_tokens * calls
total_prompt_tokens += prompt_tokens
total_completion_tokens += completion_tokens
per_stage.append(
{
"workflow_id": workflow_id,
"problem_class": class_name,
"calls": calls,
"prompt_tokens_estimate": prompt_tokens,
"completion_tokens_estimate": completion_tokens,
"confidence": estimate.confidence,
}
)
if cost_per_1k_tokens > 0:
total_tokens = total_prompt_tokens + total_completion_tokens
cost = (total_tokens / 1000.0) * cost_per_1k_tokens
cost_source = "cost_per_1k_blended"
elif model:
cost_estimate = estimate_cost(
model,
total_prompt_tokens,
total_completion_tokens,
registry=rates,
)
cost = cost_estimate.cost_usd
cost_source = cost_estimate.cost_source
else:
cost = None
cost_source = None
return {
"per_workflow": per_stage,
"total_prompt_tokens_estimate": total_prompt_tokens,
"estimated_completion_tokens": total_completion_tokens,
"estimated_cost_usd": round(cost, 6) if cost is not None else None,
"cost_source": cost_source,
...
}
Workflow Mapping
Initial mapping can stay intentionally thin:
| infospace-bench workflow | llm-connect problem class |
|---|---|
summarize-source |
chunk-summarization |
| entity extraction workflows | entity-extraction |
| relation extraction workflows | relation-extraction |
generic-source-evaluations |
judge-eval |
| final report or rollup synthesis | report-synthesis |
The consumer still owns structure-specific dimensions such as selected chunk counts, profile template word counts, and expected entities per chunk.