infospace-bench

Files

tegwick 678508226a IB-WP-0019-T02: usage rollup from run records

Every completed generate run now aggregates per-call adapter usage from
the workflow-engine run records into output/budget/usage.yaml. Per-call
data is bucketed by (workflow_id, stage_id, provider, model) with
running totals for calls, prompt_tokens, completion_tokens,
total_tokens, and cost_usd_known (sum of adapter-reported cost when the
provider returns it; usually zero today). A run-level entry captures
run_index, started_at, completed_at, duration_seconds, the executing
plan snapshot_id (resolved from the latest plans.yaml entry), and the
workflow-level run_id / stage_count summaries.

cost_usd_estimated is left as None for this task; T03 wires the
rate-table resolver so the same bucket gets a model-priced fallback
when the adapter does not return cost directly.

Fixture-mode runs are recorded with provider='fixture', zero tokens,
and cost_status='unknown' rather than silently skipped, so the rollup
honestly reflects which stages actually ran.

102 tests pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-17 19:46:40 +02:00

test_agentic_memory_profile.py

Agentic memory profile

2026-05-15 16:01:35 +02:00

test_archive.py

archive: include contracts/, schemas/; report skipped top-level dirs

2026-05-17 12:21:19 +02:00

test_budget_registry.py

IB-WP-0019-T02: usage rollup from run records

2026-05-17 19:46:40 +02:00

test_cli.py

IB-WP-0014: archive-list, restore, retention annotation, docs (T03-T05)