generated from coulomb/repo-seed
examples/routing/trading-literature.yaml is the checked-in starting config for a Lefevre-style run. It applies the IB-WP-0018 task-type taxonomy: cheap candidates for summary + evaluation, smart candidates for entity + relation extraction, and a separate baseline rule wiring claude_code for a follow-on T05 ShadowingAdapter step. Workspace- relative ledger_path keeps adaptive observations with the workspace. tests/test_routing_config.py gains a regression test that asserts the shipped example parses cleanly, every stage in stage_to_task_type maps to a declared task type, and the baseline candidate uses the claude_code provider — so the example will not bit-rot silently. tests/test_openrouter_live.py gains test_provider_routing_one_chapter_live_smoke gated on the same INFOSPACE_BENCH_ENABLE_LIVE_OPENROUTER + OPENROUTER_API_KEY opt-in as the existing static smoke. It builds a one-candidate routing config, runs a single chapter through --provider routing, and asserts the per-stage adapter-choices report section names the routed model and the routed artifacts carry adapter_id provenance. docs/generic-source-generator.md gains a "Live runs with --provider routing" subsection that walks through the one-command routed run, explains the --quality-floor override, and points at the parallel live smoke test. 174 tests pass, 2 skipped (both live smokes, correctly gated). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
245 lines
8.6 KiB
Markdown
245 lines
8.6 KiB
Markdown
# Generic Source Generator
|
|
|
|
Date: 2026-05-14
|
|
|
|
## Purpose
|
|
|
|
`infospace-bench generate` turns a local article, ebook-like file, or folder of
|
|
knowledge sources into a manifest-backed infospace. It generalizes the
|
|
Wealth/VSM pilot into an explicit workflow path with deterministic fixture
|
|
support and an optional OpenRouter provider.
|
|
|
|
## Deterministic Run
|
|
|
|
Use fixture responses for repeatable tests and demos:
|
|
|
|
```bash
|
|
infospace-bench generate from-source ./examples/article.md \
|
|
--workspace . \
|
|
--slug article-space \
|
|
--name "Article Space" \
|
|
--profile general-knowledge \
|
|
--fixture-responses ./examples/responses.yaml \
|
|
--apply
|
|
```
|
|
|
|
The command creates normalized source chunks, installs the selected profile,
|
|
runs the declared workflows, writes entities, relations, evaluations, metrics,
|
|
history, and a generation report, then registers artifacts in
|
|
`artifacts/index.yaml`.
|
|
|
|
## Stepwise Workflow
|
|
|
|
```bash
|
|
infospace-bench generate init ./book.epub \
|
|
--workspace . \
|
|
--slug book-space \
|
|
--name "Book Space" \
|
|
--profile general-knowledge \
|
|
--max-chunks 3
|
|
|
|
infospace-bench generate plan ./infospaces/book-space --stage all
|
|
infospace-bench generate run ./infospaces/book-space \
|
|
--fixture-responses ./responses.yaml
|
|
infospace-bench generate status ./infospaces/book-space
|
|
```
|
|
|
|
`--max-chunks` caps early experiments and provider cost. `generate status`
|
|
shows chunk counts, generated artifact counts, evaluations, metrics, history,
|
|
and stale source/profile inputs.
|
|
|
|
### Live OpenRouter runs (handle with care)
|
|
|
|
A single-chapter live run is the only OpenRouter shape the test suite
|
|
covers today. Use `--chapter` (or `--from-chapter` / `--to-chapter`) on
|
|
`generate init` or `generate from-source` to scope what gets registered
|
|
before any provider calls happen:
|
|
|
|
```bash
|
|
export OPENROUTER_API_KEY=...
|
|
|
|
# Preview the cost first
|
|
infospace-bench generate plan ./infospaces/foo --chapter I --cost-per-1k 0.30
|
|
|
|
# Run only Chapter I against a cheap model
|
|
infospace-bench generate from-source ./LEFEVRE.epub \
|
|
--workspace ./infospaces \
|
|
--slug reminiscences-ch1 \
|
|
--name "Reminiscences (Ch I)" \
|
|
--profile trading-literature \
|
|
--provider openrouter \
|
|
--model openai/gpt-4o-mini \
|
|
--chapter I \
|
|
--apply
|
|
```
|
|
|
|
`output/budget/plans.yaml`, `usage.yaml`, and `summary.yaml` record what
|
|
was estimated, what was actually spent, and the plan-vs-actual delta.
|
|
`output/workflows/runs/*.yaml` carry the OpenRouter request_id, model,
|
|
token usage, retry count, and per-call duration; the same metadata
|
|
reaches the entity/relation/evaluation artifacts via
|
|
`provenance.provider_metadata`.
|
|
|
|
Before scaling to the full book:
|
|
|
|
- Inspect each chapter's outputs and `generation-summary.md`
|
|
- Multiply the per-chapter `total_provider_calls_estimate` and
|
|
`estimated_cost_usd` by the chapter count and compare to your budget
|
|
- Decide on a final model and confirm the rate-table entry exists in
|
|
`src/infospace_bench/model_rates.yaml` or your workspace override
|
|
|
|
The optional live-smoke test in `tests/test_openrouter_live.py` is
|
|
skipped unless both `OPENROUTER_API_KEY` and
|
|
`INFOSPACE_BENCH_ENABLE_LIVE_OPENROUTER=1` are set. It runs a single
|
|
chapter through the same path and asserts the provider metadata
|
|
plumb-through.
|
|
|
|
### Live runs with `--provider routing`
|
|
|
|
When the routing CLI is what you want to exercise live, swap
|
|
`--provider openrouter --model ...` for the routing pair:
|
|
|
|
```bash
|
|
infospace-bench generate from-source ./LEFEVRE.epub \
|
|
--workspace ./infospaces \
|
|
--slug reminiscences-routed \
|
|
--name "Reminiscences (Routed)" \
|
|
--profile trading-literature \
|
|
--provider routing \
|
|
--routing-config ./examples/routing/trading-literature.yaml \
|
|
--chapter I \
|
|
--apply
|
|
```
|
|
|
|
`examples/routing/trading-literature.yaml` is a checked-in starting
|
|
config: cheap candidates for summary/evaluation, smart candidates for
|
|
entity/relation, a `claude_code` baseline rule for future shadow
|
|
sampling, and a workspace-relative `output/routing/quality.jsonl`
|
|
ledger so adaptive observations stay with the workspace.
|
|
|
|
`--quality-floor <float>` on the same command overrides the config's
|
|
`default_quality_floor` for a single invocation — useful for
|
|
tightening the bar for a specific run without editing the file. The
|
|
ledger fills up as the `AdaptiveRoutingPolicy` records each
|
|
observation; later runs against the same workspace get the benefit
|
|
without re-grading from scratch.
|
|
|
|
The parallel live-smoke test
|
|
(`test_provider_routing_one_chapter_live_smoke`) is also gated on
|
|
`INFOSPACE_BENCH_ENABLE_LIVE_OPENROUTER=1` + `OPENROUTER_API_KEY` and
|
|
asserts the per-stage adapter-choices report section names the routed
|
|
model.
|
|
|
|
### Budget and usage registry
|
|
|
|
Every `generate plan` invocation appends a compact snapshot to
|
|
`output/budget/plans.yaml` (deterministic 12-char `snapshot_id`, 50-entry
|
|
sliding retention). Every `generate run` invocation appends a usage
|
|
rollup to `output/budget/usage.yaml`, bucketed by `(workflow_id,
|
|
stage_id, provider, model)` with prompt and completion token counts,
|
|
known cost (when the adapter returned it), and estimated cost (when a
|
|
rate table entry matches the model).
|
|
|
|
The default rate table is bundled at
|
|
`src/infospace_bench/model_rates.yaml` and covers a handful of common
|
|
OpenRouter models at list price (see the file for the captured-at
|
|
timestamp). A workspace can override or extend entries by placing
|
|
`model-rates.yaml` next to its `infospaces/` directory; the workspace
|
|
file is overlaid on top of the package default so partial overrides
|
|
are fine.
|
|
|
|
Cost resolution order on each run: adapter-returned `cost` first, then
|
|
the rate table, then `cost_status="unknown"` (recorded explicitly,
|
|
never silently zeroed). The plan-vs-actual variance summary lands in
|
|
follow-on task T04.
|
|
|
|
### Profiles
|
|
|
|
Two profiles ship today:
|
|
|
|
- `general-knowledge` — durable concepts, claims, methods, people,
|
|
places, works, and objects across any source
|
|
- `trading-literature` — trading memoirs and market-structure texts;
|
|
tunes entity categories (`trader`, `market`, `strategy`, `error`,
|
|
`psychological_pattern`, `institution`, `instrument`,
|
|
`evidence_bearing_claim`), relation types (`cause_effect`,
|
|
`lesson_evidence`, `risk_mitigation`, `actor_venue`,
|
|
`strategy_outcome`), and evaluation criteria (`groundedness`,
|
|
`lesson_clarity`, `historical_context`, `overgeneralization_risk`)
|
|
|
|
Select via `--profile trading-literature` on `generate init` or
|
|
`generate from-source`. The generic profile remains the default.
|
|
|
|
### Scale-aware plan
|
|
|
|
`generate plan` returns a compact estimate by default — counts of selected
|
|
chunks, calls per workflow, prompt-word and token estimates, and a rough
|
|
USD cost when `--cost-per-1k` is supplied. Long corpora no longer dump
|
|
hundreds of full prompts unless `--full` is set.
|
|
|
|
```bash
|
|
infospace-bench generate plan ./infospaces/book-space \
|
|
--from-chapter 1 --to-chapter 3 \
|
|
--cost-per-1k 0.30 \
|
|
--max-calls 50 \
|
|
--cost-cap 2.00
|
|
```
|
|
|
|
Selection filters:
|
|
|
|
- `--chapter LABEL` (repeatable) — match a chapter by roman/arabic label
|
|
or numeric value (e.g. `--chapter I` or `--chapter 2`)
|
|
- `--from-chapter N` / `--to-chapter N` — numeric chapter range
|
|
- `--chunk ID` (repeatable) — exact source chunk id (e.g.
|
|
`chapter-01-part-002`)
|
|
|
|
Budget flags `--max-calls` and `--cost-cap` are reported as
|
|
`exceeds_max_calls` / `exceeds_cost_cap` booleans in the summary, so a
|
|
caller can fail fast before invoking `run`. Use `--full` to opt back into
|
|
the full per-workflow plan with prompts for deep inspection.
|
|
|
|
## OpenRouter
|
|
|
|
Live model calls are explicit:
|
|
|
|
```bash
|
|
export OPENROUTER_API_KEY=...
|
|
|
|
infospace-bench generate run ./infospaces/book-space \
|
|
--provider openrouter \
|
|
--model openai/gpt-4o-mini \
|
|
--stage all
|
|
```
|
|
|
|
Choose the `--model` value from OpenRouter model IDs. The API key is read from
|
|
`OPENROUTER_API_KEY`; it is not written to `infospace.yaml`. Default tests never
|
|
make live provider calls.
|
|
|
|
## Resume
|
|
|
|
Use resume for interrupted or reviewed runs:
|
|
|
|
```bash
|
|
infospace-bench generate resume ./infospaces/book-space \
|
|
--provider openrouter \
|
|
--model openai/gpt-4o-mini
|
|
```
|
|
|
|
Unchanged completed runs are skipped. Use `--force` when you intentionally want
|
|
to rerun completed work. Stale status is reported when source artifact digests
|
|
or installed profile/template files change.
|
|
|
|
## Review Path
|
|
|
|
After generation:
|
|
|
|
- inspect `artifacts/sources/` for normalized input chunks
|
|
- inspect `artifacts/entities/` and `artifacts/relations/` for generated claims
|
|
- inspect `output/evaluations/` for rubric output
|
|
- run `infospace-bench validate <root>` and `infospace-bench graph <root>`
|
|
- review `reports/generation-summary.md`
|
|
|
|
Move from the generic profile to a specialized profile when the source domain
|
|
needs stricter terminology, narrower extraction granularity, or a discipline
|
|
lens such as VSM.
|