IB-WP-0020-T04: example routing config + live routing smoke

examples/routing/trading-literature.yaml is the checked-in starting config for a Lefevre-style run. It applies the IB-WP-0018 task-type taxonomy: cheap candidates for summary + evaluation, smart candidates for entity + relation extraction, and a separate baseline rule wiring claude_code for a follow-on T05 ShadowingAdapter step. Workspace- relative ledger_path keeps adaptive observations with the workspace. tests/test_routing_config.py gains a regression test that asserts the shipped example parses cleanly, every stage in stage_to_task_type maps to a declared task type, and the baseline candidate uses the claude_code provider — so the example will not bit-rot silently. tests/test_openrouter_live.py gains test_provider_routing_one_chapter_live_smoke gated on the same INFOSPACE_BENCH_ENABLE_LIVE_OPENROUTER + OPENROUTER_API_KEY opt-in as the existing static smoke. It builds a one-candidate routing config, runs a single chapter through --provider routing, and asserts the per-stage adapter-choices report section names the routed model and the routed artifacts carry adapter_id provenance. docs/generic-source-generator.md gains a "Live runs with --provider routing" subsection that walks through the one-command routed run, explains the --quality-floor override, and points at the parallel live smoke test. 174 tests pass, 2 skipped (both live smokes, correctly gated). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
IB-WP-0020-T03: routing CLI flags
2026-05-18 22:19:54 +02:00 · 2026-05-18 22:08:51 +02:00
11 changed files with 569 additions and 10 deletions
--- a/docs/generic-source-generator.md
+++ b/docs/generic-source-generator.md
@@ -94,6 +94,42 @@ skipped unless both `OPENROUTER_API_KEY` and
 chapter through the same path and asserts the provider metadata
 plumb-through.

+### Live runs with `--provider routing`
+
+When the routing CLI is what you want to exercise live, swap
+`--provider openrouter --model ...` for the routing pair:
+
+```bash
+infospace-bench generate from-source ./LEFEVRE.epub \
+  --workspace ./infospaces \
+  --slug reminiscences-routed \
+  --name "Reminiscences (Routed)" \
+  --profile trading-literature \
+  --provider routing \
+  --routing-config ./examples/routing/trading-literature.yaml \
+  --chapter I \
+  --apply
+```
+
+`examples/routing/trading-literature.yaml` is a checked-in starting
+config: cheap candidates for summary/evaluation, smart candidates for
+entity/relation, a `claude_code` baseline rule for future shadow
+sampling, and a workspace-relative `output/routing/quality.jsonl`
+ledger so adaptive observations stay with the workspace.
+
+`--quality-floor <float>` on the same command overrides the config's
+`default_quality_floor` for a single invocation — useful for
+tightening the bar for a specific run without editing the file. The
+ledger fills up as the `AdaptiveRoutingPolicy` records each
+observation; later runs against the same workspace get the benefit
+without re-grading from scratch.
+
+The parallel live-smoke test
+(`test_provider_routing_one_chapter_live_smoke`) is also gated on
+`INFOSPACE_BENCH_ENABLE_LIVE_OPENROUTER=1` + `OPENROUTER_API_KEY` and
+asserts the per-stage adapter-choices report section names the routed
+model.
+
 ### Budget and usage registry

 Every `generate plan` invocation appends a compact snapshot to
--- a/examples/routing/trading-literature.yaml
+++ b/examples/routing/trading-literature.yaml
@@ -0,0 +1,81 @@
+# Example routing config for a trading-literature Lefevre-style run.
+#
+# Captures the IB-WP-0018 task-type taxonomy from docs/routing-task-types.md:
+#   summarize-source  → cheap model (volume-heavy, recoverable downstream)
+#   extract-entities  → smart model (durable output; be strict)
+#   extract-relations → smart model (depends on entities)
+#   evaluate-entity   → judge model (different family from extraction)
+#   synthesize-report → smart model (volume-of-one, quality matters, cheap)
+#
+# Quality floors are the recommended starting points from
+# docs/routing-task-types.md. With a ledger configured, AdaptiveRoutingPolicy
+# will pick the cheapest *qualifying* adapter per task type as observations
+# accumulate; until then it falls back to the static prefer/fallback order.
+#
+# Refresh the model rates in src/infospace_bench/model_rates.yaml before any
+# full-book run — list prices drift, and the rough USD estimate in the budget
+# log depends on them.
+
+schema_version: 1
+
+# Workspace-relative ledger so QualityLedger observations from this workspace
+# stay with this workspace. Drop this line to run pure static routing.
+ledger_path: output/routing/quality.jsonl
+
+# Floors apply when --quality-floor is not passed at the call site. The CLI
+# flag wins, then the per-task quality_floor below, then this default.
+default_quality_floor: 0.80
+
+stage_to_task_type:
+  summarize-source: cheap
+  extract-entities: smart
+  extract-relations: smart
+  evaluate-entity: judge
+  synthesize-report: smart
+
+task_types:
+
+  cheap:
+    quality_floor: 0.70
+    candidates:
+      - id: openrouter:gpt-4o-mini
+        provider: openrouter
+        model: openai/gpt-4o-mini
+        api_key_env: OPENROUTER_API_KEY
+        max_cost_per_1k: 0.001
+      - id: openrouter:claude-3.5-haiku
+        provider: openrouter
+        model: anthropic/claude-3.5-haiku
+        api_key_env: OPENROUTER_API_KEY
+        max_cost_per_1k: 0.003
+
+  smart:
+    quality_floor: 0.85
+    candidates:
+      - id: openrouter:claude-3.5-haiku
+        provider: openrouter
+        model: anthropic/claude-3.5-haiku
+        api_key_env: OPENROUTER_API_KEY
+      - id: openrouter:claude-3.5-sonnet
+        provider: openrouter
+        model: anthropic/claude-3.5-sonnet
+        api_key_env: OPENROUTER_API_KEY
+
+  judge:
+    quality_floor: 0.80
+    candidates:
+      # Evaluation goes through a different family than extraction to limit
+      # self-preference bias.
+      - id: openrouter:gpt-4o-mini
+        provider: openrouter
+        model: openai/gpt-4o-mini
+        api_key_env: OPENROUTER_API_KEY
+
+  # Baseline is wired here so a follow-up T05 ShadowingAdapter step can
+  # reference `claude-code` as the grading oracle without editing the
+  # task_types stanza.
+  baseline:
+    candidates:
+      - id: claude-code
+        provider: claude_code
+        model: claude-opus-4-7
--- a/src/infospace_bench/budget.py
+++ b/src/infospace_bench/budget.py
@@ -29,7 +29,7 @@ _PACKAGE_RATES_PATH = Path(__file__).parent / "model_rates.yaml"
 HUB_URL_ENV = "INFOSPACE_BENCH_HUB_URL"
 HUB_DISABLE_ENV = "INFOSPACE_BENCH_DISABLE_HUB_TOKEN_EVENTS"
 DEFAULT_HUB_URL = "http://127.0.0.1:8000"
-TOKEN_EVENTS_PATH = "/state/token-events"
+TOKEN_EVENTS_PATH = "/token-events/"
 HUB_TIMEOUT_SECONDS = 3.0

 BUDGET_DIR = Path("output/budget")
--- a/src/infospace_bench/cli.py
+++ b/src/infospace_bench/cli.py
@@ -203,9 +203,11 @@ def build_parser() -> argparse.ArgumentParser:
    )
    generate_run.add_argument("root")
    generate_run.add_argument("--stage", default="all")
-    generate_run.add_argument("--provider", choices=["fixture", "openrouter"], default="fixture")
+    generate_run.add_argument("--provider", choices=["fixture", "openrouter", "routing"], default="fixture")
    generate_run.add_argument("--model", default="")
    generate_run.add_argument("--fixture-responses", default="")
+    generate_run.add_argument("--routing-config", default="", help="YAML routing config (required with --provider routing)")
+    generate_run.add_argument("--quality-floor", type=float, default=None, help="Override the config's default_quality_floor for this run")
    generate_run.add_argument("--resume", action="store_true")
    generate_run.add_argument("--force", action="store_true")

@@ -215,9 +217,11 @@ def build_parser() -> argparse.ArgumentParser:
    )
    generate_resume.add_argument("root")
    generate_resume.add_argument("--stage", default="all")
-    generate_resume.add_argument("--provider", choices=["fixture", "openrouter"], default="fixture")
+    generate_resume.add_argument("--provider", choices=["fixture", "openrouter", "routing"], default="fixture")
    generate_resume.add_argument("--model", default="")
    generate_resume.add_argument("--fixture-responses", default="")
+    generate_resume.add_argument("--routing-config", default="")
+    generate_resume.add_argument("--quality-floor", type=float, default=None)
    generate_resume.add_argument("--force", action="store_true")

    generate_status = generate_sub.add_parser(
@@ -236,9 +240,11 @@ def build_parser() -> argparse.ArgumentParser:
    generate_from_source.add_argument("--name", required=True)
    generate_from_source.add_argument("--profile", default="general-knowledge")
    generate_from_source.add_argument("--stage", default="all")
-    generate_from_source.add_argument("--provider", choices=["fixture", "openrouter"], default="fixture")
+    generate_from_source.add_argument("--provider", choices=["fixture", "openrouter", "routing"], default="fixture")
    generate_from_source.add_argument("--model", default="")
    generate_from_source.add_argument("--fixture-responses", default="")
+    generate_from_source.add_argument("--routing-config", default="", help="YAML routing config (required with --provider routing)")
+    generate_from_source.add_argument("--quality-floor", type=float, default=None)
    generate_from_source.add_argument("--max-chunks", type=int, default=0)
    generate_from_source.add_argument(
        "--chapter",
@@ -551,6 +557,8 @@ def main(argv: list[str] | None = None) -> int:
                        provider=args.provider,
                        model=args.model,
                        fixture_responses=args.fixture_responses or None,
+                        routing_config=args.routing_config or None,
+                        quality_floor=args.quality_floor,
                        resume=args.resume,
                        force=args.force,
                    ).to_dict()
@@ -563,6 +571,8 @@ def main(argv: list[str] | None = None) -> int:
                        provider=args.provider,
                        model=args.model,
                        fixture_responses=args.fixture_responses or None,
+                        routing_config=args.routing_config or None,
+                        quality_floor=args.quality_floor,
                        resume=True,
                        force=args.force,
                    ).to_dict()
@@ -589,6 +599,8 @@ def main(argv: list[str] | None = None) -> int:
                        provider=args.provider,
                        model=args.model,
                        fixture_responses=args.fixture_responses or None,
+                        routing_config=args.routing_config or None,
+                        quality_floor=args.quality_floor,
                    )
                    _write_json(result.to_dict())
                else:
--- a/src/infospace_bench/generator.py
+++ b/src/infospace_bench/generator.py
@@ -427,6 +427,8 @@ def run_generation(
    provider: str = "fixture",
    model: str = "",
    fixture_responses: str | Path | None = None,
+    routing_config: str | Path | None = None,
+    quality_floor: float | None = None,
    resume: bool = False,
    force: bool = False,
 ) -> GenerationRunResult:
@@ -449,7 +451,14 @@ def run_generation(
    started_wall = datetime.now(timezone.utc)
    monotonic_start = _monotonic()
    adapter = (
-        _adapter_for(provider, model=model, fixture_responses=fixture_responses)
+        _adapter_for(
+            provider,
+            model=model,
+            fixture_responses=fixture_responses,
+            routing_config=routing_config,
+            quality_floor=quality_floor,
+            workspace=_workspace_for(root_path),
+        )
        if workflow_ids
        else None
    )
@@ -551,14 +560,42 @@ def _adapter_for(
    *,
    model: str,
    fixture_responses: str | Path | None,
+    routing_config: str | Path | None = None,
+    quality_floor: float | None = None,
+    workspace: Path | None = None,
 ) -> AssistedGenerationAdapter:
    if fixture_responses:
        return FixtureAssistedGenerationAdapter.from_file(Path(fixture_responses))
    if provider == "openrouter":
        return OpenRouterAssistedGenerationAdapter(model=model)
+    if provider == "routing":
+        if not routing_config:
+            raise InfospaceError(
+                "missing_routing_config",
+                "--provider routing requires --routing-config <path>",
+                {"provider": provider},
+            )
+        from .routing import RoutingAssistedGenerationAdapter
+        from .routing_config import (
+            build_routing_policy_from_config,
+            load_routing_config,
+        )
+
+        config = load_routing_config(routing_config)
+        policy = build_routing_policy_from_config(config, workspace=workspace)
+        effective_floor = (
+            quality_floor
+            if quality_floor is not None
+            else config.default_quality_floor
+        )
+        return RoutingAssistedGenerationAdapter(
+            policy=policy,
+            stage_to_task_type=dict(config.stage_to_task_type),
+            quality_floor=effective_floor,
+        )
    raise InfospaceError(
        "missing_assisted_generation_adapter",
-        "Assisted generation requires --fixture-responses or --provider openrouter",
+        "Assisted generation requires --fixture-responses, --provider openrouter, or --provider routing",
        {"provider": provider},
    )

--- a/src/infospace_bench/routing.py
+++ b/src/infospace_bench/routing.py
@@ -112,7 +112,11 @@ def _identify_adapter(adapter: LLMAdapter) -> str:
    adapter_id = getattr(adapter, "adapter_id", "")
    if adapter_id:
        return str(adapter_id)
-    model = getattr(adapter, "model", "") or getattr(adapter, "model_name", "")
+    model = (
+        getattr(adapter, "model", "")
+        or getattr(adapter, "model_name", "")
+        or getattr(adapter, "_model", "")
+    )
    name = type(adapter).__name__
    if model:
        return f"{name}:{model}"
--- a/tests/test_budget_registry.py
+++ b/tests/test_budget_registry.py
@@ -522,7 +522,7 @@ def test_emit_token_event_calls_poster_with_record_token_payload(tmp_path: Path)
    assert result["status"] == "emitted"
    assert len(calls) == 1
    url, payload, timeout = calls[0]
-    assert url == "http://hub.example/state/token-events"
+    assert url == "http://hub.example/token-events/"
    assert payload["tokens_in"] == 1200
    assert payload["tokens_out"] == 400
    assert payload["model"] == "openai/gpt-4o-mini"
--- a/tests/test_openrouter_live.py
+++ b/tests/test_openrouter_live.py
@@ -208,3 +208,87 @@ def test_openrouter_one_chapter_smoke(tmp_path: Path) -> None:
        and item.get("provenance", {}).get("provider_metadata", {}).get("request_id")
    ]
    assert generated_with_metadata, "generated artifacts should carry provider_metadata.request_id"
+
+
+_LIVE_ROUTING_REASON = (
+    "set INFOSPACE_BENCH_ENABLE_LIVE_OPENROUTER=1 and OPENROUTER_API_KEY to run "
+    "the optional one-chapter routing smoke against OpenRouter"
+)
+
+
+@pytest.mark.skipif(not (_LIVE_OPT_IN and _LIVE_API_KEY), reason=_LIVE_ROUTING_REASON)
+def test_provider_routing_one_chapter_live_smoke(tmp_path: Path) -> None:
+    """Live smoke: one chapter through --provider routing against OpenRouter.
+
+    Uses a minimal one-candidate-per-task-type routing config so the test
+    spends roughly the same as the static OpenRouter smoke. Asserts the run
+    completes, the routing bridge recorded adapter_id / task_type on
+    provider_metadata, and the per-stage adapter-choices report section
+    reflects routed choices.
+    """
+    book = _build_fixture_epub(tmp_path / "lefevre.epub")
+    model = os.environ.get("INFOSPACE_BENCH_LIVE_MODEL", "openai/gpt-4o-mini")
+
+    routing_config = tmp_path / "routing.yaml"
+    routing_config.write_text(
+        yaml.safe_dump(
+            {
+                "schema_version": 1,
+                "stage_to_task_type": {
+                    "summarize-source": "cheap",
+                    "extract-entities": "cheap",
+                    "extract-relations": "cheap",
+                    "evaluate-entity": "cheap",
+                    "synthesize-report": "cheap",
+                },
+                "task_types": {
+                    "cheap": {
+                        "candidates": [
+                            {
+                                "id": f"openrouter:{model}",
+                                "provider": "openrouter",
+                                "model": model,
+                                "api_key_env": "OPENROUTER_API_KEY",
+                            },
+                        ],
+                    },
+                },
+            },
+            sort_keys=False,
+        ),
+        encoding="utf-8",
+    )
+
+    infospace = init_generation_infospace(
+        tmp_path,
+        book,
+        "lefevre-live-routing",
+        name="Lefevre Live Routing",
+        profile="trading-literature",
+        chapter_filter=["I"],
+    )
+    plan_generation(infospace.root, cost_per_1k_tokens=0.5)
+    result = run_generation(
+        infospace.root,
+        provider="routing",
+        routing_config=routing_config,
+    )
+    status = status_generation(infospace.root)
+
+    assert result.status == "completed"
+    assert status["source_chunk_count"] == 1
+    assert status["entity_count"] >= 1
+
+    report = (infospace.root / "reports" / "generation-summary.md").read_text(encoding="utf-8")
+    assert "## Per-stage adapter choices" in report
+    assert model in report, "report should name the routed model"
+
+    # The routing bridge writes adapter_id + task_type onto provider_metadata.
+    index = yaml.safe_load((infospace.root / "artifacts" / "index.yaml").read_text(encoding="utf-8"))
+    routed_artifacts = [
+        item
+        for item in index["artifacts"]
+        if item["kind"] in {"entity", "relation", "generated"}
+        and (item.get("provenance") or {}).get("provider_metadata", {}).get("adapter_id")
+    ]
+    assert routed_artifacts, "routed artifacts must carry adapter_id provenance"
--- a/tests/test_routing_cli.py
+++ b/tests/test_routing_cli.py
@@ -0,0 +1,286 @@
+"""
+Tests for the routing CLI flags (IB-WP-0020-T03).
+
+Three levels:
+- _adapter_for("routing") unit checks — missing config, happy path
+- run_generation end-to-end through --provider routing with a stubbed
+  OpenRouterAdapter.execute_prompt so no network is required
+- CLI subprocess smoke that proves the new flags are wired
+"""
+
+from __future__ import annotations
+
+import json
+import os
+import subprocess
+import sys
+import zipfile
+from pathlib import Path
+
+import pytest
+import yaml
+
+from infospace_bench.errors import InfospaceError
+from infospace_bench.generator import (
+    _adapter_for,
+    init_generation_infospace,
+    run_generation,
+    status_generation,
+)
+from infospace_bench.routing import RoutingAssistedGenerationAdapter
+
+
+FIXTURE_ROOT = Path(__file__).parent / "fixtures" / "lefevre"
+
+
+def _build_fixture_epub(target: Path) -> Path:
+    sources = FIXTURE_ROOT / "sources"
+    layout: dict[str, str] = {
+        "mimetype": "application/epub+zip",
+        "META-INF/container.xml": (sources / "container.xml").read_text(encoding="utf-8"),
+    }
+    for source in sorted(sources.glob("*.xhtml")):
+        layout[f"OEBPS/{source.name}"] = source.read_text(encoding="utf-8")
+    layout["OEBPS/content.opf"] = (sources / "content.opf").read_text(encoding="utf-8")
+    with zipfile.ZipFile(target, "w") as archive:
+        for path_in_zip, contents in layout.items():
+            archive.writestr(path_in_zip, contents)
+    return target
+
+
+def _write_routing_config(path: Path, *, ledger_relpath: str | None = None) -> None:
+    """Minimal routing config that maps every fixture stage to one cheap candidate."""
+    data: dict = {
+        "schema_version": 1,
+        "stage_to_task_type": {
+            "summarize-source": "cheap",
+            "extract-entities": "cheap",
+            "extract-relations": "cheap",
+            "evaluate-entity": "cheap",
+            "synthesize-report": "cheap",
+        },
+        "task_types": {
+            "cheap": {
+                "candidates": [
+                    {
+                        "id": "openrouter:gpt-4o-mini",
+                        "provider": "openrouter",
+                        "model": "openai/gpt-4o-mini",
+                        "api_key_env": "OPENROUTER_API_KEY",
+                    },
+                ],
+            },
+        },
+    }
+    if ledger_relpath is not None:
+        data["ledger_path"] = ledger_relpath
+    path.write_text(yaml.safe_dump(data, sort_keys=False), encoding="utf-8")
+
+
+def test_adapter_for_routing_missing_config_raises() -> None:
+    with pytest.raises(InfospaceError) as exc_info:
+        _adapter_for("routing", model="", fixture_responses=None, routing_config=None)
+    assert exc_info.value.code == "missing_routing_config"
+
+
+def test_adapter_for_routing_returns_bridge(tmp_path: Path, monkeypatch) -> None:
+    monkeypatch.setenv("OPENROUTER_API_KEY", "sk-fake-test-key")
+    config_path = tmp_path / "routing.yaml"
+    _write_routing_config(config_path)
+
+    adapter = _adapter_for(
+        "routing",
+        model="",
+        fixture_responses=None,
+        routing_config=config_path,
+        workspace=tmp_path,
+    )
+
+    assert isinstance(adapter, RoutingAssistedGenerationAdapter)
+    assert adapter.stage_to_task_type["summarize-source"] == "cheap"
+
+
+_FIXTURE_RESPONSES = {
+    "summarize-source": "# Source Summary\n\nFixture summary content.\n",
+    "extract-entities": (
+        "# Stub Entity\n\n"
+        "## Category\n\nstrategy\n\n"
+        "## Definition\n\nA stub trading concept for the routing CLI smoke.\n"
+    ),
+    "extract-relations": (
+        "# Stub Entity Practices Tape Reading\n\n"
+        "## Subject\n\nStub Entity\n\n"
+        "## Predicate\n\npractices\n\n"
+        "## Object\n\nTape Reading\n\n"
+        "## Relation Type\n\nstrategy_outcome\n\n"
+        "## Evidence\n\nFixture evidence.\n"
+    ),
+    "evaluate-entity": (
+        "---\n"
+        "artifact_id: entity/stub-entity.md\n"
+        "evaluator: fixture\n"
+        "evaluated_at: '2026-05-18T00:00:00'\n"
+        "scores:\n"
+        "  - name: groundedness\n    value: 4.0\n    max_value: 5.0\n"
+        "  - name: lesson_clarity\n    value: 4.0\n    max_value: 5.0\n"
+        "  - name: historical_context\n    value: 4.0\n    max_value: 5.0\n"
+        "  - name: overgeneralization_risk\n    value: 4.0\n    max_value: 5.0\n"
+        "---\n\n"
+        "# Evaluation: entity/stub-entity.md\n"
+    ),
+    "synthesize-report": "# Routed Report\n\nFixture report.\n",
+}
+
+
+def _stub_openrouter_execute(self, prompt, config):
+    """Replacement for OpenRouterAdapter.execute_prompt that returns canned content.
+
+    Identifies the stage from the rendered template's H1 line (templates
+    start with ``# Extract Entities`` / ``# Extract Relations`` / ``# Evaluate
+    ...`` / ``# Synthesize ...``; anything else is treated as the
+    summarize-source stage).
+    """
+    from llm_connect.models import LLMResponse
+
+    first_line = prompt.lstrip().splitlines()[0] if prompt.strip() else ""
+    lower = first_line.lower()
+    if lower.startswith("# extract") and "entit" in lower:
+        content = _FIXTURE_RESPONSES["extract-entities"]
+    elif lower.startswith("# extract") and "relation" in lower:
+        content = _FIXTURE_RESPONSES["extract-relations"]
+    elif lower.startswith("# evaluate"):
+        content = _FIXTURE_RESPONSES["evaluate-entity"]
+    elif lower.startswith("# synthesize"):
+        content = _FIXTURE_RESPONSES["synthesize-report"]
+    else:
+        content = _FIXTURE_RESPONSES["summarize-source"]
+    return LLMResponse(
+        content=content,
+        model=getattr(self, "_model", "openai/gpt-4o-mini"),
+        usage={"prompt_tokens": len(prompt.split()), "completion_tokens": 40},
+        finish_reason="stop",
+        metadata={"request_id": "or-stub-1"},
+    )
+
+
+def test_run_generation_via_routing_provider_completes_end_to_end(
+    tmp_path: Path, monkeypatch
+) -> None:
+    monkeypatch.setenv("OPENROUTER_API_KEY", "sk-fake-test-key")
+    from llm_connect.openrouter import OpenRouterAdapter
+
+    monkeypatch.setattr(
+        OpenRouterAdapter, "execute_prompt", _stub_openrouter_execute, raising=True
+    )
+
+    book = _build_fixture_epub(tmp_path / "lefevre.epub")
+    config_path = tmp_path / "routing.yaml"
+    _write_routing_config(config_path)
+
+    infospace = init_generation_infospace(
+        tmp_path,
+        book,
+        "lefevre-routing-smoke",
+        name="Lefevre Routing Smoke",
+        profile="trading-literature",
+        chapter_filter=["I"],
+    )
+    result = run_generation(
+        infospace.root,
+        provider="routing",
+        routing_config=config_path,
+    )
+    status = status_generation(infospace.root)
+
+    assert result.status == "completed"
+    assert status["source_chunk_count"] == 1
+    assert status["entity_count"] >= 1
+    assert status["evaluation_count"] >= 1
+
+    report = (infospace.root / "reports" / "generation-summary.md").read_text(encoding="utf-8")
+    assert "## Per-stage adapter choices" in report
+    assert "openai/gpt-4o-mini" in report  # adapter_id ends with the model
+
+    # Budget usage rollup should bucket calls by the routed model.
+    import yaml as _yaml
+
+    usage = _yaml.safe_load((infospace.root / "output" / "budget" / "usage.yaml").read_text(encoding="utf-8"))
+    bucket_models = {b["model"] for b in usage["runs"][0]["per_bucket"]}
+    assert "openai/gpt-4o-mini" in bucket_models
+
+
+def test_from_source_cli_provider_routing(tmp_path: Path, monkeypatch) -> None:
+    book = _build_fixture_epub(tmp_path / "lefevre.epub")
+    config_path = tmp_path / "routing.yaml"
+    _write_routing_config(config_path)
+
+    env = os.environ.copy()
+    env["PYTHONPATH"] = "src:/home/worsch/markitect-tool/src:/home/worsch/llm-connect"
+
+    # Missing API key → fast fail from the loader, no subprocess crash.
+    env.pop("OPENROUTER_API_KEY", None)
+    bad = subprocess.run(
+        [
+            sys.executable,
+            "-m",
+            "infospace_bench",
+            "generate",
+            "from-source",
+            str(book),
+            "--workspace",
+            str(tmp_path),
+            "--slug",
+            "routing-cli-missing-key",
+            "--name",
+            "Routing CLI Missing Key",
+            "--profile",
+            "trading-literature",
+            "--provider",
+            "routing",
+            "--routing-config",
+            str(config_path),
+            "--chapter",
+            "I",
+            "--apply",
+        ],
+        check=False,
+        env=env,
+        text=True,
+        capture_output=True,
+    )
+    assert bad.returncode != 0
+    assert "missing_routing_api_key" in (bad.stdout + bad.stderr)
+
+
+def test_run_via_routing_resolves_workspace_relative_ledger(
+    tmp_path: Path, monkeypatch
+) -> None:
+    monkeypatch.setenv("OPENROUTER_API_KEY", "sk-fake-test-key")
+    from llm_connect.openrouter import OpenRouterAdapter
+
+    monkeypatch.setattr(
+        OpenRouterAdapter, "execute_prompt", _stub_openrouter_execute, raising=True
+    )
+
+    book = _build_fixture_epub(tmp_path / "lefevre.epub")
+    config_path = tmp_path / "routing.yaml"
+    _write_routing_config(config_path, ledger_relpath="output/routing/quality.jsonl")
+
+    infospace = init_generation_infospace(
+        tmp_path,
+        book,
+        "lefevre-routing-ledger",
+        name="Lefevre Routing Ledger",
+        profile="trading-literature",
+        chapter_filter=["I"],
+    )
+    run_generation(
+        infospace.root,
+        provider="routing",
+        routing_config=config_path,
+        quality_floor=0.7,
+    )
+
+    # ledger_path is relative to the workspace (tmp_path), not the infospace root.
+    ledger_path = tmp_path / "output" / "routing" / "quality.jsonl"
+    assert ledger_path.parent.is_dir(), "loader must create the ledger parent dir"
--- a/tests/test_routing_config.py
+++ b/tests/test_routing_config.py
@@ -412,6 +412,25 @@ def test_build_routing_policy_claude_code_needs_no_api_key() -> None:
    assert isinstance(policy.rules[0].prefer, ClaudeCodeAdapter)


+def test_example_trading_literature_config_parses() -> None:
+    """Regression: the shipped example config must parse cleanly."""
+    from infospace_bench.routing_config import load_routing_config
+
+    example_path = Path(__file__).resolve().parent.parent / "examples" / "routing" / "trading-literature.yaml"
+
+    config = load_routing_config(example_path)
+
+    task_type_names = {task.task_type for task in config.task_types}
+    assert {"cheap", "smart", "judge", "baseline"} <= task_type_names
+    assert config.default_quality_floor == 0.80
+    # Each shipped stage maps to a task type the config actually declares.
+    for stage, task_type in config.stage_to_task_type.items():
+        assert task_type in task_type_names, f"stage {stage!r} maps to undeclared task type {task_type!r}"
+    # baseline is included so a T05 ShadowingAdapter wiring can reference it.
+    baseline = next(t for t in config.task_types if t.task_type == "baseline")
+    assert baseline.candidates[0].provider == "claude_code"
+
+
 def test_build_routing_policy_honours_custom_api_key_env() -> None:
    from infospace_bench.routing_config import build_routing_policy_from_config
    from llm_connect.openrouter import OpenRouterAdapter
--- a/workplans/IB-WP-0020-provider-routing-cli.md
+++ b/workplans/IB-WP-0020-provider-routing-cli.md
@@ -117,7 +117,7 @@ state_hub_task_id: "5e38514b-ad6a-4d39-8716-f812f241d9fd"

 ```task
 id: IB-WP-0020-T03
-status: todo
+status: done
 priority: high
 state_hub_task_id: "fe5888e0-da33-413a-b026-71ed811b8c73"
 ```
@@ -138,7 +138,7 @@ state_hub_task_id: "fe5888e0-da33-413a-b026-71ed811b8c73"

 ```task
 id: IB-WP-0020-T04
-status: todo
+status: done
 priority: medium
 state_hub_task_id: "69288131-f265-4db5-a4b0-b0c8a6f55dd8"
 ```