From a95322051fdeaf01a34517833183171b2c22d044 Mon Sep 17 00:00:00 2001 From: tegwick Date: Mon, 18 May 2026 13:50:26 +0200 Subject: [PATCH] IB-WP-0020: provider routing CLI workplan (todo) Open a workplan that turns the IB-WP-0018 RoutingAssistedGenerationAdapter bridge into a first-class CLI option. Adds --provider routing, a YAML routing config schema, --quality-floor and --shadow-rate / --shadow-baseline opt-in flags so a real multi-chapter Lefevre live run can use adaptive cost-quality routing without writing any Python. Workstream registered with state-hub (172bb082-610a-477b-b5e0-26c9f4bdfd95) with five tasks: - T01 routing config file schema (medium) - T02 routing config loader (high) - T03 --provider routing + --routing-config + --quality-floor CLI flags (high) - T04 example config + optional live routing smoke test (medium) - T05 --shadow-rate / --shadow-baseline opt-in flags (medium) Depends on IB-WP-0018 (already done) and LLM-WP-0004 (already done in ~/llm-connect). Co-Authored-By: Claude Opus 4.7 --- workplans/IB-WP-0020-provider-routing-cli.md | 211 +++++++++++++++++++ 1 file changed, 211 insertions(+) create mode 100644 workplans/IB-WP-0020-provider-routing-cli.md diff --git a/workplans/IB-WP-0020-provider-routing-cli.md b/workplans/IB-WP-0020-provider-routing-cli.md new file mode 100644 index 0000000..609857b --- /dev/null +++ b/workplans/IB-WP-0020-provider-routing-cli.md @@ -0,0 +1,211 @@ +--- +id: IB-WP-0020 +type: workplan +title: "Provider Routing CLI Integration" +domain: markitect +repo: infospace-bench +status: todo +owner: markitect +topic_slug: markitect +created: "2026-05-18" +updated: "2026-05-18" +depends_on_workplans: + - IB-WP-0018 + - LLM-WP-0004 +related_workplans: + - IB-WP-0016 + - IB-WP-0019 +state_hub_workstream_slug: "ib-wp-0020-provider-routing-cli" +state_hub_workstream_id: "172bb082-610a-477b-b5e0-26c9f4bdfd95" +--- + +# IB-WP-0020 — Provider Routing CLI Integration + +## Goal + +Expose `RoutingAssistedGenerationAdapter` (IB-WP-0018) as a first-class +CLI option so a real multi-chapter or full-book run can use the +adaptive router without writing any Python. Today `--provider` accepts +`fixture` and `openrouter`; this workplan adds `routing`, plus a small +config file that names the rules, the ledger, the quality floors, and +the per-stage task-type overrides. + +The end state is a single command that does cost-aware adaptive +routing across multiple OpenRouter models and writes back the +per-stage adapter choices, the budget log, and (optionally) sampled +shadow grades: + +```bash +infospace-bench generate from-source ./LEFEVRE.epub \ + --workspace ./infospaces \ + --slug reminiscences-routed \ + --name "Reminiscences (Routed)" \ + --profile trading-literature \ + --provider routing \ + --routing-config ./routing.yaml \ + --chapter I \ + --apply +``` + +## Why this is a separate workplan + +`IB-WP-0018` shipped the bridge module and its programmatic API. CLI +wiring needs its own config-file schema, its own loader, its own error +surfaces, and its own end-to-end smoke test — and that is enough scope +to justify a separate review surface rather than absorbing it into the +already-closed IB-WP-0018. + +## Non-Goals + +- Owning the routing policy primitives (those live in + `llm-connect` LLM-WP-0004). +- Replacing the static `openrouter` provider — that path stays usable + for callers who do not want the router. +- Embedding model selection logic inside the CLI; the config file is + declarative and routing decisions stay with `AdaptiveRoutingPolicy`. + +## Tasks + +### T01 — Routing config file schema + +```task +id: IB-WP-0020-T01 +status: todo +priority: medium +state_hub_task_id: "39597441-22ab-4dcf-b68d-b045823a9374" +``` + +- Define a small YAML schema for a routing config: + - `quality_floor: ` (global default) + - `ledger_path: ` (relative to workspace by default) + - `task_types`: map of task_type to a list of candidate adapters, + each with `id`, `provider` (`openrouter`, `claude_code`, + `openai`, …), `model`, `api_key_env`, optional `max_cost_per_1k`, + optional `quality_floor` override + - `stage_to_task_type`: optional override map +- Document the schema in `docs/routing-config.md` with two annotated + examples (one OpenRouter-only, one ClaudeCode-as-baseline + + OpenRouter candidates). +- Tests: schema parses; missing fields default cleanly; unknown + providers raise a focused error. + +### T02 — Routing config loader + +```task +id: IB-WP-0020-T02 +status: todo +priority: high +state_hub_task_id: "5e38514b-ad6a-4d39-8716-f812f241d9fd" +``` + +- Add `src/infospace_bench/routing_config.py` (or extend + `routing.py`) with `load_routing_config(path, *, workspace)` that + returns a `RoutingPolicy` (or `AdaptiveRoutingPolicy` when the + config sets `quality_floor` or names a ledger) ready to hand to + `RoutingAssistedGenerationAdapter`. +- Provider construction: + - `openrouter` → llm-connect `OpenRouterAdapter` with API key from + `api_key_env` (default `OPENROUTER_API_KEY`) + - `claude_code` → llm-connect `ClaudeCodeAdapter` + - others (openai, gemini) supported but explicitly documented as + untested for production use +- Tests: builds a static policy from a minimal config; builds an + adaptive policy with a ledger; missing API key raises before any + network call. + +### T03 — `--provider routing` and `--routing-config` CLI flags + +```task +id: IB-WP-0020-T03 +status: todo +priority: high +state_hub_task_id: "fe5888e0-da33-413a-b026-71ed811b8c73" +``` + +- Add `routing` to the `--provider` choices on `generate run`, + `generate resume`, and `generate from-source`. +- Add `--routing-config ` (required when `--provider routing`). +- Add `--quality-floor ` to override the config-level floor at + the call site (handy for tightening or loosening for a single run + without editing the file). +- Wire the loader into `_adapter_for`/`run_generation` so a + `RoutingAssistedGenerationAdapter` is constructed and passed to the + workflow engine. +- Tests: CLI smoke that builds a routing config pointing at mocked + adapter ids and confirms the run goes through the bridge. + +### T04 — Example config and live-smoke wiring + +```task +id: IB-WP-0020-T04 +status: todo +priority: medium +state_hub_task_id: "69288131-f265-4db5-a4b0-b0c8a6f55dd8" +``` + +- Add `examples/routing/trading-literature.yaml` with a realistic + Lefevre-aimed config: cheap model for summaries, mid model for + entities/relations, ClaudeCode baseline behind a shadow sampler. +- Update the optional live-OpenRouter smoke test + (`tests/test_openrouter_live.py`) with a parallel skipped test that + exercises `--provider routing` end-to-end when both + `OPENROUTER_API_KEY` and + `INFOSPACE_BENCH_ENABLE_LIVE_OPENROUTER=1` are set. +- Document how to run the live routing smoke in + `docs/generic-source-generator.md`. + +### T05 — Shadow-mode opt-in flag + +```task +id: IB-WP-0020-T05 +status: todo +priority: medium +state_hub_task_id: "02658420-056c-4d73-8055-e6a7ab51876b" +``` + +- Add `--shadow-rate ` and `--shadow-baseline ` flags so a + caller can enable `wrap_with_shadow_sampling()` for an entire run + without editing the config file. When set, the loader wraps each + candidate adapter in `ShadowingAdapter` with the named baseline and + the chosen rate. +- Tests: monkeypatched baseline asserts the shadow path fires at + `shadow_rate=1.0` and skips at `shadow_rate=0.0`. + +## Acceptance + +- `infospace-bench generate from-source ... --provider routing + --routing-config ` succeeds against the deterministic Lefevre + fixture with a hand-crafted routing config and mocked adapters. +- The generation report's `## Per-stage adapter choices` section + reflects the routed choices, and `output/budget/usage.yaml` buckets + reflect the actual model that ran each call. +- The static `openrouter` and `fixture` provider paths remain + unchanged. +- An optional live smoke test exists and is gated identically to the + IB-WP-0016 OpenRouter smoke. +- Documentation explains the config shape, the API-key resolution, and + the difference between adaptive routing and shadow-mode sampling. + +## Risks and open questions + +- **Adapter constructor surface.** llm-connect's adapter constructors + vary slightly per provider; the loader needs to keep a small but + explicit allowlist of provider names rather than reflective magic. +- **API key plumbing.** Today `openrouter` reads + `OPENROUTER_API_KEY` directly. The config will name the env var + explicitly to make multi-key setups workable; no key material + belongs in the config file itself. +- **Schema versioning.** Bump `schema_version` from day one so the + loader can refuse mismatched configs once the shape stabilises. +- **Shadow grader choice.** v1 will default the shadow grader to + `ExactMatchJudge` because it has no extra cost. `LLMJudge` and + `EmbeddingSimilarityJudge` configuration belongs in a follow-up. + +## Downstream effects + +- `infospace-bench routing ledger ` (already shipped via + IB-WP-0018) becomes the natural companion CLI for inspecting the + observations the routed runs accumulate. +- A successful T03 + T04 lets us run a multi-chapter Lefevre live + build using the adaptive router and validate the IB-WP-0016 + reviewer checklist on real output without single-model lock-in.