--- id: IB-WP-0018 type: workplan title: "Adaptive LLM Routing — infospace-bench Consumer Wiring" domain: markitect repo: infospace-bench status: blocked owner: markitect topic_slug: markitect created: "2026-05-17" updated: "2026-05-17" depends_on_workplans: - LLM-WP-0004 related_workplans: - IB-WP-0016 --- # IB-WP-0018 — Adaptive LLM Routing — infospace-bench Consumer Wiring ## Goal Wire `infospace-bench` workflow stages to llm-connect's adaptive cost-quality routing once `LLM-WP-0004` ships the primitives. The goal is to let an infospace generation run pick the cheapest model that clears a per-stage quality bar — for example, a small/cheap model for chunk summarisation and a larger model for entity/relation extraction — without hardcoding any specific model in `infospace-bench` itself. This workplan is a stub until `LLM-WP-0004` tasks T01..T03 (ledger, grader, adaptive policy) are done in `llm-connect`. The exact task list will be refined once that API is stable. ## Status Blocked on `LLM-WP-0004` T01..T03. ## Why this is a separate workplan `IB-WP-0016` brings the Lefevre EPUB pipeline to a state where a chapter-by-chapter live OpenRouter run is feasible. That work uses `OpenRouterAssistedGenerationAdapter` directly. Replacing that direct adapter with a task-typed adaptive route is a meaningful architectural shift that deserves its own scope, baseline, and tests, rather than being absorbed into IB-WP-0016. ## Provisional Tasks (refined when LLM-WP-0004 lands) ### T01 — Task-type taxonomy - Name the generation stages as task types for routing (`summarize-source`, `extract-entities`, `extract-relations`, `evaluate-entity`, `synthesize-report`) - Document quality expectations for each task type so a per-stage quality floor can be set ### T02 — Adapter swap - Introduce a small router-aware adapter that wraps `AdaptiveRoutingPolicy.resolve(task_type)` and exposes the existing `AssistedGenerationAdapter` protocol used by `workflow.py` - Keep `OpenRouterAssistedGenerationAdapter` available as the static baseline so deterministic test runs and fixture mode continue to work ### T03 — Baseline + shadow integration - Use `ClaudeCodeAdapter` as the default baseline grader (subject to availability) - Enable `ShadowingAdapter` for the first multi-chapter run so the quality ledger fills up while real generation proceeds ### T04 — Cost/quality reporting - Surface per-stage chosen adapter, observed quality, and cumulative cost in `reports/generation-summary.md` - Add a small CLI helper to print the ledger summary for an infospace ### T05 — Tests - Fixture-backed test that routes through a deterministic adaptive policy with mocked observations - Regression test that demonstrates the static path still works when the router is bypassed ## Acceptance - An infospace generation run can be configured to use the adaptive router without any code change inside `workflow.py` - A multi-chapter Lefevre run completes with per-stage adapter choices recorded in the generation summary - The fixture-mode test suite continues to pass with no live calls - The static `OpenRouterAssistedGenerationAdapter` path remains usable for callers that opt out of the router ## Non-Goals - Authoring the routing primitives themselves (that is `LLM-WP-0004`'s job) - Owning a task-type taxonomy beyond `infospace-bench` workflow stages - Embedding cost or quality observations inside `infospace-bench` beyond what the llm-connect ledger already records