From cb37a7f408703999a89232ef1fd2ab701115ffd4 Mon Sep 17 00:00:00 2001 From: tegwick Date: Sun, 17 May 2026 17:26:36 +0200 Subject: [PATCH] IB-WP-0018: stub workplan for adaptive LLM routing consumer wiring Blocked stub that names the dependency on llm-connect WP-0004 (adaptive cost-quality routing). Activates once T01..T03 of that workplan land and the QualityLedger / BaselineGrader / AdaptiveRoutingPolicy APIs are stable. Co-Authored-By: Claude Opus 4.7 --- ...B-WP-0018-adaptive-llm-routing-consumer.md | 100 ++++++++++++++++++ 1 file changed, 100 insertions(+) create mode 100644 workplans/IB-WP-0018-adaptive-llm-routing-consumer.md diff --git a/workplans/IB-WP-0018-adaptive-llm-routing-consumer.md b/workplans/IB-WP-0018-adaptive-llm-routing-consumer.md new file mode 100644 index 0000000..f2799a2 --- /dev/null +++ b/workplans/IB-WP-0018-adaptive-llm-routing-consumer.md @@ -0,0 +1,100 @@ +--- +id: IB-WP-0018 +type: workplan +title: "Adaptive LLM Routing — infospace-bench Consumer Wiring" +domain: markitect +repo: infospace-bench +status: blocked +owner: markitect +topic_slug: markitect +created: "2026-05-17" +updated: "2026-05-17" +depends_on_workplans: + - LLM-WP-0004 +related_workplans: + - IB-WP-0016 +--- + +# IB-WP-0018 — Adaptive LLM Routing — infospace-bench Consumer Wiring + +## Goal + +Wire `infospace-bench` workflow stages to llm-connect's adaptive +cost-quality routing once `LLM-WP-0004` ships the primitives. The goal +is to let an infospace generation run pick the cheapest model that +clears a per-stage quality bar — for example, a small/cheap model for +chunk summarisation and a larger model for entity/relation extraction +— without hardcoding any specific model in `infospace-bench` itself. + +This workplan is a stub until `LLM-WP-0004` tasks T01..T03 (ledger, +grader, adaptive policy) are done in `llm-connect`. The exact task +list will be refined once that API is stable. + +## Status + +Blocked on `LLM-WP-0004` T01..T03. + +## Why this is a separate workplan + +`IB-WP-0016` brings the Lefevre EPUB pipeline to a state where a +chapter-by-chapter live OpenRouter run is feasible. That work uses +`OpenRouterAssistedGenerationAdapter` directly. Replacing that direct +adapter with a task-typed adaptive route is a meaningful architectural +shift that deserves its own scope, baseline, and tests, rather than +being absorbed into IB-WP-0016. + +## Provisional Tasks (refined when LLM-WP-0004 lands) + +### T01 — Task-type taxonomy + +- Name the generation stages as task types for routing + (`summarize-source`, `extract-entities`, `extract-relations`, + `evaluate-entity`, `synthesize-report`) +- Document quality expectations for each task type so a per-stage + quality floor can be set + +### T02 — Adapter swap + +- Introduce a small router-aware adapter that wraps + `AdaptiveRoutingPolicy.resolve(task_type)` and exposes the existing + `AssistedGenerationAdapter` protocol used by `workflow.py` +- Keep `OpenRouterAssistedGenerationAdapter` available as the static + baseline so deterministic test runs and fixture mode continue to work + +### T03 — Baseline + shadow integration + +- Use `ClaudeCodeAdapter` as the default baseline grader (subject to + availability) +- Enable `ShadowingAdapter` for the first multi-chapter run so the + quality ledger fills up while real generation proceeds + +### T04 — Cost/quality reporting + +- Surface per-stage chosen adapter, observed quality, and cumulative + cost in `reports/generation-summary.md` +- Add a small CLI helper to print the ledger summary for an infospace + +### T05 — Tests + +- Fixture-backed test that routes through a deterministic adaptive + policy with mocked observations +- Regression test that demonstrates the static path still works when + the router is bypassed + +## Acceptance + +- An infospace generation run can be configured to use the adaptive + router without any code change inside `workflow.py` +- A multi-chapter Lefevre run completes with per-stage adapter choices + recorded in the generation summary +- The fixture-mode test suite continues to pass with no live calls +- The static `OpenRouterAssistedGenerationAdapter` path remains usable + for callers that opt out of the router + +## Non-Goals + +- Authoring the routing primitives themselves (that is `LLM-WP-0004`'s + job) +- Owning a task-type taxonomy beyond `infospace-bench` workflow stages +- Embedding cost or quality observations inside `infospace-bench` + beyond what the llm-connect ledger already records