Add adaptive cost-quality routing primitives

2026-05-17 21:32:27 +02:00
parent bf86a03c5d
commit c4ad4bb9f2
17 changed files with 2480 additions and 25 deletions
--- a/docs/infospace-bench-adaptive-routing.md
+++ b/docs/infospace-bench-adaptive-routing.md
@@ -0,0 +1,83 @@
+# Infospace-Bench Adaptive Routing Guide
+
+This guide shows how a consumer such as `infospace-bench` can wire task-type
+stages into the adaptive cost-quality primitives from `llm-connect`.
+
+## Stage taxonomy
+
+The consumer owns task names and quality thresholds. A first pass for
+`infospace-bench` could use:
+
+| Stage | Task type | Suggested floor |
+|-------|-----------|-----------------|
+| Source chapter summary | `summarize-source` | `0.82` |
+| Entity extraction | `extract-entities` | `0.88` |
+| Relation extraction | `extract-relations` | `0.86` |
+| Entity evaluation | `evaluate-entity` | `0.90` |
+| Report synthesis | `synthesize-report` | `0.92` |
+
+These floors are starting points, not library defaults. Raise them for stages
+whose errors compound downstream.
+
+## Wiring sketch
+
+```python
+from llm_connect.grading import ExactMatchJudge, PairedGrader
+from llm_connect.quality import QualityLedger
+from llm_connect.routing import AdaptiveRoutingPolicy, RoutingRule
+from llm_connect.shadowing import ShadowingAdapter
+
+ledger = QualityLedger("quality-ledger.jsonl")
+grader = PairedGrader(ExactMatchJudge())
+
+baseline = claude_code_adapter
+cheap = openrouter_cheap_adapter
+mid = openrouter_mid_adapter
+
+shadowed_cheap = ShadowingAdapter(
+    candidate_adapter=cheap,
+    baseline_adapter=baseline,
+    grader=grader,
+    ledger=ledger,
+    task_type="extract-relations",
+    adapter_id="openrouter-cheap",
+    baseline_adapter_id="claude-code",
+    shadow_rate=0.1,
+    tags={"prompt_fingerprint": prompt_fingerprint},
+)
+
+policy = AdaptiveRoutingPolicy(
+    rules=[
+        RoutingRule("extract-relations", prefer=baseline, fallback=mid),
+    ],
+    ledger=ledger,
+    adapters_by_id={
+        "openrouter-cheap": shadowed_cheap,
+        "openrouter-mid": mid,
+        "claude-code": baseline,
+    },
+    window_size=20,
+    min_observations=3,
+)
+
+adapter = policy.resolve("extract-relations", quality_floor=0.86)
+response = adapter.execute_prompt(prompt, run_config)
+```
+
+## Operating loop
+
+1. Start with static routing to the trusted baseline or mid-tier adapter.
+2. Wrap cheaper candidates with `ShadowingAdapter` at a conservative
+   `shadow_rate`, for example `0.05` to `0.1`.
+3. Record a prompt fingerprint or template version in `tags` so later prompt
+   changes do not mix incompatible observations.
+4. Increase `min_observations` for stages with high variance.
+5. Let `AdaptiveRoutingPolicy` select the cheapest adapter that clears each
+   stage floor.
+
+## Refresh rules
+
+When a provider model, prompt template, or parser contract changes, treat prior
+observations as a different regime. Either write to a new ledger, prune old
+observations, or filter with a new `prompt_fingerprint` tag before trusting
+adaptive selection again.