Files
llm-connect/contracts/functional/adaptive-routing-policy.md
tegwick c4ad4bb9f2
Some checks failed
CI / test (3.10) (push) Has been cancelled
CI / test (3.11) (push) Has been cancelled
CI / test (3.12) (push) Has been cancelled
Add adaptive cost-quality routing primitives
2026-05-17 21:32:27 +02:00

3.2 KiB

Contract: AdaptiveRoutingPolicy

layer: Functional maturity: Beta module: llm_connect.routing since: WP-0004

Purpose

Select the cheapest adapter whose observed mean quality for a task type clears a caller-supplied quality floor. The policy builds on RoutingPolicy: static rules remain the cold-start and failure fallback, while adaptive selection is used only when the ledger has enough qualifying observations.

Public surface

@dataclass
class AdaptiveRoutingPolicy(RoutingPolicy):
    ledger: Optional[QualityLedger] = None
    adapters_by_id: Mapping[str, LLMAdapter] = field(default_factory=dict)
    window_size: int = 20
    min_observations: int = 1
    max_age: Optional[timedelta] = None

    def resolve(
        self,
        task_type: str,
        estimated_cost_per_1k: Optional[float] = None,
        *,
        quality_floor: Optional[float] = None,
    ) -> LLMAdapter: ...

Candidate identity

Observations are keyed by (task_type, adapter_id). Callers should pass adapters_by_id so the policy can map ledger observations back to concrete LLMAdapter instances. If a static rule adapter is not present in adapters_by_id, the policy also checks common string attributes adapter_id, id, and name.

Invariants

  1. If quality_floor is None or ledger is None, resolution is exactly the same as RoutingPolicy.resolve().
  2. quality_floor must be between 0 and 1, inclusive.
  3. Each candidate is evaluated over the newest window_size observations for the requested task_type and adapter id.
  4. max_age, when provided, filters out observations older than that age.
  5. A candidate is considered only when it has at least min_observations after filtering.
  6. A candidate qualifies when its mean quality_score is greater than or equal to quality_floor.
  7. Among qualifying candidates, the policy chooses the lowest mean observed cost_usd.
  8. If mean observed cost ties exactly, the policy prefers the matching static rule's explicit prefer adapter.
  9. If there are still ties, stable candidate order is used.
  10. If no candidate qualifies, resolution falls through to RoutingPolicy.resolve(task_type, estimated_cost_per_1k).

Sample-size and freshness trade-off

Small window_size values react quickly to model or prompt changes but can be noisy. Larger windows are more stable but may preserve stale behavior after a provider update or prompt template change. min_observations lets callers avoid acting on a single lucky sample, while max_age bounds how long old observations can influence routing. Callers that change prompts materially should also filter by a prompt fingerprint in observation tags before writing comparable samples to the same ledger regime.

Error contract

Condition Exception
quality_floor outside 0..1 ValueError
window_size <= 0 ValueError
min_observations <= 0 ValueError
max_age < 0 ValueError
No qualifying adaptive candidate and no static fallback LookupError

Non-goals

The policy does not define a task taxonomy, set task quality floors, decide which baseline is authoritative, or perform billing-grade accounting. Those are consumer policy choices.