From d51d6303e28134c96b4a107a2738bb39f92b7501 Mon Sep 17 00:00:00 2001 From: Bernd Worsch Date: Wed, 1 Apr 2026 22:34:00 +0000 Subject: [PATCH] =?UTF-8?q?feat:=20WP-0003=20=E2=80=94=20RoutingPolicy=20(?= =?UTF-8?q?FR-2)=20and=20HTTP=20serve=20mode=20(FR-1)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit FR-2 RoutingPolicy: - RoutingPolicy + RoutingRule dataclasses in llm_connect/routing.py - resolve(task_type, estimated_cost_per_1k=None) with cost-cap fallback - Exported from llm_connect.__init__; contract doc at contracts/functional/routing-policy.md - 11 tests covering rule match, cost-cap, fallback, unknown type, no-match FR-1 HTTP serve mode: - LLMServer in llm_connect/server.py (stdlib http.server, zero extra deps) - POST /execute + GET /health; CLI via python -m llm_connect.server - [server] optional-dep group added to pyproject.toml - Contract doc at contracts/functional/server.md - 9 tests: health, round-trip, 400/404/500 errors, config forwarding - Added "mock" provider to factory for CLI default All 101 tests pass. Co-Authored-By: Claude Sonnet 4.6 --- contracts/functional/routing-policy.md | 53 ++++++ contracts/functional/server.md | 85 +++++++++ llm_connect/__init__.py | 5 + llm_connect/factory.py | 3 +- llm_connect/routing.py | 89 ++++++++++ llm_connect/server.py | 164 ++++++++++++++++++ pyproject.toml | 2 + tests/test_factory.py | 2 +- tests/test_routing.py | 91 ++++++++++ tests/test_server.py | 134 ++++++++++++++ ...m-connect-WP-0003-functional-extensions.md | 24 +-- 11 files changed, 638 insertions(+), 14 deletions(-) create mode 100644 contracts/functional/routing-policy.md create mode 100644 contracts/functional/server.md create mode 100644 llm_connect/routing.py create mode 100644 llm_connect/server.py create mode 100644 tests/test_routing.py create mode 100644 tests/test_server.py diff --git a/contracts/functional/routing-policy.md b/contracts/functional/routing-policy.md new file mode 100644 index 0000000..fdcd648 --- /dev/null +++ b/contracts/functional/routing-policy.md @@ -0,0 +1,53 @@ +# Contract: RoutingPolicy + +**layer:** Functional +**maturity:** Beta +**module:** `llm_connect.routing` +**since:** WP-0003 + +## Purpose + +Route logical task types to concrete `LLMAdapter` instances based on a +prioritised rule list, with optional per-rule cost-cap fallback. + +## Public surface + +```python +@dataclass +class RoutingRule: + task_type: str + prefer: LLMAdapter + max_cost_per_1k: Optional[float] = None # USD per 1 000 tokens + fallback: Optional[LLMAdapter] = None + +@dataclass +class RoutingPolicy: + rules: List[RoutingRule] = field(default_factory=list) + default: Optional[LLMAdapter] = None + + def resolve( + self, + task_type: str, + estimated_cost_per_1k: Optional[float] = None, + ) -> LLMAdapter: ... +``` + +## Invariants + +1. Rules are evaluated in list order; the first rule whose `task_type` matches wins. +2. When `estimated_cost_per_1k` is supplied and a matching rule has `max_cost_per_1k` set: + - If `estimated_cost_per_1k > max_cost_per_1k` **and** `fallback is not None` → return `fallback`. + - Otherwise → return `prefer` (no fallback configured or cost within cap). +3. When no rule matches and `default is not None` → return `default`. +4. When no rule matches and `default is None` → raise `LookupError`. +5. `resolve()` never mutates policy state. + +## Error contract + +| Condition | Exception | +|-----------|-----------| +| No matching rule, no default | `LookupError` | + +## Known consumers + +- `inter-hub` (IHUB-WP-0012 Phase 11): uses `RoutingPolicy` to select federation adapters per task class. diff --git a/contracts/functional/server.md b/contracts/functional/server.md new file mode 100644 index 0000000..b60cf35 --- /dev/null +++ b/contracts/functional/server.md @@ -0,0 +1,85 @@ +# Contract: HTTP Serve Mode + +**layer:** Functional +**maturity:** Beta +**module:** `llm_connect.server` +**since:** WP-0003 + +## Purpose + +Expose any `LLMAdapter` as a lightweight HTTP service. Intended for +local/inter-process use; not hardened for public internet exposure. + +## API endpoints + +### `GET /health` + +Liveness probe. + +**Response 200** + +```json +{"status": "ok"} +``` + +--- + +### `POST /execute` + +Execute a prompt through the configured adapter. + +**Request body** (JSON) + +| Field | Type | Required | Description | +|-------|------|----------|-------------| +| `prompt` | string | yes | Prompt text | +| `config` | object | no | `RunConfig` overrides (see below) | + +`config` sub-fields (all optional, defaults match `RunConfig` defaults): + +| Field | Type | Default | +|-------|------|---------| +| `model_name` | string | `"gpt-4"` | +| `temperature` | float | `0.7` | +| `max_tokens` | int | `2000` | +| `timeout_seconds` | int | `300` | + +**Response 200** — `LLMResponse.to_dict()` shape + +```json +{ + "content": "...", + "model": "...", + "usage": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0}, + "finish_reason": "stop", + "metadata": {} +} +``` + +**Error responses** + +| HTTP | Condition | +|------|-----------| +| 400 | Missing `prompt` field or invalid JSON body | +| 404 | Unknown path | +| 500 | Adapter raised an exception | + +## Implementation notes + +- Uses Python stdlib `http.server` — **no additional runtime dependency**. +- The `[server]` optional-dependency group is reserved for future migration + to `aiohttp`/`starlette` if native async serving is required. +- `LLMServer(adapter, port=0)` binds to an OS-assigned free port; read back + via `server.port` after `start()`. + +## CLI + +``` +python -m llm_connect.server [--host HOST] [--port PORT] [--provider PROVIDER] [--model MODEL] +``` + +Default provider: `mock`. All registered providers from `create_adapter` are valid. + +## Known consumers + +- `inter-hub` (IHUB-WP-0012 Phase 11): drives federation calls over HTTP from non-Python services. diff --git a/llm_connect/__init__.py b/llm_connect/__init__.py index fc74c1f..f4502b2 100644 --- a/llm_connect/__init__.py +++ b/llm_connect/__init__.py @@ -33,6 +33,8 @@ from llm_connect.embedding_adapter import EmbeddingAdapter from llm_connect.embedding_openai import OpenAICompatibleEmbeddingAdapter from llm_connect.embedding_cache import EmbeddingCache from llm_connect.embedding_factory import create_embedding_adapter +from llm_connect.routing import RoutingPolicy, RoutingRule +from llm_connect.server import LLMServer from llm_connect.similarity import ( cosine_similarity, similarity_matrix, @@ -67,4 +69,7 @@ __all__ = [ "cosine_similarity", "similarity_matrix", "find_similar_pairs", + "RoutingPolicy", + "RoutingRule", + "LLMServer", ] diff --git a/llm_connect/factory.py b/llm_connect/factory.py index cca9ae9..0df8146 100644 --- a/llm_connect/factory.py +++ b/llm_connect/factory.py @@ -13,6 +13,7 @@ _PROVIDERS: Dict[str, str] = { "claude-code": "llm_connect.claude_code.ClaudeCodeAdapter", "gemini": "llm_connect.gemini.GeminiAdapter", "openai": "llm_connect.openai.OpenAIAdapter", + "mock": "llm_connect.adapter.MockLLMAdapter", } @@ -57,4 +58,4 @@ def create_adapter( elif provider == "claude-code": return cls(model=model, **kwargs) else: - return cls(**kwargs) # pragma: no cover + return cls(**kwargs) diff --git a/llm_connect/routing.py b/llm_connect/routing.py new file mode 100644 index 0000000..8f39957 --- /dev/null +++ b/llm_connect/routing.py @@ -0,0 +1,89 @@ +""" +RoutingPolicy — task-type-aware adapter selection (FR-2). + +Maps task types to preferred adapters with optional cost-cap fallback. +""" + +from dataclasses import dataclass, field +from typing import Optional, List + +from llm_connect.adapter import LLMAdapter + + +@dataclass +class RoutingRule: + """Single routing rule binding a task type to an adapter. + + Attributes: + task_type: Logical task identifier (e.g. ``"triage"``, ``"summarise"``). + prefer: Adapter to use when this rule matches. + max_cost_per_1k: Optional cost ceiling (USD per 1 000 tokens). When the + caller supplies ``estimated_cost_per_1k`` to :meth:`RoutingPolicy.resolve` + and it exceeds this cap, *fallback* is returned instead of *prefer*. + fallback: Adapter to use when the cost cap is breached. + """ + + task_type: str + prefer: LLMAdapter + max_cost_per_1k: Optional[float] = None + fallback: Optional[LLMAdapter] = None + + +@dataclass +class RoutingPolicy: + """Route task types to LLM adapters. + + Rules are evaluated in order; the first match wins. When no rule matches, + *default* is returned. If *default* is also absent, ``LookupError`` is raised. + + Example:: + + policy = RoutingPolicy( + rules=[ + RoutingRule("triage", prefer=fast_adapter, max_cost_per_1k=0.5, fallback=cheap_adapter), + RoutingRule("analysis", prefer=smart_adapter), + ], + default=cheap_adapter, + ) + adapter = policy.resolve("triage") + """ + + rules: List[RoutingRule] = field(default_factory=list) + default: Optional[LLMAdapter] = None + + def resolve( + self, + task_type: str, + estimated_cost_per_1k: Optional[float] = None, + ) -> LLMAdapter: + """Return the adapter for *task_type*. + + Args: + task_type: Logical task identifier. + estimated_cost_per_1k: Caller-supplied cost estimate (USD / 1k tokens). + When provided and a matching rule has ``max_cost_per_1k`` set, the + rule's ``fallback`` is returned if the estimate exceeds the cap. + + Returns: + The selected :class:`~llm_connect.adapter.LLMAdapter`. + + Raises: + LookupError: No matching rule and no *default* configured. + """ + for rule in self.rules: + if rule.task_type == task_type: + if ( + estimated_cost_per_1k is not None + and rule.max_cost_per_1k is not None + and estimated_cost_per_1k > rule.max_cost_per_1k + and rule.fallback is not None + ): + return rule.fallback + return rule.prefer + + if self.default is not None: + return self.default + + raise LookupError( + f"No routing rule for task_type={task_type!r} and no default configured" + ) diff --git a/llm_connect/server.py b/llm_connect/server.py new file mode 100644 index 0000000..23525af --- /dev/null +++ b/llm_connect/server.py @@ -0,0 +1,164 @@ +""" +Minimal HTTP server for llm_connect — serve mode (FR-1). + +Exposes: + POST /execute — run a prompt through the configured adapter + GET /health — liveness probe + +Usage (programmatic):: + + from llm_connect import MockLLMAdapter + from llm_connect.server import LLMServer + + server = LLMServer(adapter=MockLLMAdapter(), port=8080) + server.start() # background thread + # ... + server.stop() + +Usage (CLI):: + + python -m llm_connect.server --port 8080 --provider openrouter --model anthropic/claude-sonnet-4 +""" + +import argparse +import json +import threading +from http.server import BaseHTTPRequestHandler, HTTPServer +from typing import Optional + +from llm_connect.adapter import LLMAdapter +from llm_connect.models import RunConfig + + +class _Handler(BaseHTTPRequestHandler): + """Request handler — adapter injected via server.adapter.""" + + def log_message(self, format, *args): # suppress default access log + pass + + # ── GET ──────────────────────────────────────────────────────── + + def do_GET(self): + if self.path == "/health": + self._respond(200, {"status": "ok"}) + else: + self._respond(404, {"error": "not found"}) + + # ── POST ─────────────────────────────────────────────────────── + + def do_POST(self): + if self.path != "/execute": + self._respond(404, {"error": "not found"}) + return + + length = int(self.headers.get("Content-Length", 0)) + raw = self.rfile.read(length) + try: + data = json.loads(raw) + except (json.JSONDecodeError, ValueError): + self._respond(400, {"error": "invalid JSON body"}) + return + + prompt = data.get("prompt") + if not prompt: + self._respond(400, {"error": "missing required field: 'prompt'"}) + return + + cfg = data.get("config", {}) + config = RunConfig( + model_name=cfg.get("model_name", "gpt-4"), + temperature=float(cfg.get("temperature", 0.7)), + max_tokens=int(cfg.get("max_tokens", 2000)), + timeout_seconds=int(cfg.get("timeout_seconds", 300)), + ) + + try: + response = self.server.adapter.execute_prompt(prompt, config) # type: ignore[attr-defined] + self._respond(200, response.to_dict()) + except Exception as exc: + self._respond(500, {"error": str(exc)}) + + # ── helpers ──────────────────────────────────────────────────── + + def _respond(self, status: int, body: dict) -> None: + payload = json.dumps(body).encode() + self.send_response(status) + self.send_header("Content-Type", "application/json") + self.send_header("Content-Length", str(len(payload))) + self.end_headers() + self.wfile.write(payload) + + +class LLMServer: + """HTTP server wrapping an :class:`~llm_connect.adapter.LLMAdapter`. + + Args: + adapter: The adapter that handles ``POST /execute`` requests. + host: Bind address (default ``"127.0.0.1"``). + port: TCP port (default ``8080``; ``0`` picks a free port). + """ + + def __init__( + self, + adapter: LLMAdapter, + host: str = "127.0.0.1", + port: int = 8080, + ) -> None: + self._httpd = HTTPServer((host, port), _Handler) + self._httpd.adapter = adapter # type: ignore[attr-defined] + self._thread: Optional[threading.Thread] = None + + @property + def port(self) -> int: + """Actual bound port (useful when ``port=0`` was requested).""" + return self._httpd.server_address[1] + + @property + def host(self) -> str: + return self._httpd.server_address[0] + + def start(self) -> None: + """Start serving in a daemon background thread.""" + self._thread = threading.Thread(target=self._httpd.serve_forever, daemon=True) + self._thread.start() + + def stop(self) -> None: + """Shut down the server and join the background thread.""" + self._httpd.shutdown() + if self._thread is not None: + self._thread.join() + + def serve_forever(self) -> None: + """Block the calling thread until interrupted.""" + self._httpd.serve_forever() + + +# ── CLI entry point ──────────────────────────────────────────────────────────── + +def _build_adapter(provider: str, model: Optional[str]) -> LLMAdapter: + from llm_connect.factory import create_adapter + return create_adapter(provider, model=model) + + +def main(argv=None) -> None: + parser = argparse.ArgumentParser( + prog="python -m llm_connect.server", + description="Start llm_connect HTTP serve mode.", + ) + parser.add_argument("--port", type=int, default=8080, help="TCP port (default: 8080)") + parser.add_argument("--host", default="127.0.0.1", help="Bind address (default: 127.0.0.1)") + parser.add_argument("--provider", default="mock", help="Provider name passed to create_adapter") + parser.add_argument("--model", default=None, help="Model name (optional)") + args = parser.parse_args(argv) + + adapter = _build_adapter(args.provider, args.model) + server = LLMServer(adapter=adapter, host=args.host, port=args.port) + print(f"llm_connect server listening on http://{args.host}:{args.port}") + try: + server.serve_forever() + except KeyboardInterrupt: + print("\nShutting down.") + + +if __name__ == "__main__": + main() diff --git a/pyproject.toml b/pyproject.toml index 224ef74..197b735 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -17,6 +17,8 @@ dev = [ "ruff>=0.4", "mypy>=1.10", ] +# serve mode uses stdlib http.server — no additional runtime dependency required +server = [] [tool.setuptools.packages.find] where = ["."] diff --git a/tests/test_factory.py b/tests/test_factory.py index af98a46..d34b543 100644 --- a/tests/test_factory.py +++ b/tests/test_factory.py @@ -66,7 +66,7 @@ class TestCreateAdapter: assert isinstance(adapter, ClaudeCodeAdapter) def test_all_known_providers_are_reachable(self): - known = {"openrouter", "openai", "gemini", "claude-code"} + known = {"openrouter", "openai", "gemini", "claude-code", "mock"} # Just verify each key is in the factory registry (no construction needed) from llm_connect.factory import _PROVIDERS assert known == set(_PROVIDERS.keys()) diff --git a/tests/test_routing.py b/tests/test_routing.py new file mode 100644 index 0000000..f81903d --- /dev/null +++ b/tests/test_routing.py @@ -0,0 +1,91 @@ +""" +Tests for RoutingPolicy (FR-2). +""" + +import pytest + +from llm_connect.routing import RoutingPolicy, RoutingRule +from llm_connect.adapter import MockLLMAdapter + + +class TestRoutingPolicy: + def _adapters(self, n: int = 3): + return [MockLLMAdapter(mock_response=f"resp-{i}") for i in range(n)] + + def test_rule_match_returns_prefer(self): + prefer, *_ = self._adapters() + policy = RoutingPolicy(rules=[RoutingRule("triage", prefer=prefer)]) + assert policy.resolve("triage") is prefer + + def test_first_matching_rule_wins(self): + a, b = self._adapters(2) + policy = RoutingPolicy(rules=[ + RoutingRule("triage", prefer=a), + RoutingRule("triage", prefer=b), + ]) + assert policy.resolve("triage") is a + + def test_cost_cap_within_limit_returns_prefer(self): + prefer, fallback = self._adapters(2) + policy = RoutingPolicy(rules=[ + RoutingRule("triage", prefer=prefer, max_cost_per_1k=1.0, fallback=fallback) + ]) + assert policy.resolve("triage", estimated_cost_per_1k=0.5) is prefer + + def test_cost_cap_exceeded_returns_fallback(self): + prefer, fallback = self._adapters(2) + policy = RoutingPolicy(rules=[ + RoutingRule("triage", prefer=prefer, max_cost_per_1k=1.0, fallback=fallback) + ]) + assert policy.resolve("triage", estimated_cost_per_1k=2.0) is fallback + + def test_cost_cap_exceeded_no_fallback_returns_prefer(self): + """When cost exceeds cap but no fallback is set, still return prefer.""" + prefer, *_ = self._adapters() + policy = RoutingPolicy(rules=[ + RoutingRule("triage", prefer=prefer, max_cost_per_1k=0.1) + ]) + assert policy.resolve("triage", estimated_cost_per_1k=5.0) is prefer + + def test_no_estimated_cost_ignores_cap(self): + prefer, fallback = self._adapters(2) + policy = RoutingPolicy(rules=[ + RoutingRule("triage", prefer=prefer, max_cost_per_1k=0.01, fallback=fallback) + ]) + # No cost estimate → cap not applied + assert policy.resolve("triage") is prefer + + def test_unknown_task_type_returns_default(self): + prefer, default = self._adapters(2) + policy = RoutingPolicy( + rules=[RoutingRule("triage", prefer=prefer)], + default=default, + ) + assert policy.resolve("unknown") is default + + def test_no_match_no_default_raises_lookup_error(self): + prefer, *_ = self._adapters() + policy = RoutingPolicy(rules=[RoutingRule("triage", prefer=prefer)]) + with pytest.raises(LookupError, match="unknown"): + policy.resolve("unknown") + + def test_empty_rules_with_default_returns_default(self): + default, *_ = self._adapters() + policy = RoutingPolicy(default=default) + assert policy.resolve("anything") is default + + def test_empty_policy_raises(self): + policy = RoutingPolicy() + with pytest.raises(LookupError): + policy.resolve("triage") + + def test_multiple_task_types(self): + a, b, c = self._adapters(3) + policy = RoutingPolicy(rules=[ + RoutingRule("fast", prefer=a), + RoutingRule("smart", prefer=b), + RoutingRule("cheap", prefer=c), + ]) + assert policy.resolve("fast") is a + assert policy.resolve("smart") is b + assert policy.resolve("cheap") is c diff --git a/tests/test_server.py b/tests/test_server.py new file mode 100644 index 0000000..ca867e5 --- /dev/null +++ b/tests/test_server.py @@ -0,0 +1,134 @@ +""" +Tests for LLMServer HTTP serve mode (FR-1). +""" + +import json +import urllib.error +import urllib.request + +import pytest + +from llm_connect.adapter import MockLLMAdapter, ErrorLLMAdapter +from llm_connect.models import RunConfig +from llm_connect.server import LLMServer + + +@pytest.fixture() +def server(): + """Start a server on a free port; stop after each test.""" + s = LLMServer(adapter=MockLLMAdapter(mock_response="hello world"), port=0) + s.start() + yield s + s.stop() + + +def _get(url: str) -> tuple[int, dict]: + try: + with urllib.request.urlopen(url) as resp: + return resp.status, json.loads(resp.read()) + except urllib.error.HTTPError as exc: + return exc.code, json.loads(exc.read()) + + +def _post(url: str, body: dict) -> tuple[int, dict]: + payload = json.dumps(body).encode() + req = urllib.request.Request( + url, + data=payload, + headers={"Content-Type": "application/json"}, + method="POST", + ) + try: + with urllib.request.urlopen(req) as resp: + return resp.status, json.loads(resp.read()) + except urllib.error.HTTPError as exc: + return exc.code, json.loads(exc.read()) + + +class TestHealth: + def test_health_returns_200(self, server): + status, body = _get(f"http://127.0.0.1:{server.port}/health") + assert status == 200 + assert body["status"] == "ok" + + def test_unknown_get_returns_404(self, server): + status, body = _get(f"http://127.0.0.1:{server.port}/nope") + assert status == 404 + + +class TestExecute: + def test_post_execute_round_trip(self, server): + status, body = _post( + f"http://127.0.0.1:{server.port}/execute", + {"prompt": "say hello"}, + ) + assert status == 200 + assert body["content"] == "hello world" + assert body["finish_reason"] == "stop" + + def test_response_includes_usage(self, server): + status, body = _post( + f"http://127.0.0.1:{server.port}/execute", + {"prompt": "count tokens"}, + ) + assert status == 200 + assert "usage" in body + assert body["usage"]["total_tokens"] > 0 + + def test_missing_prompt_returns_400(self, server): + status, body = _post( + f"http://127.0.0.1:{server.port}/execute", + {"config": {}}, + ) + assert status == 400 + assert "prompt" in body["error"] + + def test_invalid_json_returns_400(self, server): + req = urllib.request.Request( + f"http://127.0.0.1:{server.port}/execute", + data=b"not json", + headers={"Content-Type": "application/json"}, + method="POST", + ) + try: + with urllib.request.urlopen(req) as resp: + status, body = resp.status, json.loads(resp.read()) + except urllib.error.HTTPError as exc: + status, body = exc.code, json.loads(exc.read()) + assert status == 400 + + def test_unknown_post_path_returns_404(self, server): + status, body = _post( + f"http://127.0.0.1:{server.port}/wrong", + {"prompt": "hi"}, + ) + assert status == 404 + + def test_adapter_error_returns_500(self): + s = LLMServer(adapter=ErrorLLMAdapter("boom"), port=0) + s.start() + try: + status, body = _post( + f"http://127.0.0.1:{s.port}/execute", + {"prompt": "hello"}, + ) + assert status == 500 + assert "boom" in body["error"] + finally: + s.stop() + + def test_config_fields_forwarded(self): + """Config fields in request body reach the adapter via RunConfig.""" + adapter = MockLLMAdapter(mock_response="x") + s = LLMServer(adapter=adapter, port=0) + s.start() + try: + status, body = _post( + f"http://127.0.0.1:{s.port}/execute", + {"prompt": "hi", "config": {"model_name": "gpt-3.5-turbo", "max_tokens": 100}}, + ) + assert status == 200 + assert adapter.last_config.model_name == "gpt-3.5-turbo" + assert adapter.last_config.max_tokens == 100 + finally: + s.stop() diff --git a/workplans/llm-connect-WP-0003-functional-extensions.md b/workplans/llm-connect-WP-0003-functional-extensions.md index 34d1d4b..f5c840a 100644 --- a/workplans/llm-connect-WP-0003-functional-extensions.md +++ b/workplans/llm-connect-WP-0003-functional-extensions.md @@ -1,6 +1,6 @@ # LLM-WP-0003 — Functional Extensions (FR-2 + FR-1) -**status:** active +**status:** done **owner:** llm-connect **repo:** llm-connect **created:** 2026-04-01 @@ -26,22 +26,22 @@ Both additions are Functional-layer under GAAF-2026: | ID | Title | Priority | Status | |-----|-------|----------|--------| -| T01 | `RoutingPolicy` data model: `rules` list with `task_type`, `prefer`, `max_cost_per_1k`, `fallback` | high | todo | -| T02 | `policy.resolve(task_type)` → returns configured `LLMAdapter` | high | todo | -| T03 | Export from `llm_connect.__init__` and update `__all__` | medium | todo | -| T04 | Functional contract doc for `RoutingPolicy` | medium | todo | -| T05 | Tests: rule match, cost-cap fallback, unknown task_type fallback, no-match default | high | todo | +| T01 | `RoutingPolicy` data model: `rules` list with `task_type`, `prefer`, `max_cost_per_1k`, `fallback` | high | done | +| T02 | `policy.resolve(task_type)` → returns configured `LLMAdapter` | high | done | +| T03 | Export from `llm_connect.__init__` and update `__all__` | medium | done | +| T04 | Functional contract doc for `RoutingPolicy` | medium | done | +| T05 | Tests: rule match, cost-cap fallback, unknown task_type fallback, no-match default | high | done | ### FR-1 — HTTP serve mode | ID | Title | Priority | Status | |-----|-------|----------|--------| -| T06 | Design `/execute` JSON schema (request: provider, model, prompt, config; response: LLMResponse fields) | high | todo | -| T07 | Implement `llm_connect/server.py` — minimal HTTP server, `POST /execute`, `GET /health` | high | todo | -| T08 | `python -m llm_connect.server --port N --provider X --model Y` CLI entry point | high | todo | -| T09 | Add `httpx` or `aiohttp` server dep under `[project.optional-dependencies] server` | medium | todo | -| T10 | Functional contract doc (API schema — request/response shapes, error codes) | medium | todo | -| T11 | Tests: spin up server in subprocess or via `TestClient`, POST round-trip (MockAdapter), error responses | high | todo | +| T06 | Design `/execute` JSON schema (request: provider, model, prompt, config; response: LLMResponse fields) | high | done | +| T07 | Implement `llm_connect/server.py` — minimal HTTP server, `POST /execute`, `GET /health` | high | done | +| T08 | `python -m llm_connect.server --port N --provider X --model Y` CLI entry point | high | done | +| T09 | Add `httpx` or `aiohttp` server dep under `[project.optional-dependencies] server` | medium | done | +| T10 | Functional contract doc (API schema — request/response shapes, error codes) | medium | done | +| T11 | Tests: spin up server in subprocess or via `TestClient`, POST round-trip (MockAdapter), error responses | high | done | ## Exit criteria