feat: WP-0003 — RoutingPolicy (FR-2) and HTTP serve mode (FR-1)

FR-2 RoutingPolicy: - RoutingPolicy + RoutingRule dataclasses in llm_connect/routing.py - resolve(task_type, estimated_cost_per_1k=None) with cost-cap fallback - Exported from llm_connect.__init__; contract doc at contracts/functional/routing-policy.md - 11 tests covering rule match, cost-cap, fallback, unknown type, no-match FR-1 HTTP serve mode: - LLMServer in llm_connect/server.py (stdlib http.server, zero extra deps) - POST /execute + GET /health; CLI via python -m llm_connect.server - [server] optional-dep group added to pyproject.toml - Contract doc at contracts/functional/server.md - 9 tests: health, round-trip, 400/404/500 errors, config forwarding - Added "mock" provider to factory for CLI default All 101 tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 22:34:00 +00:00
parent f76a58d6e9
commit d51d6303e2
11 changed files with 638 additions and 14 deletions
--- a/contracts/functional/server.md
+++ b/contracts/functional/server.md
@@ -0,0 +1,85 @@
+# Contract: HTTP Serve Mode
+
+**layer:** Functional  
+**maturity:** Beta  
+**module:** `llm_connect.server`  
+**since:** WP-0003
+
+## Purpose
+
+Expose any `LLMAdapter` as a lightweight HTTP service.  Intended for
+local/inter-process use; not hardened for public internet exposure.
+
+## API endpoints
+
+### `GET /health`
+
+Liveness probe.
+
+**Response 200**
+
+```json
+{"status": "ok"}
+```
+
+---
+
+### `POST /execute`
+
+Execute a prompt through the configured adapter.
+
+**Request body** (JSON)
+
+| Field | Type | Required | Description |
+|-------|------|----------|-------------|
+| `prompt` | string | yes | Prompt text |
+| `config` | object | no | `RunConfig` overrides (see below) |
+
+`config` sub-fields (all optional, defaults match `RunConfig` defaults):
+
+| Field | Type | Default |
+|-------|------|---------|
+| `model_name` | string | `"gpt-4"` |
+| `temperature` | float | `0.7` |
+| `max_tokens` | int | `2000` |
+| `timeout_seconds` | int | `300` |
+
+**Response 200** — `LLMResponse.to_dict()` shape
+
+```json
+{
+  "content": "...",
+  "model": "...",
+  "usage": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0},
+  "finish_reason": "stop",
+  "metadata": {}
+}
+```
+
+**Error responses**
+
+| HTTP | Condition |
+|------|-----------|
+| 400 | Missing `prompt` field or invalid JSON body |
+| 404 | Unknown path |
+| 500 | Adapter raised an exception |
+
+## Implementation notes
+
+- Uses Python stdlib `http.server` — **no additional runtime dependency**.
+- The `[server]` optional-dependency group is reserved for future migration
+  to `aiohttp`/`starlette` if native async serving is required.
+- `LLMServer(adapter, port=0)` binds to an OS-assigned free port; read back
+  via `server.port` after `start()`.
+
+## CLI
+
+```
+python -m llm_connect.server [--host HOST] [--port PORT] [--provider PROVIDER] [--model MODEL]
+```
+
+Default provider: `mock`.  All registered providers from `create_adapter` are valid.
+
+## Known consumers
+
+- `inter-hub` (IHUB-WP-0012 Phase 11): drives federation calls over HTTP from non-Python services.