Files
llm-connect/contracts/functional/server.md
Bernd Worsch d51d6303e2
Some checks failed
CI / test (3.10) (push) Has been cancelled
CI / test (3.11) (push) Has been cancelled
CI / test (3.12) (push) Has been cancelled
feat: WP-0003 — RoutingPolicy (FR-2) and HTTP serve mode (FR-1)
FR-2 RoutingPolicy:
- RoutingPolicy + RoutingRule dataclasses in llm_connect/routing.py
- resolve(task_type, estimated_cost_per_1k=None) with cost-cap fallback
- Exported from llm_connect.__init__; contract doc at contracts/functional/routing-policy.md
- 11 tests covering rule match, cost-cap, fallback, unknown type, no-match

FR-1 HTTP serve mode:
- LLMServer in llm_connect/server.py (stdlib http.server, zero extra deps)
- POST /execute + GET /health; CLI via python -m llm_connect.server
- [server] optional-dep group added to pyproject.toml
- Contract doc at contracts/functional/server.md
- 9 tests: health, round-trip, 400/404/500 errors, config forwarding
- Added "mock" provider to factory for CLI default

All 101 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-01 22:34:00 +00:00

86 lines
2.0 KiB
Markdown

# Contract: HTTP Serve Mode
**layer:** Functional
**maturity:** Beta
**module:** `llm_connect.server`
**since:** WP-0003
## Purpose
Expose any `LLMAdapter` as a lightweight HTTP service. Intended for
local/inter-process use; not hardened for public internet exposure.
## API endpoints
### `GET /health`
Liveness probe.
**Response 200**
```json
{"status": "ok"}
```
---
### `POST /execute`
Execute a prompt through the configured adapter.
**Request body** (JSON)
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `prompt` | string | yes | Prompt text |
| `config` | object | no | `RunConfig` overrides (see below) |
`config` sub-fields (all optional, defaults match `RunConfig` defaults):
| Field | Type | Default |
|-------|------|---------|
| `model_name` | string | `"gpt-4"` |
| `temperature` | float | `0.7` |
| `max_tokens` | int | `2000` |
| `timeout_seconds` | int | `300` |
**Response 200**`LLMResponse.to_dict()` shape
```json
{
"content": "...",
"model": "...",
"usage": {"prompt_tokens": 0, "completion_tokens": 0, "total_tokens": 0},
"finish_reason": "stop",
"metadata": {}
}
```
**Error responses**
| HTTP | Condition |
|------|-----------|
| 400 | Missing `prompt` field or invalid JSON body |
| 404 | Unknown path |
| 500 | Adapter raised an exception |
## Implementation notes
- Uses Python stdlib `http.server`**no additional runtime dependency**.
- The `[server]` optional-dependency group is reserved for future migration
to `aiohttp`/`starlette` if native async serving is required.
- `LLMServer(adapter, port=0)` binds to an OS-assigned free port; read back
via `server.port` after `start()`.
## CLI
```
python -m llm_connect.server [--host HOST] [--port PORT] [--provider PROVIDER] [--model MODEL]
```
Default provider: `mock`. All registered providers from `create_adapter` are valid.
## Known consumers
- `inter-hub` (IHUB-WP-0012 Phase 11): drives federation calls over HTTP from non-Python services.