# Adapter `model_params` contract `RunConfig.model_params` is a portability layer, not a blind provider payload escape hatch. Adapters must translate the shared keys they understand, pass through only provider-valid keys, and drop provider-specific keys that would make another provider reject the request. ## Shared structured output Callers may request structured output with: ```python RunConfig( model_params={ "json_schema": { "type": "object", "properties": { "summary": {"type": "string"}, "recommendations": {"type": "array", "items": {"type": "string"}}, }, "required": ["summary", "recommendations"], } } ) ``` Adapters translate that key into the provider's native shape: | Adapter | Translation | |---|---| | OpenAI | `response_format = {"type": "json_schema", "json_schema": ...}` | | OpenRouter | Same OpenAI-compatible `response_format` wrapper | | Gemini | `generationConfig.responseMimeType = "application/json"` and `generationConfig.responseSchema = ...` | | Claude Code CLI | `--json-schema ` plus `--output-format json`, then envelope unwrap | OpenAI-compatible adapters default `json_schema.strict` to `False`. Strict mode requires schemas to meet provider-specific constraints such as `additionalProperties: false` on object nodes and complete `required` lists. Callers that need strict behavior can pass an explicit provider-native `response_format` in `model_params`. ## Pass-through keys OpenAI and OpenRouter pass through known Chat Completions fields: `top_p`, `n`, `stream`, `stop`, `presence_penalty`, `frequency_penalty`, `logit_bias`, `user`, `seed`, `tools`, `tool_choice`, `response_format`, `logprobs`, `top_logprobs`, and `parallel_tool_calls`. Gemini passes through valid `generateContent` top-level fields: `safetySettings`, `tools`, `toolConfig`, `systemInstruction`, and `cachedContent`. Gemini also accepts generation config fields directly or via snake-case aliases: `candidateCount`, `candidate_count`, `stopSequences`, `stop_sequences`, `maxOutputTokens`, `max_output_tokens`, `temperature`, `topP`, `top_p`, `topK`, `top_k`, `responseMimeType`, `response_mime_type`, `responseSchema`, and `response_schema`. ## Dropped keys Adapters must drop keys that are meaningful to another adapter or to llm-connect itself but invalid for the target provider. The current shared drop set includes: `reasoning_effort`, `max_depth`, `claude_cli_path`, and raw `json_schema` after translation. Unknown keys are ignored by default. This keeps activity-specific configs from causing provider HTTP 400 errors when a caller switches providers. ## Diagnostics and replay Server mode supports opt-in diagnostics for `/execute`: ```bash LLM_CONNECT_DEBUG=1 python -m llm_connect.server --provider openrouter curl 'http://127.0.0.1:8080/execute?debug=1' -d '{"prompt":"hi"}' ``` Debug responses include a `debug` field with the redacted provider request, raw provider response body, and adapter transformations such as `merge_model_params` or `unwrap_cli_envelope`. Normal responses omit `debug`. Set `LLM_CONNECT_AUDIT_DIR=/path/to/audit` to write one JSON audit record per `/execute` call. Audit records include the prompt, config, redacted provider request, provider response, parsed content, and latency. Re-run parsing without another provider call with: ```bash python -m llm_connect.replay /path/to/audit/record.json --json ``` ## Server concurrency `llm_connect.server.LLMServer` uses `ThreadingHTTPServer`. Adapter instances used in server mode must be safe to call concurrently. The bundled HTTP and subprocess adapters keep per-call state local; custom adapters should avoid mutating shared instance attributes during `execute_prompt` unless they use their own locks.