generated from coulomb/repo-seed
79c899b694f5b07cb56cac43e74279eaed32e88d
The 2026-06-02 daily-triage canary debugging session uncovered five real bugs (commits9de0f49,435da49,cd4551c,583ab57,1b01f0e), mostly because llm-connect has no way to see what payload the adapter sent or what the provider returned. Capture the six structural improvements that would collapse the next diagnosis of this shape from half a day to minutes: T01 — LLM_CONNECT_DEBUG envelope mode for /execute responses T02 — ThreadingHTTPServer drop-in replacement for stdlib HTTPServer T03 — Per-call audit log + replay CLI (LLM_CONNECT_AUDIT_DIR) T04 — Apply param-translation contract to OpenAI and Gemini adapters T05 — Provider-agnostic structured-output smoke test in CI T06 — Document the model_params translation contract for adapter authors All six registered in the State Hub under workstream adhoc-llmc-2026-06-02 (1c936c91-79c7-427d-ab37-9052e8a61cda). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
llm-connect
Pluggable LLM adapters for Python and the commandline. Supports OpenRouter, Gemini, OpenAI, and the Claude Code CLI out of the box, with a clean abstract interface for adding your own.
Quick start
from llm_connect import create_adapter, RunConfig
adapter = create_adapter("gemini", model="gemini-2.5-flash")
config = RunConfig(temperature=0.7, max_tokens=1000)
response = adapter.execute_prompt("Summarise the value chain concept.", config)
print(response.content)
Installation
pip install -e /path/to/llm-connect # local editable install
# or, once published:
pip install llm-connect
Requires: Python 3.10+, toml
Providers
| Provider key | Class | Notes |
|---|---|---|
"openrouter" |
OpenRouterAdapter |
OpenAI-compatible endpoint; supports all OpenRouter models |
"gemini" |
GeminiAdapter |
Google Generative Language REST API; supports free tier |
from llm_connect import create_adapter
# OpenRouter
adapter = create_adapter("openrouter", model="anthropic/claude-sonnet-4")
# Gemini (uses GEMINI_API_KEY env var or apikey-geminifree.txt)
adapter = create_adapter("gemini", model="gemini-2.5-flash")
# OpenAI (uses OPENAI_API_KEY env var)
adapter = create_adapter("openai", model="gpt-4.1-mini")
# Claude Code CLI (uses locally installed claude binary)
adapter = create_adapter("claude-code")
API keys
Keys are resolved in this order (first found wins):
- Explicit
api_keyargument to the constructor - Environment variable (e.g.
OPENROUTER_API_KEY,GEMINI_API_KEY,OPENAI_API_KEY) - Key file in the project root (e.g.
apikey-openrouter.txt,apikey-geminifree.txt)
Core types
RunConfig
Controls a single LLM call.
from llm_connect import RunConfig
config = RunConfig(
model_name="gemini-2.5-flash", # overrides adapter default
temperature=0.3,
max_tokens=2000,
timeout_seconds=60,
)
| Field | Default | Description |
|---|---|---|
model_name |
"gpt-4" |
Model identifier (adapter may override) |
temperature |
0.7 |
Sampling temperature |
max_tokens |
2000 |
Maximum output tokens |
model_params |
{} |
Extra provider-specific parameters |
max_depth |
3 |
Max nesting depth for recursive calls |
skip_if_exists |
True |
Skip if identical input hash already processed |
timeout_seconds |
300 |
Request timeout |
LLMResponse
Returned by every execute_prompt call.
response = adapter.execute_prompt(prompt, config)
print(response.content) # generated text
print(response.model) # model actually used
print(response.usage) # {"prompt_tokens": …, "completion_tokens": …, "total_tokens": …}
print(response.finish_reason) # "stop", "length", etc.
Writing your own adapter
from llm_connect import LLMAdapter, RunConfig, LLMResponse
class MyAdapter(LLMAdapter):
def execute_prompt(self, prompt: str, config: RunConfig) -> LLMResponse:
# call your API here
return LLMResponse(content="...", model="my-model")
def validate_config(self, config: RunConfig) -> bool:
return True
TOML configuration chain
The resolve_llm() function walks a 7-level priority chain to pick a
provider and model. This is used by the llm-helper integration but is also
available standalone:
from llm_connect.toml_config import resolve_llm
resolved = resolve_llm(app_name="myapp")
print(resolved.provider, resolved.model, resolved.provider_source)
Priority order (highest first):
- CLI flags (
cli_provider,cli_modelarguments) - Env var
{APP_NAME}_HELPER_MODEL(model only) - User preference —
~/.config/{app_name}/config.toml[llm.preference] - Directory preference —
.{app_name}.toml[llm.preference] - Directory default —
.{app_name}.toml[llm.default] - User default —
~/.config/{app_name}/config.toml[llm.default] - Hardcoded fallback —
gemini / gemini-2.5-flash
Example config file (~/.config/myapp/config.toml):
[llm.default]
provider = "gemini"
model = "gemini-2.5-flash"
[llm.preference]
provider = "openrouter"
model = "anthropic/claude-sonnet-4"
Embeddings
from llm_connect import create_embedding_adapter, EmbeddingCache
adapter = create_embedding_adapter("openai", model="text-embedding-3-small")
cache = EmbeddingCache(cache_dir=".embeddings")
# Get embedding (cached after first call)
vec = cache.get_or_compute("my text", lambda t: adapter.embed([t])[0])
Exceptions
from llm_connect.exceptions import (
LLMError, # base
LLMConfigurationError,# bad key, unknown provider
LLMAPIError, # HTTP error from provider (has .status_code)
LLMRateLimitError, # 429
LLMTimeoutError, # request timed out
LLMSubprocessError, # claude CLI failed (has .return_code, .stderr)
)
Testing
from llm_connect import MockLLMAdapter, RunConfig
mock = MockLLMAdapter(mock_response="Test response")
config = RunConfig()
response = mock.execute_prompt("any prompt", config)
assert response.content == "Test response"
assert mock.call_count == 1
Origin
Extracted from the markitect project.
The markitect.llm module remains a re-export shim pointing here.
Description
Languages
Python
99.7%
Dockerfile
0.3%