generated from coulomb/repo-seed
Covers: installation, all 4 providers, RunConfig/LLMResponse types, custom adapter pattern, TOML config chain, embeddings, exceptions, testing with MockLLMAdapter, and origin note. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
5.6 KiB
5.6 KiB
llm-connect
Pluggable LLM adapters for Python. Supports OpenRouter, Gemini, OpenAI, and the Claude Code CLI out of the box, with a clean abstract interface for adding your own.
Quick start
from llm_connect import create_adapter, RunConfig
adapter = create_adapter("gemini", model="gemini-2.5-flash")
config = RunConfig(temperature=0.7, max_tokens=1000)
response = adapter.execute_prompt("Summarise the value chain concept.", config)
print(response.content)
Installation
pip install -e /path/to/llm-connect # local editable install
# or, once published:
pip install llm-connect
Requires: Python 3.10+, toml
Providers
| Provider key | Class | Notes |
|---|---|---|
"openrouter" |
OpenRouterAdapter |
OpenAI-compatible endpoint; supports all OpenRouter models |
"gemini" |
GeminiAdapter |
Google Generative Language REST API; supports free tier |
"openai" |
OpenAIAdapter |
OpenAI chat completions endpoint |
"claude-code" |
ClaudeCodeAdapter |
Shells out to the claude --print CLI; no API key needed |
from llm_connect import create_adapter
# OpenRouter
adapter = create_adapter("openrouter", model="anthropic/claude-sonnet-4")
# Gemini (uses GEMINI_API_KEY env var or apikey-geminifree.txt)
adapter = create_adapter("gemini", model="gemini-2.5-flash")
# OpenAI (uses OPENAI_API_KEY env var)
adapter = create_adapter("openai", model="gpt-4.1-mini")
# Claude Code CLI (uses locally installed claude binary)
adapter = create_adapter("claude-code")
API keys
Keys are resolved in this order (first found wins):
- Explicit
api_keyargument to the constructor - Environment variable (e.g.
OPENROUTER_API_KEY,GEMINI_API_KEY,OPENAI_API_KEY) - Key file in the project root (e.g.
apikey-openrouter.txt,apikey-geminifree.txt)
Core types
RunConfig
Controls a single LLM call.
from llm_connect import RunConfig
config = RunConfig(
model_name="gemini-2.5-flash", # overrides adapter default
temperature=0.3,
max_tokens=2000,
timeout_seconds=60,
)
| Field | Default | Description |
|---|---|---|
model_name |
"gpt-4" |
Model identifier (adapter may override) |
temperature |
0.7 |
Sampling temperature |
max_tokens |
2000 |
Maximum output tokens |
model_params |
{} |
Extra provider-specific parameters |
max_depth |
3 |
Max nesting depth for recursive calls |
skip_if_exists |
True |
Skip if identical input hash already processed |
timeout_seconds |
300 |
Request timeout |
LLMResponse
Returned by every execute_prompt call.
response = adapter.execute_prompt(prompt, config)
print(response.content) # generated text
print(response.model) # model actually used
print(response.usage) # {"prompt_tokens": …, "completion_tokens": …, "total_tokens": …}
print(response.finish_reason) # "stop", "length", etc.
Writing your own adapter
from llm_connect import LLMAdapter, RunConfig, LLMResponse
class MyAdapter(LLMAdapter):
def execute_prompt(self, prompt: str, config: RunConfig) -> LLMResponse:
# call your API here
return LLMResponse(content="...", model="my-model")
def validate_config(self, config: RunConfig) -> bool:
return True
TOML configuration chain
The resolve_llm() function walks a 7-level priority chain to pick a
provider and model. This is used by the llm-helper integration but is also
available standalone:
from llm_connect.toml_config import resolve_llm
resolved = resolve_llm(app_name="myapp")
print(resolved.provider, resolved.model, resolved.provider_source)
Priority order (highest first):
- CLI flags (
cli_provider,cli_modelarguments) - Env var
{APP_NAME}_HELPER_MODEL(model only) - User preference —
~/.config/{app_name}/config.toml[llm.preference] - Directory preference —
.{app_name}.toml[llm.preference] - Directory default —
.{app_name}.toml[llm.default] - User default —
~/.config/{app_name}/config.toml[llm.default] - Hardcoded fallback —
gemini / gemini-2.5-flash
Example config file (~/.config/myapp/config.toml):
[llm.default]
provider = "gemini"
model = "gemini-2.5-flash"
[llm.preference]
provider = "openrouter"
model = "anthropic/claude-sonnet-4"
Embeddings
from llm_connect import create_embedding_adapter, EmbeddingCache
adapter = create_embedding_adapter("openai", model="text-embedding-3-small")
cache = EmbeddingCache(cache_dir=".embeddings")
# Get embedding (cached after first call)
vec = cache.get_or_compute("my text", lambda t: adapter.embed([t])[0])
Exceptions
from llm_connect.exceptions import (
LLMError, # base
LLMConfigurationError,# bad key, unknown provider
LLMAPIError, # HTTP error from provider (has .status_code)
LLMRateLimitError, # 429
LLMTimeoutError, # request timed out
LLMSubprocessError, # claude CLI failed (has .return_code, .stderr)
)
Testing
from llm_connect import MockLLMAdapter, RunConfig
mock = MockLLMAdapter(mock_response="Test response")
config = RunConfig()
response = mock.execute_prompt("any prompt", config)
assert response.content == "Test response"
assert mock.call_count == 1
Origin
Extracted from the markitect project.
The markitect.llm module remains a re-export shim pointing here.