# llm-connect Pluggable LLM adapters for Python and the commandline. Supports OpenRouter, Gemini, OpenAI, and the Claude Code CLI out of the box, with a clean abstract interface for adding your own. ## Quick start ```python from llm_connect import create_adapter, RunConfig adapter = create_adapter("gemini", model="gemini-2.5-flash") config = RunConfig(temperature=0.7, max_tokens=1000) response = adapter.execute_prompt("Summarise the value chain concept.", config) print(response.content) ``` ## Installation ```bash pip install -e /path/to/llm-connect # local editable install # or, once published: pip install llm-connect ``` **Requires:** Python 3.10+, `toml` ## Providers | Provider key | Class | Notes | |---|---|---| | `"openrouter"` | `OpenRouterAdapter` | OpenAI-compatible endpoint; supports all OpenRouter models | | `"gemini"` | `GeminiAdapter` | Google Generative Language REST API; supports free tier | ```python from llm_connect import create_adapter # OpenRouter adapter = create_adapter("openrouter", model="anthropic/claude-sonnet-4") # Gemini (uses GEMINI_API_KEY env var or apikey-geminifree.txt) adapter = create_adapter("gemini", model="gemini-2.5-flash") # OpenAI (uses OPENAI_API_KEY env var) adapter = create_adapter("openai", model="gpt-4.1-mini") # Claude Code CLI (uses locally installed claude binary) adapter = create_adapter("claude-code") ``` ## API keys Keys are resolved in this order (first found wins): 1. Explicit `api_key` argument to the constructor 2. Environment variable (e.g. `OPENROUTER_API_KEY`, `GEMINI_API_KEY`, `OPENAI_API_KEY`) 3. Key file in the project root (e.g. `apikey-openrouter.txt`, `apikey-geminifree.txt`) ## Core types ### `RunConfig` Controls a single LLM call. ```python from llm_connect import RunConfig config = RunConfig( model_name="gemini-2.5-flash", # overrides adapter default temperature=0.3, max_tokens=2000, timeout_seconds=60, ) ``` | Field | Default | Description | |---|---|---| | `model_name` | `"gpt-4"` | Model identifier (adapter may override) | | `temperature` | `0.7` | Sampling temperature | | `max_tokens` | `2000` | Maximum output tokens | | `model_params` | `{}` | Extra provider-specific parameters | | `max_depth` | `3` | Max nesting depth for recursive calls | | `skip_if_exists` | `True` | Skip if identical input hash already processed | | `timeout_seconds` | `300` | Request timeout | ### `LLMResponse` Returned by every `execute_prompt` call. ```python response = adapter.execute_prompt(prompt, config) print(response.content) # generated text print(response.model) # model actually used print(response.usage) # {"prompt_tokens": …, "completion_tokens": …, "total_tokens": …} print(response.finish_reason) # "stop", "length", etc. ``` ## Writing your own adapter ```python from llm_connect import LLMAdapter, RunConfig, LLMResponse class MyAdapter(LLMAdapter): def execute_prompt(self, prompt: str, config: RunConfig) -> LLMResponse: # call your API here return LLMResponse(content="...", model="my-model") def validate_config(self, config: RunConfig) -> bool: return True ``` ## TOML configuration chain The `resolve_llm()` function walks a 7-level priority chain to pick a provider and model. This is used by the `llm-helper` integration but is also available standalone: ```python from llm_connect.toml_config import resolve_llm resolved = resolve_llm(app_name="myapp") print(resolved.provider, resolved.model, resolved.provider_source) ``` Priority order (highest first): 1. CLI flags (`cli_provider`, `cli_model` arguments) 2. Env var `{APP_NAME}_HELPER_MODEL` (model only) 3. User preference — `~/.config/{app_name}/config.toml` `[llm.preference]` 4. Directory preference — `.{app_name}.toml` `[llm.preference]` 5. Directory default — `.{app_name}.toml` `[llm.default]` 6. User default — `~/.config/{app_name}/config.toml` `[llm.default]` 7. Hardcoded fallback — `gemini / gemini-2.5-flash` Example config file (`~/.config/myapp/config.toml`): ```toml [llm.default] provider = "gemini" model = "gemini-2.5-flash" [llm.preference] provider = "openrouter" model = "anthropic/claude-sonnet-4" ``` ## Embeddings ```python from llm_connect import create_embedding_adapter, EmbeddingCache adapter = create_embedding_adapter("openai", model="text-embedding-3-small") cache = EmbeddingCache(cache_dir=".embeddings") # Get embedding (cached after first call) vec = cache.get_or_compute("my text", lambda t: adapter.embed([t])[0]) ``` ## Exceptions ```python from llm_connect.exceptions import ( LLMError, # base LLMConfigurationError,# bad key, unknown provider LLMAPIError, # HTTP error from provider (has .status_code) LLMRateLimitError, # 429 LLMTimeoutError, # request timed out LLMSubprocessError, # claude CLI failed (has .return_code, .stderr) ) ``` ## Testing ```python from llm_connect import MockLLMAdapter, RunConfig mock = MockLLMAdapter(mock_response="Test response") config = RunConfig() response = mock.execute_prompt("any prompt", config) assert response.content == "Test response" assert mock.call_count == 1 ``` ## Origin Extracted from the [markitect](https://github.com/worsch/markitect) project. The `markitect.llm` module remains a re-export shim pointing here.