tegwick 79c899b694
Some checks failed
CI / test (3.10) (push) Has been cancelled
CI / test (3.11) (push) Has been cancelled
CI / test (3.12) (push) Has been cancelled
Capture llm-connect lessons from CUST-WP-0045 canary as ADHOC-2026-06-02
The 2026-06-02 daily-triage canary debugging session uncovered five real
bugs (commits 9de0f49, 435da49, cd4551c, 583ab57, 1b01f0e), mostly because
llm-connect has no way to see what payload the adapter sent or what the
provider returned. Capture the six structural improvements that would
collapse the next diagnosis of this shape from half a day to minutes:

  T01 — LLM_CONNECT_DEBUG envelope mode for /execute responses
  T02 — ThreadingHTTPServer drop-in replacement for stdlib HTTPServer
  T03 — Per-call audit log + replay CLI (LLM_CONNECT_AUDIT_DIR)
  T04 — Apply param-translation contract to OpenAI and Gemini adapters
  T05 — Provider-agnostic structured-output smoke test in CI
  T06 — Document the model_params translation contract for adapter authors

All six registered in the State Hub under workstream
adhoc-llmc-2026-06-02 (1c936c91-79c7-427d-ab37-9052e8a61cda).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-02 15:55:42 +02:00
2026-05-18 11:38:12 +02:00
2026-05-18 16:55:44 +02:00
2026-04-01 21:08:15 +00:00

llm-connect

Pluggable LLM adapters for Python and the commandline. Supports OpenRouter, Gemini, OpenAI, and the Claude Code CLI out of the box, with a clean abstract interface for adding your own.

Quick start

from llm_connect import create_adapter, RunConfig

adapter = create_adapter("gemini", model="gemini-2.5-flash")
config = RunConfig(temperature=0.7, max_tokens=1000)
response = adapter.execute_prompt("Summarise the value chain concept.", config)
print(response.content)

Installation

pip install -e /path/to/llm-connect     # local editable install
# or, once published:
pip install llm-connect

Requires: Python 3.10+, toml

Providers

Provider key Class Notes
"openrouter" OpenRouterAdapter OpenAI-compatible endpoint; supports all OpenRouter models
"gemini" GeminiAdapter Google Generative Language REST API; supports free tier
from llm_connect import create_adapter

# OpenRouter
adapter = create_adapter("openrouter", model="anthropic/claude-sonnet-4")

# Gemini (uses GEMINI_API_KEY env var or apikey-geminifree.txt)
adapter = create_adapter("gemini", model="gemini-2.5-flash")

# OpenAI (uses OPENAI_API_KEY env var)
adapter = create_adapter("openai", model="gpt-4.1-mini")

# Claude Code CLI (uses locally installed claude binary)
adapter = create_adapter("claude-code")

API keys

Keys are resolved in this order (first found wins):

  1. Explicit api_key argument to the constructor
  2. Environment variable (e.g. OPENROUTER_API_KEY, GEMINI_API_KEY, OPENAI_API_KEY)
  3. Key file in the project root (e.g. apikey-openrouter.txt, apikey-geminifree.txt)

Core types

RunConfig

Controls a single LLM call.

from llm_connect import RunConfig

config = RunConfig(
    model_name="gemini-2.5-flash",  # overrides adapter default
    temperature=0.3,
    max_tokens=2000,
    timeout_seconds=60,
)
Field Default Description
model_name "gpt-4" Model identifier (adapter may override)
temperature 0.7 Sampling temperature
max_tokens 2000 Maximum output tokens
model_params {} Extra provider-specific parameters
max_depth 3 Max nesting depth for recursive calls
skip_if_exists True Skip if identical input hash already processed
timeout_seconds 300 Request timeout

LLMResponse

Returned by every execute_prompt call.

response = adapter.execute_prompt(prompt, config)
print(response.content)       # generated text
print(response.model)         # model actually used
print(response.usage)         # {"prompt_tokens": …, "completion_tokens": …, "total_tokens": …}
print(response.finish_reason) # "stop", "length", etc.

Writing your own adapter

from llm_connect import LLMAdapter, RunConfig, LLMResponse

class MyAdapter(LLMAdapter):
    def execute_prompt(self, prompt: str, config: RunConfig) -> LLMResponse:
        # call your API here
        return LLMResponse(content="...", model="my-model")

    def validate_config(self, config: RunConfig) -> bool:
        return True

TOML configuration chain

The resolve_llm() function walks a 7-level priority chain to pick a provider and model. This is used by the llm-helper integration but is also available standalone:

from llm_connect.toml_config import resolve_llm

resolved = resolve_llm(app_name="myapp")
print(resolved.provider, resolved.model, resolved.provider_source)

Priority order (highest first):

  1. CLI flags (cli_provider, cli_model arguments)
  2. Env var {APP_NAME}_HELPER_MODEL (model only)
  3. User preference — ~/.config/{app_name}/config.toml [llm.preference]
  4. Directory preference — .{app_name}.toml [llm.preference]
  5. Directory default — .{app_name}.toml [llm.default]
  6. User default — ~/.config/{app_name}/config.toml [llm.default]
  7. Hardcoded fallback — gemini / gemini-2.5-flash

Example config file (~/.config/myapp/config.toml):

[llm.default]
provider = "gemini"
model = "gemini-2.5-flash"

[llm.preference]
provider = "openrouter"
model = "anthropic/claude-sonnet-4"

Embeddings

from llm_connect import create_embedding_adapter, EmbeddingCache

adapter = create_embedding_adapter("openai", model="text-embedding-3-small")
cache = EmbeddingCache(cache_dir=".embeddings")

# Get embedding (cached after first call)
vec = cache.get_or_compute("my text", lambda t: adapter.embed([t])[0])

Exceptions

from llm_connect.exceptions import (
    LLMError,             # base
    LLMConfigurationError,# bad key, unknown provider
    LLMAPIError,          # HTTP error from provider (has .status_code)
    LLMRateLimitError,    # 429
    LLMTimeoutError,      # request timed out
    LLMSubprocessError,   # claude CLI failed (has .return_code, .stderr)
)

Testing

from llm_connect import MockLLMAdapter, RunConfig

mock = MockLLMAdapter(mock_response="Test response")
config = RunConfig()
response = mock.execute_prompt("any prompt", config)
assert response.content == "Test response"
assert mock.call_count == 1

Origin

Extracted from the markitect project. The markitect.llm module remains a re-export shim pointing here.

Description
Pluggable LLM adapters for Python
Readme 435 KiB
Languages
Python 99.7%
Dockerfile 0.3%