coulomb/llm-connect

Fork 0

generated from coulomb/repo-seed

Go to file

tegwick 79c899b694

CI / test (3.10) (push) Has been cancelled

Details

CI / test (3.11) (push) Has been cancelled

Details

CI / test (3.12) (push) Has been cancelled

Details

Capture llm-connect lessons from CUST-WP-0045 canary as ADHOC-2026-06-02

The 2026-06-02 daily-triage canary debugging session uncovered five real
bugs (commits 9de0f49, 435da49, cd4551c, 583ab57, 1b01f0e), mostly because
llm-connect has no way to see what payload the adapter sent or what the
provider returned. Capture the six structural improvements that would
collapse the next diagnosis of this shape from half a day to minutes:

  T01 — LLM_CONNECT_DEBUG envelope mode for /execute responses
  T02 — ThreadingHTTPServer drop-in replacement for stdlib HTTPServer
  T03 — Per-call audit log + replay CLI (LLM_CONNECT_AUDIT_DIR)
  T04 — Apply param-translation contract to OpenAI and Gemini adapters
  T05 — Provider-agnostic structured-output smoke test in CI
  T06 — Document the model_params translation contract for adapter authors

All six registered in the State Hub under workstream
adhoc-llmc-2026-06-02 (1c936c91-79c7-427d-ab37-9052e8a61cda).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-06-02 15:55:42 +02:00

.claude/rules

Refresh agent instruction files

2026-05-18 16:55:44 +02:00

.github/workflows

feat: WP-0001 foundation + WP-0002 core extensions

2026-04-01 22:24:14 +00:00

contracts

Implement-LLM-WP-0005-cost-model-estimators

2026-05-19 05:02:20 +02:00

docs

Implement-LLM-WP-0005-cost-model-estimators

2026-05-19 05:02:20 +02:00

examples

Adaptive routing initial version

2026-05-18 11:38:12 +02:00

llm_connect

Honour explicit OpenRouter --model when it equals the adapter default

2026-06-02 14:50:37 +02:00

tests

Prefer JSON-bearing envelope fields, skip metadata, in Claude CLI unwrap

2026-06-02 12:44:25 +02:00

workplans

Capture llm-connect lessons from CUST-WP-0045 canary as ADHOC-2026-06-02

2026-06-02 15:55:42 +02:00

.custodian-brief.md

chore(consistency): sync task status from DB [auto]

2026-05-17 22:54:25 +02:00

.gitignore

chore: add .gitignore, remove pycache

2026-02-27 07:54:53 +01:00

AGENTS.md

Refresh agent instruction files

2026-05-18 16:55:44 +02:00

ARCHITECTURE-LAYERS.md

feat: WP-0001 foundation + WP-0002 core extensions

2026-04-01 22:24:14 +00:00

CLAUDE.md

chore(custodian): add CLAUDE.md and .claude/rules/ orientation files

2026-04-01 23:15:29 +02:00

FEATURE_REQUESTS.md

added feature requests

2026-04-01 21:08:15 +00:00

INTENT.md

Added INTENT.md file and reviewed scope

2026-05-03 17:46:24 +02:00

pyproject.toml

Implement-LLM-WP-0005-cost-model-estimators

2026-05-19 05:02:20 +02:00

README.md

Added INTENT.md file and reviewed scope

2026-05-03 17:46:24 +02:00

SCOPE.md

Scope update from repo-scoping refactor

2026-05-01 12:26:51 +02:00

tpsc.yaml

Third party services catalog declaration

2026-03-25 00:10:13 +01:00

uv.lock

Preserve llm-connect run config in server mode

2026-05-19 20:55:02 +02:00

README.md

llm-connect

Pluggable LLM adapters for Python and the commandline. Supports OpenRouter, Gemini, OpenAI, and the Claude Code CLI out of the box, with a clean abstract interface for adding your own.

Quick start

from llm_connect import create_adapter, RunConfig

adapter = create_adapter("gemini", model="gemini-2.5-flash")
config = RunConfig(temperature=0.7, max_tokens=1000)
response = adapter.execute_prompt("Summarise the value chain concept.", config)
print(response.content)

Installation

pip install -e /path/to/llm-connect     # local editable install
# or, once published:
pip install llm-connect

Requires: Python 3.10+, toml

Providers

Provider key	Class	Notes
`"openrouter"`	`OpenRouterAdapter`	OpenAI-compatible endpoint; supports all OpenRouter models
`"gemini"`	`GeminiAdapter`	Google Generative Language REST API; supports free tier

from llm_connect import create_adapter

# OpenRouter
adapter = create_adapter("openrouter", model="anthropic/claude-sonnet-4")

# Gemini (uses GEMINI_API_KEY env var or apikey-geminifree.txt)
adapter = create_adapter("gemini", model="gemini-2.5-flash")

# OpenAI (uses OPENAI_API_KEY env var)
adapter = create_adapter("openai", model="gpt-4.1-mini")

# Claude Code CLI (uses locally installed claude binary)
adapter = create_adapter("claude-code")

API keys

Keys are resolved in this order (first found wins):

Explicit api_key argument to the constructor
Environment variable (e.g. OPENROUTER_API_KEY, GEMINI_API_KEY, OPENAI_API_KEY)
Key file in the project root (e.g. apikey-openrouter.txt, apikey-geminifree.txt)

Core types

`RunConfig`

Controls a single LLM call.

from llm_connect import RunConfig

config = RunConfig(
    model_name="gemini-2.5-flash",  # overrides adapter default
    temperature=0.3,
    max_tokens=2000,
    timeout_seconds=60,
)

Field	Default	Description
`model_name`	`"gpt-4"`	Model identifier (adapter may override)
`temperature`	`0.7`	Sampling temperature
`max_tokens`	`2000`	Maximum output tokens
`model_params`	`{}`	Extra provider-specific parameters
`max_depth`	`3`	Max nesting depth for recursive calls
`skip_if_exists`	`True`	Skip if identical input hash already processed
`timeout_seconds`	`300`	Request timeout

`LLMResponse`

Returned by every execute_prompt call.

response = adapter.execute_prompt(prompt, config)
print(response.content)       # generated text
print(response.model)         # model actually used
print(response.usage)         # {"prompt_tokens": …, "completion_tokens": …, "total_tokens": …}
print(response.finish_reason) # "stop", "length", etc.

Writing your own adapter

from llm_connect import LLMAdapter, RunConfig, LLMResponse

class MyAdapter(LLMAdapter):
    def execute_prompt(self, prompt: str, config: RunConfig) -> LLMResponse:
        # call your API here
        return LLMResponse(content="...", model="my-model")

    def validate_config(self, config: RunConfig) -> bool:
        return True

TOML configuration chain

The resolve_llm() function walks a 7-level priority chain to pick a provider and model. This is used by the llm-helper integration but is also available standalone:

from llm_connect.toml_config import resolve_llm

resolved = resolve_llm(app_name="myapp")
print(resolved.provider, resolved.model, resolved.provider_source)

Priority order (highest first):

CLI flags (cli_provider, cli_model arguments)
Env var {APP_NAME}_HELPER_MODEL (model only)
User preference — ~/.config/{app_name}/config.toml [llm.preference]
Directory preference — .{app_name}.toml [llm.preference]
Directory default — .{app_name}.toml [llm.default]
User default — ~/.config/{app_name}/config.toml [llm.default]
Hardcoded fallback — gemini / gemini-2.5-flash

Example config file (~/.config/myapp/config.toml):

[llm.default]
provider = "gemini"
model = "gemini-2.5-flash"

[llm.preference]
provider = "openrouter"
model = "anthropic/claude-sonnet-4"

Embeddings

from llm_connect import create_embedding_adapter, EmbeddingCache

adapter = create_embedding_adapter("openai", model="text-embedding-3-small")
cache = EmbeddingCache(cache_dir=".embeddings")

# Get embedding (cached after first call)
vec = cache.get_or_compute("my text", lambda t: adapter.embed([t])[0])

Exceptions

from llm_connect.exceptions import (
    LLMError,             # base
    LLMConfigurationError,# bad key, unknown provider
    LLMAPIError,          # HTTP error from provider (has .status_code)
    LLMRateLimitError,    # 429
    LLMTimeoutError,      # request timed out
    LLMSubprocessError,   # claude CLI failed (has .return_code, .stderr)
)

Testing

from llm_connect import MockLLMAdapter, RunConfig

mock = MockLLMAdapter(mock_response="Test response")
config = RunConfig()
response = mock.execute_prompt("any prompt", config)
assert response.content == "Test response"
assert mock.call_count == 1

Origin

Extracted from the markitect project. The markitect.llm module remains a re-export shim pointing here.