coulomb/markitect-main

Fork 0

Files

tegwick eaf4a955af

Test Suite / unit-tests (3.11) (push) Has been cancelled

Details

Test Suite / unit-tests (3.12) (push) Has been cancelled

Details

Test Suite / code-quality (push) Has been cancelled

Details

Test Suite / security-scan (push) Has been cancelled

Details

Test Suite / integration-tests (push) Has been cancelled

Details

Test Suite / e2e-tests (push) Has been cancelled

Details

Test Suite / performance-tests (push) Has been cancelled

Details

Test Suite / test-summary (push) Has been cancelled

Details

docs(roadmap): add workplan for extracting llm module as shared library

3-stage plan: decouple (RunConfig/LLMResponse move + app name
parameterization) → extract to standalone package → adopt in first
consumer. Registered as workstream in Custodian State Hub.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-02-24 21:51:54 +01:00

8.4 KiB

Raw Permalink Blame History

LLM Adapter Layer — Extract as Shared Library

Vision

The markitect.llm module is a clean, stdlib-only adapter layer for calling LLMs via OpenRouter, Gemini, OpenAI, and the Claude Code CLI. It implements a uniform interface, a 7-layer TOML config chain, embedding support with caching, and typed exceptions. It should be usable by all projects in the Bernd Worsch ecosystem without pulling in all of markitect.

This roadmap tracks extracting it into a standalone installable library.

Current State

The module lives at markitect/llm/ (~16 files, ~1500 LOC, stdlib-only) and provides:

4 text adapters: OpenRouter, Gemini, OpenAI, Claude Code CLI
2 embedding adapters: OpenAI-compatible (OpenAI + OpenRouter)
Embedding cache: JSON-backed, content-digest validated
Similarity utilities: pure-Python cosine similarity, matrix, pair-finding
7-layer TOML config chain: CLI > env > user/dir preference/default > hardcoded
Typed exceptions: LLMError hierarchy
HTTP wrapper: urllib-only, typed exception translation

Two Coupling Issues Blocking Clean Extraction

Issue	Location	Severity
`RunConfig` and `LLMResponse` are defined in `markitect.prompts.execution.models`, not in `markitect.llm`	`markitect/prompts/execution/models.py`	High — creates cross-module import for all consumers
TOML config chain hardcodes `"markitect"` as app name (paths: `~/.config/markitect/`, env prefix `MARKITECT_`, files: `.markitect.toml`)	`markitect/llm/toml_config.py`	Medium — consumers either accept markitect config or can't use the chain

Terminology

adapter: concrete implementation of LLMAdapter for a single provider
factory: create_adapter() / create_embedding_adapter() — provider-agnostic entry points
config chain: 7-layer resolution of provider + model (CLI → env → TOML → hardcoded)
standalone library: a Python package installable with pip install from a git URL or local path, without PyPI
consumer: any project that imports and uses the library (markitect itself, custodian, railiance, etc.)

Packaging Decision (Pending)

Before Phase 2 starts, one architectural decision must be resolved:

D1: Where does the extracted library live?

Option A — Standalone repo (~/bw-llm or similar):

Clean separation, versioned independently, installable via pip install git+file:///... or git URL

Adds a repo to maintain; changes require bumping version in dependents

Option B — Subfolder of markitect with own pyproject.toml (monorepo-lite):

Stays co-located with the main codebase that will use it most

Less friction for iteration; single git history

Slightly unorthodox but valid for personal infrastructure

Option C — Just pip install markitect in other projects:

Zero extraction work; reuse today

Pulls all of markitect (prompts, infospace, CLI, etc.) as transitive deps

Acceptable short-term if other projects are small

Stages

Stage 1 — Decouple (within markitect)

Prepare the module for extraction without changing its public API.

S1.1 — Move RunConfig + LLMResponse into markitect.llm

RunConfig and LLMResponse are currently in markitect.prompts.execution.models. The LLM adapters import from there, creating a hard dependency on the prompt system.

Work:

Move both dataclasses to markitect/llm/models.py
Update all imports in markitect.llm and markitect.prompts
Keep a re-export shim in markitect.prompts.execution.models for backwards compat

Acceptance: markitect/llm/ has zero imports from markitect.prompts.*

S1.2 — Parameterize the TOML config chain

Replace the hardcoded "markitect" app name with a configurable app_name parameter.

Work:

Add app_name: str = "markitect" parameter to resolve_llm() and the config path helpers in toml_config.py
Derive config file path (~/.config/{app_name}/config.toml), env prefix ({APP_NAME}_HELPER_MODEL), and local config file (.{app_name}.toml) from it
All existing behaviour is preserved when app_name="markitect" (default)

Acceptance: A consumer can call resolve_llm(app_name="railiance") and get config from ~/.config/railiance/config.toml and RAILIANCE_HELPER_MODEL.

S1.3 — Isolation tests

Write a test file that imports only from markitect.llm.* and verifies no accidental coupling remains.

Acceptance: pytest tests/test_llm_isolation.py passes; no import of markitect.prompts or markitect.infospace in the LLM module tree.

Stage 2 — Extract

S2.1 — Resolve D1: packaging location

Record the decision and create the package scaffold.

Acceptance: D1 resolved, pyproject.toml for the library exists at the chosen location with name, version 0.1.0, and declared dependencies.

S2.2 — Create standalone package

Move (or symlink) the llm module into the new package structure. Wire up the pyproject.toml entry points. Verify pip install -e <path> works.

Files to carry over:

llm/
  __init__.py          # re-exports: create_adapter, create_embedding_adapter,
                       #   LLMAdapter, EmbeddingAdapter, LLMConfig, exceptions
  models.py            # RunConfig, LLMResponse (moved from S1.1)
  config.py            # load_config, resolve_api_key
  toml_config.py       # resolve_llm (parameterized from S1.2)
  factory.py           # create_adapter
  exceptions.py        # LLM exception hierarchy
  openrouter.py
  claude_code.py
  gemini.py
  openai.py
  embedding_adapter.py
  embedding_openai.py
  embedding_factory.py # create_embedding_adapter
  embedding_cache.py
  similarity.py
  _http.py
  _token_estimator.py

Acceptance: python -c "from bw_llm import create_adapter; print('ok')" works in a fresh venv with only the new package installed.

S2.3 — Update markitect to depend on extracted package

Replace markitect/llm/ with an import alias pointing to the new package, or add the package as a path dependency in markitect's pyproject.toml.

Acceptance: All markitect tests pass; markitect/llm/__init__.py is either removed or becomes a thin re-export of bw_llm.

S2.4 — Integration smoke test

Run the full markitect infospace pipeline (entity extraction + evaluation) end-to-end against a small fixture to confirm nothing broke.

Acceptance: markitect infospace evaluate --dry-run succeeds on a 3-entity fixture.

Stage 3 — Adopt in First Consumer

S3.1 — Integrate in one other project

Pick the first real consumer (likely the custodian state-hub, for LLM-assisted state summaries or decision rationale generation) and wire up the library.

Work:

Add bw-llm (or equivalent) as a dependency
Write a small usage example (e.g., llm_helper.py)
Confirm config chain works with the consumer's own app name

S3.2 — Usage guide

Write README.md for the library covering:

Installation (local path / git URL)
Supported providers and env vars
TOML config file locations and format
create_adapter() / create_embedding_adapter() quick-start
Error handling

Acceptance: Another developer (or agent) can follow the README to use the library in a new project without reading source code.

Stage Summary

Stage	Description	Key Deliverable	Blocks
S1.1	Move RunConfig/LLMResponse to llm	Zero cross-module deps	S2.2
S1.2	Parameterize app name	Configurable config chain	S2.2
S1.3	Isolation tests	Green test suite	S2.1
S2.1	Resolve packaging decision (D1)	pyproject.toml scaffold	S2.2
S2.2	Create standalone package	`pip install` works	S2.3
S2.3	Update markitect	markitect uses extracted lib	S2.4
S2.4	Integration smoke test	Full pipeline passes	S3.1
S3.1	First consumer integration	Library used in real project	S3.2
S3.2	Usage guide	README published	—

Out of Scope

Publishing to PyPI (unnecessary for personal infrastructure; git/local installs suffice)
Adding new LLM providers (separate concern)
Porting the helper CLI to the library (the CLI is markitect-specific)
Async adapters (current sync interface is sufficient; can be added later)

8.4 KiB Raw Permalink Blame History

LLM Adapter Layer — Extract as Shared Library

Vision

Current State

Two Coupling Issues Blocking Clean Extraction

Terminology

Packaging Decision (Pending)

Stages

Stage 1 — Decouple (within markitect)

S1.1 — Move RunConfig + LLMResponse into markitect.llm

S1.2 — Parameterize the TOML config chain

S1.3 — Isolation tests

Stage 2 — Extract

S2.1 — Resolve D1: packaging location

S2.2 — Create standalone package

S2.3 — Update markitect to depend on extracted package

S2.4 — Integration smoke test

Stage 3 — Adopt in First Consumer

S3.1 — Integrate in one other project

S3.2 — Usage guide

Stage Summary

Out of Scope

8.4 KiB

Raw Permalink Blame History