The adapter compared self._model to _DEFAULT_MODEL ("anthropic/claude-sonnet-4")
to decide whether to honour the constructor's model. When a caller passes
that exact value via --model, the comparison treats it as "not specified"
and falls through to RunConfig.model_name, which defaults to "gpt-4". So
every llm-connect call started with --provider openrouter --model
anthropic/claude-sonnet-4 actually landed on OpenAI's gpt-4 — and on
gpt-4 OpenAI's structured-output response_format requires a model with
schema support that gpt-4 lacks, returning 400. The CUST-WP-0045 canary
hit this for hours; the smoke probes that worked were the ones with no
json_schema, where gpt-4 returned fine.
Track _explicit_model separately so a constructor or LLMConfig that
matches the default is still treated as a real intent.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The previous strict=True default rejected the activity-core daily-triage
schema (and most real-world application schemas) because OpenAI strict
mode requires additionalProperties:false on every object and every
property in the required list. Application-supplied schemas typically
do not meet that bar — adding additionalProperties recursively at the
adapter would be surprising and may break callers that rely on extra
fields. Flipping strict to False keeps the schema as a soft constraint;
the model still produces structured output and the activity-core
canary's 400 from OpenRouter goes away.
Callers who need strict enforcement can pass response_format directly
via model_params, where the adapter's pass-through handling preserves
the strict flag they set.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The adapter previously did a blind payload.update(config.model_params).
For callers like activity-core that pass reasoning_effort, max_depth,
and json_schema (Claude / llm-connect-specific fields), those leaked
into the OpenAI Chat Completions request body and OpenRouter rejected
the whole call with HTTP 400. CUST-WP-0045 canary on 2026-06-02 hit
this — manual repro confirmed: same prompt with no model_params returns
a clean 10-recommendation WSJF report in 4.5s; with model_params
included, every call 400s.
Replace the merge with a whitelist + translation step:
- pass-through known OpenAI Chat Completions fields (top_p, stop, seed,
tools, response_format, etc.)
- translate json_schema into the proper response_format wrapper
({type:"json_schema", json_schema:{name,schema,strict}})
- drop documented non-OpenAI fields (reasoning_effort, max_depth) so
the payload stays valid
- silently drop unknown keys rather than risk another 400
The same pattern will need to apply to the OpenAI and Gemini adapters
when their callers start passing provider-specific keys — left as
follow-up rather than speculative refactoring.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The first CUST-WP-0045 canary retry after 9de0f49 still failed schema
validation with `Expecting value: line 1 column 1 (char 0)`. The original
allowlist returned envelope.result verbatim, which on longer prompts
carries the model's conversational preamble ("Triage report generated
and returned via structured output. Key signals: ..."), not the
schema-enforced JSON. The actual structured payload lives in a different
envelope field whose name varies across CLI versions.
Make the unwrap order-aware:
1. Scan envelope fields and return the first one whose value parses as
JSON (dict, list, or a string that loads cleanly). Skip well-known
metadata keys (type, usage, total_cost_usd, etc.) so telemetry can
never be mistaken for the model payload.
2. Fall back to the original text-field allowlist only when no field
carries JSON, so non-schema callers via this same code path still
see the model's prose.
3. Surface the raw envelope as last resort.
This is robust against unknown envelope shapes — as long as the schema-
enforced JSON appears somewhere in a non-metadata field, the adapter
will find it.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The Claude Code adapter previously passed --json-schema alone. On Claude
CLI 2.1.160 that combination still emits the model's conversational
preamble on stdout while the schema-enforced structured payload ships on
a sidecar channel the adapter cannot read. Result: callers requesting
structured output got prose that fails JSON parsing downstream — exactly
the failure mode the activity-core CUST-WP-0045 daily triage canary hit
on 2026-06-01 ("Triage report generated and returned via structured
output. Key signals:..." → json.loads error at column 1).
Fix: when --json-schema is set, also pass --output-format json. The CLI
then writes a JSON envelope on stdout. The adapter unwraps it by
probing a small allowlist of known text-bearing fields (result,
result_text, content, text, output). Unknown envelope shapes fall
through to raw stdout so the operator can introspect the structure and
extend the allowlist.
The unwrap path is only triggered when --json-schema was set, so non-
schema callers keep the existing raw-stdout behavior.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Drafted workplan to move two consumer-side concerns into llm-connect:
- ModelRateRegistry: per-model USD-per-1k rates with provenance, a
property of the base model, not the application.
- ProblemClass token estimators: generic shapes (chunk-summarization,
entity-extraction, relation-extraction, judge-eval, report-synthesis)
with base dimensions + tunable params; consumer supplies the shape
of its problem and gets a TokenEstimate before any call.
Demand signal: the 2026-05-18 infospace-bench Lefevre Chapter-I smoke
ran 32 calls / 28k tokens / 0.009 USD actual against a planned 8.40
USD — the 1000x variance was entirely consumer-side because there is
no rate table in llm-connect to delegate to.
Three new modules (rates.py, costs.py, problem_classes.py), eight
tasks, registered as workstream 869196c5-551b-4eef-b8d8-cca6f770a9b0
under the custodian topic. A follow-on consumer workplan in
infospace-bench will migrate plan_generation_summary to delegate once
T01-T04 land here.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Draft the workplan that extends the static RoutingPolicy (WP-0003) with
a quality observation ledger, a BaselineGrader (ClaudeCodeAdapter as the
default oracle), an AdaptiveRoutingPolicy that picks the cheapest
adapter clearing a per-task quality floor, and a sampled
ShadowingAdapter for production observation collection.
Scope is explicit: ship primitives only. Task-type taxonomy, quality
thresholds, baseline choice, and re-grading cadence stay with the
consumer. infospace-bench is the named first consumer; consumer wiring
deferred until T01-T03 land.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Remove redundant async_execute_prompt overrides from OpenAI/Gemini/OpenRouter
adapters (identical to base class default — asyncio import also removed)
- Cache prompt.split() result in MockLLMAdapter to avoid double evaluation
- Promote deferred LLMBudgetExceededError imports to module level in
models.py and adapter.py (no circular dependency)
- Auto-populate context dict in LLMBudgetExceededError.__init__ so callers
need not pass redundant context= kwarg
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Registers llm-connect with the Custodian agent system:
- CLAUDE.md: thin @-import index pointing to modular rules
- .claude/rules/session-protocol.md: orient with get_domain_summary("custodian")
- .claude/rules/repo-identity.md: domain=custodian, slug=llm-connect
- .claude/rules/first-session.md, workplan-convention.md, stack-and-commands.md,
architecture.md, repo-boundary.md, agents.md, scope.md (stubs to fill in)
- session-protocol notes both local (:8000) and CoulombCore bridge (:18000) URLs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds uv lockfile and .venv. Only runtime dep is toml; pytest added as dev dep.
Service-level dependencies (OpenAI, Gemini, Anthropic, OpenRouter APIs) are
tracked separately via the state-hub capability/service dependency system.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Copy markitect.llm module into standalone llm_connect package.
All markitect.* imports replaced with llm_connect.* equivalents.
LLMError base class inlined (no markitect.exceptions dependency).
Verified: from llm_connect import create_adapter works.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>