Record daily triage schema canary blocker

This commit is contained in:
2026-05-21 03:19:27 +02:00
parent ed6a13c8d7
commit a28deec772
3 changed files with 114 additions and 6 deletions

View File

@@ -10,7 +10,7 @@ topic_slug: custodian
planning_priority: high
planning_order: 45
created: "2026-05-19"
updated: "2026-05-19"
updated: "2026-05-21"
state_hub_workstream_id: "d9d9a3ec-f736-4041-beac-bb92c7ad314e"
---
@@ -421,6 +421,81 @@ Verification:
`PYTHONPATH=. uv run pytest -q`:
173 passed
## Implementation Notes - 2026-05-21
T06 remains in progress; no cutover was performed and the Codex automation must
remain the fallback runner. The daily triage ActivityDefinition is still
`enabled: false`.
Real llm-connect canary attempt 1 reached the activity-core workflow but failed
before report persistence:
- workflow id:
`activity-6fca51fa-387a-4fd0-bc4e-d62c29eb859a:manual-d0317873-5e09-4849-a57a-6edff7fada2c`
- Temporal status: `COMPLETED`
- activity-core run id: `9b8486b5-0495-5d3f-8b7b-dc078a7c097b`
- worker evidence: llm-connect returned HTTP 200 twice, but activity-core
rejected the instruction output as invalid JSON
- persistence evidence: no working-memory note and no State Hub
`daily_triage` progress event were written
Diagnosis showed that server-mode llm-connect was resolving the older
`/usr/bin/claude` CLI instead of the working user install at
`/home/worsch/.local/bin/claude`. A direct llm-connect probe through the older
CLI returned the literal content `Execution error`, while the user install could
return raw JSON. Restarting llm-connect with the user CLI path made a small
probe return `{"ok": true}` through the HTTP boundary.
Real llm-connect canary attempt 2 used the working Claude CLI path but still did
not produce a persisted report:
- workflow id:
`activity-6fca51fa-387a-4fd0-bc4e-d62c29eb859a:manual-2de56ad6-0f82-48f0-8184-f357bd22f658`
- Temporal status: `COMPLETED`
- activity-core run id: `953a1f46-e57b-58e1-b4a2-2e41e804a972`
- worker evidence: first llm-connect call returned HTTP 200, then activity-core
retried because the output was not schema-valid JSON; the retry returned
HTTP 500
- persistence evidence: no working-memory note and no State Hub
`daily_triage` progress event were written
The follow-up fix keeps the existing activity-core/llm-connect boundary:
- activity-core now loads an instruction's existing `output_schema` and forwards
that schema to llm-connect as `model_params.json_schema`
- llm-connect's Claude Code adapter now prefers
`LLM_CONNECT_CLAUDE_CLI_PATH`, `CLAUDE_CLI_PATH`, or the user-local
`/home/worsch/.local/bin/claude` before falling back to `claude`
- llm-connect's Claude Code adapter maps `model_params.json_schema` to the
native Claude CLI `--json-schema` option
- the Custodian ActivityDefinition now points at the domain-owned absolute
schema path `/home/worsch/the-custodian/schemas/daily-triage-report.json`
and asks for JSON only as a fallback
The patched schema probe could not be completed because the local Claude Code
session limit was reached; the CLI reported:
`You've hit your session limit · resets 3:40am (Europe/Berlin)`.
Next T06 step after the limit resets, or after llm-connect routes this profile
to another approved provider, is to rerun the manual trigger with the patched
schema path and verify all three evidence surfaces before pausing Codex or
enabling the activity-core schedule.
Verification:
- activity-core focused executor tests:
`uv run pytest tests/rules/test_executor.py -q`:
22 passed
- llm-connect focused Claude Code/factory tests:
`PYTHONPATH=. uv run pytest tests/test_claude_code.py tests/test_factory.py -q`:
18 passed
- activity-core full suite:
`uv run pytest -q`:
115 passed, 1 skipped
- llm-connect full suite:
`PYTHONPATH=. uv run pytest -q`:
175 passed
## Acceptance Criteria
- The daily State Hub WSJF triage runs from activity-core, not Codex app cron.