Files
state-hub/workplans/STATE-WP-0060-fix-consistency-cross-flavor.md

6.4 KiB

id, type, title, domain, repo, status, owner, topic_slug, created, updated, state_hub_workstream_id
id type title domain repo status owner topic_slug created updated state_hub_workstream_id
STATE-WP-0060 workplan Fix cross-flavor make fix-consistency failures custodian state-hub finished codex custodian 2026-06-07 2026-06-07 557ea19e-64bf-44d9-a288-e8ad692a3754

Fix Cross-Flavor make fix-consistency Failures

Origin: infrastructure-friction analysis from the helix_forge domain (agentic-resources/docs/ASSESSMENT-infra-friction.md, error mining AGENTIC-WP-0006).

Critical Review

This workplan conforms with INTENT.md: consistency tooling is one of State Hub's explicit responsibilities, and fix-consistency is the bridge that keeps files-first workplans synchronized into the live read model. Improving its caller-facing reliability is State Hub work.

The original draft was too broad in one place. Current repo review on 2026-06-07 shows the single-repo Make targets already convert warning-only exit code 2 to shell success:

  • make check-consistency REPO=... runs scripts/consistency_check.py and then maps e == 2 to exit 0.
  • make fix-consistency REPO=... does the same.
  • The underlying CLI still exits 2 for warnings, which is useful for strict machine callers but can confuse agents or wrappers if they call the script directly, use an aggregate target, or hit a repo-specific unrecoverable case.

Implementation should therefore diagnose the exact caller path before changing semantics. Do not blindly make all warnings disappear: warnings must remain visible in text/JSON output and in the structured result. Do not relax genuine failures just to make automation green.

make fix-consistency exits non-zero — make: *** [...: fix-consistency] Error 1 — in 5 captured coding sessions across 4 repos. It is the only error category that spans both Claude and Grok; every other recurring error in the corpus is Claude-only, which makes this the one cross-flavor signal we can trust as not agent-specific.

fix-consistency is the ADR-001 reconciliation mechanism that syncs the files-first workplans with the hub read model. When it fails, workplan/hub drift goes unreconciled — so a flaky exit here undermines the whole coordination model. (Note: on healthy single-repo Make runs it can print PASS (with warnings) and still exit 0 because the Make target normalizes warning exit code 2.)

Root-Cause the Non-Zero Exit Path

id: STATE-WP-0060-T01
status: done
priority: high
state_hub_task_id: "011a49ad-13a5-46f7-849d-f7b1a0bca005"

Reproduce the failure across the repos/flavors that hit it. Identify the failure mode at the actual caller path — single-repo Make target, direct scripts/consistency_check.py, aggregate target, post-commit hook, remote path, or repo-specific reconciliation failure. Capture the exact command, output, and exit code.

Done when the diagnosis classifies each captured failure into one of:

  • warning-only direct CLI exit 2 misinterpreted by an agent/wrapper,
  • aggregate/remote target exit handling that differs from the single-repo Make target,
  • real unrecoverable reconciliation failure,
  • environment/setup issue such as missing uv, unreachable API, stale repo path, or missing write permission.

Also record whether the older Makefile:227 line number simply refers to a previous file version; the current fix-consistency target lives later in the Makefile.

Fix Exit Semantics / Failing Reconciliation Case

id: STATE-WP-0060-T02
status: done
priority: high
state_hub_task_id: "49388ab7-db45-4bdb-a89e-bb7f116afd47"

Fix the exact failing path without hiding drift.

Implementation notes:

  • Preserve explicit warning reporting in CLI text and JSON output.
  • Keep direct script exit semantics if callers rely on 2 to distinguish warnings, unless the diagnosis shows that contract is the root defect and the repo intentionally changes it.
  • Ensure all agent/operator-facing Make targets, hooks, and documented commands that say warnings are acceptable actually return shell success on warnings-only runs.
  • If a real reconciliation bug is found, fix the underlying case rather than weakening the exit code.
  • Add tests for the boundary: clean = success, warnings-only = agent-facing success with visible warnings, real failures = non-zero.

Verify Across Affected Repos + Document

id: STATE-WP-0060-T03
status: done
priority: medium
state_hub_task_id: "c9939dcb-37da-4073-a5f6-06f94fc7807e"

Verify fix-consistency now passes on the repos that previously failed (cross-flavor). Document the exit-code contract for callers (agents + operators) near the Make targets or consistency docs, including when to use direct CLI strictness versus Make wrappers.

Done when:

  • the affected repos/flavors no longer report warning-only runs as failed,
  • a genuine failure fixture still exits non-zero,
  • make fix-consistency REPO=state-hub succeeds after this repo's workplan updates.

After workplan updates, run from ~/state-hub:

make fix-consistency REPO=state-hub

Verification Notes

Completed 2026-06-07:

  • Reproduced the local non-zero path in a non-login WSL shell: wsl -d Ubuntu-24.04 --cd /home/worsch/state-hub make fix-consistency REPO=state-hub failed with /bin/sh: 1: uv: not found before the checker could apply warning/failure exit semantics. The same command worked through bash -lc, where uv is on PATH as /home/worsch/.local/bin/uv.
  • Confirmed the current Makefile line number is not 227: the single-repo fix-consistency target is now around line 240, so the older captured Makefile:227 reference is from a previous file version.
  • Kept direct scripts/consistency_check.py strictness: clean exits 0, warnings-only exits 2, and failures exit 1. Added consistency_exit_code() to make that contract explicit.
  • Hardened all Makefile uv invocations with UV ?= ... so non-login shells fall back to ~/.local/bin/uv. Agent/operator consistency Make targets still normalize warning-only 2 to shell success while preserving warning output.
  • Added tests for strict checker exit codes and actual Make targets using a fake uv executable: clean returns success, warning-only returns success for Make wrappers, and real failures remain non-zero.

Verification:

  • .venv/bin/python -m pytest tests/test_consistency_check.py -q -> 109 passed
  • wsl -d Ubuntu-24.04 --cd /home/worsch/state-hub make check-consistency REPO=state-hub -> PASS without a login shell
  • git diff --check -> clean