Makes the state hub an event publisher so activity-core can drive
maintenance automation declaratively via ActivityDefinitions, rather
than the hub creating tasks itself.
- api/events/: lazy JetStream publisher + EventEnvelope mirroring
activity-core's contract; no-op when NATS_URL unset, fire-and-forget
with logged failures so publishing never breaks an API request.
- Wired publishers on the five v1.0 lifecycle events:
org.statehub.repo.registered (POST /repos/)
org.statehub.workstream.completed (PATCH /workstreams/* on transition)
org.statehub.decision.resolved (POST /decisions/*/resolve)
org.statehub.domain.goal.activated (POST /domain-goals/*/activate)
org.statehub.task.stale (scripts/cleanup_stale_tasks.py)
- docs/nats-event-subjects.md: subject naming convention + catalog.
- docs/cron-migration.md: design stub for replacing custodian-sync
systemd timer and cleanup-stale cron with ActivityDefinitions
(depends on activity-core WP-0003).
- docs/activity-core-delegation.md: protocol, invariants, cutover plan.
- SCOPE.md: declares activity-core as downstream event consumer and
restates that the state hub stays a read model, not a task factory.
Workplan: workplans/CUST-WP-0040-state-hub-nats-activity-core-integration.md
242 tests pass.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Rule: trailing slash only on collection roots (/). Any route containing
a path parameter {…} uses no trailing slash. Applies across all routers,
scripts, Makefile, and tests. Fixes 307-redirect fragility on POST/PATCH
from naive clients (curl, Codex HTTP calls).
Also adds POST /repos/{slug}/sync — runs ADR-001 consistency check with
--fix via HTTP, so non-MCP agents (Codex) can self-service DB sync without
operator intervention.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously defaulted to CWD ("."), causing ingest to silently scan the
state-hub directory instead of the target repo when called without
--repo-path. Now queries GET /repos/{slug}/ for host_paths[hostname]
and exits with a clear error if neither flag nor hub lookup succeeds.
Also deleted the incorrect SBOM snapshot for repo-registry (420 entries
that were actually state-hub packages).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Root cause of the 501-commit pile-up in inter-hub: fix_repo() created
git commits (brief updates, T03 writebacks) but never pushed them, so
the 15-minute timer accumulated local commits indefinitely. Once real
development landed on remote the repos diverged with no self-healing path.
Changes
-------
repo_sync.py (new module)
Extracts all git lifecycle primitives: pull_ff, push_ff,
count_remote_ahead (C-16 input), count_local_ahead (C-17/T04 input).
Module docstring documents the push-seal invariant and stable state.
consistency_check.py
- Imports primitives from repo_sync; thin _detect_behind_remote wrapper
preserves backward compat for existing callers and tests.
- C-17 backlog guard: if local has unpushed commits from a prior failed
push, retry before making more; skip all writes if push still fails.
- T04 push seal: unconditional push_ff() at end of every fix_repo() run.
- _report_needs_action: ahead_of_remote param so repos with unpushed
backlogs are not silently skipped as "clean" by fix_all_remote().
- Domain-slug fallback: brief no longer degrades to "(unknown)" when all
workplans are completed — falls back to any workstream for domain context.
- Service switched from --all --fix to --remote --all (pulls before
fixing, skips already-clean repos).
push-seal.md (new)
Capability documentation: the problem, the invariant, all three checks
(C-16/C-17/T04), stable-state description, API reference, and test map.
test_repo_sync.py (new, 32 tests)
Full coverage of all four primitives via real git repos (tmp_path).
Includes C-17 scenario, push-seal invariant, and four end-to-end
loop-stability tests.
test_consistency_check.py
Four new _report_needs_action cases for the ahead_of_remote parameter.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add PATCH /token-events/{id} endpoint to correct heuristic events
- Add `note` filter to GET /token-events/ list
- Add TokenEventPatch schema
- Add task_token_hook.py: PostToolUse hook that reads the Claude Code
session transcript, computes per-task token delta, and replaces the
heuristic token event with real measured counts (note="measured")
- Register hook in ~/.claude/settings.json on mcp__state-hub__update_task_status
Covers both interactive sessions and ralph-workplan loops
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The post-commit hook re-invokes fix-consistency, which commits writeback
changes, which re-triggers the hook — causing exponential process spawning.
Fix: pass GIT_CUSTODIAN_SYNC=1 in the env for all writeback git commits.
Update the post-commit hook (not tracked by git) to exit early when this
variable is set.
Also remove the --no-verify flag that was added as a failed attempt (it
only skips pre-commit/commit-msg, not post-commit hooks).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add git_fingerprint (root commit SHA-1) to managed_repos as a stable,
machine-independent identifier — identical across every clone regardless
of checkout path, remote URL, or SSH alias.
- Migration n1i2j3k4l5m6: adds git_fingerprint column + non-unique index
(non-unique to support repos that share ancestry via forks/splits)
- GET /repos/by-fingerprint?hash=<sha>[&remote_url=<url>]: lookup by
fingerprint; optional remote_url disambiguates shared-ancestry repos
- GET /repos/by-remote?url=<url>: fallback lookup by remote URL
- consistency_check.py --here [PATH]: auto-detects repo slug from any
local checkout via fingerprint (falls back to remote URL), then auto-
registers host_paths[hostname] so subsequent runs need no override
- --all now includes repos with host_paths[current_hostname], not just
those with local_path
- fix-consistency-here / check-consistency-here Makefile targets
- Fixed _api_get bug: httpx strips query strings when params={} is passed
- Backfilled fingerprints for 14 repos on this host
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds _write_custodian_brief() to consistency_check.py. After every fix_repo()
run, a .custodian-brief.md is written to the repo root with: domain, last-synced
timestamp, current repo goal, active workstreams with progress (done/total), and
the first 7 open tasks per workstream (blocked → in_progress → todo order) with
task IDs. The file is git-committed when content changes so remote workers (e.g.
CoulombCore) can pull it and orient without a live MCP connection.
Session protocol template and CLAUDE.md updated: read .custodian-brief.md first,
then call get_domain_summary() as an enhancement (skip if MCP unreachable).
This eliminates false "State hub is offline" alarms in subagents and remote workers.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds --remote CLI flag and fix_all_remote() function. When run without a
REPO argument, the target checks all registered repos and:
- Skips repos whose local path does not exist on this machine
- Skips repos that are already clean (no fixable issues, no FAILs, not
behind remote, only C-08 background noise allowed)
- For repos that need work: git pull --ff-only then fix_repo()
Prints a summary of CLEAN (skipped) and NOT ON THIS HOST (skipped) repos
before the detailed fix reports.
Simplifies the Makefile target from shell-level curl+git to a single
uv run call using --remote. Same flag handles both single-repo and all-repos.
Also adds _git_pull() helper and 13 new tests (71 total in consistency suite).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
_detect_behind_remote was comparing HEAD != @{u} which incorrectly
triggered C-16 when the local repo had unpushed commits. Fixed to use
git rev-list --count HEAD..@{u} which only counts commits the remote
has that local lacks. Adds test_returns_false_when_local_ahead.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
T01 — No-regress rule (C-15): fix-consistency now detects when a DB task
status is ahead of the workplan file (e.g. marked done on CoulombCore)
and emits C-15 WARN instead of regressing the DB back to the stale file
value. STATUS_ORDER ranking: todo(0) < in_progress/blocked(1) < done/cancelled(2).
T02 — Pull gate (C-16): fix_repo runs git fetch + rev-parse at the start
of every --fix run. If the local repo is behind its remote tracking branch,
all write operations are skipped and C-16 WARN is emitted. Best-effort:
offline/no-remote silently skips the check.
T03 — DB→file writeback: C-15 fix path patches the status field in the
matching task block and git-commits the change with a standard message.
--no-writeback flag disables writeback while keeping T01/T02 active.
T04 — CLAUDE.md + session-protocol.template updated with new guidance,
C-15/C-16 semantics, and fix-consistency-remote recommendation.
T05 — Makefile: fix-consistency-remote pulls then fixes in one step.
16 new tests; 155 passed total.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements the 14-criterion DoI checklist as a runnable gate with API,
MCP tools, CLI script, and dashboard integration.
Core components:
- api/doi_engine.py — async engine evaluating all 14 criteria (asyncio.to_thread
for non-blocking HTTP self-calls), shared by API and CLI
- api/schemas/doi.py — DoICriterion, DoIReport, DoISummaryEntry schemas
- api/routers/repos.py — GET /repos/{slug}/doi + GET /repos/doi/summary
- scripts/check_doi.py — CLI: make check-doi REPO=<slug> / check-doi-all
- mcp_server/server.py — check_repo_doi(), get_doi_summary() tools
Dashboard (repos.md):
- DoI tier badge per repo (None/Core/Standard/Full) colour-coded red→green
- Domain block shows lowest DoI tier across its repos
- DoI KPI card in summary row
- DoI filter in All Repos Table
- Link to Repository DoI policy page
Also fixes: TPSC snapshots 500 error (missing nested selectinload for
catalog_entry relationship in list_snapshots endpoint).
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Introduces a capability catalog (CUST-WP-0022) so domains can advertise what
they provide and agents can request capabilities from other domains with
auto-routing, lifecycle tracking, and task-unblocking on completion.
- New models: CapabilityCatalog, CapabilityRequest with full lifecycle
(requested → accepted → in_progress → ready_for_review → completed/rejected/withdrawn)
- Migration i6d7e8f9a0b1: capability_catalog + capability_requests tables
- Router /capability-catalog and /capability-requests with accept/status endpoints
- 7 new MCP tools: register_capability, list_capabilities, request_capability,
accept_capability_request, update_capability_request_status,
list_capability_requests, get_capability_request
- StateSummary gains open_capability_requests count
- Dashboard: capability-requests.md page + docs/capabilities.md + docs/scope.md
- SCOPE.md: three seed capabilities documented (MCP registration, state tracking, SBOM)
- scope.template: Provided Capabilities section with example block
- scripts/ingest_capabilities.py + make ingest-capabilities[/-all] targets
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Parses go.sum lockfiles for Go projects. Reads go.mod alongside to
mark direct vs indirect dependencies. Deduplicates by (module, version),
skipping go.mod hash lines.
Used to ingest key-cape (netkingdom domain): 23 Go modules.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- parse_task_blocks() now injects the nearest preceding ### heading
text as `title` — tasks no longer stored with bare IDs as their title
- C-11 fix skips creating tasks when workstream is completed/archived
(prevents duplicate task creation on repeated fix-consistency runs)
- C-12 is now fixable: auto-cancels open orphan DB tasks when the
backing workstream is finished (completed/archived)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Root cause analysis: calling create_workstream() before writing the workplan
file creates a ghost workstream with repo_id=null. When fix-consistency later
runs on the file, it creates a second workstream and writes its ID into the
file — leaving the ghost permanently active and showing false partial progress
in the dashboard.
C-14: after checking file-backed workstreams, query active workstreams on the
same topic with repo_id=null. Flag any whose title matches a file-backed
workstream as a probable ghost duplicate.
CLAUDE.md: add explicit "workplan ↔ DB sync rule" prohibiting create_workstream()
for file-backed work. Write file first, then make fix-consistency.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
T01: copy agent-scope-analyst.md to the-custodian/agents/
T02: add scope.template, prepend @SCOPE.md to claude-md.template,
update register_project.sh to write SCOPE.md stub on new registration,
add scope-analyst row to TOOLS.md
T03: SCOPE.md for the-custodian itself
Workplan: CUST-WP-0017 registered in state-hub
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix /domains/{slug}/ 500: EP/TD queries now use domain_id FK (not string column)
- Remove dead cascade-slug code in rename_domain (FK handles it)
- MCP: list_kaizen_agents(category?) + get_kaizen_agent(name) via resolve_repo_path()
- TOOLS.md: Kaizen Agents section with discovery/load pattern
- agents.template: new project rule for consumer repos
- claude-md.template + register_project.sh: include agents.md in new-project scaffolding
- agents/: direct install of 6 curated agents for hub sessions
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaces the monolithic project_claude_md.template with a directory of
7 focused rule files in scripts/project_rules/. register_project.sh now
generates .claude/rules/*.md + a thin CLAUDE.md index of @-imports,
matching the pattern established in ops-bridge.
Template files:
claude-md.template — 9-line @-import index
repo-identity.template — purpose, domain, slug, topic ID (machine-gen)
session-protocol.template — orient/inbox/workplans/brief/close (machine-gen)
first-session.template — bootstrap flow; delete once past FSP
workplan-convention.template— prefix, location; delegates to global CLAUDE.md
stack-and-commands.template — language/deps/commands (stub, manual)
architecture.template — design overview (stub, manual)
repo-boundary.template — what this repo does NOT own (stub, manual)
register_project.sh changes:
- Generates .claude/rules/ from templates with variable substitution
- Writes thin CLAUDE.md if none exists; appends suggestion comment if one does
- Step 7: auto-registers this machine's local path via POST /repos/{slug}/paths/
- project_claude_md.template deprecated to a redirect notice
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a JSONB column `host_paths` to managed_repos mapping
hostname → absolute local path. Fixes the consistency-checker
failure when the same repo lives at different paths on different
machines (e.g. /home/worsch/marki-docx on the workstation vs
/home/tegwick/marki-docx on custodiancore).
Changes:
- Migration g4b5c6d7e8f9: adds host_paths JSONB (default {})
- Model: host_paths Mapped[dict] column
- Schemas: host_paths in RepoRead; new RepoPathRegister schema
- Router: POST /repos/{slug}/paths/ — merges one host entry
- consistency_check.py: resolve_repo_path() prefers host_paths
[hostname] over local_path; --repo-path CLI override added
- MCP: update_repo_path(slug, path, host?) tool
- Makefile: register-path target; REPO_PATH passthrough on
check-consistency and fix-consistency targets
- TOOLS.md: documents update_repo_path
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Detects when all DB tasks are done/cancelled but the workstream status
is still 'active' — the pattern where a worker completes tasks via MCP
but forgets to call update_workstream_status(). Auto-fixable via --fix.
Also extends the C-04/C-05 fix path to handle C-13 (same PATCH logic).
Motivated by marki-docx WP-0001/WP-0002 visibility gap (2026-03-16).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Root cause: workplan files use "done" (task vocabulary) but the DB workstream
API only accepts "completed". The PATCH was silently failing with 422.
Fixes:
- Add FILE_TO_DB_WORKSTREAM_STATUS map and normalise_workstream_status()
- Normalise file status before C-04 comparison: done↔completed is no longer
spurious drift
- Normalise file status before PATCHing: always send DB-valid "completed"
- _api_patch now returns {"_error": ...} instead of None on failure, so the
fix loop reports FAILED entries rather than silently dropping them
- 9 new tests in TestNormaliseWorkstreamStatus (42 total)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Migration c5d6e7f8a9b0: domain_goals and repo_goals tables, repo_goal_id FK on workstreams
- DomainGoal: one active per domain (partial unique index), status active/archived/superseded
- RepoGoal: integer priority, status active/paused/completed/archived, optional domain_goal_id link
- WorkstreamUpdate schema and router extended with repo_goal_id and repo_goal_id filter
- 6 new MCP goal tools: create_domain_goal, get_domain_goals, activate_domain_goal, create_repo_goal, get_repo_goals, update_repo_goal
- update_workstream MCP tool: patch any subset of workstream fields (title, description, owner, due_date, repo_goal_id, status)
- get_domain_summary extended with goal_guidance (needs_workplan, alignment_warnings) signals
- Dashboard goals.md page and docs/goals.md reference page
- CLAUDE.md template updated to act on goal_guidance signals at session start
- CUST-WP-0010 workplan for this feature
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Avoids ~12.9k token response in domain repo sessions; get_domain_summary
returns the same actionable data scoped to the domain at ~10% of the cost.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>