Adds first-class tracking for API and interface mutations across the
agent ecosystem. Breaking changes are documented, affected repos are
notified via inbox, and agents discover pending changes at session
start via the dispatch endpoint.
- Migration q4l5m6n7o8p9: interface_changes table
- Model/schema: InterfaceChange with draft→published→resolved lifecycle
- Router: POST/GET/PATCH /interface-changes/, /publish, /resolve actions
(auto-notify affected repo agents on publish; progress event on origin)
- Dispatch: GET /repos/{slug}/dispatch now returns pending_interface_changes
- MCP tools: register_interface_change, list_interface_changes,
publish_interface_change, resolve_interface_change
- Dashboard: /interface-changes page with type badges, planned calendar,
published cards, and draft table
- EP-CUST-ICR-001 registered: webhook subscriptions (deliberately deferred)
First record: trailing-slash normalisation (2026-04-26), published,
affecting repo-registry — visible in repo-registry dispatch immediately.
223 tests passing.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Rule: trailing slash only on collection roots (/). Any route containing
a path parameter {…} uses no trailing slash. Applies across all routers,
scripts, Makefile, and tests. Fixes 307-redirect fragility on POST/PATCH
from naive clients (curl, Codex HTTP calls).
Also adds POST /repos/{slug}/sync — runs ADR-001 consistency check with
--fix via HTTP, so non-MCP agents (Codex) can self-service DB sync without
operator intervention.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add PATCH /token-events/{id} endpoint to correct heuristic events
- Add `note` filter to GET /token-events/ list
- Add TokenEventPatch schema
- Add task_token_hook.py: PostToolUse hook that reads the Claude Code
session transcript, computes per-task token delta, and replaces the
heuristic token event with real measured counts (note="measured")
- Register hook in ~/.claude/settings.json on mcp__state-hub__update_task_status
Covers both interactive sessions and ralph-workplan loops
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
By Repo now resolves via the full chain rather than requiring repo_id
directly on the token event:
1. token_events.repo_id (direct)
2. → workstreams.repo_id (via workstream_id)
3. → task.workstream_id → workstreams.repo_id (via task_id)
Changes:
- Auto-populate repo_id on token events at creation time (both the
token_events router and the tasks router)
- New GET /token-events/by-repo/ endpoint with RepoTokenSummary schema;
returns tokens_in/out/total, event_count, by_model, by_note per repo
- Dashboard By Repo section uses /by-repo/ directly and shows repo_slug
instead of a truncated UUID
- Backfilled the three existing events (userbased) with repo_id via SQL
185 tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Tier 1 (exact counts) now defaults to note="measured" instead of null,
signalling the counts were read from the Claude Code status bar.
Callers can pass note="userbased" when a human provided the numbers.
measured — agent read exact counts from the Claude Code status bar
userbased — counts provided by a human
workplan — prorated from workplan total across task count
heuristic — server fallback, 1000/500, no agent input
Added token_note field to TaskUpdate schema and exposed note param on
update_task_status and record_interactive_task MCP tools.
TOOLS.md documents the full taxonomy. 185 tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Token events are now always created when update_task_status is called
with status="done", using the best available data:
Tier 1 (best): exact tokens_in + tokens_out passed by agent
Tier 2: workplan_tokens_in + workplan_tokens_out prorated
across workstream task count (note="workplan")
Tier 3 (fallback): heuristic 1000 in / 500 out (note="heuristic")
Non-done status changes never create a token event.
MCP tool updated with workplan_tokens_in/out params and tiered docs.
Ralph-workplan skill files updated with the three-tier guidance.
184 tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add git_fingerprint (root commit SHA-1) to managed_repos as a stable,
machine-independent identifier — identical across every clone regardless
of checkout path, remote URL, or SSH alias.
- Migration n1i2j3k4l5m6: adds git_fingerprint column + non-unique index
(non-unique to support repos that share ancestry via forks/splits)
- GET /repos/by-fingerprint?hash=<sha>[&remote_url=<url>]: lookup by
fingerprint; optional remote_url disambiguates shared-ancestry repos
- GET /repos/by-remote?url=<url>: fallback lookup by remote URL
- consistency_check.py --here [PATH]: auto-detects repo slug from any
local checkout via fingerprint (falls back to remote URL), then auto-
registers host_paths[hostname] so subsequent runs need no override
- --all now includes repos with host_paths[current_hostname], not just
those with local_path
- fix-consistency-here / check-consistency-here Makefile targets
- Fixed _api_get bug: httpx strips query strings when params={} is passed
- Backfilled fingerprints for 14 repos on this host
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a structured dispute mechanism when capability request routing is wrong:
- New `routing_disputed` status with four DB columns (dispute_reason, disputed_by,
dispute_suggested_domain, disputed_at) via Alembic migration m0h1i2j3k4l5
- POST /capability-requests/{id}/dispute — any party can flag misrouting with a reason
and optional suggested domain; notifies custodian + current fulfilling domain
- POST /capability-requests/{id}/reroute — custodian re-routes to correct domain via
catalog_entry_id or direct slug; appends audit trail to routing_note; resets to requested
- Two new MCP tools: dispute_capability_routing and reroute_capability_request
- Dashboard: amber disputed-banner at top of Summary, routing_disputed Kanban column,
dispute details (reason, suggested domain, raised-by) shown on disputed cards
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add `routing_note` column (migration l9g0h1i2j3k4) to persist why a request was routed to a given domain
- Fix substring-match bug in `_route_capability`: use `\b` word-boundary regex so 'postgres' no longer matches inside 'postgresql'
- Include `title` in keyword scoring for better routing accuracy
- Return `routing_note` string from `_route_capability` and store it on the request
- Add `PATCH /capability-requests/{id}` endpoint + `CapabilityRequestPatch` schema to correct mutable metadata (catalog_entry_id, priority, blocking_task_id, fulfilling_workstream_id)
- Add `patch_capability_request` MCP tool wrapping the new endpoint
- Add 105 lines of routing tests (word-boundary, title-match, multi-entry scoring, broadcast fallback)
- Add `tunnels-up`, `tunnels-status`, `tunnels-check` Makefile targets for ops-bridge managed tunnels
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Root cause: C2/C9/C10 each made a full HTTP round-trip back to the API
(asyncio.to_thread → urllib → TCP → uvicorn → SQLAlchemy → DB) for every
repo. 16 repos × 3 calls = 48 self-calls at ~80-150ms each = ~6s total.
Fix: doi_engine.evaluate() accepts a prefetch dict. The summary endpoint
runs 3 bulk GROUP BY queries (domain status, TPSC snapshot counts, active
goal counts) and passes results directly — zero HTTP self-calls in summary
mode.
Result: /repos/doi/summary 6s → <1s (6× improvement on top of prior 13×).
Total improvement from original: 108s → <1s.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Two fixes:
1. skip_consistency=True in summary mode — omits C7/C13 subprocess calls
(consistency_check.py) which were the main bottleneck (32 spawns for 16 repos).
Full check still available per-repo via GET /repos/{slug}/doi.
2. asyncio.gather — all repos evaluated in parallel instead of sequentially.
Also: rename Repositories page title from "Repos" to "Repositories".
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Implements the 14-criterion DoI checklist as a runnable gate with API,
MCP tools, CLI script, and dashboard integration.
Core components:
- api/doi_engine.py — async engine evaluating all 14 criteria (asyncio.to_thread
for non-blocking HTTP self-calls), shared by API and CLI
- api/schemas/doi.py — DoICriterion, DoIReport, DoISummaryEntry schemas
- api/routers/repos.py — GET /repos/{slug}/doi + GET /repos/doi/summary
- scripts/check_doi.py — CLI: make check-doi REPO=<slug> / check-doi-all
- mcp_server/server.py — check_repo_doi(), get_doi_summary() tools
Dashboard (repos.md):
- DoI tier badge per repo (None/Core/Standard/Full) colour-coded red→green
- Domain block shows lowest DoI tier across its repos
- DoI KPI card in summary row
- DoI filter in All Repos Table
- Link to Repository DoI policy page
Also fixes: TPSC snapshots 500 error (missing nested selectinload for
catalog_entry relationship in list_snapshots endpoint).
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Introduces a capability catalog (CUST-WP-0022) so domains can advertise what
they provide and agents can request capabilities from other domains with
auto-routing, lifecycle tracking, and task-unblocking on completion.
- New models: CapabilityCatalog, CapabilityRequest with full lifecycle
(requested → accepted → in_progress → ready_for_review → completed/rejected/withdrawn)
- Migration i6d7e8f9a0b1: capability_catalog + capability_requests tables
- Router /capability-catalog and /capability-requests with accept/status endpoints
- 7 new MCP tools: register_capability, list_capabilities, request_capability,
accept_capability_request, update_capability_request_status,
list_capability_requests, get_capability_request
- StateSummary gains open_capability_requests count
- Dashboard: capability-requests.md page + docs/capabilities.md + docs/scope.md
- SCOPE.md: three seed capabilities documented (MCP registration, state tracking, SBOM)
- scope.template: Provided Capabilities section with example block
- scripts/ingest_capabilities.py + make ingest-capabilities[/-all] targets
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Fix /domains/{slug}/ 500: EP/TD queries now use domain_id FK (not string column)
- Remove dead cascade-slug code in rename_domain (FK handles it)
- MCP: list_kaizen_agents(category?) + get_kaizen_agent(name) via resolve_repo_path()
- TOOLS.md: Kaizen Agents section with discovery/load pattern
- agents.template: new project rule for consumer repos
- claude-md.template + register_project.sh: include agents.md in new-project scaffolding
- agents/: direct install of 6 curated agents for hub sessions
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a JSONB column `host_paths` to managed_repos mapping
hostname → absolute local path. Fixes the consistency-checker
failure when the same repo lives at different paths on different
machines (e.g. /home/worsch/marki-docx on the workstation vs
/home/tegwick/marki-docx on custodiancore).
Changes:
- Migration g4b5c6d7e8f9: adds host_paths JSONB (default {})
- Model: host_paths Mapped[dict] column
- Schemas: host_paths in RepoRead; new RepoPathRegister schema
- Router: POST /repos/{slug}/paths/ — merges one host entry
- consistency_check.py: resolve_repo_path() prefers host_paths
[hostname] over local_path; --repo-path CLI override added
- MCP: update_repo_path(slug, path, host?) tool
- Makefile: register-path target; REPO_PATH passthrough on
check-consistency and fix-consistency targets
- TOOLS.md: documents update_repo_path
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
EP catalogue (all domains):
- EP-RAIL-001 ep_id patched (schema fix: add ep_id to EPUpdate)
- EP-RAIL-003 (git bare-repo mirrors) and EP-RAIL-004 (offsite secondary
backup) registered from railiance-cluster/docs/backup-restore.md
- EP-CUST-003..007 ep_ids assigned to existing custodian EPs
- EP-CUST-008 (State Hub API auth) and EP-CUST-009 (update_workstream MCP
tool) registered as new custodian extension points
TD catalogue (railiance — first 5 items):
- TD-RAIL-001: backup cron runs as root without audit trail (high/security)
- TD-RAIL-002: k3s kubeconfig world-readable mode 644 (medium/security)
- TD-RAIL-003: no Ansible role unit tests (medium/test)
- TD-RAIL-004: age key extracted via awk — fragile (medium/impl)
- TD-RAIL-005: etcd snapshot retention uncoordinated (low/impl)
Dashboard (T08 + T10):
- Extract API URL and POLL to src/components/config.js; all 15 pages
now import from the shared module (contributions/goals keep custom POLL)
- Shared .kpi-infobox, .filter-bar, .filter-search/.filter-owner CSS
moved to observablehq.config.js head <style> block; removed from 9 pages
- Build: 0 errors, 0 warnings
API (T09):
- progress.py: limit param now Query(100, le=1000) — prevents unbounded
list requests; closes TD-CUST-004 for the only endpoint that had limit
CUST-WP-0004 marked completed (all 10 tasks done).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
EPUpdate was missing the ep_id field, making it impossible to assign a
human-readable ID to an existing EP via PATCH. The router already uses
model_dump(exclude_unset=True) + setattr so no router change needed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Migration c5d6e7f8a9b0: domain_goals and repo_goals tables, repo_goal_id FK on workstreams
- DomainGoal: one active per domain (partial unique index), status active/archived/superseded
- RepoGoal: integer priority, status active/paused/completed/archived, optional domain_goal_id link
- WorkstreamUpdate schema and router extended with repo_goal_id and repo_goal_id filter
- 6 new MCP goal tools: create_domain_goal, get_domain_goals, activate_domain_goal, create_repo_goal, get_repo_goals, update_repo_goal
- update_workstream MCP tool: patch any subset of workstream fields (title, description, owner, due_date, repo_goal_id, status)
- get_domain_summary extended with goal_guidance (needs_workplan, alignment_warnings) signals
- Dashboard goals.md page and docs/goals.md reference page
- CLAUDE.md template updated to act on goal_guidance signals at session start
- CUST-WP-0010 workplan for this feature
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements CUST-WP-0007. Resolves inconsistencies I-1, I-2, I-5, I-6
identified in the GEMS audit (GenericEntityModellingSystem.md).
Pass 1 (e1f2a3b4c5d6): domain_id FK on extension_points and
technical_debt (replaces raw string column); repo_id FK on contributions.
Fixes domain-filtering bugs in EP/TD dashboard pages.
Pass 2 (f2a3b4c5d6e7): repo_id nullable FK on workstreams, aligning
the GEMS primary attachment with ADR-001 (repo > topic). Dashboard
pages updated to prefer repo->domain over topic->domain.
Pass 3 (a3b4c5d6e7f8): SBOMSnapshot container entity (GEMS Complex
between Repository and SBOMEntry). Ingest is now additive — each call
creates a new snapshot; history is retained. List/report endpoints
filter to latest snapshot per repo via _latest_snapshot_ids_subquery().
New endpoints: GET /sbom/snapshots/, GET /sbom/snapshots/{id}/.
Dashboard gains a Snapshot History section.
Also adds GEMS analysis artefacts: wiki/GEMS-StateHub-TypeRegistry.md,
wiki/GEMS-StateHub-SWOT.md, workplans/CUST-WP-0006 (analysis),
workplans/CUST-WP-0007 (migration, now completed).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
str.join() is synchronous and cannot consume a generator that uses await.
Build the blocker slugs list with an explicit async for loop instead.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaces the hardcoded 6-domain PostgreSQL ENUM with a first-class
`domains` DB table, and adds a `managed_repos` table for multi-repo
support per domain.
P1 — Domain as a DB entity:
- Migration b1c2d3e4f5a6: creates `domains` table, migrates topics.domain
ENUM column to domain_id FK, drops the domain ENUM type
- Domain ORM model (api/models/domain.py) + Pydantic schemas
- Domain API router: GET/POST /domains/, GET/PATCH /domains/{slug}/,
rename and archive endpoints with EP/TD cascade on rename
- Topic model updated: domain_id FK + @property domain_slug for
backwards-compatible JSON serialization (field renamed domain → domain_slug)
- TopicCreate/TopicRead updated; seed.py rewritten to use FK lookup
P2 — Multi-repo support:
- ManagedRepo ORM model (api/models/managed_repo.py) + schemas
- Repo API router: GET/POST /repos/, GET/PATCH /repos/{slug}/, archive
- Makefile: add-domain, rename-domain, add-repo, list-repos targets
- register_project.sh: verify domain via /domains/ API + POST /repos/
P3 — MCP tools & live validation:
- 6 new MCP tools: list_domains, create_domain, rename_domain,
archive_domain, list_domain_repos, register_repo
- EP/TD routers: replace hardcoded VALID_DOMAINS set with per-request
DB lookup — returns 422 with list of valid slugs on unknown domain
- State summary: adds domains: list[DomainSummary] (slug, name,
repo_count, active_workstream_count, ep_count, td_count)
- TOOLS.md updated with domain management section
P4 — Dashboard:
- New domains.md page with KPI row + domain cards + repo lists
- domains.json.py + repos.json.py data loaders
- Domains page added to observablehq.config.js nav
- workstreams.md, extensions.md, techdept.md: domain_slug fix +
dynamic domain list loaded from /domains/ API (no longer hardcoded)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
API:
- DecisionResolve schema (rationale, decided_by, write_log flag)
- POST /decisions/{id}/resolve — marks resolved, emits progress event,
appends entry to DECISIONS.md in the project's registered directory
(found via the topic's registration milestone event)
Dashboard:
- Replace Inputs.table for blocking decisions with full-text cards
- Each card shows title, full description (pre-wrap), rationale/context,
escalation warning if present
- Expandable "Resolve →" section with rationale textarea, decided-by
input, submit button that calls the resolve endpoint
- On success: collapses form, dims card, confirms log was written
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
API:
- WorkstreamWithTaskCounts schema extends WorkstreamRead with
tasks_total/todo/in_progress/blocked/done fields
- /state/summary now includes these counts in open_workstreams via
a single extra GROUP BY query (workstream_id, status)
Dashboard:
- Replace domain workstream-count bar with a horizontal stacked
progress bar per workstream (done/in-progress/blocked/todo)
- Workstreams with no tasks show "no tasks yet" annotation
- Workstreams with tasks show "X/N done" label after the bar
- Sorted by domain then title so domains group naturally
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CORS: add CORSMiddleware to FastAPI for localhost:3000 so browser fetch
works across ports without errors.
All four pages now use async generator cells that call the API directly
and re-yield every 15 s — no data loader cache, no manual cache clearing.
Each page shows a live status bar (● green/red · last updated time).
Offline state shows the `make api` hint inline.
index.md: add "Registered Projects" section — polls
/progress/?event_type=milestone&limit=500 and filters for
"Project registered with State Hub:" events; shows project name,
domain, path, and registration timestamp.
workstreams.md: fix broken domain column — now fetches /workstreams/
and /topics/ in parallel and joins on topic_id client-side.
Previously the domain column showed "unknown" for all rows because
WorkstreamRead schema doesn't include domain.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>