Covers installation, usage, workplan file format, task status lifecycle,
custodian naming conventions, COULOMBCORE usage, and manual cancellation.
Registered in Reference nav + reference.md index.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers local setup, remote (COULOMBCORE) one-liner registration,
ops-bridge tunnel config, bridge states, MCP transport modes, and
adding new remote hosts. Registered in Reference nav + reference.md index.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
server.py: MCP_TRANSPORT and MCP_PORT env vars select transport at startup
(default: stdio — no behaviour change for local use)
Makefile: `make mcp-http` starts SSE server on 127.0.0.1:8001
Remote registration (one-liner on COULOMBCORE after tunnel is up):
claude mcp add-json -s user state-hub \
'{"type":"sse","url":"http://127.0.0.1:18001/sse"}'
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Root cause: workplan files use "done" (task vocabulary) but the DB workstream
API only accepts "completed". The PATCH was silently failing with 422.
Fixes:
- Add FILE_TO_DB_WORKSTREAM_STATUS map and normalise_workstream_status()
- Normalise file status before C-04 comparison: done↔completed is no longer
spurious drift
- Normalise file status before PATCHing: always send DB-valid "completed"
- _api_patch now returns {"_error": ...} instead of None on failure, so the
fix loop reports FAILED entries rather than silently dropping them
- 9 new tests in TestNormaliseWorkstreamStatus (42 total)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
33 offline tests covering: parse_frontmatter, parse_task_blocks,
get_tasks_from_workplan, ConsistencyReport severity filtering,
render_text output, and report_to_dict serialisation.
Closes the DoD automated-tests gap for the Consistency Engine workstream.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
PATCH /decisions/{id}/ is a blind field-setter with no decided_at logic.
POST /decisions/{id}/resolve is the correct endpoint — it auto-sets
decided_at and emits a decision_resolved progress event.
Fixes: resolved decisions showing last in the sorted list because
decided_at was never populated.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previous fix applied the decided_at branch to all status groups,
causing open decisions without decided_at (e.g. COULOMBCORE decision)
to sort last behind any open decision that had decided_at set.
Now: decided_at desc only for resolved/superseded; open/escalated
use deadline asc → created_at desc.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Within resolved/superseded: most recently decided_at first.
Within open/escalated: soonest deadline first, then most recently
created_at (previously had no created_at fallback).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
EP catalogue (all domains):
- EP-RAIL-001 ep_id patched (schema fix: add ep_id to EPUpdate)
- EP-RAIL-003 (git bare-repo mirrors) and EP-RAIL-004 (offsite secondary
backup) registered from railiance-cluster/docs/backup-restore.md
- EP-CUST-003..007 ep_ids assigned to existing custodian EPs
- EP-CUST-008 (State Hub API auth) and EP-CUST-009 (update_workstream MCP
tool) registered as new custodian extension points
TD catalogue (railiance — first 5 items):
- TD-RAIL-001: backup cron runs as root without audit trail (high/security)
- TD-RAIL-002: k3s kubeconfig world-readable mode 644 (medium/security)
- TD-RAIL-003: no Ansible role unit tests (medium/test)
- TD-RAIL-004: age key extracted via awk — fragile (medium/impl)
- TD-RAIL-005: etcd snapshot retention uncoordinated (low/impl)
Dashboard (T08 + T10):
- Extract API URL and POLL to src/components/config.js; all 15 pages
now import from the shared module (contributions/goals keep custom POLL)
- Shared .kpi-infobox, .filter-bar, .filter-search/.filter-owner CSS
moved to observablehq.config.js head <style> block; removed from 9 pages
- Build: 0 errors, 0 warnings
API (T09):
- progress.py: limit param now Query(100, le=1000) — prevents unbounded
list requests; closes TD-CUST-004 for the only endpoint that had limit
CUST-WP-0004 marked completed (all 10 tasks done).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
EPUpdate was missing the ep_id field, making it impossible to assign a
human-readable ID to an existing EP via PATCH. The router already uses
model_dump(exclude_unset=True) + setattr so no router change needed.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The tunnel command belongs here — it opens a reverse SSH tunnel so that
a remote host can reach the local State Hub at 127.0.0.1:8000.
Usage: make tunnel HOST=user@hostname
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Migration c5d6e7f8a9b0: domain_goals and repo_goals tables, repo_goal_id FK on workstreams
- DomainGoal: one active per domain (partial unique index), status active/archived/superseded
- RepoGoal: integer priority, status active/paused/completed/archived, optional domain_goal_id link
- WorkstreamUpdate schema and router extended with repo_goal_id and repo_goal_id filter
- 6 new MCP goal tools: create_domain_goal, get_domain_goals, activate_domain_goal, create_repo_goal, get_repo_goals, update_repo_goal
- update_workstream MCP tool: patch any subset of workstream fields (title, description, owner, due_date, repo_goal_id, status)
- get_domain_summary extended with goal_guidance (needs_workplan, alignment_warnings) signals
- Dashboard goals.md page and docs/goals.md reference page
- CLAUDE.md template updated to act on goal_guidance signals at session start
- CUST-WP-0010 workplan for this feature
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Domains are sorted top-to-bottom by the latest updated_at across their
workstreams (most recently active domain first). Within a domain,
workstreams are also ordered by updated_at desc. Replaces alphabetical sort.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Avoids ~12.9k token response in domain repo sessions; get_domain_summary
returns the same actionable data scoped to the domain at ~10% of the cost.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Move Interventions under Workstreams in the navigator
- Add action-confirm.js: shared modal component for actions requiring a
mandatory comment (survives live-poll re-renders, unlike inline DOM mutation)
- Wire action-confirm into Interventions (Mark done) and Decisions (Resolve)
- Fix Interventions completed section: fetch all tasks and filter client-side
so resolved interventions (needs_human=false) still appear under Completed
- Add docs/interventions.md help page with ? button on the h1
- Replace all hardcoded "Bernd" with "human" across dashboard src and docs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements CUST-WP-0007. Resolves inconsistencies I-1, I-2, I-5, I-6
identified in the GEMS audit (GenericEntityModellingSystem.md).
Pass 1 (e1f2a3b4c5d6): domain_id FK on extension_points and
technical_debt (replaces raw string column); repo_id FK on contributions.
Fixes domain-filtering bugs in EP/TD dashboard pages.
Pass 2 (f2a3b4c5d6e7): repo_id nullable FK on workstreams, aligning
the GEMS primary attachment with ADR-001 (repo > topic). Dashboard
pages updated to prefer repo->domain over topic->domain.
Pass 3 (a3b4c5d6e7f8): SBOMSnapshot container entity (GEMS Complex
between Repository and SBOMEntry). Ingest is now additive — each call
creates a new snapshot; history is retained. List/report endpoints
filter to latest snapshot per repo via _latest_snapshot_ids_subquery().
New endpoints: GET /sbom/snapshots/, GET /sbom/snapshots/{id}/.
Dashboard gains a Snapshot History section.
Also adds GEMS analysis artefacts: wiki/GEMS-StateHub-TypeRegistry.md,
wiki/GEMS-StateHub-SWOT.md, workplans/CUST-WP-0006 (analysis),
workplans/CUST-WP-0007 (migration, now completed).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
custodian_cli.py:
- new ingest-sbom subcommand: auto-detects repo slug from local_path
registration, runs ingest_sbom.py --scan from the repo root
- --dry-run flag passes through to the underlying script
- --slug override for repos where path lookup fails
repos.md:
- ? button on "⚠ not ingested" now opens /docs/sbom (not /docs/repos)
docs/sbom.md:
- Ingest commands section now leads with `custodian ingest-sbom` (repo-root)
- make ingest-sbom kept as low-level alternative
- Per-ecosystem and gap-type references updated to new command
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Each "⚠ not ingested" entry in the Coverage Map now shows a hoverable ?
button linking to /docs/repos (SBOM ingestion section).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
custodian_cli.py:
- register-project now writes CLAUDE.custodian.md (suggestion) instead
of overwriting CLAUDE.md; includes preamble with integration instructions
- registers repo via POST /repos/
- creates a "Repo Integration: {slug}" workstream in the domain's topic
with 4 onboarding tasks (integrate CLAUDE.md, first workplan, SBOM,
EPs/TDs); checks for existing workstream to be idempotent
- fixes {REPO_SLUG} template substitution (previously missing)
dashboard:
- repos.md: fetches workstreams; detects active repo-integration-* slugs;
adds "Integrating" KPI card; shows ⚙ integrating badge per repo in
coverage map and table; replaces "How to Ingest a Repo" with
"Onboard a New Repo" 4-step panel with doc help button
- docs/repo-integration.md (new): full collaboration model doc — custodian
as coach, repo agent as executor; journey, generated tasks, first session
protocol, ongoing relationship
- docs/repos.md: links to new repo-integration doc; updates "What is a
managed repo?" section; adds onboarding quick reference
- docs/reference.md: fix latent build error — code examples were in ```js
fences (executed by OF); changed to ```javascript (display only)
- observablehq.config.js: adds "Repo Integration" to Reference nav
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Instead of overwriting the target repo's CLAUDE.md, the registration
script now writes CLAUDE.custodian.md — a suggestion file with an
integration header. The repo's Claude agent integrates both files and
deletes the suggestion when done, preserving existing project conventions.
Also fix: `read` prompt now redirects from /dev/tty so the script
doesn't exit with code 1 when run non-interactively via make.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- scripts/cleanup_stale_tasks.py: daily script that cancels open tasks
in completed/archived workstreams; handles 307 redirects; emits a
cleanup progress event summarising results
- Makefile: add cleanup-stale target (also suitable for cron)
- ADR-001: append Workstream Closure Protocol section — mandatory closure
review before marking workstream completed, with task classification
table (done/cancelled/carry-forward) and Closure Review file format
- WP-0002 + WP-0005: append Closure Review sections documenting the
2026-03-02 cleanup run (26 stale DB rows cancelled — all were legacy
pre-ADR-001 DB-first records; file status was already done)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- `escalated` filter now excludes decisions with status resolved or
superseded — a lingering escalation_note on a closed decision no
longer triggers the warning box or shows the amber note on the card
- Resolves D1 Vault backend appearing to re-surface an escalation alert
Also resolved ADR-001 decision (was made/open, now made/resolved);
overview blocking-decision count is now 0.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
get_state_summary() returns ~10k tokens — too expensive for routine domain
repo sessions that only need their own workstreams and decisions.
New get_domain_summary(domain_slug):
- 5 targeted API calls: topics (filter), workstreams (topic+status), decisions
(topic+pending), progress (topic, limit 5), repos (domain, slug+SBOM only)
- Returns: topic, active workstreams, blocking decisions, 5 recent events,
repo SBOM status — all scoped to one domain
- Estimated ~80-90% token reduction vs get_state_summary()
get_state_summary() preserved unchanged for cross-domain / custodian sessions.
Updated its docstring to note the large response and point to get_domain_summary.
Template updated: Step 1 now calls get_domain_summary("{DOMAIN}") instead of
get_state_summary() + get_next_steps(). TOOLS.md updated with usage guidance.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Observable Framework 1.13.3 supports collapsible: true on nav sections,
rendering them as <details> elements. Collapsed by default; auto-expands
when any page within the section is active.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Establishes the repo boundary rule and a formal vocabulary for classifying
work items by scope:
- Task: neutral state hub data entity
- Todo: a task scoped to the current session's repo/domain
- Internal todo: addressed within this repo by this agent
- Ecosystem todo: work for another registered repo → state hub task [repo:<slug>]
- Third-party todo: work for an upstream repo → contribution artifact (BR/FR/EP/UPR)
New dashboard doc: /docs/inter-repo-communication — defines the boundary rule,
the full terminology, ecosystem and third-party todo workflows, and a decision
table for classifying any piece of work found during a session.
Also:
- sbom.md: replace verbose inter-repo section with a 3-line summary + link
- observablehq.config.js: add "Inter-Repo Communication" to Reference nav
- project_claude_md.template: add "### Repo Boundary Rule" section; fix
Workplan Convention section (removing incorrect claim that the custodian
writes workplan files in other repos — that is the target repo's job)
Cross-repo: created state hub task [repo:railiance-bootstrap] for that repo's
agent to apply the boundary rule and workplan convention fix to its own CLAUDE.md
(task 78d43cb0, workstream 59155efb).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Glob with pattern 'workplans/*.md' from repo root fails silently.
Changed instruction to Glob(pattern="**/*.md", path="workplans/")
with Bash ls as fallback.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous template only defined a First Session Protocol (triggered when no
workstreams existed). When workstreams did exist, get_state_summary() was called
but no output was defined, causing registered-repo Claude sessions to produce
nothing useful.
New 3-step normal session protocol:
- Step 1: get_state_summary() + get_next_steps()
- Step 2: scan workplans/*.md for active tasks (todo/in_progress)
- Step 3: output orientation brief covering active workstreams, pending tasks
for this repo (from workplans/ + [repo:<slug>] state hub tasks), suggested
next action, and SBOM status
Also strengthens First Session Protocol, ADR-001 workplan convention section,
and SBOM ingest section (adds SCAN=1 REPO_PATH= flags).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
register-project now creates a topic automatically if the domain has
no active topic yet, instead of exiting with an error. This makes the
"create domain → register project" flow self-contained.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The custodian CLI had a static VALID_DOMAINS list used as argparse
choices= and for in-process domain validation, preventing any domain
added after v0.5 from being used. Now fetches active domains from the
API at runtime. Also fixes t.get("domain") → t.get("domain_slug")
in two topic lookup sites.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
After the v0.5 migration TopicRead.domain was renamed to domain_slug.
index.md, decisions.md and tasks.md still referenced the old field,
causing every workstream domain to fall back to "unknown". Also
updated tasks.md to load the domain filter list dynamically from
/domains/ instead of the hardcoded 6-slug array.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
str.join() is synchronous and cannot consume a generator that uses await.
Build the blocker slugs list with an explicit async for loop instead.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaces the hardcoded 6-domain PostgreSQL ENUM with a first-class
`domains` DB table, and adds a `managed_repos` table for multi-repo
support per domain.
P1 — Domain as a DB entity:
- Migration b1c2d3e4f5a6: creates `domains` table, migrates topics.domain
ENUM column to domain_id FK, drops the domain ENUM type
- Domain ORM model (api/models/domain.py) + Pydantic schemas
- Domain API router: GET/POST /domains/, GET/PATCH /domains/{slug}/,
rename and archive endpoints with EP/TD cascade on rename
- Topic model updated: domain_id FK + @property domain_slug for
backwards-compatible JSON serialization (field renamed domain → domain_slug)
- TopicCreate/TopicRead updated; seed.py rewritten to use FK lookup
P2 — Multi-repo support:
- ManagedRepo ORM model (api/models/managed_repo.py) + schemas
- Repo API router: GET/POST /repos/, GET/PATCH /repos/{slug}/, archive
- Makefile: add-domain, rename-domain, add-repo, list-repos targets
- register_project.sh: verify domain via /domains/ API + POST /repos/
P3 — MCP tools & live validation:
- 6 new MCP tools: list_domains, create_domain, rename_domain,
archive_domain, list_domain_repos, register_repo
- EP/TD routers: replace hardcoded VALID_DOMAINS set with per-request
DB lookup — returns 422 with list of valid slugs on unknown domain
- State summary: adds domains: list[DomainSummary] (slug, name,
repo_count, active_workstream_count, ep_count, td_count)
- TOOLS.md updated with domain management section
P4 — Dashboard:
- New domains.md page with KPI row + domain cards + repo lists
- domains.json.py + repos.json.py data loaders
- Domains page added to observablehq.config.js nav
- workstreams.md, extensions.md, techdept.md: domain_slug fix +
dynamic domain list loaded from /domains/ API (no longer hardcoded)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Scripts, Makefile target, and MCP tool for checking a repository
against ADR-001 (workplans as repo artefacts, state-hub as cache).
Checks performed:
File-side: workplans/ dir exists, valid YAML frontmatter (required
fields, type, status, id format), filename matches id, embedded
task blocks have id/status/priority.
State-hub cross-reference: state_hub_workstream_id references
resolve to real DB records; orphan detection flags active DB
workstreams with no backing workplan file.
Usage:
make validate-adr REPO=<path> [DOMAIN=<slug>]
validate_repo_adr(repo_path, domain_slug?) # MCP tool
Running against the-custodian itself correctly surfaces the 4
pre-ADR-001 workstreams that still need workplan files written.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>