Commit Graph

164 Commits

Author SHA1 Message Date
b54ee8149c feat(dashboard): add tokens consumed per day chart to Progress page
Fetches /token-events/?limit=1000 in parallel with progress events and
renders a second area+line chart (amber) below the events-per-day chart,
aggregating tokens_in + tokens_out per calendar day over the same 30-day window.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-30 00:42:09 +02:00
f5c166e77e feat(dashboard): add repo filter, sort order, and max results controls to Token Cost page
Three reactive dropdowns below the Token Cost heading:
- Filter by repo: client-side filter via 3-level chain resolution
- Sort by: Tokens Total (default), Tokens In, Out, Event Count, Most Recent
- Show: 10/20/50/100/500 rows per table (default 20)

Applies uniformly to By Repo, By Workplan, and Top Tasks tables.
"Most Recent" derives last_event_at per group from the fetched events.
Truncated tables show a "Showing M of N" count below.

Completes CUST-WP-0030 T07–T09.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-30 00:02:17 +02:00
20ce332d55 feat(dashboard): entity list UX — REF column, name cells, detail pages (CUST-WP-0030)
- ref-cell.js: REF column component — click=copy deeplink, dblclick=open
- field-help.js: field registry + fieldRow helper with help-tip decoration;
  FK fields (task_id, workstream_id, repo_id) render as async-linked cells
  with entity-title bubble-help on hover
- GET /token-events/{id} endpoint + get-by-id tests
- GET /repos/by-id/{repo_id} UUID lookup endpoint
- Landing pages: /token-events/[id], /workstreams/[id], /repos/[slug], /tasks/[id]
- token-cost.md: REF + Name columns on all three tables; parallel fetch of
  workstreams/tasks for title resolution
- reference.md: entity detail page URL scheme documented

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 22:35:35 +02:00
bb965e4990 feat(token-tracking): repo aggregation via graph walk (task→workstream→repo)
By Repo now resolves via the full chain rather than requiring repo_id
directly on the token event:
  1. token_events.repo_id (direct)
  2. → workstreams.repo_id (via workstream_id)
  3. → task.workstream_id → workstreams.repo_id (via task_id)

Changes:
- Auto-populate repo_id on token events at creation time (both the
  token_events router and the tasks router)
- New GET /token-events/by-repo/ endpoint with RepoTokenSummary schema;
  returns tokens_in/out/total, event_count, by_model, by_note per repo
- Dashboard By Repo section uses /by-repo/ directly and shows repo_slug
  instead of a truncated UUID
- Backfilled the three existing events (userbased) with repo_id via SQL

185 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 19:05:23 +02:00
68943684e1 feat(token-tracking): introduce token note taxonomy (measured/userbased/workplan/heuristic)
Tier 1 (exact counts) now defaults to note="measured" instead of null,
signalling the counts were read from the Claude Code status bar.
Callers can pass note="userbased" when a human provided the numbers.

  measured  — agent read exact counts from the Claude Code status bar
  userbased — counts provided by a human
  workplan  — prorated from workplan total across task count
  heuristic — server fallback, 1000/500, no agent input

Added token_note field to TaskUpdate schema and exposed note param on
update_task_status and record_interactive_task MCP tools.
TOOLS.md documents the full taxonomy. 185 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 18:47:40 +02:00
edffc62775 feat(token-tracking): add record_interactive_task MCP tool
New tool for capturing ad-hoc work done outside formal workplans.
Finds or creates a persistent 'interactive-<repo>' workstream for the
repo, creates the task, marks it done, and records a token event using
the three-tier logic — all in a single call.

Seeded two example events on interactive-the-custodian:
  - Three-tier token recording on task done (8000/3500)
  - Add record_interactive_task MCP tool (4500/1800)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 18:36:51 +02:00
e247c20439 feat(token-tracking): three-tier token recording on task done
Token events are now always created when update_task_status is called
with status="done", using the best available data:

  Tier 1 (best): exact tokens_in + tokens_out passed by agent
  Tier 2:        workplan_tokens_in + workplan_tokens_out prorated
                 across workstream task count (note="workplan")
  Tier 3 (fallback): heuristic 1000 in / 500 out (note="heuristic")

Non-done status changes never create a token event.
MCP tool updated with workplan_tokens_in/out params and tiered docs.
Ralph-workplan skill files updated with the three-tier guidance.
184 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 18:28:18 +02:00
d65bc701da feat(token-tracking): record AI token consumption per task (CUST-WP-0029)
Introduces end-to-end token consumption tracking so agent work is
visible as a cost/effort metric alongside tasks and workplans.

- Migration o2j3k4l5m6n7: token_events table with FK indexes on
  task_id, workstream_id, repo_id, created_at
- ORM model, Pydantic schemas (TokenEventCreate, TokenEventRead with
  computed tokens_total, TokenSummary)
- Router: POST /token-events/, GET /token-events/ (7 filters),
  GET /token-events/summary/ (task|workstream|repo|commit|release scope)
- MCP tools: record_token_event, get_token_summary (formatted table)
- update_task_status enriched with optional tokens_in/tokens_out
  passthrough — one call creates status update + token event
- Dashboard token-cost.md page: by-repo bar, by-workplan table,
  by-model bar, top-10 tasks by tokens
- ralph-workplan skill updated with token reporting guidance and
  per-task heuristics for estimating counts
- Tests: test_token_events.py + test_token_passthrough.py (182 pass)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-29 17:46:46 +02:00
b0b6c98aec fix(consistency): prevent post-commit hook re-entrancy loop
The post-commit hook re-invokes fix-consistency, which commits writeback
changes, which re-triggers the hook — causing exponential process spawning.

Fix: pass GIT_CUSTODIAN_SYNC=1 in the env for all writeback git commits.
Update the post-commit hook (not tracked by git) to exit early when this
variable is set.

Also remove the --no-verify flag that was added as a failed attempt (it
only skips pre-commit/commit-msg, not post-commit hooks).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-28 23:55:07 +01:00
091695766b feat(repos): git-fingerprint-based machine-independent repo identity
Add git_fingerprint (root commit SHA-1) to managed_repos as a stable,
machine-independent identifier — identical across every clone regardless
of checkout path, remote URL, or SSH alias.

- Migration n1i2j3k4l5m6: adds git_fingerprint column + non-unique index
  (non-unique to support repos that share ancestry via forks/splits)
- GET /repos/by-fingerprint?hash=<sha>[&remote_url=<url>]: lookup by
  fingerprint; optional remote_url disambiguates shared-ancestry repos
- GET /repos/by-remote?url=<url>: fallback lookup by remote URL
- consistency_check.py --here [PATH]: auto-detects repo slug from any
  local checkout via fingerprint (falls back to remote URL), then auto-
  registers host_paths[hostname] so subsequent runs need no override
- --all now includes repos with host_paths[current_hostname], not just
  those with local_path
- fix-consistency-here / check-consistency-here Makefile targets
- Fixed _api_get bug: httpx strips query strings when params={} is passed
- Backfilled fingerprints for 14 repos on this host

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-28 23:55:06 +01:00
aa88a53db2 feat(mcp): add create_topic tool
POST /topics/ was already implemented in the REST API but had no MCP
wrapper, so agents couldn't create topics (e.g. inter_hub) via MCP.
Tool follows the same pattern as create_domain.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-28 01:32:52 +01:00
662db5c593 fix(state-hub): fix mcp-http startup crash and remove legacy tunnel targets
- Add `Optional` to typing imports in mcp_server/server.py — it was used
  in 13 annotations but never imported, crashing FastMCP v3 at startup
- Remove legacy tunnel/tunnel-daemon/tunnel-loop/tunnel-status/tunnel-stop
  targets from Makefile; ops-bridge (tunnels-up/status/check) supersedes them

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 22:32:51 +01:00
b19896a9a9 docs(dashboard): add technical reference page for Observable Framework dashboard
Documents the dashboard's architecture, framework choice rationale, data-fetching
strategies (static loaders + live polling), component library, page inventory,
and key features including the Workstream Health Index and entity modals.
Also registers the new page in the Reference nav and adds runbook section for
node overload / runaway agent process (INC-002) with hardening checklist.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-27 00:09:18 +01:00
6018df03cf feat(brief): generate .custodian-brief.md per repo for offline worker orientation
Adds _write_custodian_brief() to consistency_check.py. After every fix_repo()
run, a .custodian-brief.md is written to the repo root with: domain, last-synced
timestamp, current repo goal, active workstreams with progress (done/total), and
the first 7 open tasks per workstream (blocked → in_progress → todo order) with
task IDs. The file is git-committed when content changes so remote workers (e.g.
CoulombCore) can pull it and orient without a live MCP connection.

Session protocol template and CLAUDE.md updated: read .custodian-brief.md first,
then call get_domain_summary() as an enhancement (skip if MCP unreachable).
This eliminates false "State hub is offline" alarms in subagents and remote workers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 17:48:36 +01:00
8ad371f753 feat(consistency): fix-consistency-remote works without REPO for all repos
Adds --remote CLI flag and fix_all_remote() function. When run without a
REPO argument, the target checks all registered repos and:
- Skips repos whose local path does not exist on this machine
- Skips repos that are already clean (no fixable issues, no FAILs, not
  behind remote, only C-08 background noise allowed)
- For repos that need work: git pull --ff-only then fix_repo()

Prints a summary of CLEAN (skipped) and NOT ON THIS HOST (skipped) repos
before the detailed fix reports.

Simplifies the Makefile target from shell-level curl+git to a single
uv run call using --remote. Same flag handles both single-repo and all-repos.

Also adds _git_pull() helper and 13 new tests (71 total in consistency suite).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 14:38:30 +01:00
86fd570533 fix(consistency): correct behind-remote detection to not trigger on local-ahead
_detect_behind_remote was comparing HEAD != @{u} which incorrectly
triggered C-16 when the local repo had unpushed commits. Fixed to use
git rev-list --count HEAD..@{u} which only counts commits the remote
has that local lacks. Adds test_returns_false_when_local_ahead.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 13:31:28 +01:00
26c920ce3e feat(consistency): distributed multi-machine safety (CUST-WP-0026)
T01 — No-regress rule (C-15): fix-consistency now detects when a DB task
status is ahead of the workplan file (e.g. marked done on CoulombCore)
and emits C-15 WARN instead of regressing the DB back to the stale file
value. STATUS_ORDER ranking: todo(0) < in_progress/blocked(1) < done/cancelled(2).

T02 — Pull gate (C-16): fix_repo runs git fetch + rev-parse at the start
of every --fix run. If the local repo is behind its remote tracking branch,
all write operations are skipped and C-16 WARN is emitted. Best-effort:
offline/no-remote silently skips the check.

T03 — DB→file writeback: C-15 fix path patches the status field in the
matching task block and git-commits the change with a standard message.
--no-writeback flag disables writeback while keeping T01/T02 active.

T04 — CLAUDE.md + session-protocol.template updated with new guidance,
C-15/C-16 semantics, and fix-consistency-remote recommendation.

T05 — Makefile: fix-consistency-remote pulls then fixes in one step.

16 new tests; 155 passed total.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 10:19:23 +01:00
41d239c166 ops: establish ops/ directory with Gitea runbook and INC-001 incident report
- Create ops/runbooks/gitea-coulombcore.md — recovery checklist for Gitea
  on COULOMBCORE, documents containerd StartError pattern and CPU budget issue
- Create ops/incidents/2026-03-25-gitea-pgpool-crashloop.md — INC-001 post-mortem
  for 13-day Gitea outage (PGPool CrashLoopBackOff + rolling update CPU deadlock)
- Create ops/README.md — index for runbooks and incidents
- state-hub/dashboard/src/docs/connecting.md: add railiance01 tunnel config
  (was previously unsaved)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-25 11:30:44 +01:00
efbbef76b0 feat(capability-requests): add routing dispute & reroute workflow (CUST-WP-0027)
Adds a structured dispute mechanism when capability request routing is wrong:
- New `routing_disputed` status with four DB columns (dispute_reason, disputed_by,
  dispute_suggested_domain, disputed_at) via Alembic migration m0h1i2j3k4l5
- POST /capability-requests/{id}/dispute — any party can flag misrouting with a reason
  and optional suggested domain; notifies custodian + current fulfilling domain
- POST /capability-requests/{id}/reroute — custodian re-routes to correct domain via
  catalog_entry_id or direct slug; appends audit trail to routing_note; resets to requested
- Two new MCP tools: dispute_capability_routing and reroute_capability_request
- Dashboard: amber disputed-banner at top of Summary, routing_disputed Kanban column,
  dispute details (reason, suggested domain, raised-by) shown on disputed cards

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 23:58:52 +01:00
e423ff7126 feat(dashboard): add Tools & Apps page with liveness probes
New page at /tools listing all connected applications grouped by
category: Local Services (State Hub API, KeePassXC, pgAdmin, ops-bridge),
Source Control (Gitea), Identity/Auth (KeyCape, Authelia, privacyIDEA,
LLDAP), and Dev Tooling (Claude Code, uv). Local services show live
green/red/grey status dots via no-cors fetch probes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 01:18:11 +01:00
0777e5b2f0 feat: add FOS/credential standards, big-picture guidance, and CUST-WP-0025 workplan
- canon/standards/credential-management_v0.1.md: single root-of-trust credential hierarchy standard
- canon/standards/federated-organization-standard_v1.0.md: FOS reference architecture (VSM-based)
- wiki/BigPictureGuidance.md: integration guidance for OAS + FOS orthogonal layers
- workplans/CUST-WP-0025-fos-hub-bootstrap.md: 4-phase plan (identity, hub-core extraction, ops-hub, fin-hub)
- state-hub/Makefile: treat exit 2 (warnings-only) as success in check-consistency targets

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 23:48:13 +01:00
634642cb52 feat(capability-requests): add routing_note, PATCH endpoint, word-boundary fix, and ops-bridge tunnel targets
- Add `routing_note` column (migration l9g0h1i2j3k4) to persist why a request was routed to a given domain
- Fix substring-match bug in `_route_capability`: use `\b` word-boundary regex so 'postgres' no longer matches inside 'postgresql'
- Include `title` in keyword scoring for better routing accuracy
- Return `routing_note` string from `_route_capability` and store it on the request
- Add `PATCH /capability-requests/{id}` endpoint + `CapabilityRequestPatch` schema to correct mutable metadata (catalog_entry_id, priority, blocking_task_id, fulfilling_workstream_id)
- Add `patch_capability_request` MCP tool wrapping the new endpoint
- Add 105 lines of routing tests (word-boundary, title-match, multi-entry scoring, broadcast fallback)
- Add `tunnels-up`, `tunnels-status`, `tunnels-check` Makefile targets for ops-bridge managed tunnels

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 03:47:54 +01:00
1e0ae37c89 docs: add State Hub reference page and restructure reference index
New page (docs/state-hub.md) covers:
- Why: the invisible state problem across repos and agents
- What: Derived Data Store, Read Model, Agent Orchestration Layer,
  Cross-Repo Observatory — and what it is NOT
- Derived Data Store principle (ADR-003): fingerprint cache, rebuild
  guarantee, force-refresh
- Repository Orchestrator: session protocol, cross-domain coordination
  via messages + capability routing, Kaizen agents
- Architecture diagram (ASCII), technology choices, data model overview
- Running the hub, design principles, related docs

reference.md: add Architecture & Design section grouping state-hub,
TPSC, GDPR maturity, SCOPE.md, capabilities, and goals docs.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 02:01:58 +01:00
d63e7310d5 perf(doi): fingerprint-based DB cache for DoI results
Adds doi_cache table (migration k8f9a0b1c2d3). Results are stored after
each evaluation and reused on subsequent requests when the fingerprint
matches. Fingerprint covers repo.updated_at, latest TPSC snapshot_at,
latest goal updated_at, and mtime of SCOPE.md / CLAUDE.md / tpsc.yaml.

Behaviour:
- Summary (warm cache, nothing changed): ~0.4s (was 0.9s)
- Summary (one repo stale): ~0.9s (only stale repos recomputed)
- Single repo (cache hit): ~0.2s (was 40s for full check)
- Single repo ?force_refresh=true: ~2s (full C7/C13 subprocess check)

Total journey: 108s (original) → 6s → <1s → 0.2s (cached single repo)

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 01:47:19 +01:00
6ae5cb6bf7 perf(doi): eliminate HTTP self-calls in summary — 48 calls → 3 bulk DB queries
Root cause: C2/C9/C10 each made a full HTTP round-trip back to the API
(asyncio.to_thread → urllib → TCP → uvicorn → SQLAlchemy → DB) for every
repo. 16 repos × 3 calls = 48 self-calls at ~80-150ms each = ~6s total.

Fix: doi_engine.evaluate() accepts a prefetch dict. The summary endpoint
runs 3 bulk GROUP BY queries (domain status, TPSC snapshot counts, active
goal counts) and passes results directly — zero HTTP self-calls in summary
mode.

Result: /repos/doi/summary 6s → <1s (6× improvement on top of prior 13×).
Total improvement from original: 108s → <1s.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 01:37:40 +01:00
7dec3ac9ee perf(dashboard): lazy-load DoI tiers on Repositories page
Page now renders in ~200ms. DoI badges and KPI card show a spinner
while the background fetch resolves (~6s), then update reactively
via Observable Mutable pattern (doiData / doiLoading).

Fast path: repos, SBOM, domains, workstreams — immediate render.
Slow path: /repos/doi/summary — background, non-blocking.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 01:31:48 +01:00
66e3a9afe4 perf(doi): 13x speedup for /repos/doi/summary (108s → ~6s)
Two fixes:
1. skip_consistency=True in summary mode — omits C7/C13 subprocess calls
   (consistency_check.py) which were the main bottleneck (32 spawns for 16 repos).
   Full check still available per-repo via GET /repos/{slug}/doi.
2. asyncio.gather — all repos evaluated in parallel instead of sequentially.

Also: rename Repositories page title from "Repos" to "Repositories".

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 01:29:27 +01:00
f94ee008b5 feat(doi): Repository DoI automated gate and dashboard integration (CUST-WP-0024)
Implements the 14-criterion DoI checklist as a runnable gate with API,
MCP tools, CLI script, and dashboard integration.

Core components:
- api/doi_engine.py — async engine evaluating all 14 criteria (asyncio.to_thread
  for non-blocking HTTP self-calls), shared by API and CLI
- api/schemas/doi.py — DoICriterion, DoIReport, DoISummaryEntry schemas
- api/routers/repos.py — GET /repos/{slug}/doi + GET /repos/doi/summary
- scripts/check_doi.py — CLI: make check-doi REPO=<slug> / check-doi-all
- mcp_server/server.py — check_repo_doi(), get_doi_summary() tools

Dashboard (repos.md):
- DoI tier badge per repo (None/Core/Standard/Full) colour-coded red→green
- Domain block shows lowest DoI tier across its repos
- DoI KPI card in summary row
- DoI filter in All Repos Table
- Link to Repository DoI policy page

Also fixes: TPSC snapshots 500 error (missing nested selectinload for
catalog_entry relationship in list_snapshots endpoint).

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 01:08:18 +01:00
a7b26ef6de docs(policy): add heading to workstream-dod policy file
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 00:42:08 +01:00
19d2b3cd31 docs(policy): add Repository Definition of Integrated (DoI)
Three-tier checklist defining what 'fully integrated with the state-hub'
means for a repository:
- Core (Registered): registered, domain assigned, path resolves, remote URL
- Standard (Integrated): SCOPE.md, CLAUDE.md, workplan convention, SBOM, TPSC
- Full (Fully Integrated): repo goal, capabilities declared, agents template,
  clean consistency check, host paths registered

Exposed via /policy/repo-doi (editable in dashboard) and linked under Policies.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 00:35:54 +01:00
0f9266cd91 docs(tpsc): add GDPR Maturity Model reference page
Full reference for the 7-level CNIL/IAPP CMMI-aligned scale used in TPSC:
source frameworks, per-level descriptions, suitability guidance, key GDPR
concepts (DPA, SCCs, adequacy, BCRs, Art.9), assignment decision tree,
and authoritative references.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 00:19:07 +01:00
c7a893f068 feat(tpsc): Third-Party Services Catalog (CUST-WP-0023)
Introduces TPSC for tracking external service dependencies with GDPR
compliance maturity (CNIL/IAPP CMMI scale), pricing model, ToS, and
data retention information across all repos.

Primary data:
- canon/tpsc/{openai,anthropic,gemini,openrouter}-api.yaml — service definitions
- tpsc.yaml in each repo (llm-connect seeded with 4 services)

State-hub additions:
- Migration j7e8f9a0b1c2: tpsc_catalog + tpsc_snapshots + tpsc_entries
- api/models/tpsc.py, api/schemas/tpsc.py, api/routers/tpsc.py
- /tpsc/catalog/, /tpsc/ingest/, /tpsc/snapshots/, /tpsc/report/gdpr endpoints
- 4 MCP tools: register_service, list_services, ingest_tpsc_tool, get_gdpr_report
- scripts/ingest_tpsc.py + make ingest-tpsc[/-all] targets
- Dashboard: tpsc.md page + docs/tpsc.md

GDPR maturity scale: unknown | non_compliant | initial | developing | defined | managed | certified
Warnings triggered at: unknown, non_compliant, initial

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 00:15:26 +01:00
118c5628e9 fix(mcp): resolve repo paths with existence check before trusting hostname match
Stale host_paths entries (wrong username, old machine) were silently overriding
the correct local_path, causing FileNotFoundError on tools like list_kaizen_agents.

Extracts _resolve_repo_path(repo) helper that tries host_paths[hostname] first
but validates the path exists on disk before trusting it, then falls back to
local_path. Both candidates support ~ expansion. Applied to all 4 call sites:
_kaizen_agents_dir, validate_repo_adr, check_repo_consistency, ingest_sbom_tool.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 21:38:35 +01:00
f85c5e4d49 feat(capability-requests): add cross-domain capability catalog and request routing
Introduces a capability catalog (CUST-WP-0022) so domains can advertise what
they provide and agents can request capabilities from other domains with
auto-routing, lifecycle tracking, and task-unblocking on completion.

- New models: CapabilityCatalog, CapabilityRequest with full lifecycle
  (requested → accepted → in_progress → ready_for_review → completed/rejected/withdrawn)
- Migration i6d7e8f9a0b1: capability_catalog + capability_requests tables
- Router /capability-catalog and /capability-requests with accept/status endpoints
- 7 new MCP tools: register_capability, list_capabilities, request_capability,
  accept_capability_request, update_capability_request_status,
  list_capability_requests, get_capability_request
- StateSummary gains open_capability_requests count
- Dashboard: capability-requests.md page + docs/capabilities.md + docs/scope.md
- SCOPE.md: three seed capabilities documented (MCP registration, state tracking, SBOM)
- scope.template: Provided Capabilities section with example block
- scripts/ingest_capabilities.py + make ingest-capabilities[/-all] targets

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 21:07:50 +01:00
4642a53d6e fix(dashboard): enrich repo-sync page with live SBOM snapshot stats
repos.json.py now fetches /sbom/snapshots/ alongside /repos/ and
annotates each repo with sbom_snapshot_count, sbom_entry_count, and a
last_sbom_at fallback derived from actual snapshot data. This prevents
"LastSBOM=never" when the denormalized field is out of sync.

repo-sync.md gains SBOM KPI tiles (ingested vs no-SBOM), color-coded
SBOM age column (same green/orange/red scale as state sync), and an
entry count column showing packages from the latest snapshot.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 01:34:02 +01:00
6eeb715b3f feat(sbom): add go.sum parser to ingest_sbom.py
Parses go.sum lockfiles for Go projects. Reads go.mod alongside to
mark direct vs indirect dependencies. Deduplicates by (module, version),
skipping go.mod hash lines.

Used to ingest key-cape (netkingdom domain): 23 Go modules.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 01:04:34 +01:00
e1d5ba7417 fix(dashboard): clear API-unreachable warning when API recovers
Always call display() for the warning element so Observable Framework
replaces it on each poll re-run. Previously the conditional display()
call left the warning rendered indefinitely once shown.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 00:51:11 +01:00
7c81f4fba0 docs(mcp): switch MCP transport stdio → SSE, update all references
MCP server is now a persistent SSE service on :8001 (make mcp-http),
independent of the Claude Code session. Re-registration is a single
claude mcp add-json command; no patch_mcp_cwd.py needed.

- Makefile: mcp-http is primary transport, add fuser restart + updated comment
- state-hub/README.md: stack table, MCP section, troubleshooting note updated
- CLAUDE.md (project): registration instructions rewritten for SSE

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-19 00:05:56 +01:00
a9ba6ffd42 refactor(makefile): rename backend → api, fold raw uvicorn target in
The old bare `api` target (uvicorn only) is subsumed into the new `api`
target (db + postgres-wait + migrate + fuser-restart + uvicorn). Updated
all doc references and cleaned up duplicate entries left by the rename.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 23:20:45 +01:00
2542ff283f fix(makefile): use fuser port-kill instead of pkill pattern for restart
pkill -f matched the shell subprocess's own argv (which contains the
pattern as a -c argument), causing make to receive SIGTERM and abort.
fuser -k 8000/tcp / 3000/tcp targets only the process bound to the
port — no self-kill risk.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 23:18:31 +01:00
4d634b5ac7 refactor(makefile): rename start → backend, add restart logic for api and dashboard
- `make backend` replaces `make start`; polls postgres with nc (up to 10s)
  instead of fixed sleep, kills any running uvicorn before starting fresh
- `make dashboard` kills any running observable preview before restarting
- Update all references in CLAUDE.md, README.md, SCOPE.md, state-hub/README.md,
  and dashboard/src/docs/live-data.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 23:16:44 +01:00
fb90293fc4 feat(mcp): add list_tasks(workstream_id) tool — resolves FR 7074fd47
Agents had no way to look up task UUIDs by workstream; they were stuck
unable to call update_task_status without already knowing the UUID.
list_tasks() wraps GET /tasks with workstream_id filter, returning
[{id, title, status, priority}] for all matching tasks.

FR raised by kaizen-agentic worker on COULOMBCORE while syncing
KAIZEN-WP-0002 task IDs. Marked merged in contributions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 23:01:22 +01:00
27eb6b14ad feat(CUST-WP-0021): multi-host repo path hardening — all 5 tasks complete
- T01 (done prior): registered host_paths for bnt-lap001 (14 repos) and
  COULOMBCORE (6 repos) via POST /repos/{slug}/paths/
- T02: validate_repo_adr now accepts repo_slug (not raw path); resolves
  local path via host_paths[hostname] → local_path; clear error for
  unregistered/missing paths
- T03: ingest_sbom_tool lockfile_path is now optional and relative to
  resolved repo root; absolute paths accepted with deprecation warning
- T04: check_repo_consistency pre-flight guard — fetches repo, resolves
  path, returns clear error before spawning subprocess if path missing
- T05: TOOLS.md — updated validate_repo_adr row (slug not path);
  added Multi-Host & Remote Agent Usage section documenting design
  boundary, remote agent workflow, and update_repo_path usage

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 22:53:25 +01:00
ec92c8e95e feat(tests): pytest-asyncio test suite — 119 tests across 3 modules
Infrastructure (T01):
- tests/conftest.py: sync schema setup (psycopg2), per-test table
  truncation, async ASGI client with get_session override
- pyproject.toml: [tool.pytest.ini_options] asyncio_mode=auto
- Makefile: make test target with TEST_DATABASE_URL

Core router tests (T02): 19 tests
- domains, topics, workstreams, tasks, decisions + state summary
- Caught real bug: topic router missing duplicate-slug 409 guard (fixed)

TD/EP/Contributions/SBOM tests (T03): 10 tests
- CRUD + status transitions + lifecycle guard + SBOM ingest

MCP smoke tests (T04): 12 tests
- get_state_summary, create_task, update_task_status,
  add_progress_event, flag_for_human HTTP shapes

CI gate (T05): make test documented in CLAUDE.md session protocol

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 12:00:06 +01:00
8da2aeab44 fix(consistency_check): heading titles + workstream-aware task guards
- parse_task_blocks() now injects the nearest preceding ### heading
  text as `title` — tasks no longer stored with bare IDs as their title
- C-11 fix skips creating tasks when workstream is completed/archived
  (prevents duplicate task creation on repeated fix-consistency runs)
- C-12 is now fixable: auto-cancels open orphan DB tasks when the
  backing workstream is finished (completed/archived)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 08:05:07 +01:00
9aa54f8133 feat(api): CUST-WP-0018 — API hardening & code quality
T01: Fix datetime.utcnow() → datetime.now(tz=timezone.utc) in MCP server
T02: Wrap _get/_post/_patch/_delete with try/except; return error dicts
T03: Log warnings when write_log skips missing project path
T04: Add priority + due_date_before filters to GET /tasks/
T05: Add owner + slug filters to GET /workstreams/
T06: Add offset param to GET /progress/ for proper pagination
T07: Low-severity bundle:
  - CORS origins from CORS_ORIGINS env var (TD-017)
  - seed.py upsert domains+topics on re-run (TD-011)
  - normalise filter bar CSS → filter-text-input everywhere (TD-016)
  - add 30.5 avg-days-per-month comment in decisions.md (TD-019)
  - TD-009, TD-018 already resolved by existing code

Closes CUST-WP-0018.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 02:17:04 +01:00
9c9d5db632 fix(mcp): accept JSON string for add_progress_event detail param
FastMCP validates dict | None strictly, rejecting a JSON string even if
parseable. Broaden to dict | str | None and coerce in the function body
so callers don't need to pre-parse the detail payload.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 02:11:35 +01:00
dbf297f8fe fix(dashboard): rename Repository → Repositories, Policy → Policies
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 02:08:54 +01:00
0a6560ba9d fix(dashboard): pin Overview as first nav entry
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 02:07:03 +01:00
8fa1409995 feat(dashboard): reorder nav — flat pages first (alpha), sections below (alpha), Reference last
Sub-pages within sections also sorted alphabetically.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-18 02:06:23 +01:00