Compare commits

...

2 Commits

Author SHA1 Message Date
2cd3099ebf feat(canon): add Interface Change Registry concept and workplan
Concept doc captures the design for coordinated API evolution in agent
ecosystems: InterfaceChange entity, draft→published→resolved lifecycle,
TPSC-derived dependency routing, inbox-based notifications, pre-change
coordination via planned_for, and deliberate deferral of webhooks.

CUST-WP-0033 workplan: 6 tasks (model, API, dispatch integration,
MCP tools, dashboard, webhook EP).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-26 15:13:11 +02:00
8dd15efde1 fix(api): normalize trailing slashes — no slash on param routes
Rule: trailing slash only on collection roots (/). Any route containing
a path parameter {…} uses no trailing slash. Applies across all routers,
scripts, Makefile, and tests. Fixes 307-redirect fragility on POST/PATCH
from naive clients (curl, Codex HTTP calls).

Also adds POST /repos/{slug}/sync — runs ADR-001 consistency check with
--fix via HTTP, so non-MCP agents (Codex) can self-service DB sync without
operator intervention.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-26 15:13:01 +02:00
15 changed files with 406 additions and 21 deletions

View File

@@ -0,0 +1,162 @@
---
id: CUST-CPT-CUST-2026-000002
type: concept
title: "Interface Change Registry — Coordinated API Evolution for Agent Ecosystems"
status: active
owners: ["Bernd", "Custodian"]
created: "2026-04-26"
updated: "2026-04-26"
scope:
domains: ["custodian"]
sensitivity: internal
tags: ["interface", "api-evolution", "agent-coordination", "ecosystem", "change-management"]
related_workplan: CUST-WP-0033
---
# Interface Change Registry — Coordinated API Evolution for Agent Ecosystems
## Problem
In a distributed ecosystem of closely coupled repos and services, APIs and interfaces
evolve continuously. Without coordination, breaking changes propagate silently: a
service updates its REST API, and dependent agents or services discover the breakage
only at runtime — typically as a 422, a 307 redirect, or a schema validation failure.
Human-operated systems address this with release notes and changelogs. Agent-operated
systems need something machine-readable and actionable: a record of what changed, who
depends on it, and a channel to trigger adaptation before the breakage hits.
The trailing-slash normalisation performed on 2026-04-26 is a concrete example: it
was a deliberate, coordinated breaking change. Without a registry, every consumer had
to be found by manual grep. With one, the change record names the affected repos,
their agents receive inbox notifications, and each can adapt autonomously.
## Core Abstraction: InterfaceChange
An `InterfaceChange` record describes a single, versioned mutation to a published
interface boundary. It carries:
| Field | Purpose |
|---|---|
| `repo_slug` | The repo that owns the interface |
| `interface_type` | What kind of interface: `rest_api`, `mcp_tool`, `cli`, `schema`, `capability` |
| `change_type` | Nature of change: `breaking`, `additive`, `deprecation`, `removal` |
| `title` | Short human-readable summary |
| `description` | Full description with before/after detail |
| `affected_paths` | Specific endpoints, tool names, CLI commands, or schema fields changed |
| `affected_repo_slugs` | Repos known to consume this interface |
| `status` | `draft``published``resolved` |
| `planned_for` | Optional date — set when change is announced before it ships |
| `published_at` | When the change became live |
| `resolved_at` | When all known dependents have adapted |
## Lifecycle
```
draft ──publish──▶ published ──resolve──▶ resolved
│ │
│ auto-notify affected
│ repo agents (inbox)
└── edit freely; no notifications yet
```
**draft** — change is being documented but not yet live or announced. Safe to edit.
Can be used to pre-announce a planned breaking change before it is merged.
**published** — the change is live (or imminently scheduled). On publish, the hub
automatically sends an inbox message to the agent of each `affected_repo_slug`.
The message contains enough context for the agent to identify what needs updating.
**resolved** — all known dependents have adapted and confirmed. Can be closed by
the originating agent or by any affected agent once it has updated its side.
## Dependency Routing
`affected_repo_slugs` can be populated in two ways:
1. **Explicit** — the author lists known consumers when creating the record.
2. **Derived** — the hub queries the TPSC graph: repos that have a TPSC snapshot
declaring a dependency on the originating repo's service are automatically
included as candidates. The author confirms or trims the list before publish.
This means the TPSC catalog (`tpsc.yaml` files) is the underlying dependency map
for routing interface change notifications. Keeping TPSC current is what makes
automatic routing accurate.
## Pre-Change Coordination
Setting `planned_for` and publishing in `draft` status (then moving to `published`
on merge) enables a coordination window:
```
day 0 — change drafted, planned_for set to day 7
day 0 — dependent agents receive notification, begin adapting
day 7 — change lands, status set to published
day 7+ — agents confirm adaptation, change resolved
```
This is the proactive adaptation model: the breaking change announcement travels
faster than the change itself, giving dependents time to prepare. At scale, this
enables the ecosystem to self-heal around planned migrations.
## Agent Session Integration
Two integration points keep agents aware of pending changes without requiring
active polling:
**Session start (pull):** The `/repos/{slug}/dispatch` endpoint includes a
`pending_interface_changes` field — published changes that affect this repo and
are not yet resolved. Agents reading dispatch at session start see what they need
to adapt to.
**Inbox notification (push):** On publish, the hub sends an inbox message to each
affected repo's agent. The message includes the change title, description, and
`affected_paths` so the agent can locate the relevant code without additional API
calls.
Together, these ensure no published breaking change is invisible to an agent
working in an affected repo.
## Relationship to Existing State-Hub Entities
| Entity | Relationship |
|---|---|
| `ManagedRepo` | InterfaceChange.repo_id FK; affected_repo_slugs reference repo slugs |
| `TPSC` / `TPSCSnapshot` | Source for derived affected_repo_slugs via service dependency graph |
| `Decision` | A planned breaking change is conceptually a pending decision; consider linking |
| `ProgressEvent` | Publishing a change auto-appends a progress event to the originating repo |
| `Message` | Publish action sends inbox messages to affected agents |
## Webhook Extension (Deferred)
Inbox messages cover agents that poll at session start. Services that need
real-time push — CI pipelines, external webhooks, non-Custodian agents — require
a separate subscription mechanism.
This is explicitly deferred. The design leaves room for it:
- `InterfaceChange` records are immutable once published (safe to deliver idempotently)
- `affected_repo_slugs` is the routing key; a subscription table maps slugs to URLs
- Delivery semantics: at-least-once with exponential backoff
A dedicated EP (`EP-CUST-ICR-001`) will track this when the inbox-first approach
proves insufficient.
## Long-term Vision
The Interface Change Registry is a first step toward a **self-healing ecosystem**:
1. A repo publishes a breaking change with `planned_for = T+7`.
2. The hub identifies all dependents via TPSC and notifies their agents.
3. Each dependent agent opens a task, locates the affected code, and creates an
adaptation PR before T+7.
4. At T+7, the change ships; all dependents are already adapted.
5. The originating agent marks the change resolved.
No human coordination is needed for routine interface evolution. Humans remain in
the loop for non-routine changes — architectural decisions, security-sensitive
migrations, or changes that require cross-domain agreement — via the Decision entity
and the existing escalation protocol.
At greater scale, the dependency graph enables **contract testing**: two repos can
register a formal interface contract, and the hub can detect when a proposed change
would violate it before the change is merged.

View File

@@ -127,7 +127,7 @@ rename-domain:
register-path:
@test -n "$(REPO)" || (echo "ERROR: REPO is required. Usage: make register-path REPO=<slug> PATH=<path>"; exit 1)
@test -n "$(PATH)" || (echo "ERROR: PATH is required. Usage: make register-path REPO=<slug> PATH=<path>"; exit 1)
curl -sf -X POST "http://127.0.0.1:8000/repos/$(REPO)/paths/" \
curl -sf -X POST "http://127.0.0.1:8000/repos/$(REPO)/paths" \
-H "Content-Type: application/json" \
-d "{\"host\": \"$$(hostname)\", \"path\": \"$(PATH)\"}" | python3 -m json.tool

View File

@@ -78,7 +78,7 @@ async def create_contribution(
return contrib
@router.get("/{contribution_id}/", response_model=ContributionRead)
@router.get("/{contribution_id}", response_model=ContributionRead)
async def get_contribution(
contribution_id: uuid.UUID,
session: AsyncSession = Depends(get_session),
@@ -118,7 +118,7 @@ async def patch_contribution_status(
return contrib
@router.delete("/{contribution_id}/", status_code=status.HTTP_204_NO_CONTENT)
@router.delete("/{contribution_id}", status_code=status.HTTP_204_NO_CONTENT)
async def withdraw_contribution(
contribution_id: uuid.UUID,
session: AsyncSession = Depends(get_session),

View File

@@ -45,7 +45,7 @@ async def create_domain(
return domain
@router.get("/{slug}/", response_model=DomainDetail)
@router.get("/{slug}", response_model=DomainDetail)
async def get_domain(
slug: str,
session: AsyncSession = Depends(get_session),

View File

@@ -1,11 +1,17 @@
import asyncio
import json
import socket
import subprocess
import sys
import uuid
from datetime import datetime, timezone
from pathlib import Path
from fastapi import APIRouter, Depends, HTTPException, status
from sqlalchemy import case, func, select
from sqlalchemy.ext.asyncio import AsyncSession
from api.config import settings
from api.database import get_session
from api.doi_engine import compute_fingerprint, evaluate as _doi_evaluate
from api.models.doi_cache import DOICache
@@ -337,7 +343,7 @@ async def get_repo_by_id(
return repo
@router.get("/{slug}/", response_model=RepoRead)
@router.get("/{slug}", response_model=RepoRead)
async def get_repo(
slug: str,
session: AsyncSession = Depends(get_session),
@@ -345,7 +351,7 @@ async def get_repo(
return await _get_repo_by_slug(slug, session)
@router.patch("/{slug}/", response_model=RepoRead)
@router.patch("/{slug}", response_model=RepoRead)
async def update_repo(
slug: str,
body: RepoUpdate,
@@ -359,7 +365,7 @@ async def update_repo(
return repo
@router.post("/{slug}/paths/", response_model=RepoRead)
@router.post("/{slug}/paths", response_model=RepoRead)
async def register_host_path(
slug: str,
body: RepoPathRegister,
@@ -471,6 +477,53 @@ async def get_repo_dispatch(
)
@router.post("/{slug}/sync")
async def sync_repo_consistency(
slug: str,
fix: bool = True,
session: AsyncSession = Depends(get_session),
) -> dict:
"""Run ADR-001 consistency check (and optional --fix) for a repo via HTTP.
Intended for non-Claude-Code agents (e.g. Codex) that cannot use MCP tools
but need to sync workplan file state to the state-hub DB after making changes.
Returns the raw JSON output from consistency_check.py.
Query param ?fix=false to run check-only without writing.
"""
repo = await _get_repo_by_slug(slug, session)
hostname = socket.gethostname()
host_paths = repo.host_paths or {}
repo_path = host_paths.get(hostname)
if not repo_path or not Path(repo_path).exists():
raise HTTPException(
status_code=503,
detail=(
f"No accessible path for repo '{slug}' on host '{hostname}'. "
f"Register with: POST /repos/{slug}/paths/"
),
)
script = Path(__file__).parent.parent.parent / "scripts" / "consistency_check.py"
cmd = [sys.executable, str(script), "--repo", slug, "--json",
"--api-base", settings.api_base]
if fix:
cmd.append("--fix")
result = await asyncio.to_thread(
subprocess.run, cmd, capture_output=True, text=True
)
try:
return json.loads(result.stdout)
except Exception:
raise HTTPException(
status_code=500,
detail=f"Consistency check failed: {result.stderr or result.stdout or '(no output)'}",
)
async def _get_repo_by_slug(slug: str, session: AsyncSession) -> ManagedRepo:
result = await session.execute(select(ManagedRepo).where(ManagedRepo.slug == slug))
repo = result.scalar_one_or_none()

View File

@@ -112,7 +112,7 @@ async def list_snapshots(
return [SBOMSnapshotRead.model_validate(s) for s in result.scalars().all()]
@router.get("/snapshots/{snapshot_id}/", response_model=SBOMSnapshotDetail)
@router.get("/snapshots/{snapshot_id}", response_model=SBOMSnapshotDetail)
async def get_snapshot(
snapshot_id: uuid.UUID,
session: AsyncSession = Depends(get_session),
@@ -209,7 +209,7 @@ async def licence_report(
return LicenceReport(groups=licence_groups, copyleft_direct_count=copyleft_direct_count)
@router.get("/{repo_slug}/", response_model=SBOMRepoView)
@router.get("/{repo_slug}", response_model=SBOMRepoView)
async def get_repo_sbom(
repo_slug: str,
session: AsyncSession = Depends(get_session),

View File

@@ -110,7 +110,7 @@ async def defer_td(
# ── Notes ─────────────────────────────────────────────────────────────────────
@router.get("/{td_id}/notes/", response_model=list[TDNoteRead])
@router.get("/{td_id}/notes", response_model=list[TDNoteRead])
async def list_notes(
td_id: uuid.UUID,
session: AsyncSession = Depends(get_session),
@@ -124,7 +124,7 @@ async def list_notes(
return list(result.scalars().all())
@router.post("/{td_id}/notes/", response_model=TDNoteRead, status_code=status.HTTP_201_CREATED)
@router.post("/{td_id}/notes", response_model=TDNoteRead, status_code=status.HTTP_201_CREATED)
async def add_note(
td_id: uuid.UUID,
body: TDNoteCreate,

View File

@@ -37,7 +37,7 @@ PROMPT_FILE = SCRIPT_DIR.parent / "prompts" / "sbom-capture-agent.md"
def resolve_repo_path(repo_slug: str) -> Path | None:
"""Look up the registered path for a repo slug via the state-hub API."""
url = f"{API_BASE}/repos/{repo_slug}/"
url = f"{API_BASE}/repos/{repo_slug}"
try:
with urllib.request.urlopen(url, timeout=10) as resp:
data = json.loads(resp.read())

View File

@@ -69,7 +69,7 @@ def _print_report(report, use_color: bool = True) -> None:
async def check_repo(slug: str, as_json: bool) -> bool:
try:
repo = _get(f"/repos/{slug}/")
repo = _get(f"/repos/{slug}")
except Exception as e:
print(f"✗ Could not fetch repo '{slug}': {e}", file=sys.stderr)
return False

View File

@@ -1083,7 +1083,7 @@ def fix_repo(
hostname = socket.gethostname()
if (repo_record.get("host_paths") or {}).get(hostname) != repo_path:
result = _api_post(
api_base, f"/repos/{repo_slug}/paths/",
api_base, f"/repos/{repo_slug}/paths",
{"host": hostname, "path": repo_path},
)
if result and "_error" not in result:
@@ -1333,7 +1333,7 @@ def fix_repo(
from datetime import timezone as _tz
import datetime as _dt
now_iso = _dt.datetime.now(_tz.utc).isoformat()
_api_patch(api_base, f"/repos/{repo_slug}/", {"last_state_synced_at": now_iso})
_api_patch(api_base, f"/repos/{repo_slug}", {"last_state_synced_at": now_iso})
# Write the worker orientation brief (.custodian-brief.md)
if repo_path:

View File

@@ -509,7 +509,7 @@ def post_ingest(api_base: str, repo_slug: str, entries: list[dict]) -> dict:
def _resolve_repo_path_from_hub(api_base: str, repo_slug: str) -> Path | None:
"""Query the hub for this host's registered path for repo_slug."""
try:
url = f"{api_base}/repos/{repo_slug}/"
url = f"{api_base}/repos/{repo_slug}"
with urllib.request.urlopen(url) as resp:
data = json.loads(resp.read())
hostname = socket.gethostname()

View File

@@ -45,7 +45,7 @@ resolve_path() {
local slug="$1"
# Try the registered local_path first
local api_path
api_path=$(curl -sf "${API_BASE}/repos/${slug}/" | python3 -c \
api_path=$(curl -sf "${API_BASE}/repos/${slug}" | python3 -c \
"import json,sys; d=json.load(sys.stdin); print(d.get('local_path') or '')" 2>/dev/null || true)
if [[ -n "$api_path" && -d "$api_path" ]]; then
echo "$api_path"

View File

@@ -73,7 +73,7 @@ echo " API OK"
# ── Step 2: Verify domain exists ───────────────────────────────────────────────
echo "==> Verifying domain '$DOMAIN' ..."
DOMAIN_JSON="$(curl -sf "$API_BASE/domains/$DOMAIN/" 2>/dev/null || echo 'NOT_FOUND')"
DOMAIN_JSON="$(curl -sf "$API_BASE/domains/$DOMAIN" 2>/dev/null || echo 'NOT_FOUND')"
if [[ "$DOMAIN_JSON" == "NOT_FOUND" ]] || ! echo "$DOMAIN_JSON" | python3 -c "import json,sys; d=json.load(sys.stdin); sys.exit(0 if d.get('slug') else 1)" 2>/dev/null; then
echo "ERROR: Domain '$DOMAIN' not found in the State Hub."
echo " To create: make add-domain DOMAIN=$DOMAIN NAME=\"<display name>\""
@@ -252,7 +252,7 @@ fi
# ── Step 7: Register this machine's local path ────────────────────────────────
echo "==> Registering host path for $(hostname) ..."
curl -sf -X POST "$API_BASE/repos/$REPO_SLUG/paths/" \
curl -sf -X POST "$API_BASE/repos/$REPO_SLUG/paths" \
-H "Content-Type: application/json" \
-d "{\"host\": \"$(hostname)\", \"path\": \"$PROJECT_PATH\"}" > /dev/null \
&& echo " host_paths[$(hostname)] = $PROJECT_PATH"

View File

@@ -68,13 +68,13 @@ class TestTechnicalDebt:
})
td_id = r.json()["id"]
r2 = await client.post(f"/technical-debt/{td_id}/notes/", json={
r2 = await client.post(f"/technical-debt/{td_id}/notes", json={
"step": "analysis",
"content": "Root cause identified.",
})
assert r2.status_code == 201
r3 = await client.get(f"/technical-debt/{td_id}/notes/")
r3 = await client.get(f"/technical-debt/{td_id}/notes")
assert r3.status_code == 200
assert len(r3.json()) == 1

View File

@@ -0,0 +1,170 @@
---
id: CUST-WP-0033
type: workplan
title: "Interface Change Registry — Coordinated API Evolution"
domain: custodian
repo: the-custodian
status: active
owner: custodian
topic_slug: custodian
created: "2026-04-26"
updated: "2026-04-26"
concept: canon/projects/custodian/interface_change_registry_v0.1.md
state_hub_workstream_id: "420a3981-abf5-4a8e-a94b-455964f1a0e5"
---
# CUST-WP-0033 — Interface Change Registry
## Goal
Add a first-class `InterfaceChange` entity to the state-hub. Agents producing
breaking changes can document them; agents consuming interfaces discover pending
changes at session start via dispatch, and receive inbox notifications on publish.
Webhook delivery is deferred and registered as an extension point.
Reference concept: `canon/projects/custodian/interface_change_registry_v0.1.md`
## T01: Data model and migration
```task
id: CUST-WP-0033-T01
status: todo
priority: high
state_hub_task_id: "6bc77d3c-78e0-485a-a3bc-b5987c4ccc53"
```
New table `interface_changes`. Fields: `id` (UUID PK), `repo_id` (FK
managed_repos), `interface_type` (String: rest_api / mcp_tool / cli / schema /
capability), `change_type` (String: breaking / additive / deprecation / removal),
`title` (String), `description` (Text), `affected_paths` (JSONB list of strings),
`affected_repo_slugs` (JSONB list of slugs), `status` (String: draft / published /
resolved), `planned_for` (Date nullable), `published_at` (DateTime nullable),
`resolved_at` (DateTime nullable), `author` (String), `created_at`, `updated_at`.
Index on `(repo_id, status)` and `(status)` for dispatch queries.
Acceptance: Alembic migration runs cleanly, model importable, no regressions in
existing test suite.
## T02: API endpoints
```task
id: CUST-WP-0033-T02
status: todo
priority: high
state_hub_task_id: "7664551e-0871-4e82-a9ba-b59be515c47c"
```
Router at `/interface-changes/` (prefix). Endpoints:
- `POST /interface-changes/` — create (status always `draft` on creation)
- `GET /interface-changes/` — list; filter params: `repo_slug`, `status`,
`change_type`, `affected_repo` (returns changes that affect the given slug)
- `GET /interface-changes/{change_id}` — single record
- `PATCH /interface-changes/{change_id}` — update mutable fields (title,
description, affected_paths, affected_repo_slugs, planned_for); only valid in
`draft` status
- `POST /interface-changes/{change_id}/publish` — transition draft → published;
sets `published_at`; fires inbox messages to agents of all `affected_repo_slugs`;
appends a progress event on the originating repo
- `POST /interface-changes/{change_id}/resolve` — transition published → resolved;
sets `resolved_at`
Acceptance: all endpoints return correct status codes; publish transitions send
inbox messages; 409 on invalid status transitions; tests cover happy path and
invalid transitions.
## T03: Dispatch integration
```task
id: CUST-WP-0033-T03
status: todo
priority: medium
state_hub_task_id: "8f8403a2-4444-4196-9845-ea9c66b674eb"
```
Extend `GET /repos/{slug}/dispatch` to include a `pending_interface_changes` field:
published `InterfaceChange` records where `affected_repo_slugs` contains `slug` and
status is not `resolved`. Each entry: `id`, `title`, `change_type`,
`interface_type`, `repo_slug` (origin), `affected_paths`, `planned_for`,
`published_at`.
Extend `DispatchWorkstream` schema with the new field. Update `RepoDispatch` schema.
Update the `get_repo_dispatch` endpoint accordingly.
Acceptance: `GET /repos/repo-registry/dispatch` returns `pending_interface_changes`
list (empty or populated); no regression on existing dispatch tests.
## T04: MCP tools
```task
id: CUST-WP-0033-T04
status: todo
priority: medium
state_hub_task_id: "d9135829-954e-41de-af9f-607768916478"
```
Four tools in `mcp_server/server.py`:
- `register_interface_change(repo_slug, interface_type, change_type, title,
description, affected_paths=None, affected_repo_slugs=None, planned_for=None)`
— creates a draft record
- `list_interface_changes(repo_slug=None, status=None, change_type=None,
affected_repo=None)` — returns formatted summary
- `publish_interface_change(change_id)` — publishes and triggers notifications
- `resolve_interface_change(change_id)` — marks resolved
Acceptance: tools callable from Claude Code; publish tool returns confirmation
of how many inbox messages were sent.
## T05: Dashboard page
```task
id: CUST-WP-0033-T05
status: todo
priority: low
state_hub_task_id: "d2fcbe83-c5a7-400c-a53b-ff1950795814"
```
New page `dashboard/src/interface-changes.md`. Shows:
- Table of published/draft changes grouped by repo, sorted by `published_at` desc
- Change type badge (breaking = red, deprecation = amber, additive = green)
- Affected repos column with count
- Filter by repo slug and change_type
- A "planned" section for changes with future `planned_for` dates, sorted
chronologically — effectively a migration calendar
Add to `observablehq.config.js` nav.
Acceptance: page renders; data loads from `GET /interface-changes/?status=published`
and `GET /interface-changes/?status=draft`.
## T06: Register webhook extension point
```task
id: CUST-WP-0033-T06
status: todo
priority: low
state_hub_task_id: "47d7bea8-b5fb-4fc4-9ec5-9ed5e0cdef72"
```
Register EP-CUST-ICR-001 in the state-hub:
```
interface_type: future_capability
ep_type: architecture
title: Webhook subscriptions for interface change notifications
description: |
Inbox messages cover polling agents. Real-time push to CI pipelines,
external webhooks, and non-Custodian agents requires a subscription table
(repo_slug → webhook_url) and delivery infrastructure (retry, dead-letter).
Defer until inbox-first approach proves insufficient for ≥1 real case.
status: open
priority: low
```
No implementation. Documents the deliberate deferral and records the design
direction so it is not re-invented later.
Acceptance: EP registered, retrievable via `GET /extension-points/?domain=custodian`.