WP-0016 finished: interactive registry maintain with llm-connect automation
Some checks failed
ci / validate-registry (push) Has been cancelled

Closes the registry maintenance loop from inside each domain repo:
interactive prompting for judgment calls, full automation for safe and
high-confidence changes, both backed by the llm-connect HTTP bridge.

- New modules: maintain.py, maintain_llm.py, patches.py, interactive.py
- Schema: schemas/registry-patch.schema.json
- CLI: reuse-surface maintain; establish --scaffold --hook
- Sibling templates: Makefile fragment, pre-commit hook
- Deterministic signal collectors extended; validate cwd auto-detect
- Docs, gap priority 28, SCOPE update
- Tests: test_maintain.py, test_interactive.py (59 pytest total)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-18 04:00:39 +02:00
parent 1afa7e5ee5
commit b24ec507aa
22 changed files with 3604 additions and 39 deletions

View File

@@ -32,6 +32,9 @@ jobs:
reuse-surface catalog
reuse-surface graph --check --fail-on-warnings
- name: Registry maintain dry-run (informational)
run: reuse-surface maintain --all --auto --no-llm || true
- name: Registry stats (informational)
run: reuse-surface stats || true

View File

@@ -68,6 +68,8 @@ The MVP registry foundation, CLI tooling (REUSE-WP-0003), federation stack
`--roster registry/federation/local-repo-roster.yaml --federation-ready`)
- **Draft or refresh entries** with `reuse-surface establish --discover` and
`reuse-surface update` (optional llm-connect backend)
- **Maintain registry interactively or automatically** with `reuse-surface maintain`
(TTY prompts, `--auto`, optional llm-connect, `--publish` chain)
- **Run the hub locally or in a container** with `reuse-surface serve`
- **Generate relation graphs** with `reuse-surface graph`
- **Explore relations interactively** at `docs/graph/index.html`

View File

@@ -194,15 +194,17 @@ consumer telemetry.
See §4 and archived workplans `workplans/archived/`.
### Proposed next (priorities 2527)
### Proposed next (priorities 2528)
| Priority | Gap | Suggested outcome | Status |
|---|---|---|---|
| 25 | Gitea publish visibility | Raw URL HTTP 200 for all roster rows | **Closed** (WP-0015-T01) |
| 26 | Federated ID deduplication | Per-owner removal from reuse-surface index | **Closed** (WP-0015-T02) |
| 27 | Planning analytics + standardization | Gap report or standardization tracker | **Partial** — gap report shipped (T03); tracker deferred |
| 28 | Registry maintenance automation | Interactive `maintain` + `--auto` with llm-connect | **Closed** (WP-0016) |
**Workplan:** `workplans/REUSE-WP-0015-federation-polish-and-planning-analytics.md`
**Workplan:** `workplans/REUSE-WP-0016-interactive-registry-maintain.md` (priority 28);
`workplans/REUSE-WP-0015-federation-polish-and-planning-analytics.md` (2527)
**Assessment:** `history/2026-06-16-intent-scope-assessment.md`
**Follow-up docs:**
@@ -232,4 +234,5 @@ See §4 and archived workplans `workplans/archived/`.
| 2026-06-17 | WP-0013 closed priority 24 |
| 2026-06-16 | WP-0014 closed priority 18; 60 workstation repos |
| 2026-06-16 | **SCOPE refresh + full INTENT success-criteria mapping**; priorities 2527 proposed |
| 2026-06-16 | Assessment persisted; **REUSE-WP-0015** created for priorities 2527 |
| 2026-06-16 | Assessment persisted; **REUSE-WP-0015** created for priorities 2527 |
| 2026-06-16 | **REUSE-WP-0016** closed priority 28 (interactive `maintain`, `--auto`, templates) |

View File

@@ -129,12 +129,27 @@ completes the establishment checklist below.
cd ../state-hub
reuse-surface establish --scaffold --domain helix_forge
# optional: LLM_CONNECT_URL=... reuse-surface establish --discover --dry-run
reuse-surface validate --root .
reuse-surface validate
git push origin main
reuse-surface establish --publish-check \
--raw-url https://gitea.coulomb.social/coulomb/state-hub/raw/main/registry/indexes/capabilities.yaml
```
### Ongoing maintenance (from sibling repo)
```bash
export LLM_CONNECT_URL=http://127.0.0.1:8088 # optional
reuse-surface maintain --all --from-git-since origin/main
reuse-surface maintain --all --auto --no-llm # CI / pre-commit
reuse-surface maintain --publish \
--raw-url https://gitea.coulomb.social/coulomb/state-hub/raw/main/registry/indexes/capabilities.yaml \
--all --auto --no-llm
```
Copy `templates/Makefile.registry.fragment` for `make registry-maintain` /
`make registry-check`. Optional pre-commit hook:
`reuse-surface establish --scaffold --hook`.
### Registration checklist
1. Merge capability index to the default branch.

View File

@@ -50,6 +50,33 @@ Discover drafts start at low maturity with explicit auto-draft risks in
`known_reliability_risks`. Promote only with evidence per
`specs/CapabilityMaturityStandard.md`.
## Maintain session checklist (REUSE-WP-0016)
After code or doc changes in the owning repo:
```bash
reuse-surface maintain --all --from-git-since origin/main
reuse-surface validate
git add registry/ && git commit -m "registry: maintain session"
git push origin main
```
Automation (CI or pre-commit):
```bash
reuse-surface maintain --all --auto --no-llm
```
With llm-connect for maturity suggestions:
```bash
export LLM_CONNECT_URL=http://127.0.0.1:8088
reuse-surface maintain --all --from-git-since HEAD~5
```
Review every non-deterministic patch before merge; promotions require evidence
citations on disk per `specs/CapabilityMaturityStandard.md`.
## Manual validation checklist
Use this checklist until an automated CLI validator exists.

View File

@@ -1,7 +1,7 @@
# Composed federated capability index. Regenerate with:
# reuse-surface federation compose
version: 1
updated: '2026-06-16'
updated: '2026-06-18'
domain: helix_forge
collision_policy: warn
sources:
@@ -162,7 +162,7 @@ sources:
url: https://gitea.coulomb.social/coulomb/ops-hub/raw/main/registry/indexes/capabilities.yaml
cache: registry/federation/cache/ops-hub.yaml
- repo: ops-warden
count: 0
count: 1
url: https://gitea.coulomb.social/coulomb/ops-warden/raw/main/registry/indexes/capabilities.yaml
cache: registry/federation/cache/ops-warden.yaml
- repo: phase-memory
@@ -430,6 +430,29 @@ capabilities:
source_repo: reuse-surface
source_url: https://gitea.coulomb.social/coulomb/reuse-surface/raw/main/registry/indexes/capabilities.yaml
source_index: registry/federation/cache/reuse-surface.yaml
- id: capability.security.ssh-certificate-issuance
name: SSH Certificate Issuance
summary: Issue short-lived CA-signed SSH certificates for adm, agt, and atm actors
through a stable cert_command CLI interface; steward NetKingdom operational access
routing.
vector: D4 / A3 / C3 / R2
domain: helix_forge
status: draft
owner: ops-warden
path: registry/capabilities/capability.security.ssh-certificate-issuance.md
tags:
- ssh
- certificate
- ca
- ops-warden
- openbao
- security
consumption_modes:
- CLI
- cert_command subprocess
source_repo: ops-warden
source_url: https://gitea.coulomb.social/coulomb/ops-warden/raw/main/registry/indexes/capabilities.yaml
source_index: registry/federation/cache/ops-warden.yaml
- id: capability.statehub.progress-log
name: Work Progress Logging
summary: Record progress events, decisions, and session notes against workstreams

View File

@@ -33,6 +33,7 @@ from reuse_surface.reports import (
from reuse_surface.establish import (
discover_capabilities,
format_publish_check_markdown,
install_registry_hook,
publish_check,
scaffold_next_steps,
scaffold_registry,
@@ -52,6 +53,11 @@ from reuse_surface.stats import (
format_stats_json,
format_stats_markdown,
)
from reuse_surface.maintain import (
format_maintain_json,
format_maintain_markdown,
run_maintain,
)
from reuse_surface.registry import (
ROOT,
capability_paths,
@@ -62,13 +68,26 @@ from reuse_surface.registry import (
parse_front_matter,
parse_vector,
registry_paths,
resolve_repo_root,
)
def _registry_root(args: argparse.Namespace) -> Path:
if getattr(args, "root", None):
return Path(args.root).resolve()
return ROOT
return resolve_repo_root(getattr(args, "root", None))
def _make_validate_fn(repo_root: Path) -> Any:
def _validate() -> tuple[int, list[str], list[str]]:
errors, warnings, _paths = _run_validate(repo_root, target=None, relations=False)
for warning in warnings:
print(f"warning: {warning}", file=sys.stderr)
for error in errors:
print(f"error: {error}", file=sys.stderr)
if errors:
return 1, errors, warnings
return 0, errors, warnings
return _validate
def _check_index_drift(
@@ -427,6 +446,9 @@ def cmd_establish(args: argparse.Namespace) -> int:
)
for path in created:
print(f"ok: wrote {path.relative_to(repo_root)}")
if args.hook:
hook = install_registry_hook(repo_root, force=args.force)
print(f"ok: wrote {hook.relative_to(repo_root)}")
print(scaffold_next_steps(repo_root))
return 0
if args.publish_check:
@@ -461,6 +483,38 @@ def cmd_establish(args: argparse.Namespace) -> int:
return 1
def cmd_maintain(args: argparse.Namespace) -> int:
repo_root = Path(args.path or ".").resolve()
try:
if not args.capability and not args.all:
print("error: specify --capability or --all", file=sys.stderr)
return 1
result = run_maintain(
repo_root,
capability_id=args.capability,
all_capabilities=args.all,
git_since=args.from_git_since,
llm_url=args.llm_url,
no_llm=args.no_llm,
auto=args.auto,
yes=args.yes,
auto_confidence=args.auto_confidence,
auto_max_delta=args.auto_max_delta,
publish=args.publish,
raw_url=args.raw_url,
output_format=args.format,
validate_fn=_make_validate_fn(repo_root),
)
if args.format == "json":
print(format_maintain_json(result))
else:
print(format_maintain_markdown(result), end="")
return result.exit_code
except ValueError as exc:
print(f"error: {exc}", file=sys.stderr)
return 1
def cmd_update(args: argparse.Namespace) -> int:
repo_root = Path(args.path or ".").resolve()
try:
@@ -782,6 +836,11 @@ def main(argv: list[str] | None = None) -> int:
establish.add_argument("--raw-url", help="raw Gitea index URL for publish-check")
establish.add_argument("--llm-url", help="llm-connect base URL (or LLM_CONNECT_URL)")
establish.add_argument("--context-max-files", type=int, default=12)
establish.add_argument(
"--hook",
action="store_true",
help="install registry pre-commit hook (with --scaffold)",
)
establish.set_defaults(func=cmd_establish)
update = subparsers.add_parser("update", help="refresh registry metadata from repo signals")
@@ -795,6 +854,28 @@ def main(argv: list[str] | None = None) -> int:
update.add_argument("--format", choices=["markdown", "json"], default="markdown")
update.set_defaults(func=cmd_update)
maintain = subparsers.add_parser(
"maintain", help="interactive or automated registry maintenance"
)
maintain.add_argument("--path", help="repo root (default: cwd)")
maintain.add_argument("--capability", help="single capability id")
maintain.add_argument("--all", action="store_true")
maintain.add_argument("--from-git-since", help="git ref for change detection")
maintain.add_argument("--llm-url", help="llm-connect base URL (or LLM_CONNECT_URL)")
maintain.add_argument("--no-llm", action="store_true")
maintain.add_argument("--auto", action="store_true", help="apply safe + gated LLM patches")
maintain.add_argument("--yes", action="store_true", help="non-TTY auto-apply equivalent")
maintain.add_argument(
"--auto-confidence",
choices=["low", "medium", "high"],
default="high",
)
maintain.add_argument("--auto-max-delta", type=int, default=1)
maintain.add_argument("--publish", action="store_true")
maintain.add_argument("--raw-url", help="raw Gitea index URL for --publish")
maintain.add_argument("--format", choices=["markdown", "json"], default="markdown")
maintain.set_defaults(func=cmd_maintain)
args = parser.parse_args(argv)
return args.func(args)

View File

@@ -11,7 +11,9 @@ from typing import Any
import yaml
from reuse_surface.llm_bridge import request_registry_draft
from reuse_surface.registry import load_index_at, registry_paths
from reuse_surface.registry import ROOT, load_index_at, registry_paths
HOOK_TEMPLATE = ROOT / "templates" / "git-hook.pre-commit.registry"
SCAFFOLD_README = """# Capability Registry
@@ -80,6 +82,18 @@ def scaffold_registry(
return created
def install_registry_hook(repo_root: Path, *, force: bool = False) -> Path:
hook_path = repo_root / ".git" / "hooks" / "pre-commit"
if not (repo_root / ".git").is_dir():
raise ValueError(f"not a git repository: {repo_root}")
if hook_path.exists() and not force:
raise ValueError(f"hook already exists: {hook_path}; use --force to overwrite")
hook_path.parent.mkdir(parents=True, exist_ok=True)
hook_path.write_text(HOOK_TEMPLATE.read_text(encoding="utf-8"), encoding="utf-8")
hook_path.chmod(0o755)
return hook_path
def scaffold_next_steps(repo_root: Path) -> str:
return textwrap.dedent(
f"""

View File

@@ -0,0 +1,119 @@
from __future__ import annotations
import json
import os
import subprocess
import sys
import tempfile
from pathlib import Path
from typing import Any, Literal
from reuse_surface.patches import is_safe_patch
PromptAction = Literal["apply", "skip", "edit", "quit", "apply_all_safe"]
class NonInteractiveError(ValueError):
pass
def is_tty() -> bool:
return sys.stdin.isatty() and sys.stdout.isatty()
def format_patch_summary(patch: dict[str, Any]) -> str:
lines = [
f" capability: {patch['capability_id']}",
f" kind: {patch['kind']}",
f" confidence: {patch.get('confidence', 'n/a')}",
f" rationale: {patch.get('rationale', '')}",
]
for key in ("append", "value", "field_path", "dimension", "from_level", "to_level"):
if patch.get(key) is not None:
lines.append(f" {key}: {patch[key]}")
if patch.get("evidence_citations"):
lines.append(f" evidence: {', '.join(patch['evidence_citations'])}")
return "\n".join(lines)
def emit_event(event: str, payload: dict[str, Any]) -> None:
print(json.dumps({"event": event, **payload}, sort_keys=True))
def prompt_patch(patch: dict[str, Any]) -> PromptAction:
print("\n--- Registry patch ---")
print(format_patch_summary(patch))
while True:
choice = input("[a]pply [s]kip [e]dit [q]uit [A]pply all safe? ").strip().lower()
if choice in {"a", "apply"}:
return "apply"
if choice in {"s", "skip"}:
return "skip"
if choice in {"e", "edit"}:
return "edit"
if choice in {"q", "quit"}:
return "quit"
if choice == "":
continue
if choice.upper() == "A" or choice == "apply all safe":
return "apply_all_safe"
print("Invalid choice.")
def edit_patch(patch: dict[str, Any]) -> dict[str, Any]:
editor = os.environ.get("EDITOR", "nano")
with tempfile.NamedTemporaryFile("w", suffix=".yaml", delete=False) as handle:
import yaml
yaml.safe_dump(patch, handle, sort_keys=False)
temp_path = handle.name
subprocess.run([editor, temp_path], check=False)
import yaml
edited = yaml.safe_load(Path(temp_path).read_text(encoding="utf-8"))
Path(temp_path).unlink(missing_ok=True)
if not isinstance(edited, dict):
return patch
return edited
def prompt_batch(
patches: list[dict[str, Any]],
*,
assume_yes: bool = False,
auto_mode: bool = False,
emit_json: bool = False,
) -> list[dict[str, Any]]:
if auto_mode or assume_yes:
return list(patches)
if not is_tty():
if emit_json:
for patch in patches:
emit_event("suggestion", {"patch": patch, "default": "skip"})
raise NonInteractiveError(
"non-interactive stdin; use --auto or --yes to apply patches"
)
raise NonInteractiveError(
"non-interactive stdin; use --auto or --yes to apply patches"
)
selected: list[dict[str, Any]] = []
index = 0
while index < len(patches):
patch = patches[index]
action = prompt_patch(patch)
if action == "apply_all_safe":
selected.extend(p for p in patches[index:] if is_safe_patch(p))
index = len(patches)
break
if action == "quit":
break
if action == "skip":
index += 1
continue
if action == "edit":
patch = edit_patch(patch)
selected.append(patch)
index += 1
return selected

214
reuse_surface/maintain.py Normal file
View File

@@ -0,0 +1,214 @@
from __future__ import annotations
import json
import os
from dataclasses import dataclass, field
from pathlib import Path
from typing import Any, Callable
from reuse_surface.establish import format_publish_check_markdown, publish_check
from reuse_surface.interactive import NonInteractiveError, prompt_batch
from reuse_surface.maintain_llm import request_maintain_patches
from reuse_surface.patches import (
apply_patches_atomic,
filter_auto_patches,
patches_from_suggestions,
)
from reuse_surface.registry import load_index_at, registry_paths
from reuse_surface.registry_update import collect_deterministic_suggestions
@dataclass
class MaintainResult:
selected_count: int = 0
applied: list[str] = field(default_factory=list)
skipped: int = 0
notes: list[str] = field(default_factory=list)
publish: dict[str, Any] | None = None
exit_code: int = 0
def collect_capability_ids(
repo_root: Path,
*,
capability_id: str | None,
all_capabilities: bool,
) -> list[str]:
index = load_index_at(registry_paths(repo_root)["index"])
ids = [row["id"] for row in index.get("capabilities", [])]
if capability_id:
if capability_id not in ids:
raise ValueError(f"capability not in index: {capability_id}")
return [capability_id]
if all_capabilities:
return ids
raise ValueError("specify --capability or --all")
def gather_patches(
repo_root: Path,
*,
capability_ids: list[str],
git_since: str | None,
llm_url: str | None,
no_llm: bool,
) -> tuple[list[dict[str, Any]], list[str]]:
patches: list[dict[str, Any]] = []
notes: list[str] = []
scope_id = capability_ids[0] if len(capability_ids) == 1 else None
suggestions = collect_deterministic_suggestions(
repo_root,
capability_id=scope_id,
git_since=git_since,
)
patches = patches_from_suggestions(suggestions)
if scope_id:
patches = [
patch
for patch in patches
if patch["capability_id"] == scope_id
or patch.get("kind") == "index_updated_bump"
]
if no_llm:
return patches, notes
try:
for cap_id in capability_ids:
payload = request_maintain_patches(
repo_root,
cap_id,
git_since=git_since,
llm_url=llm_url,
)
patches.extend(payload.get("patches", []))
notes.extend(payload.get("notes", []))
except ValueError as exc:
if "LLM backend not configured" in str(exc):
notes.append("LLM phase skipped: LLM_CONNECT_URL not set")
else:
notes.append(f"LLM phase skipped: {exc}")
return patches, notes
def run_maintain(
repo_root: Path,
*,
capability_id: str | None = None,
all_capabilities: bool = False,
git_since: str | None = None,
llm_url: str | None = None,
no_llm: bool = False,
auto: bool = False,
yes: bool = False,
auto_confidence: str = "high",
auto_max_delta: int = 1,
publish: bool = False,
raw_url: str | None = None,
output_format: str = "markdown",
validate_fn: Callable[[], tuple[int, list[str], list[str]]] | None = None,
) -> MaintainResult:
if publish and not raw_url:
raw_url = os.environ.get("REUSE_SURFACE_RAW_URL")
if publish and not raw_url:
raise ValueError("--publish requires --raw-url or REUSE_SURFACE_RAW_URL")
cap_ids = collect_capability_ids(
repo_root,
capability_id=capability_id,
all_capabilities=all_capabilities,
)
patches, notes = gather_patches(
repo_root,
capability_ids=cap_ids,
git_since=git_since,
llm_url=llm_url,
no_llm=no_llm,
)
result = MaintainResult(notes=notes)
if not patches:
result.exit_code = 0
if publish and raw_url:
result.publish = publish_check(repo_root, raw_url=raw_url)
if not result.publish["ok"]:
result.exit_code = 1
return result
if auto or yes:
selected = filter_auto_patches(
patches,
repo_root,
auto_confidence=auto_confidence,
auto_max_delta=auto_max_delta,
)
result.skipped = len(patches) - len(selected)
else:
try:
selected = prompt_batch(
patches,
assume_yes=yes,
auto_mode=False,
emit_json=output_format == "json",
)
result.skipped = len(patches) - len(selected)
except NonInteractiveError as exc:
raise ValueError(str(exc)) from exc
result.selected_count = len(selected)
if not selected:
result.exit_code = 2
return result
if validate_fn is None:
raise ValueError("validate_fn is required")
applied, code = apply_patches_atomic(repo_root, selected, validate=validate_fn)
result.applied = applied
result.exit_code = code
if code == 0 and publish and raw_url:
result.publish = publish_check(repo_root, raw_url=raw_url)
if not result.publish["ok"]:
result.exit_code = 1
return result
def format_maintain_markdown(result: MaintainResult) -> str:
lines = ["# Registry maintain session", ""]
lines.append(f"**Selected:** {result.selected_count} patch(es)")
lines.append(f"**Skipped:** {result.skipped}")
lines.append(f"**Exit:** {result.exit_code}")
if result.applied:
lines.append("")
lines.append("## Applied")
for item in result.applied:
lines.append(f"- {item}")
if result.notes:
lines.append("")
lines.append("## Notes")
for note in result.notes:
lines.append(f"- {note}")
if result.publish:
lines.append("")
lines.append(format_publish_check_markdown(result.publish).rstrip())
return "\n".join(lines) + "\n"
def format_maintain_json(result: MaintainResult) -> str:
return json.dumps(
{
"selected_count": result.selected_count,
"skipped": result.skipped,
"applied": result.applied,
"notes": result.notes,
"publish": result.publish,
"exit_code": result.exit_code,
},
indent=2,
sort_keys=True,
)

View File

@@ -0,0 +1,160 @@
from __future__ import annotations
import json
import subprocess
import textwrap
from pathlib import Path
from typing import Any
import yaml
from jsonschema import Draft202012Validator
from reuse_surface.llm_bridge import execute_prompt, extract_json_object
from reuse_surface.registry import ROOT, load_index_at, parse_front_matter, registry_paths
PATCH_SCHEMA_PATH = ROOT / "schemas" / "registry-patch.schema.json"
MATURITY_SUMMARY = """
| Dimension | Levels | Question |
|---|---|---|
| discovery | D0D7 | Planning/orientation reuse strength |
| availability | A0A7 | Consumption mode and delivery artifacts |
| completeness | C0C6 | Scope vs intent and expectations |
| reliability | R0R6 | Consumer quality signals |
Promotion rules:
- Cite repo-relative evidence paths for every maturity_promote patch.
- Prefer single-step promotions (one level per dimension).
- Do not invent files; only cite paths visible in git diff or context.
"""
def load_patch_schema() -> dict[str, Any]:
return json.loads(PATCH_SCHEMA_PATH.read_text(encoding="utf-8"))
def _git_diff(repo_root: Path, git_since: str | None) -> str:
if not git_since:
return ""
proc = subprocess.run(
[
"git",
"-C",
str(repo_root),
"diff",
git_since,
"HEAD",
"--",
"registry/",
"reuse_surface/",
"tests/",
"docs/",
".gitea/",
"pyproject.toml",
],
capture_output=True,
text=True,
check=False,
)
return proc.stdout[:12000]
def build_maintain_prompt(
repo_root: Path,
capability_id: str,
*,
git_since: str | None = None,
context_files: list[str] | None = None,
) -> str:
paths = registry_paths(repo_root)
index = load_index_at(paths["index"])
row = next((item for item in index["capabilities"] if item["id"] == capability_id), None)
if not row:
raise ValueError(f"capability not in index: {capability_id}")
entry = parse_front_matter(repo_root / row["path"])
diff = _git_diff(repo_root, git_since)
context_chunks: list[str] = []
for rel in context_files or []:
path = repo_root / rel
if path.is_file():
context_chunks.append(f"### {rel}\n{path.read_text(encoding='utf-8')[:4000]}")
schema_hint = json.dumps(
{
"patches": [
{
"capability_id": capability_id,
"kind": "maturity_promote",
"confidence": "medium",
"rationale": "CI gate added",
"dimension": "reliability",
"from_level": "R2",
"to_level": "R3",
"evidence_citations": ["tests/test_example.py"],
"promotion_history_entry": {
"date": "2026-06-16",
"dimension": "reliability",
"from": "R2",
"to": "R3",
"rationale": "pytest coverage for consumption path",
},
}
],
"notes": ["optional human review items"],
},
indent=2,
)
return textwrap.dedent(
f"""
Propose structured registry maintenance patches for `{capability_id}`.
Return ONLY JSON matching this shape (no markdown fences):
{schema_hint}
Allowed patch kinds: vector_sync, evidence_append, artifact_append,
maturity_promote, consumer_feedback, relation_add, index_row_add,
index_updated_bump.
Maturity reference:
{MATURITY_SUMMARY}
Current entry YAML:
{yaml.safe_dump(entry, sort_keys=False)}
Git diff since {git_since or 'N/A'}:
{diff or '(none)'}
Context files:
{chr(10).join(context_chunks) if context_chunks else '(none)'}
"""
).strip()
def request_maintain_patches(
repo_root: Path,
capability_id: str,
*,
git_since: str | None = None,
context_files: list[str] | None = None,
llm_url: str | None = None,
) -> dict[str, Any]:
prompt = build_maintain_prompt(
repo_root,
capability_id,
git_since=git_since,
context_files=context_files,
)
content = execute_prompt(
prompt,
base_url=llm_url,
config={"temperature": 0.2, "max_tokens": 3000},
)
payload = extract_json_object(content)
validator = Draft202012Validator(load_patch_schema())
errors = sorted(validator.iter_errors(payload), key=lambda err: list(err.path))
if errors:
messages = "; ".join(error.message for error in errors[:3])
raise ValueError(f"patch schema validation failed: {messages}")
return payload

391
reuse_surface/patches.py Normal file
View File

@@ -0,0 +1,391 @@
from __future__ import annotations
import shutil
from datetime import date
from pathlib import Path
from typing import Any, Callable
import yaml
from reuse_surface.registry import (
LEVEL_ORDERS,
entry_vector,
load_index_at,
parse_front_matter,
registry_paths,
vectors_match,
)
SAFE_DETERMINISTIC_KINDS = frozenset(
{
"vector_sync",
"vector_drift",
"evidence_append",
"evidence_test",
"artifact_append",
"availability_artifact",
"index_updated_bump",
"index_row_add",
"evidence_workflow",
"evidence_documentation",
}
)
CONFIDENCE_ORDER = {"low": 0, "medium": 1, "high": 2}
DIMENSION_LEVEL_PREFIX = {
"discovery": "D",
"availability": "A",
"completeness": "C",
"reliability": "R",
}
def suggestion_to_patch(suggestion: dict[str, Any]) -> dict[str, Any] | None:
kind = suggestion.get("kind")
if kind == "missing_entry":
return None
patch_body = suggestion.get("apply_patch")
if not patch_body:
return None
cap_id = suggestion["capability_id"]
rationale = suggestion.get("detail", "deterministic signal")
if kind == "vector_drift":
return {
"capability_id": cap_id,
"kind": "vector_sync",
"confidence": "high",
"rationale": rationale,
"value": patch_body["value"],
}
if kind in {"evidence_test", "evidence_workflow", "evidence_documentation"}:
return {
"capability_id": cap_id,
"kind": "evidence_append",
"confidence": "high",
"rationale": rationale,
"field_path": patch_body["field"],
"append": patch_body["append"],
}
if kind == "availability_artifact":
return {
"capability_id": cap_id,
"kind": "artifact_append",
"confidence": "high",
"rationale": rationale,
"append": patch_body["append"],
}
if kind == "index_row_add":
return {
"capability_id": cap_id,
"kind": "index_row_add",
"confidence": "high",
"rationale": rationale,
"index_row": patch_body.get("index_row", {}),
}
if kind == "index_updated_stale":
return {
"capability_id": cap_id,
"kind": "index_updated_bump",
"confidence": "high",
"rationale": rationale,
"value": patch_body.get("value", date.today().isoformat()),
}
return None
def patches_from_suggestions(suggestions: list[dict[str, Any]]) -> list[dict[str, Any]]:
patches: list[dict[str, Any]] = []
for item in suggestions:
patch = suggestion_to_patch(item)
if patch:
patches.append(patch)
return patches
def is_safe_patch(patch: dict[str, Any]) -> bool:
return patch.get("kind") in SAFE_DETERMINISTIC_KINDS
def level_delta(dimension: str, from_level: str, to_level: str) -> int:
order = LEVEL_ORDERS[dimension]
return order.index(to_level) - order.index(from_level)
def evidence_gate(repo_root: Path, patch: dict[str, Any]) -> bool:
if patch.get("kind") != "maturity_promote":
return True
citations = patch.get("evidence_citations") or []
if not citations:
return False
return all((repo_root / path).exists() for path in citations)
def promotion_delta_gate(patch: dict[str, Any], max_delta: int) -> bool:
if patch.get("kind") != "maturity_promote":
return True
dimension = patch.get("dimension")
from_level = patch.get("from_level")
to_level = patch.get("to_level")
if not dimension or not from_level or not to_level:
return False
delta = level_delta(dimension, from_level, to_level)
return 0 < delta <= max_delta
def confidence_gate(patch: dict[str, Any], minimum: str) -> bool:
return CONFIDENCE_ORDER[patch.get("confidence", "low")] >= CONFIDENCE_ORDER[minimum]
def filter_auto_patches(
patches: list[dict[str, Any]],
repo_root: Path,
*,
auto_confidence: str = "high",
auto_max_delta: int = 1,
) -> list[dict[str, Any]]:
selected: list[dict[str, Any]] = []
for patch in patches:
if is_safe_patch(patch):
selected.append(patch)
continue
if not confidence_gate(patch, auto_confidence):
continue
if not evidence_gate(repo_root, patch):
continue
if not promotion_delta_gate(patch, auto_max_delta):
continue
selected.append(patch)
return selected
def _write_front_matter(path: Path, front_matter: dict[str, Any]) -> None:
text = path.read_text(encoding="utf-8")
marker_end = text.find("\n---", 4)
body = text[marker_end + 4 :] if marker_end != -1 else "\n"
path.write_text(
"---\n"
+ yaml.safe_dump(front_matter, sort_keys=False, allow_unicode=True)
+ "---"
+ body,
encoding="utf-8",
)
def _apply_maturity_promote(
front_matter: dict[str, Any],
patch: dict[str, Any],
) -> list[str]:
dimension = patch["dimension"]
to_level = patch["to_level"]
changed: list[str] = []
if dimension in {"discovery", "availability"}:
front_matter.setdefault("maturity", {}).setdefault(dimension, {})["current"] = to_level
if dimension == "availability":
front_matter.setdefault("availability", {})["current_level"] = to_level
changed.append(f"maturity.{dimension}.current -> {to_level}")
else:
key = "completeness" if dimension == "completeness" else "reliability"
front_matter.setdefault("external_evidence", {}).setdefault(key, {})["level"] = to_level
changed.append(f"external_evidence.{key}.level -> {to_level}")
entry = patch.get("promotion_history_entry")
if entry:
history = front_matter.setdefault("promotion_history", [])
history.append(entry)
changed.append("promotion_history +1")
return changed
def _apply_patch_to_state(
repo_root: Path,
patch: dict[str, Any],
index: dict[str, Any],
entry_cache: dict[str, dict[str, Any]],
entry_paths: dict[str, Path],
) -> list[str]:
cap_id = patch["capability_id"]
kind = patch["kind"]
changed: list[str] = []
index_by_id = {row["id"]: row for row in index.get("capabilities", [])}
if kind == "index_updated_bump":
index["updated"] = patch.get("value", date.today().isoformat())
return ["index.updated bumped"]
if kind == "index_row_add":
row = patch.get("index_row", {})
if cap_id not in index_by_id and row:
index.setdefault("capabilities", []).append(row)
changed.append(f"index row added for {cap_id}")
return changed
row = index_by_id.get(cap_id)
if not row:
return changed
if kind == "vector_sync":
row["vector"] = patch["value"]
changed.append(f"index vector for {cap_id}")
return changed
entry_path = repo_root / row["path"]
if cap_id not in entry_cache:
entry_cache[cap_id] = parse_front_matter(entry_path)
entry_paths[cap_id] = entry_path
front_matter = entry_cache[cap_id]
if kind == "evidence_append":
field = patch.get("field_path", "evidence.tests")
parts = field.split(".")
target = front_matter
for part in parts[:-1]:
target = target.setdefault(part, {})
items = target.setdefault(parts[-1], [])
append = patch["append"]
if append not in items:
items.append(append)
changed.append(f"{cap_id} {field} += {append}")
elif kind == "artifact_append":
artifacts = front_matter.setdefault("availability", {}).setdefault(
"current_artifacts", []
)
append = patch["append"]
if append not in artifacts:
artifacts.append(append)
changed.append(f"{cap_id} availability.current_artifacts += {append}")
elif kind == "consumer_feedback":
feedback = front_matter.setdefault("evidence", {}).setdefault(
"consumer_feedback", []
)
append = patch.get("append") or patch.get("value")
if append and append not in feedback:
feedback.append(str(append))
changed.append(f"{cap_id} consumer_feedback +1")
elif kind == "relation_add":
rel = patch.get("value") or {}
rel_type = rel.get("type", "related_to")
target_id = rel.get("target")
if target_id:
relations = front_matter.setdefault("relations", {}).setdefault(rel_type, [])
if target_id not in relations:
relations.append(target_id)
changed.append(f"{cap_id} relations.{rel_type} += {target_id}")
elif kind == "maturity_promote":
changed.extend(_apply_maturity_promote(front_matter, patch))
row["vector"] = entry_vector(front_matter)
return changed
def apply_patches(repo_root: Path, patches: list[dict[str, Any]]) -> list[str]:
paths = registry_paths(repo_root)
index = load_index_at(paths["index"])
entry_cache: dict[str, dict[str, Any]] = {}
entry_paths: dict[str, Path] = {}
changed: list[str] = []
for patch in patches:
changed.extend(
_apply_patch_to_state(repo_root, patch, index, entry_cache, entry_paths)
)
if changed:
index["updated"] = date.today().isoformat()
paths["index"].write_text(
yaml.safe_dump(index, sort_keys=False, allow_unicode=True),
encoding="utf-8",
)
for cap_id, front_matter in entry_cache.items():
_write_front_matter(entry_paths[cap_id], front_matter)
return changed
def _patch_to_suggestion(patch: dict[str, Any]) -> dict[str, Any] | None:
kind = patch["kind"]
cap_id = patch["capability_id"]
if kind == "vector_sync":
return {
"capability_id": cap_id,
"kind": "vector_drift",
"apply_patch": {"field": "index.vector", "value": patch["value"]},
}
if kind == "evidence_append":
field = patch.get("field_path", "evidence.tests")
return {
"capability_id": cap_id,
"kind": "evidence_test",
"apply_patch": {"field": field, "append": patch["append"]},
}
if kind == "artifact_append":
return {
"capability_id": cap_id,
"kind": "availability_artifact",
"apply_patch": {"field": "availability.current_artifacts", "append": patch["append"]},
}
if kind == "index_updated_bump":
return {
"capability_id": cap_id,
"kind": "index_updated_stale",
"apply_patch": {"field": "index.updated", "value": patch.get("value")},
}
return None
def apply_patches_atomic(
repo_root: Path,
patches: list[dict[str, Any]],
*,
validate: Callable[[], tuple[int, list[str], list[str]]],
) -> tuple[list[str], int]:
if not patches:
return [], 0
session_dir = repo_root / ".reuse-surface-session"
backup_dir = session_dir / "backup"
if session_dir.exists():
shutil.rmtree(session_dir)
backup_dir.mkdir(parents=True, exist_ok=True)
paths = registry_paths(repo_root)
touched: set[Path] = set()
if paths["index"].exists():
rel = paths["index"].relative_to(repo_root)
dest = backup_dir / rel
dest.parent.mkdir(parents=True, exist_ok=True)
shutil.copy2(paths["index"], dest)
touched.add(paths["index"])
index = load_index_at(paths["index"]) if paths["index"].exists() else {}
for row in index.get("capabilities", []):
entry_path = repo_root / row["path"]
if entry_path.exists():
rel = entry_path.relative_to(repo_root)
dest = backup_dir / rel
dest.parent.mkdir(parents=True, exist_ok=True)
shutil.copy2(entry_path, dest)
touched.add(entry_path)
try:
changed = apply_patches(repo_root, patches)
code, errors, warnings = validate()
if code != 0:
for path in touched:
rel = path.relative_to(repo_root)
backup = backup_dir / rel
if backup.exists():
shutil.copy2(backup, path)
shutil.rmtree(session_dir, ignore_errors=True)
return changed, code
shutil.rmtree(session_dir, ignore_errors=True)
return changed, 0
except Exception:
for path in touched:
rel = path.relative_to(repo_root)
backup = backup_dir / rel
if backup.exists():
shutil.copy2(backup, path)
shutil.rmtree(session_dir, ignore_errors=True)
raise

View File

@@ -87,4 +87,13 @@ def entry_vector(front_matter: dict[str, Any]) -> str:
def vectors_match(index_vector: str, front_matter: dict[str, Any]) -> bool:
return index_vector.replace(" ", "") == entry_vector(front_matter).replace(" ", "")
return index_vector.replace(" ", "") == entry_vector(front_matter).replace(" ", "")
def resolve_repo_root(explicit: str | Path | None = None) -> Path:
if explicit:
return Path(explicit).resolve()
cwd = Path.cwd()
if (cwd / "registry" / "indexes" / "capabilities.yaml").is_file():
return cwd.resolve()
return ROOT

View File

@@ -3,6 +3,7 @@ from __future__ import annotations
import json
import subprocess
import textwrap
from datetime import date
from pathlib import Path
from typing import Any
@@ -17,6 +18,7 @@ from reuse_surface.registry import (
vectors_match,
)
# Safe to apply without interactive review (see patches.SAFE_DETERMINISTIC_KINDS).
SAFE_EVIDENCE_PREFIXES = ("tests/", ".gitea/workflows/")
@@ -52,6 +54,8 @@ def collect_deterministic_suggestions(
changed_files = git_changed_files(repo_root, git_since) if git_since else []
suggestions: list[dict[str, Any]] = []
suggestions.extend(_collect_index_orphans(repo_root, index, changed_files))
for row in rows:
entry_path = repo_root / row["path"]
if not entry_path.exists():
@@ -80,43 +84,167 @@ def collect_deterministic_suggestions(
}
)
evidence_tests = front_matter.get("evidence", {}).get("tests", [])
for changed in changed_files:
if changed.startswith("tests/") and changed not in evidence_tests:
suggestions.extend(
_collect_changed_file_suggestions(row["id"], front_matter, changed_files, repo_root)
)
return suggestions
def _collect_index_orphans(
repo_root: Path,
index: dict[str, Any],
changed_files: list[str],
) -> list[dict[str, Any]]:
suggestions: list[dict[str, Any]] = []
indexed_paths = {row["path"] for row in index.get("capabilities", [])}
cap_dir = registry_paths(repo_root)["capabilities"]
if not cap_dir.exists():
return suggestions
for entry_file in sorted(cap_dir.glob("*.md")):
if entry_file.name == ".gitkeep":
continue
rel = str(entry_file.relative_to(repo_root))
if rel in indexed_paths:
continue
try:
front_matter = parse_front_matter(entry_file)
except ValueError:
continue
cap_id = front_matter.get("id", entry_file.stem.replace("-", "."))
suggestions.append(
{
"capability_id": cap_id,
"kind": "index_row_add",
"detail": f"capability file not in index: {rel}",
"apply_patch": {
"field": "index.capabilities",
"index_row": {
"id": cap_id,
"name": front_matter.get("name", cap_id),
"summary": front_matter.get("summary", ""),
"vector": entry_vector(front_matter),
"domain": front_matter.get("domain", index.get("domain", "helix_forge")),
"status": front_matter.get("status", "draft"),
"owner": front_matter.get("owner", repo_root.name),
"path": rel,
"tags": front_matter.get("tags", []),
"consumption_modes": front_matter.get("availability", {}).get(
"consumption_modes", ["informational"]
),
},
},
}
)
index_updated = index.get("updated")
registry_touched = any(path.startswith("registry/") for path in changed_files)
if registry_touched and index_updated != date.today().isoformat():
first_id = index.get("capabilities", [{}])[0].get("id", "registry")
suggestions.append(
{
"capability_id": first_id,
"kind": "index_updated_stale",
"detail": "registry/ changed; bump index updated date",
"apply_patch": {"field": "index.updated", "value": date.today().isoformat()},
}
)
return suggestions
def _pyproject_script_artifacts(repo_root: Path) -> list[str]:
pyproject = repo_root / "pyproject.toml"
if not pyproject.exists():
return []
try:
import tomllib
data = tomllib.loads(pyproject.read_text(encoding="utf-8"))
except (OSError, ValueError):
return []
scripts = data.get("project", {}).get("scripts", {})
return [f"pyproject.toml:[project.scripts].{name}" for name in sorted(scripts)]
def _collect_changed_file_suggestions(
cap_id: str,
front_matter: dict[str, Any],
changed_files: list[str],
repo_root: Path,
) -> list[dict[str, Any]]:
suggestions: list[dict[str, Any]] = []
evidence = front_matter.setdefault("evidence", {})
evidence_tests = evidence.get("tests", [])
evidence_docs = evidence.get("documentation", [])
pkg_prefixes = tuple(
p.name + "/"
for p in repo_root.iterdir()
if p.is_dir() and (p / "__init__.py").exists()
)
for changed in changed_files:
if changed.startswith("tests/") and changed not in evidence_tests:
suggestions.append(
{
"capability_id": cap_id,
"kind": "evidence_test",
"detail": f"new test file not cited: {changed}",
"apply_patch": {"field": "evidence.tests", "append": changed},
}
)
if changed.startswith(".gitea/workflows/") and changed.endswith((".yml", ".yaml")):
field = "evidence.tests" if "test" in changed.lower() else "evidence.documentation"
existing = evidence_tests if field == "evidence.tests" else evidence_docs
if changed not in existing:
suggestions.append(
{
"capability_id": row["id"],
"kind": "evidence_test",
"detail": f"new test file not cited: {changed}",
"capability_id": cap_id,
"kind": "evidence_workflow",
"detail": f"workflow changed not cited: {changed}",
"apply_patch": {"field": field, "append": changed},
}
)
if changed.startswith("docs/") and changed not in evidence_docs:
suggestions.append(
{
"capability_id": cap_id,
"kind": "evidence_documentation",
"detail": f"doc changed not cited: {changed}",
"apply_patch": {"field": "evidence.documentation", "append": changed},
}
)
artifacts = front_matter.get("availability", {}).get("current_artifacts", [])
for changed in changed_files:
if changed.endswith(".py") and changed.startswith(pkg_prefixes):
if changed not in artifacts:
suggestions.append(
{
"capability_id": cap_id,
"kind": "availability_artifact",
"detail": f"changed module not cited: {changed}",
"apply_patch": {
"field": "evidence.tests",
"field": "availability.current_artifacts",
"append": changed,
},
}
)
artifacts = front_matter.get("availability", {}).get("current_artifacts", [])
for changed in changed_files:
if changed.endswith(".py") and changed.startswith(
tuple(
p.name + "/"
for p in repo_root.iterdir()
if p.is_dir() and (p / "__init__.py").exists()
)
):
if changed not in artifacts:
if changed == "pyproject.toml":
for script_ref in _pyproject_script_artifacts(repo_root):
if script_ref not in artifacts:
suggestions.append(
{
"capability_id": row["id"],
"capability_id": cap_id,
"kind": "availability_artifact",
"detail": f"changed module not cited: {changed}",
"detail": f"CLI script not cited: {script_ref}",
"apply_patch": {
"field": "availability.current_artifacts",
"append": changed,
"append": script_ref,
},
}
)
return suggestions
@@ -150,11 +278,12 @@ def apply_deterministic_suggestions(
entry_paths[cap_id] = entry_path
front_matter = entry_cache[cap_id]
if patch["field"] == "evidence.tests":
tests = front_matter.setdefault("evidence", {}).setdefault("tests", [])
if patch["append"] not in tests:
tests.append(patch["append"])
changed.append(f"{cap_id} evidence.tests += {patch['append']}")
if patch["field"] in {"evidence.tests", "evidence.documentation"}:
bucket = patch["field"].split(".")[1]
items = front_matter.setdefault("evidence", {}).setdefault(bucket, [])
if patch["append"] not in items:
items.append(patch["append"])
changed.append(f"{cap_id} {patch['field']} += {patch['append']}")
if patch["field"] == "availability.current_artifacts":
artifacts = front_matter.setdefault("availability", {}).setdefault(
"current_artifacts", []
@@ -165,7 +294,22 @@ def apply_deterministic_suggestions(
f"{cap_id} availability.current_artifacts += {patch['append']}"
)
for suggestion in suggestions:
patch = suggestion.get("apply_patch")
if not patch:
continue
if suggestion.get("kind") == "index_row_add":
cap_id = suggestion["capability_id"]
row = patch.get("index_row")
if row and cap_id not in index_by_id:
index.setdefault("capabilities", []).append(row)
changed.append(f"index row added for {cap_id}")
if suggestion.get("kind") == "index_updated_stale":
index["updated"] = patch.get("value", date.today().isoformat())
changed.append("index.updated bumped")
if changed:
index["updated"] = date.today().isoformat()
paths["index"].write_text(
yaml.safe_dump(index, sort_keys=False, allow_unicode=True),
encoding="utf-8",

View File

@@ -0,0 +1,83 @@
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://reuse-surface.local/schemas/registry-patch.schema.json",
"title": "RegistryMaintainPatchSet",
"type": "object",
"additionalProperties": false,
"required": ["patches"],
"properties": {
"patches": {
"type": "array",
"items": {
"type": "object",
"additionalProperties": false,
"required": ["capability_id", "kind", "confidence", "rationale"],
"properties": {
"capability_id": {
"type": "string",
"pattern": "^capability\\.[a-z0-9]+(\\.[a-z0-9-]+)+$"
},
"kind": {
"type": "string",
"enum": [
"vector_sync",
"evidence_append",
"artifact_append",
"maturity_promote",
"consumer_feedback",
"relation_add",
"index_row_add",
"index_updated_bump"
]
},
"confidence": {
"type": "string",
"enum": ["low", "medium", "high"]
},
"rationale": {
"type": "string",
"minLength": 1
},
"field_path": {
"type": "string"
},
"value": {},
"append": {
"type": "string"
},
"dimension": {
"type": "string",
"enum": ["discovery", "availability", "completeness", "reliability"]
},
"from_level": {
"type": "string"
},
"to_level": {
"type": "string"
},
"promotion_history_entry": {
"type": "object",
"additionalProperties": true
},
"index_row": {
"type": "object",
"additionalProperties": true
},
"evidence_citations": {
"type": "array",
"items": {
"type": "string",
"minLength": 1
}
}
}
}
},
"notes": {
"type": "array",
"items": {
"type": "string"
}
}
}
}

View File

@@ -0,0 +1,12 @@
# Append to sibling repo Makefile (adjust REPO slug if needed).
REGISTRY_RAW_URL ?= https://gitea.coulomb.social/coulomb/$(notdir $(CURDIR))/raw/main/registry/indexes/capabilities.yaml
.PHONY: registry-maintain registry-check
registry-maintain:
reuse-surface maintain --all --from-git-since origin/main
registry-check:
reuse-surface maintain --all --from-git-since origin/main --auto --no-llm
reuse-surface validate --root .
reuse-surface establish --publish-check --raw-url $(REGISTRY_RAW_URL)

View File

@@ -0,0 +1,6 @@
#!/bin/sh
# Optional pre-commit hook: deterministic registry sync when registry/ changes.
if git diff --cached --name-only | grep -q '^registry/'; then
reuse-surface maintain --all --auto --no-llm || exit 1
git add registry/
fi

45
tests/test_interactive.py Normal file
View File

@@ -0,0 +1,45 @@
from __future__ import annotations
import pytest
from reuse_surface.interactive import NonInteractiveError, format_patch_summary, prompt_batch
def test_format_patch_summary():
text = format_patch_summary(
{
"capability_id": "capability.demo.sample",
"kind": "vector_sync",
"confidence": "high",
"rationale": "drift",
"value": "D2 / A0 / C0 / R0",
}
)
assert "vector_sync" in text
def test_prompt_batch_non_tty_raises():
with pytest.raises(NonInteractiveError):
prompt_batch(
[
{
"capability_id": "capability.demo.sample",
"kind": "vector_sync",
"confidence": "high",
"rationale": "drift",
"value": "D2 / A0 / C0 / R0",
}
]
)
def test_prompt_batch_assume_yes():
patch = {
"capability_id": "capability.demo.sample",
"kind": "vector_sync",
"confidence": "high",
"rationale": "drift",
"value": "D2 / A0 / C0 / R0",
}
selected = prompt_batch([patch], assume_yes=True)
assert selected == [patch]

188
tests/test_maintain.py Normal file
View File

@@ -0,0 +1,188 @@
from __future__ import annotations
import json
from pathlib import Path
from unittest.mock import patch
import yaml
from reuse_surface.establish import scaffold_registry
from reuse_surface.maintain import run_maintain
from reuse_surface.maintain_llm import build_maintain_prompt, load_patch_schema
from reuse_surface.patches import (
apply_patches,
evidence_gate,
filter_auto_patches,
patches_from_suggestions,
promotion_delta_gate,
suggestion_to_patch,
)
from reuse_surface.registry import load_index_at, registry_paths
from reuse_surface.registry_update import collect_deterministic_suggestions
def _seed_repo(tmp_path: Path) -> str:
scaffold_registry(tmp_path)
cap_id = "capability.demo.sample"
rel = "registry/capabilities/capability-demo-sample.md"
front_matter = {
"id": cap_id,
"name": "Sample",
"summary": "Sample",
"owner": "demo",
"status": "draft",
"domain": "helix_forge",
"tags": ["demo"],
"maturity": {
"discovery": {"current": "D2", "target": "D5", "confidence": "low"},
"availability": {"current": "A0", "target": "A3", "confidence": "low"},
},
"external_evidence": {
"completeness": {"level": "C0", "confidence": "low"},
"reliability": {"level": "R0", "confidence": "low"},
},
"discovery": {"intent": "demo", "includes": [], "excludes": []},
"availability": {
"current_level": "A0",
"target_level": "A3",
"current_artifacts": [],
"consumption_modes": ["informational"],
},
"relations": {"depends_on": [], "supports": [], "related_to": []},
"evidence": {"documentation": [], "tests": []},
"consumer_guidance": {
"recommended_for": [],
"not_recommended_for": [],
"known_limitations": [],
},
}
entry = tmp_path / rel
entry.parent.mkdir(parents=True, exist_ok=True)
entry.write_text("---\n" + yaml.safe_dump(front_matter, sort_keys=False) + "---\n")
index_path = registry_paths(tmp_path)["index"]
index = load_index_at(index_path)
index["capabilities"] = [
{
"id": cap_id,
"name": "Sample",
"summary": "Sample",
"vector": "D3 / A0 / C0 / R0",
"domain": "helix_forge",
"status": "draft",
"owner": "demo",
"path": rel,
"tags": ["demo"],
"consumption_modes": ["informational"],
}
]
index_path.write_text(yaml.safe_dump(index, sort_keys=False), encoding="utf-8")
return cap_id
def test_patch_schema_loads():
schema = load_patch_schema()
assert "patches" in schema["properties"]
def test_build_maintain_prompt(tmp_path: Path):
cap_id = _seed_repo(tmp_path)
prompt = build_maintain_prompt(tmp_path, cap_id, git_since=None)
assert cap_id in prompt
assert "Return ONLY JSON" in prompt
def test_suggestion_to_patch_vector_sync():
patch = suggestion_to_patch(
{
"capability_id": "capability.demo.sample",
"kind": "vector_drift",
"detail": "drift",
"apply_patch": {"field": "index.vector", "value": "D2 / A0 / C0 / R0"},
}
)
assert patch is not None
assert patch["kind"] == "vector_sync"
def test_evidence_gate_requires_files(tmp_path: Path):
evidence = (tmp_path / "tests" / "test_x.py")
evidence.parent.mkdir(parents=True)
evidence.write_text("def test_x(): pass\n")
patch = {
"kind": "maturity_promote",
"evidence_citations": ["tests/test_x.py"],
}
assert evidence_gate(tmp_path, patch)
patch["evidence_citations"] = ["tests/missing.py"]
assert not evidence_gate(tmp_path, patch)
def test_promotion_delta_gate():
patch = {
"kind": "maturity_promote",
"dimension": "availability",
"from_level": "A2",
"to_level": "A3",
}
assert promotion_delta_gate(patch, 1)
patch["to_level"] = "A5"
assert not promotion_delta_gate(patch, 1)
def test_apply_patches_vector_sync(tmp_path: Path):
cap_id = _seed_repo(tmp_path)
suggestions = collect_deterministic_suggestions(tmp_path, capability_id=cap_id)
patches = patches_from_suggestions(suggestions)
changed = apply_patches(tmp_path, patches)
assert changed
index = load_index_at(registry_paths(tmp_path)["index"])
assert index["capabilities"][0]["vector"] == "D2 / A0 / C0 / R0"
def test_filter_auto_patches(tmp_path: Path):
cap_id = _seed_repo(tmp_path)
suggestions = collect_deterministic_suggestions(tmp_path, capability_id=cap_id)
patches = patches_from_suggestions(suggestions)
selected = filter_auto_patches(patches, tmp_path)
assert selected
def test_run_maintain_auto_no_llm(tmp_path: Path):
_seed_repo(tmp_path)
def _validate() -> tuple[int, list[str], list[str]]:
return 0, [], []
result = run_maintain(
tmp_path,
all_capabilities=True,
auto=True,
no_llm=True,
validate_fn=_validate,
)
assert result.exit_code == 0
assert result.selected_count >= 1
def test_request_maintain_patches_mock(tmp_path: Path):
cap_id = _seed_repo(tmp_path)
payload = {
"patches": [
{
"capability_id": cap_id,
"kind": "consumer_feedback",
"confidence": "medium",
"rationale": "note",
"append": "helpful",
}
],
"notes": [],
}
with patch(
"reuse_surface.maintain_llm.execute_prompt",
return_value=json.dumps(payload),
):
from reuse_surface.maintain_llm import request_maintain_patches
result = request_maintain_patches(tmp_path, cap_id, llm_url="http://example")
assert len(result["patches"]) == 1

View File

@@ -147,7 +147,7 @@ local index YAML. `--discover` drafts capabilities via llm-connect (optional).
Refresh registry metadata from repo drift signals.
```bash
reuse-surface update --capability capability.registry.register --dry-run
reuse-surface update --capability capability.registry.register
reuse-surface update --all --from-git-since HEAD~5 --apply
reuse-surface update --capability capability.registry.register --suggest-maturity
```
@@ -155,6 +155,30 @@ reuse-surface update --capability capability.registry.register --suggest-maturit
Deterministic patches (`vector_drift`, new `tests/` citations) apply with
`--apply`. LLM suggestions use `--suggest-maturity` and remain review-only.
### maintain
Interactive or automated registry maintenance (REUSE-WP-0016). Preferred entry
point for sibling repo operators.
```bash
export LLM_CONNECT_URL=http://127.0.0.1:8088 # optional
reuse-surface maintain --all --from-git-since origin/main
reuse-surface maintain --capability capability.registry.register
reuse-surface maintain --all --auto --no-llm
reuse-surface maintain --all --auto --from-git-since HEAD~3
reuse-surface maintain --publish --raw-url https://.../capabilities.yaml --all --auto --no-llm
```
| Mode | Flags | Behavior |
|---|---|---|
| Interactive (TTY) | (default) | Prompt per patch: apply / skip / edit / quit |
| Full automation | `--auto` or `--yes` | Safe deterministic + gated LLM patches |
| Deterministic only | `--auto --no-llm` | No llm-connect required |
| Publish chain | `--publish --raw-url` | maintain → validate → publish-check |
Templates: `templates/Makefile.registry.fragment`, `templates/git-hook.pre-commit.registry`.
Install hook: `reuse-surface establish --scaffold --hook`.
### report cohorts
Export capability cohorts for planning or implementation reuse decisions.
@@ -196,6 +220,7 @@ Stable IDs and maturity fields are preserved for agent consumption (UC-RS-019).
| Verify index publish URL | `reuse-surface establish --publish-check` |
| Draft capabilities (LLM) | `reuse-surface establish --discover` |
| Refresh entry metadata | `reuse-surface update` |
| Interactive registry maintain | `reuse-surface maintain` |
| Planning cohort export | `reuse-surface report cohorts` |
| Relation graph | `reuse-surface graph` |

1624
uv.lock generated Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,377 @@
---
id: REUSE-WP-0016
type: workplan
title: "Interactive registry maintain with llm-connect automation"
domain: helix_forge
repo: reuse-surface
status: finished
owner: codex
topic_slug: helix-forge
created: "2026-06-16"
updated: "2026-06-17"
state_hub_workstream_id: "2a7565a4-2627-44ca-a856-6c3f18576f92"
---
# Interactive registry maintain with llm-connect automation
Follow-up to **REUSE-WP-0013** (`establish`, `update`, `stats`). Workstation
rollout (**REUSE-WP-0014**) gave every sibling repo a registry scaffold; operators
still maintain entries manually or run `reuse-surface update` as a **non-interactive
report**. LLM maturity hints (`--suggest-maturity`) dump JSON for human review
with no apply path.
This workplan closes the **registry maintenance loop** from inside each domain
repo: interactive prompting for judgment calls, full automation for safe and
high-confidence changes, both backed by the existing **llm-connect** HTTP bridge.
**Baseline vector:** `D5 / A4 / C5 / R3`
**Target vector:** `D5 / A4 / C5C6 / R3` (tooling depth; reliability unchanged
until consumer telemetry program)
## Problem statement
| Pain | Today (WP-0013) | Target |
|---|---|---|
| Update registry after code changes | `update` prints suggestions; user must remember `--apply` | Guided session with per-change prompts |
| Maturity / evidence refresh | `--suggest-maturity` → JSON only | Structured LLM patches with review or auto-apply |
| Publish hygiene | Manual validate → commit → publish-check | `maintain` chains update → validate → optional publish-check |
| Agent vs human UX | Same stdout for both | TTY prompts for humans; JSON/event stream for agents |
| Sibling repo friction | `validate` defaults to install root | Auto-detect `registry/` in cwd |
## Design principles
1. **Deterministic first** — vector drift, missing index rows, and cited artifact
paths apply without LLM; same safe-apply list as WP-0013-T06, extended.
2. **Interactive by default in TTY**`reuse-surface maintain` prompts before
any non-deterministic write; non-TTY requires `--yes` or `--auto`.
3. **Full automation is explicit**`--auto` applies deterministic patches plus
LLM proposals that pass schema validation and evidence gates; never silent
promotion above configured ceilings (default: no auto D/A/C/R jumps > 1 level).
4. **LLM optional** — deterministic-only paths work without `LLM_CONNECT_URL`;
LLM steps skip gracefully with a clear message.
5. **Validate gate** — every write path ends with `reuse-surface validate --root
<repo>`; failed validation rolls back the session batch (atomic apply).
6. **Evidence-bound promotions** — auto-apply for maturity changes requires
cited repo paths (tests, workflows, docs) present on disk; align checks with
`specs/CapabilityMaturityStandard.md`.
7. **Boundary** — reuse-surface does not host models; llm-connect owns routing
and credentials (`POST {LLM_CONNECT_URL}/execute`).
## Proposed CLI surface
```bash
# Interactive maintain (default in TTY)
cd ~/state-hub
export LLM_CONNECT_URL=http://127.0.0.1:8088 # optional
reuse-surface maintain
reuse-surface maintain --from-git-since origin/main
reuse-surface maintain --capability capability.statehub.workstream-coordinate
# Full automation (CI, agents, pre-commit)
reuse-surface maintain --auto --from-git-since HEAD~1
reuse-surface maintain --auto --no-llm # deterministic only
# Non-interactive apply-all-safe (current update behavior, preserved)
reuse-surface update --all --from-git-since origin/main --apply
# Federation publish helper (chains maintain + validate + publish-check)
reuse-surface maintain --publish --raw-url https://gitea.../capabilities.yaml
```
### Interactive prompt flow (TTY)
```text
reuse-surface maintain
→ collect repo signals (git diff, index drift, roster stats)
→ deterministic suggestions (always listed first)
→ optional LLM patch proposals per capability (llm-connect)
→ for each pending change:
[a]pply [s]kip [e]dit in $EDITOR [q]uit [A]pply all safe
→ atomic write + validate
→ summary: files changed, remaining manual items, publish reminder
```
### Automation tiers (`--auto`)
| Tier | Applies without prompt |
|---|---|
| `safe` | Deterministic patches (vector drift, evidence path append) |
| `llm-metadata` | LLM `consumer_feedback`, `notes`, non-level field updates |
| `llm-promote` | Single-step maturity bumps with on-disk evidence citations |
Configure ceiling via `--auto-max-delta 1` (default) or `--auto-max-delta 0` to
disable promotions.
## Suggested execution order
```text
T01 registry-patch JSON schema + LLM prompt templates
→ T02 expand deterministic signal collectors
→ T03 interactive prompt module (TTY + non-TTY)
→ T04 maintain command (orchestrator)
→ T05 LLM patch apply path with evidence gates
→ T06 --auto mode + atomic batch apply/rollback
→ T07 validate cwd auto-detect; index.updated bump
→ T08 sibling integration (Makefile template, optional hook generator)
→ T09 docs, tests, gap-analysis priority 28
```
## Dependencies
| Dependency | Owner | Notes |
|---|---|---|
| llm-connect | llm-connect | `LLM_CONNECT_URL`; mocked in pytest |
| WP-0013 modules | reuse-surface | `registry_update.py`, `llm_bridge.py`, `establish.py` |
| Maturity standard | reuse-surface | Promotion evidence rules in prompts and gates |
| Sibling repo adoption | Domain owners | Run `maintain` in each checkout; optional CI step |
---
## Add Registry Patch Schema And LLM Templates
```task
id: REUSE-WP-0016-T01
status: done
priority: high
state_hub_task_id: "f5daf384-ca4e-42ec-8530-bf5d46155284"
```
Define `schemas/registry-patch.schema.json` for structured update proposals
(consumed by interactive and `--auto` paths):
- `patches[]`: `{ capability_id, kind, confidence, rationale, field_path, value |
append, promotion_history_entry }`
- `kinds`: `vector_sync`, `evidence_append`, `artifact_append`, `maturity_promote`,
`consumer_feedback`, `relation_add`, `index_row_add`
- `evidence_citations[]`: repo-relative paths supporting each patch
Add prompt builders in `reuse_surface/registry_update.py` (or
`reuse_surface/maintain_llm.py`):
- `build_maintain_prompt(repo_root, capability_id, git_since, context_files)`
- Schema-constrained JSON via `request_json_object` + validator
- Reuse maturity level definitions from `CapabilityMaturityStandard.md` in prompt
context (summary table, not full doc)
Pytest: fixture repo + mocked llm-connect returning valid/invalid patches.
## Expand Deterministic Signal Collectors
```task
id: REUSE-WP-0016-T02
status: done
priority: high
state_hub_task_id: "55e6d943-6237-4332-9b01-2fa42aceff1f"
```
Extend `collect_deterministic_suggestions` in `registry_update.py`:
| Signal | Suggested field |
|---|---|
| `.gitea/workflows/*.yml` changed | `evidence.tests` or `evidence.documentation` |
| `docs/**` changed | `evidence.documentation` |
| `pyproject.toml` / `[project.scripts]` added | `availability.current_artifacts` |
| New `registry/capabilities/*.md` without index row | `index_row_add` patch |
| `index.updated` stale vs last git touch on `registry/` | bump `updated` date |
| Missing entry file for index row | `missing_entry` (blocking warning) |
Keep `--apply` safe-list explicit in code (document in module docstring). Add
regression tests in `tests/test_registry_update.py`.
## Implement Interactive Prompt Module
```task
id: REUSE-WP-0016-T03
status: done
priority: high
state_hub_task_id: "fe3a2e99-8c40-48a7-9d70-0e92b48146d2"
```
New module `reuse_surface/interactive.py`:
- Detect TTY (`sys.stdin.isatty()`)
- `prompt_patch(patch) -> Literal["apply","skip","edit","quit"]` with short
summary (kind, capability_id, rationale, field preview)
- `prompt_batch(patches) -> list[patch]` supporting **Apply all safe** for
deterministic kinds only
- Non-TTY: raise unless `assume_yes` / `auto_mode` set; emit JSON lines
(`{"event":"suggestion",...}`) for agent consumers
- Optional `$EDITOR` flow: write temp YAML snippet, re-parse on save
No llm-connect dependency. Pytest with stdin mocked via `io.StringIO`.
## Implement maintain Command
```task
id: REUSE-WP-0016-T04
status: done
priority: high
state_hub_task_id: "6e3a7b3d-1037-49ed-ad7c-341d21c333da"
```
Add `reuse-surface maintain` in `cli.py` (or alias `update --interactive` if
prefer fewer top-level verbs — default to **`maintain`** as the user-facing
entry point):
**Flags:**
| Flag | Purpose |
|---|---|
| `--path` | Repo root (default cwd) |
| `--capability` / `--all` | Scope |
| `--from-git-since` | Git ref for change detection |
| `--llm-url` | Override `LLM_CONNECT_URL` |
| `--no-llm` | Skip LLM phase |
| `--publish` | Run `establish --publish-check` after successful validate |
| `--raw-url` | Required when `--publish` |
| `--format json` | Machine-readable session result |
**Flow:**
1. Run T02 collectors
2. If LLM enabled: run T01 prompts per capability in scope
3. Merge deterministic + LLM into ordered patch list (deterministic first)
4. T03 interactive selection (unless `--auto` — T06)
5. T05 apply + T06 atomic validate
Preserve existing `update` command unchanged for scripting backward compatibility.
## LLM Patch Apply Path With Evidence Gates
```task
id: REUSE-WP-0016-T05
status: done
priority: medium
state_hub_task_id: "f0baa772-b7f0-4143-9fd9-9c96db17f532"
```
Implement `apply_patches(repo_root, patches) -> list[str]`:
- Reuse `apply_deterministic_suggestions` for overlapping kinds
- New writers: `promotion_history` append, `maturity.*.current` with vector
sync to index, `consumer_feedback` append, `relations.*` append (optional v1)
- **Evidence gate:** for `maturity_promote`, require every
`evidence_citations` path to exist under `repo_root`; reject patch if not
- **Level gate:** refuse promotion if delta > `--auto-max-delta` unless
interactive user confirms
- Bump `registry/indexes/capabilities.yaml` `updated` field on any write
Pytest: promote with/without evidence files; vector/index consistency after apply.
## Implement --auto Mode And Atomic Batch
```task
id: REUSE-WP-0016-T06
status: done
priority: medium
state_hub_task_id: "bd8f6243-24a3-44f8-9824-4cc2518ad8d9"
```
`maintain --auto`:
- Apply all `safe` deterministic patches
- Apply LLM patches with `confidence >= --auto-confidence` (default `high`) and
passing evidence gates
- `--auto-max-delta` (default `1`) caps promotion steps per dimension per session
- `--yes` on non-TTY equivalent to `--auto` with default thresholds
**Atomic batch:** write all entry/index changes to temp files under
`.reuse-surface-session/`; on validate success, rename into place; on failure,
discard and print validator errors.
Exit codes: `0` ok, `1` validation/schema failure, `2` partial skip (no writes).
## Validate Cwd Auto-Detect And Publish Helper
```task
id: REUSE-WP-0016-T07
status: done
priority: low
state_hub_task_id: "a61c0843-f44b-4e75-9043-7d042087e015"
```
- When `--root` / `--path` omitted and `./registry/indexes/capabilities.yaml`
exists, default repo root to cwd (validate, update, maintain, stats)
- `maintain --publish --raw-url` chains: maintain session → validate →
`establish.publish_check` → print pass/fail markdown
- Document raw URL convention in session summary when `REUSE_SURFACE_RAW_URL` set
## Sibling Integration Templates
```task
id: REUSE-WP-0016-T08
status: done
priority: low
state_hub_task_id: "ec2d58a3-c797-464b-9fb3-464f71360c9c"
```
Ship copy-paste artifacts (not installed into sibling repos automatically):
- `templates/Makefile.registry.fragment` — `registry-maintain`, `registry-check`
- `templates/git-hook.pre-commit.registry` — `maintain --auto --no-llm` when
`registry/` changed
- `establish --scaffold` append: optional `--hook` writes `.git/hooks/pre-commit`
(refuse overwrite unless `--force`)
Dogfood: run against `state-hub` checkout when available.
## Documentation, Tests, And Gap Note
```task
id: REUSE-WP-0016-T09
status: done
priority: low
state_hub_task_id: "85f8f549-7df9-493c-b43b-f1b67af3ee6c"
```
- `tools/README.md` — `maintain` command reference; interactive vs `--auto`
- `docs/RegistryFederation.md` — link maintain + publish to sibling onboarding
- `registry/README.md` — operator checklist after `maintain` session
- `docs/IntentScopeGapAnalysis.md` — add priority **28** (registry maintenance
automation); mark open
- `SCOPE.md` — extend "What Is Possible Now" when T04 ships
- CI: `maintain --auto --no-llm` on reuse-surface self-registry (informational
or gated); no live llm-connect in CI
- Pytest count increase; `reuse-surface validate` unchanged for default path
---
## Acceptance
- [x] `reuse-surface maintain` in TTY walks through suggestions with apply/skip/edit
- [x] `maintain --auto --no-llm` applies deterministic patches and validates atomically
- [x] LLM patches apply only with schema validation + evidence gates
- [x] `maintain --publish --raw-url` reports federation publish pass/fail
- [x] Non-TTY without `--auto`/`--yes` fails with clear message (no silent writes)
- [x] `validate` defaults to cwd when local `registry/` index exists
- [x] All new behavior documented; gap priority 28 recorded
## Completion notes (2026-06-17)
- Modules: `maintain.py`, `maintain_llm.py`, `patches.py`, `interactive.py`
- Schema: `schemas/registry-patch.schema.json`
- Templates: `templates/Makefile.registry.fragment`, `templates/git-hook.pre-commit.registry`
- CLI: `reuse-surface maintain`; `establish --scaffold --hook`
- Tests: `tests/test_maintain.py`, `tests/test_interactive.py` (59 pytest total)
## Out of scope
- Hub cache invalidation webhooks (gap priority from §3.1 — separate workplan)
- Auto `hub register` (still operator step with token)
- Embedding / ML overlap detection (keep `overlaps` heuristic)
- llm-connect hosting or provider configuration inside reuse-surface
- Fully unattended maturity promotion without evidence citations
## Dogfood target
From `~/state-hub` (or any roster repo with `publish_check: pass`):
```bash
export LLM_CONNECT_URL=http://127.0.0.1:8088
reuse-surface maintain --from-git-since origin/main
reuse-surface maintain --auto --from-git-since HEAD~3
reuse-surface maintain --publish \
--raw-url https://gitea.coulomb.social/coulomb/state-hub/raw/main/registry/indexes/capabilities.yaml
```
Success: registry files updated, `validate --root .` passes, publish-check 200.