generated from coulomb/repo-seed
Implement REUSE-WP-0013 registry establish, update, and stats
Some checks failed
ci / validate-registry (push) Has been cancelled
Some checks failed
ci / validate-registry (push) Has been cancelled
Add stats, establish (scaffold, publish-check, discover), and update CLI commands with optional llm-connect bridge, validate --root for sibling repos, pytest coverage, and documentation for sibling registry onboarding.
This commit is contained in:
@@ -32,6 +32,9 @@ jobs:
|
||||
reuse-surface catalog
|
||||
reuse-surface graph --check --fail-on-warnings
|
||||
|
||||
- name: Registry stats (informational)
|
||||
run: reuse-surface stats || true
|
||||
|
||||
- name: Planning cohort report (informational)
|
||||
run: reuse-surface report cohorts --planning-min D4 || true
|
||||
|
||||
|
||||
9
SCOPE.md
9
SCOPE.md
@@ -60,6 +60,11 @@ The MVP registry foundation, CLI tooling (REUSE-WP-0003), federation stack
|
||||
against `https://reuse.coulomb.social`
|
||||
- **Sync local federation manifest from hub** with `reuse-surface hub sync`
|
||||
- **Export planning cohorts** with `reuse-surface report cohorts`
|
||||
- **Bootstrap a sibling registry** with `reuse-surface establish --scaffold`
|
||||
- **Verify index publish readiness** with `reuse-surface establish --publish-check`
|
||||
- **View registry stats** with `reuse-surface stats`
|
||||
- **Draft or refresh entries** with `reuse-surface establish --discover` and
|
||||
`reuse-surface update` (optional llm-connect backend)
|
||||
- **Run the hub locally or in a container** with `reuse-surface serve`
|
||||
- **Generate relation graphs** with `reuse-surface graph`
|
||||
- **Explore relations interactively** at `docs/graph/index.html`
|
||||
@@ -104,8 +109,8 @@ See `tools/README.md` for command reference.
|
||||
- **Federated index:** `registry/indexes/federated.yaml` (local compose).
|
||||
- **Relation graph:** `docs/graph/capability-graph.mmd`, `docs/graph/index.html`.
|
||||
- **Searchable catalog:** `docs/catalog/search.html`.
|
||||
- **Workplans:** REUSE-WP-0001 through REUSE-WP-0011 finished; WP-0011 archived;
|
||||
**REUSE-WP-0012** finished (federation scale + intent alignment).
|
||||
- **Workplans:** REUSE-WP-0001 through REUSE-WP-0012 finished/archived;
|
||||
**REUSE-WP-0013** finished (registry establish/update/stats).
|
||||
- **Assessment history:** `history/2026-06-15-intent-scope-assessment.md`.
|
||||
- **Self-assessed vector:** `D5 / A4 / C5 / R3` (see `docs/IntentScopeGapAnalysis.md`).
|
||||
|
||||
|
||||
@@ -3,7 +3,7 @@
|
||||
**Repository:** `reuse-surface`
|
||||
**Artifact:** `docs/IntentScopeGapAnalysis.md`
|
||||
**Status:** Living analysis
|
||||
**Updated:** 2026-06-16
|
||||
**Updated:** 2026-06-17
|
||||
**Purpose:** Record alignment, drift, and open gaps between declared intent and
|
||||
current delivered scope so future workplans can close them deliberately.
|
||||
|
||||
@@ -30,6 +30,8 @@ four maturity dimensions, and human/agent consumers.
|
||||
standardization tracker still manual.
|
||||
3. **Hub automation** — `hub sync` shipped; polling/webhooks still absent.
|
||||
4. **Managed platform posture** — A5 container documented; A6/Postgres deferred.
|
||||
5. **Registry bootstrap in sibling repos** — `establish`/`update`/`stats` shipped;
|
||||
sibling adoption still operator-driven.
|
||||
|
||||
**Current reuse-surface product vector (self-assessment):** `D5 / A4 / C5 / R3`
|
||||
|
||||
@@ -197,8 +199,10 @@ archived workplans under `workplans/archived/`.
|
||||
| 21 | INTENT layout sync | Update INTENT.md tree and example entry shape | **Closed** (WP-0012) |
|
||||
| 22 | Hub hardening | Postgres option, backup, documented SLO (A5→A6 path) | **Closed** (doc; implementation deferred) |
|
||||
| 23 | External evidence program | Raise catalog R levels with consumer_feedback | **Closed** (checklist + 3 entries; telemetry deferred) |
|
||||
| 24 | Registry bootstrap tooling | `establish`, `update`, `stats` for sibling repos | **Closed** (WP-0013) |
|
||||
|
||||
**Workplan:** `REUSE-WP-0012` (finished). **Assessment snapshots:**
|
||||
**Workplan:** `REUSE-WP-0013` (finished). Prior: `REUSE-WP-0012` (finished).
|
||||
**Assessment snapshots:**
|
||||
`history/2026-06-15-intent-scope-assessment.md`,
|
||||
`history/2026-06-16-hub-registration-blocks.md`.
|
||||
|
||||
@@ -227,4 +231,5 @@ archived workplans under `workplans/archived/`.
|
||||
| 2026-06-15 | REUSE-WP-0011 closed priority 17; hub live at reuse.coulomb.social |
|
||||
| 2026-06-15 | Post-WP-0011 refresh: 20 capabilities, vector D5/A4/C4/R3, priorities 18–23 proposed |
|
||||
| 2026-06-15 | REUSE-WP-0012 proposed; assessment archived in `history/2026-06-15-intent-scope-assessment.md` |
|
||||
| 2026-06-16 | REUSE-WP-0012 closed priorities 19–23; priority 18 deferred on sibling index blocks; vector C5 |
|
||||
| 2026-06-16 | REUSE-WP-0012 closed priorities 19–23; priority 18 deferred on sibling index blocks; vector C5 |
|
||||
| 2026-06-17 | REUSE-WP-0013 closed priority 24; establish/update/stats + optional llm-connect assist |
|
||||
@@ -97,6 +97,18 @@ curl -fsS "<raw-url>" | head
|
||||
source) to an environment variable holding a Bearer token or full header value.
|
||||
The hub stores `auth_env` / `auth_header` names only — never secret values.
|
||||
|
||||
### Sibling onboarding (CLI)
|
||||
|
||||
```bash
|
||||
cd ../state-hub
|
||||
reuse-surface establish --scaffold --domain helix_forge
|
||||
# optional: LLM_CONNECT_URL=... reuse-surface establish --discover --dry-run
|
||||
reuse-surface validate --root .
|
||||
git push origin main
|
||||
reuse-surface establish --publish-check \
|
||||
--raw-url https://gitea.coulomb.social/coulomb/state-hub/raw/main/registry/indexes/capabilities.yaml
|
||||
```
|
||||
|
||||
### Registration checklist
|
||||
|
||||
1. Merge capability index to the default branch.
|
||||
|
||||
@@ -20,6 +20,9 @@ dev = [
|
||||
"httpx>=0.27",
|
||||
"pytest>=8.0",
|
||||
]
|
||||
llm = [
|
||||
"llm-connect",
|
||||
]
|
||||
|
||||
[project.scripts]
|
||||
reuse-surface = "reuse_surface.cli:main"
|
||||
|
||||
@@ -35,6 +35,21 @@ registry/
|
||||
|
||||
Missing evidence is acceptable in the MVP when it is explicit rather than hidden.
|
||||
|
||||
## LLM-assisted discover review checklist
|
||||
|
||||
When using `reuse-surface establish --discover` (llm-connect backend):
|
||||
|
||||
- [ ] Every proposed `id` follows `capability.<domain>.<name>` and is not a duplicate
|
||||
- [ ] `summary`, `discovery.intent`, and maturity vectors match repo reality
|
||||
- [ ] `owner` reflects the delivering repository or team
|
||||
- [ ] Relations are empty or manually added after human review
|
||||
- [ ] Run `reuse-surface validate --root <repo>` before merge
|
||||
- [ ] Run `reuse-surface establish --publish-check` after pushing to `main`
|
||||
|
||||
Discover drafts start at low maturity with explicit auto-draft risks in
|
||||
`known_reliability_risks`. Promote only with evidence per
|
||||
`specs/CapabilityMaturityStandard.md`.
|
||||
|
||||
## Manual validation checklist
|
||||
|
||||
Use this checklist until an automated CLI validator exists.
|
||||
|
||||
@@ -26,21 +26,48 @@ from reuse_surface.reports import (
|
||||
format_cohort_markdown,
|
||||
select_cohort,
|
||||
)
|
||||
from reuse_surface.establish import (
|
||||
discover_capabilities,
|
||||
format_publish_check_markdown,
|
||||
publish_check,
|
||||
scaffold_next_steps,
|
||||
scaffold_registry,
|
||||
)
|
||||
from reuse_surface.registry_update import (
|
||||
apply_deterministic_suggestions,
|
||||
collect_deterministic_suggestions,
|
||||
format_suggestions_json,
|
||||
format_suggestions_markdown,
|
||||
suggest_llm_updates,
|
||||
)
|
||||
from reuse_surface.stats import collect_stats, format_stats_json, format_stats_markdown
|
||||
from reuse_surface.registry import (
|
||||
ROOT,
|
||||
capability_paths,
|
||||
level_at_least,
|
||||
load_index,
|
||||
load_index_at,
|
||||
load_schema,
|
||||
parse_front_matter,
|
||||
parse_vector,
|
||||
registry_paths,
|
||||
)
|
||||
|
||||
|
||||
def _check_index_drift(entry_paths: list[Path], index: dict[str, Any]) -> list[str]:
|
||||
def _registry_root(args: argparse.Namespace) -> Path:
|
||||
if getattr(args, "root", None):
|
||||
return Path(args.root).resolve()
|
||||
return ROOT
|
||||
|
||||
|
||||
def _check_index_drift(
|
||||
entry_paths: list[Path],
|
||||
index: dict[str, Any],
|
||||
repo_root: Path,
|
||||
) -> list[str]:
|
||||
warnings: list[str] = []
|
||||
indexed_paths = {item["path"] for item in index.get("capabilities", [])}
|
||||
file_paths = {str(path.relative_to(ROOT)) for path in entry_paths}
|
||||
file_paths = {str(path.relative_to(repo_root)) for path in entry_paths}
|
||||
for path in sorted(file_paths - indexed_paths):
|
||||
warnings.append(f"index drift: entry file not indexed: {path}")
|
||||
for path in sorted(indexed_paths - file_paths):
|
||||
@@ -48,11 +75,22 @@ def _check_index_drift(entry_paths: list[Path], index: dict[str, Any]) -> list[s
|
||||
return warnings
|
||||
|
||||
|
||||
def cmd_validate(args: argparse.Namespace) -> int:
|
||||
def _capability_paths_for(repo_root: Path, target: Path | None) -> list[Path]:
|
||||
if target is not None:
|
||||
return [target]
|
||||
cap_dir = registry_paths(repo_root)["capabilities"]
|
||||
return sorted(path for path in cap_dir.glob("*.md") if path.name != ".gitkeep")
|
||||
|
||||
|
||||
def _run_validate(
|
||||
repo_root: Path,
|
||||
*,
|
||||
target: Path | None,
|
||||
relations: bool,
|
||||
) -> tuple[list[str], list[str], list[Path]]:
|
||||
schema = load_schema()
|
||||
validator = Draft202012Validator(schema)
|
||||
target = Path(args.path) if args.path else None
|
||||
paths = capability_paths(target)
|
||||
paths = _capability_paths_for(repo_root, target)
|
||||
errors: list[str] = []
|
||||
warnings: list[str] = []
|
||||
|
||||
@@ -67,10 +105,23 @@ def cmd_validate(args: argparse.Namespace) -> int:
|
||||
errors.append(f"{path}: {location}: {error.message}")
|
||||
|
||||
if not target:
|
||||
index = load_index()
|
||||
warnings.extend(_check_index_drift(paths, index))
|
||||
if args.relations:
|
||||
index_path = registry_paths(repo_root)["index"]
|
||||
if index_path.exists():
|
||||
index = load_index_at(index_path)
|
||||
warnings.extend(_check_index_drift(paths, index, repo_root))
|
||||
if relations and repo_root == ROOT:
|
||||
warnings.extend(check_relations())
|
||||
return errors, warnings, paths
|
||||
|
||||
|
||||
def cmd_validate(args: argparse.Namespace) -> int:
|
||||
repo_root = _registry_root(args)
|
||||
target = Path(args.path) if args.path else None
|
||||
if target and not target.is_absolute():
|
||||
target = repo_root / target
|
||||
errors, warnings, paths = _run_validate(
|
||||
repo_root, target=target, relations=args.relations
|
||||
)
|
||||
|
||||
for warning in warnings:
|
||||
print(f"warning: {warning}", file=sys.stderr)
|
||||
@@ -329,6 +380,117 @@ def cmd_hub_sync(args: argparse.Namespace) -> int:
|
||||
return 0
|
||||
|
||||
|
||||
def cmd_stats(args: argparse.Namespace) -> int:
|
||||
repo_root = Path(args.path or ".").resolve()
|
||||
stats = collect_stats(
|
||||
repo_root,
|
||||
federation_ready=args.federation_ready,
|
||||
raw_url=args.raw_url,
|
||||
hub_url=getattr(args, "hub_url", None),
|
||||
)
|
||||
if args.format == "json":
|
||||
print(format_stats_json(stats))
|
||||
else:
|
||||
print(format_stats_markdown(stats), end="")
|
||||
return 0
|
||||
|
||||
|
||||
def cmd_establish(args: argparse.Namespace) -> int:
|
||||
repo_root = Path(args.path or ".").resolve()
|
||||
try:
|
||||
if args.scaffold:
|
||||
created = scaffold_registry(
|
||||
repo_root, domain=args.domain, force=args.force
|
||||
)
|
||||
for path in created:
|
||||
print(f"ok: wrote {path.relative_to(repo_root)}")
|
||||
print(scaffold_next_steps(repo_root))
|
||||
return 0
|
||||
if args.publish_check:
|
||||
result = publish_check(repo_root, raw_url=args.raw_url)
|
||||
print(format_publish_check_markdown(result), end="")
|
||||
return 0 if result["ok"] else 1
|
||||
if args.discover:
|
||||
result = discover_capabilities(
|
||||
repo_root,
|
||||
domain=args.domain,
|
||||
dry_run=not args.apply,
|
||||
apply=args.apply,
|
||||
llm_url=args.llm_url,
|
||||
context_max_files=args.context_max_files,
|
||||
)
|
||||
if result.get("dry_run"):
|
||||
print(yaml.safe_dump(result["draft"], sort_keys=False))
|
||||
return 0
|
||||
for path in result.get("written", []):
|
||||
print(f"ok: wrote {path}")
|
||||
validate_args = argparse.Namespace(
|
||||
path=None,
|
||||
root=str(repo_root),
|
||||
relations=False,
|
||||
fail_on_warnings=True,
|
||||
)
|
||||
return cmd_validate(validate_args)
|
||||
except ValueError as exc:
|
||||
print(f"error: {exc}", file=sys.stderr)
|
||||
return 1
|
||||
print("error: specify --scaffold, --publish-check, or --discover", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
|
||||
def cmd_update(args: argparse.Namespace) -> int:
|
||||
repo_root = Path(args.path or ".").resolve()
|
||||
try:
|
||||
capability_id = None if args.all else args.capability
|
||||
if not args.all and not args.capability:
|
||||
print("error: specify --capability or --all", file=sys.stderr)
|
||||
return 1
|
||||
if args.suggest_maturity:
|
||||
cap_ids = [args.capability] if args.capability else []
|
||||
if args.all:
|
||||
index = load_index_at(registry_paths(repo_root)["index"])
|
||||
cap_ids = [row["id"] for row in index.get("capabilities", [])]
|
||||
payload = {
|
||||
"suggestions": [
|
||||
suggest_llm_updates(
|
||||
repo_root,
|
||||
cap_id,
|
||||
git_since=args.from_git_since,
|
||||
llm_url=args.llm_url,
|
||||
)
|
||||
for cap_id in cap_ids
|
||||
]
|
||||
}
|
||||
print(json.dumps(payload, indent=2, sort_keys=True))
|
||||
return 0
|
||||
|
||||
suggestions = collect_deterministic_suggestions(
|
||||
repo_root,
|
||||
capability_id=capability_id,
|
||||
git_since=args.from_git_since,
|
||||
)
|
||||
if args.apply:
|
||||
changed = apply_deterministic_suggestions(repo_root, suggestions)
|
||||
for line in changed:
|
||||
print(f"ok: {line}")
|
||||
validate_args = argparse.Namespace(
|
||||
path=None,
|
||||
root=str(repo_root),
|
||||
relations=False,
|
||||
fail_on_warnings=True,
|
||||
)
|
||||
return cmd_validate(validate_args)
|
||||
|
||||
if args.format == "json":
|
||||
print(format_suggestions_json(suggestions))
|
||||
else:
|
||||
print(format_suggestions_markdown(suggestions), end="")
|
||||
return 0
|
||||
except ValueError as exc:
|
||||
print(f"error: {exc}", file=sys.stderr)
|
||||
return 1
|
||||
|
||||
|
||||
def cmd_report_cohorts(args: argparse.Namespace) -> int:
|
||||
filters = cohort_filters_from_args(args)
|
||||
matches = select_cohort(filters)
|
||||
@@ -399,6 +561,10 @@ def main(argv: list[str] | None = None) -> int:
|
||||
action="store_true",
|
||||
help="exit non-zero when warnings are present",
|
||||
)
|
||||
validate.add_argument(
|
||||
"--root",
|
||||
help="registry repo root (default: reuse-surface install root)",
|
||||
)
|
||||
validate.set_defaults(func=cmd_validate)
|
||||
|
||||
federation = subparsers.add_parser(
|
||||
@@ -539,6 +705,41 @@ def main(argv: list[str] | None = None) -> int:
|
||||
)
|
||||
cohorts.set_defaults(func=cmd_report_cohorts)
|
||||
|
||||
stats = subparsers.add_parser("stats", help="registry maturity and federation stats")
|
||||
stats.add_argument("--path", help="repo root (default: cwd)")
|
||||
stats.add_argument("--federation-ready", action="store_true")
|
||||
stats.add_argument("--raw-url", help="probe federation raw index URL")
|
||||
stats.add_argument("--hub-url", help="hub base URL (or REUSE_SURFACE_URL)")
|
||||
stats.add_argument("--format", choices=["markdown", "json"], default="markdown")
|
||||
stats.set_defaults(func=cmd_stats)
|
||||
|
||||
establish = subparsers.add_parser(
|
||||
"establish", help="bootstrap or discover capability registry"
|
||||
)
|
||||
establish.add_argument("--path", help="target repo root (default: cwd)")
|
||||
establish.add_argument("--domain", default="helix_forge")
|
||||
establish.add_argument("--force", action="store_true")
|
||||
establish.add_argument("--scaffold", action="store_true")
|
||||
establish.add_argument("--publish-check", action="store_true")
|
||||
establish.add_argument("--discover", action="store_true")
|
||||
establish.add_argument("--dry-run", action="store_true", help="discover preview (default)")
|
||||
establish.add_argument("--apply", action="store_true", help="discover write + validate")
|
||||
establish.add_argument("--raw-url", help="raw Gitea index URL for publish-check")
|
||||
establish.add_argument("--llm-url", help="llm-connect base URL (or LLM_CONNECT_URL)")
|
||||
establish.add_argument("--context-max-files", type=int, default=12)
|
||||
establish.set_defaults(func=cmd_establish)
|
||||
|
||||
update = subparsers.add_parser("update", help="refresh registry metadata from repo signals")
|
||||
update.add_argument("--path", help="repo root (default: cwd)")
|
||||
update.add_argument("--capability", help="single capability id")
|
||||
update.add_argument("--all", action="store_true")
|
||||
update.add_argument("--from-git-since", help="git ref for change detection")
|
||||
update.add_argument("--apply", action="store_true")
|
||||
update.add_argument("--suggest-maturity", action="store_true")
|
||||
update.add_argument("--llm-url", help="llm-connect base URL (or LLM_CONNECT_URL)")
|
||||
update.add_argument("--format", choices=["markdown", "json"], default="markdown")
|
||||
update.set_defaults(func=cmd_update)
|
||||
|
||||
args = parser.parse_args(argv)
|
||||
return args.func(args)
|
||||
|
||||
|
||||
448
reuse_surface/establish.py
Normal file
448
reuse_surface/establish.py
Normal file
@@ -0,0 +1,448 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import textwrap
|
||||
import urllib.error
|
||||
import urllib.request
|
||||
from datetime import date
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
import yaml
|
||||
|
||||
from reuse_surface.llm_bridge import request_registry_draft
|
||||
from reuse_surface.registry import load_index_at, registry_paths
|
||||
|
||||
SCAFFOLD_README = """# Capability Registry
|
||||
|
||||
Markdown-first capability index for federation and reuse planning.
|
||||
|
||||
## Authoring
|
||||
|
||||
1. Copy a capability entry template (see reuse-surface `templates/capability-entry.template.md`).
|
||||
2. Add the row to `indexes/capabilities.yaml`.
|
||||
3. Run `reuse-surface validate` from a checkout with the CLI installed.
|
||||
4. Merge to `main` and verify publish with `reuse-surface establish --publish-check`.
|
||||
|
||||
Federation contract: reuse-surface `docs/RegistryFederation.md`.
|
||||
"""
|
||||
|
||||
CONTEXT_FILES = (
|
||||
"INTENT.md",
|
||||
"SCOPE.md",
|
||||
"AGENTS.md",
|
||||
"README.md",
|
||||
"pyproject.toml",
|
||||
"Cargo.toml",
|
||||
"go.mod",
|
||||
)
|
||||
|
||||
|
||||
def scaffold_registry(
|
||||
repo_root: Path,
|
||||
*,
|
||||
domain: str = "helix_forge",
|
||||
force: bool = False,
|
||||
) -> list[Path]:
|
||||
paths = registry_paths(repo_root)
|
||||
created: list[Path] = []
|
||||
if paths["registry"].exists() and not force:
|
||||
raise ValueError(
|
||||
f"registry already exists at {paths['registry']}; use --force to overwrite"
|
||||
)
|
||||
|
||||
paths["registry"].mkdir(parents=True, exist_ok=True)
|
||||
paths["capabilities"].mkdir(parents=True, exist_ok=True)
|
||||
paths["index"].parent.mkdir(parents=True, exist_ok=True)
|
||||
|
||||
readme = paths["registry"] / "README.md"
|
||||
if force or not readme.exists():
|
||||
readme.write_text(SCAFFOLD_README, encoding="utf-8")
|
||||
created.append(readme)
|
||||
|
||||
gitkeep = paths["capabilities"] / ".gitkeep"
|
||||
if force or not gitkeep.exists():
|
||||
gitkeep.write_text("", encoding="utf-8")
|
||||
created.append(gitkeep)
|
||||
|
||||
index_data = {
|
||||
"version": 1,
|
||||
"updated": date.today().isoformat(),
|
||||
"domain": domain,
|
||||
"capabilities": [],
|
||||
}
|
||||
if force or not paths["index"].exists():
|
||||
paths["index"].write_text(
|
||||
yaml.safe_dump(index_data, sort_keys=False, allow_unicode=True),
|
||||
encoding="utf-8",
|
||||
)
|
||||
created.append(paths["index"])
|
||||
return created
|
||||
|
||||
|
||||
def scaffold_next_steps(repo_root: Path) -> str:
|
||||
return textwrap.dedent(
|
||||
f"""
|
||||
Next steps:
|
||||
1. Add capability entries under {repo_root / 'registry/capabilities'}
|
||||
2. Update {repo_root / 'registry/indexes/capabilities.yaml'}
|
||||
3. reuse-surface validate
|
||||
4. git push origin main
|
||||
5. reuse-surface establish --publish-check --raw-url <gitea-raw-url>
|
||||
6. reuse-surface hub register --repo <slug> --url <raw-url>
|
||||
"""
|
||||
).strip()
|
||||
|
||||
|
||||
def publish_check(
|
||||
repo_root: Path,
|
||||
*,
|
||||
raw_url: str | None = None,
|
||||
) -> dict[str, Any]:
|
||||
paths = registry_paths(repo_root)
|
||||
result: dict[str, Any] = {
|
||||
"repo_root": str(repo_root),
|
||||
"checks": [],
|
||||
"ok": True,
|
||||
}
|
||||
|
||||
if paths["index"].exists():
|
||||
try:
|
||||
data = load_index_at(paths["index"])
|
||||
valid = isinstance(data, dict) and isinstance(data.get("capabilities"), list)
|
||||
result["checks"].append(
|
||||
{
|
||||
"name": "local_index_yaml",
|
||||
"ok": valid,
|
||||
"detail": f"{len(data.get('capabilities', []))} capabilities"
|
||||
if valid
|
||||
else "invalid structure",
|
||||
}
|
||||
)
|
||||
if not valid:
|
||||
result["ok"] = False
|
||||
except (OSError, yaml.YAMLError) as exc:
|
||||
result["checks"].append(
|
||||
{"name": "local_index_yaml", "ok": False, "detail": str(exc)}
|
||||
)
|
||||
result["ok"] = False
|
||||
else:
|
||||
result["checks"].append(
|
||||
{
|
||||
"name": "local_index_yaml",
|
||||
"ok": False,
|
||||
"detail": "registry/indexes/capabilities.yaml missing",
|
||||
}
|
||||
)
|
||||
result["ok"] = False
|
||||
|
||||
if raw_url:
|
||||
probe = _probe_raw_url(raw_url)
|
||||
result["checks"].append(
|
||||
{
|
||||
"name": "raw_url_probe",
|
||||
"ok": probe["ok"],
|
||||
"detail": f"HTTP {probe.get('status')} {probe.get('content_type', '')}".strip(),
|
||||
"url": raw_url,
|
||||
}
|
||||
)
|
||||
if probe["ok"]:
|
||||
body_probe = _fetch_yaml_snippet(raw_url)
|
||||
result["checks"].append(body_probe)
|
||||
if not body_probe.get("ok"):
|
||||
result["ok"] = False
|
||||
else:
|
||||
result["ok"] = False
|
||||
result["remediation"] = (
|
||||
"Merge registry/indexes/capabilities.yaml to main and confirm "
|
||||
"Gitea raw URL returns 200 YAML. See docs/RegistryFederation.md."
|
||||
)
|
||||
|
||||
return result
|
||||
|
||||
|
||||
def _probe_raw_url(url: str) -> dict[str, Any]:
|
||||
request = urllib.request.Request(
|
||||
url,
|
||||
method="HEAD",
|
||||
headers={"User-Agent": "reuse-surface/0.1"},
|
||||
)
|
||||
try:
|
||||
with urllib.request.urlopen(request, timeout=30) as response:
|
||||
return {
|
||||
"ok": response.status == 200,
|
||||
"status": response.status,
|
||||
"content_type": response.headers.get("Content-Type", ""),
|
||||
}
|
||||
except urllib.error.HTTPError as exc:
|
||||
return {
|
||||
"ok": False,
|
||||
"status": exc.code,
|
||||
"content_type": exc.headers.get("Content-Type", ""),
|
||||
}
|
||||
|
||||
|
||||
def _fetch_yaml_snippet(url: str) -> dict[str, Any]:
|
||||
request = urllib.request.Request(url, headers={"User-Agent": "reuse-surface/0.1"})
|
||||
try:
|
||||
with urllib.request.urlopen(request, timeout=30) as response:
|
||||
body = response.read().decode("utf-8")
|
||||
except urllib.error.HTTPError as exc:
|
||||
return {"name": "raw_url_body", "ok": False, "detail": f"HTTP {exc.code}"}
|
||||
except urllib.error.URLError as exc:
|
||||
return {"name": "raw_url_body", "ok": False, "detail": str(exc.reason)}
|
||||
try:
|
||||
data = yaml.safe_load(body)
|
||||
except yaml.YAMLError as exc:
|
||||
return {"name": "raw_url_body", "ok": False, "detail": str(exc)}
|
||||
ok = isinstance(data, dict) and "capabilities" in data
|
||||
return {
|
||||
"name": "raw_url_body",
|
||||
"ok": ok,
|
||||
"detail": "valid capabilities.yaml shape" if ok else "body is not valid index YAML",
|
||||
}
|
||||
|
||||
|
||||
def collect_context(repo_root: Path, *, max_files: int = 12) -> str:
|
||||
chunks: list[str] = []
|
||||
used = 0
|
||||
for name in CONTEXT_FILES:
|
||||
if used >= max_files:
|
||||
break
|
||||
path = repo_root / name
|
||||
if path.is_file():
|
||||
chunks.append(f"### {name}\n{path.read_text(encoding='utf-8')[:8000]}")
|
||||
used += 1
|
||||
pkg_dirs = sorted(
|
||||
[
|
||||
item
|
||||
for item in repo_root.iterdir()
|
||||
if item.is_dir()
|
||||
and not item.name.startswith(".")
|
||||
and item.name not in {"registry", "tests", "docs", "workplans", "node_modules"}
|
||||
]
|
||||
)
|
||||
for pkg in pkg_dirs[: max(0, max_files - used)]:
|
||||
init = pkg / "__init__.py"
|
||||
if init.exists():
|
||||
chunks.append(f"### {pkg.name}/__init__.py\n{init.read_text(encoding='utf-8')[:2000]}")
|
||||
return "\n\n".join(chunks)
|
||||
|
||||
|
||||
def build_discover_prompt(context: str, domain: str) -> str:
|
||||
schema_hint = json.dumps(
|
||||
{
|
||||
"domain": domain,
|
||||
"capabilities": [
|
||||
{
|
||||
"id": "capability.domain.name",
|
||||
"name": "Human Name",
|
||||
"summary": "One sentence.",
|
||||
"owner": "team",
|
||||
"vector": "D2 / A0 / C0 / R0",
|
||||
"tags": ["tag"],
|
||||
"consumption_modes": ["informational"],
|
||||
"discovery_intent": "What this enables.",
|
||||
"discovery_includes": ["included behavior"],
|
||||
"discovery_excludes": ["excluded behavior"],
|
||||
}
|
||||
],
|
||||
},
|
||||
indent=2,
|
||||
)
|
||||
return textwrap.dedent(
|
||||
f"""
|
||||
You are drafting a capability registry index for helix_forge reuse-surface.
|
||||
|
||||
Return ONLY a JSON object matching this shape (no markdown fences):
|
||||
{schema_hint}
|
||||
|
||||
Rules:
|
||||
- Propose 1-5 distinct capabilities grounded in the repository context.
|
||||
- Use IDs matching ^capability\\.[a-z0-9]+(\\.[a-z0-9-]+)+$
|
||||
- Default vector D2 / A0 / C0 / R0 unless strong delivery evidence exists.
|
||||
- domain: {domain}
|
||||
|
||||
Repository context:
|
||||
{context}
|
||||
"""
|
||||
).strip()
|
||||
|
||||
|
||||
def discover_capabilities(
|
||||
repo_root: Path,
|
||||
*,
|
||||
domain: str = "helix_forge",
|
||||
dry_run: bool = True,
|
||||
apply: bool = False,
|
||||
llm_url: str | None = None,
|
||||
context_max_files: int = 12,
|
||||
) -> dict[str, Any]:
|
||||
if apply and dry_run:
|
||||
raise ValueError("use either --dry-run or --apply, not both")
|
||||
if not apply and not dry_run:
|
||||
dry_run = True
|
||||
|
||||
context = collect_context(repo_root, max_files=context_max_files)
|
||||
if not context.strip():
|
||||
raise ValueError("no context files found for discovery")
|
||||
|
||||
prompt = build_discover_prompt(context, domain)
|
||||
draft = request_registry_draft(
|
||||
prompt,
|
||||
base_url=llm_url,
|
||||
config={"temperature": 0.2, "max_tokens": 4000},
|
||||
)
|
||||
|
||||
result: dict[str, Any] = {"draft": draft, "written": [], "dry_run": dry_run}
|
||||
if dry_run:
|
||||
return result
|
||||
|
||||
paths = registry_paths(repo_root)
|
||||
if not paths["index"].exists():
|
||||
scaffold_registry(repo_root, domain=domain, force=False)
|
||||
|
||||
index = load_index_at(paths["index"]) if paths["index"].exists() else {
|
||||
"version": 1,
|
||||
"domain": domain,
|
||||
"capabilities": [],
|
||||
}
|
||||
existing_ids = {row["id"] for row in index.get("capabilities", [])}
|
||||
|
||||
for item in draft.get("capabilities", []):
|
||||
cap_id = item["id"]
|
||||
if cap_id in existing_ids:
|
||||
continue
|
||||
filename = cap_id.replace(".", "-") + ".md"
|
||||
rel_path = f"registry/capabilities/{filename}"
|
||||
entry_path = repo_root / rel_path
|
||||
entry_body = _render_entry_from_draft(item, domain)
|
||||
entry_path.parent.mkdir(parents=True, exist_ok=True)
|
||||
entry_path.write_text(entry_body, encoding="utf-8")
|
||||
vector = item.get("vector", "D2 / A0 / C0 / R0")
|
||||
index.setdefault("capabilities", []).append(
|
||||
{
|
||||
"id": cap_id,
|
||||
"name": item["name"],
|
||||
"summary": item["summary"],
|
||||
"vector": vector,
|
||||
"domain": domain,
|
||||
"status": "draft",
|
||||
"owner": item.get("owner", repo_root.name),
|
||||
"path": rel_path,
|
||||
"tags": item.get("tags", []),
|
||||
"consumption_modes": item.get("consumption_modes", ["informational"]),
|
||||
}
|
||||
)
|
||||
result["written"].append(rel_path)
|
||||
|
||||
index["updated"] = date.today().isoformat()
|
||||
index["domain"] = draft.get("domain", domain)
|
||||
paths["index"].write_text(
|
||||
yaml.safe_dump(index, sort_keys=False, allow_unicode=True),
|
||||
encoding="utf-8",
|
||||
)
|
||||
result["written"].append(str(paths["index"].relative_to(repo_root)))
|
||||
return result
|
||||
|
||||
|
||||
def _render_entry_from_draft(item: dict[str, Any], domain: str) -> str:
|
||||
vector = item.get("vector", "D2 / A0 / C0 / R0")
|
||||
d, a, c, r = [part.strip() for part in vector.split("/")]
|
||||
front_matter = {
|
||||
"id": item["id"],
|
||||
"name": item["name"],
|
||||
"summary": item["summary"],
|
||||
"owner": item.get("owner", domain),
|
||||
"status": "draft",
|
||||
"domain": domain,
|
||||
"tags": item.get("tags") or ["draft"],
|
||||
"maturity": {
|
||||
"discovery": {
|
||||
"current": d,
|
||||
"target": "D5",
|
||||
"confidence": "low",
|
||||
"rationale": "Auto-drafted by reuse-surface establish --discover; review required.",
|
||||
},
|
||||
"availability": {
|
||||
"current": a,
|
||||
"target": "A3",
|
||||
"confidence": "low",
|
||||
"rationale": "Auto-drafted; confirm consumption modes and artifacts.",
|
||||
},
|
||||
},
|
||||
"external_evidence": {
|
||||
"completeness": {
|
||||
"level": c,
|
||||
"confidence": "low",
|
||||
"basis": "scope_vs_intent_and_consumer_expectations",
|
||||
"satisfied_expectations": [],
|
||||
"broken_expectations": [],
|
||||
"out_of_scope_expectations": [],
|
||||
},
|
||||
"reliability": {
|
||||
"level": r,
|
||||
"confidence": "low",
|
||||
"basis": "consumer_quality_signals",
|
||||
"known_reliability_risks": ["auto-drafted entry without consumer evidence"],
|
||||
},
|
||||
},
|
||||
"discovery": {
|
||||
"intent": item.get("discovery_intent", item["summary"]),
|
||||
"includes": item.get("discovery_includes") or [],
|
||||
"excludes": item.get("discovery_excludes") or [],
|
||||
"assumptions": [],
|
||||
"use_cases": [],
|
||||
"research_memos": [],
|
||||
},
|
||||
"availability": {
|
||||
"current_level": a,
|
||||
"target_level": "A3",
|
||||
"current_artifacts": [],
|
||||
"target_artifacts": [],
|
||||
"consumption_modes": item.get("consumption_modes") or ["informational"],
|
||||
},
|
||||
"relations": {"depends_on": [], "supports": [], "related_to": []},
|
||||
"evidence": {
|
||||
"documentation": [],
|
||||
"tests": [],
|
||||
"consumer_feedback": [],
|
||||
"bug_reports": [],
|
||||
"incidents": [],
|
||||
},
|
||||
"consumer_guidance": {
|
||||
"recommended_for": ["planning reuse after human review"],
|
||||
"not_recommended_for": ["implementation reuse before validation"],
|
||||
"known_limitations": ["discover draft — verify maturity claims"],
|
||||
},
|
||||
"promotion_history": [],
|
||||
}
|
||||
markdown = (
|
||||
f"# {item['name']}\n\n"
|
||||
"Auto-drafted capability entry. Review maturity, evidence, and relations "
|
||||
"before promoting.\n"
|
||||
)
|
||||
return (
|
||||
"---\n"
|
||||
+ yaml.safe_dump(front_matter, sort_keys=False, allow_unicode=True)
|
||||
+ "---\n\n"
|
||||
+ markdown
|
||||
)
|
||||
|
||||
|
||||
def format_publish_check_markdown(result: dict[str, Any]) -> str:
|
||||
lines = ["# Federation publish check", ""]
|
||||
lines.append(f"**Repo:** `{result['repo_root']}`")
|
||||
lines.append(f"**Result:** {'PASS' if result['ok'] else 'FAIL'}")
|
||||
lines.append("")
|
||||
for check in result["checks"]:
|
||||
status = "ok" if check["ok"] else "FAIL"
|
||||
detail = check.get("detail", "")
|
||||
name = check["name"]
|
||||
lines.append(f"- **{name}**: {status} — {detail}")
|
||||
if check.get("url"):
|
||||
lines.append(f" `{check['url']}`")
|
||||
if result.get("remediation"):
|
||||
lines.append("")
|
||||
lines.append(f"**Remediation:** {result['remediation']}")
|
||||
return "\n".join(lines) + "\n"
|
||||
102
reuse_surface/llm_bridge.py
Normal file
102
reuse_surface/llm_bridge.py
Normal file
@@ -0,0 +1,102 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import urllib.error
|
||||
import urllib.request
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
from jsonschema import Draft202012Validator
|
||||
|
||||
from reuse_surface.registry import ROOT
|
||||
|
||||
DRAFT_SCHEMA_PATH = ROOT / "schemas" / "registry-draft.schema.json"
|
||||
|
||||
|
||||
def llm_connect_url(explicit: str | None = None) -> str:
|
||||
base = (explicit or os.environ.get("LLM_CONNECT_URL", "")).rstrip("/")
|
||||
if not base:
|
||||
raise ValueError(
|
||||
"LLM backend not configured; set LLM_CONNECT_URL or pass --llm-url"
|
||||
)
|
||||
return base
|
||||
|
||||
|
||||
def load_draft_schema() -> dict[str, Any]:
|
||||
return json.loads(DRAFT_SCHEMA_PATH.read_text(encoding="utf-8"))
|
||||
|
||||
|
||||
def execute_prompt(
|
||||
prompt: str,
|
||||
*,
|
||||
base_url: str | None = None,
|
||||
config: dict[str, Any] | None = None,
|
||||
) -> str:
|
||||
url = f"{llm_connect_url(base_url)}/execute"
|
||||
body: dict[str, Any] = {"prompt": prompt}
|
||||
if config:
|
||||
body["config"] = config
|
||||
data = json.dumps(body).encode("utf-8")
|
||||
request = urllib.request.Request(
|
||||
url,
|
||||
data=data,
|
||||
headers={
|
||||
"Content-Type": "application/json",
|
||||
"Accept": "application/json",
|
||||
"User-Agent": "reuse-surface/0.1",
|
||||
},
|
||||
method="POST",
|
||||
)
|
||||
try:
|
||||
with urllib.request.urlopen(request, timeout=120) as response:
|
||||
payload = json.loads(response.read().decode("utf-8"))
|
||||
except urllib.error.HTTPError as exc:
|
||||
raw = exc.read().decode("utf-8")
|
||||
raise ValueError(f"llm-connect returned {exc.code}: {raw}") from exc
|
||||
content = payload.get("content")
|
||||
if not isinstance(content, str) or not content.strip():
|
||||
raise ValueError("llm-connect response missing content")
|
||||
return content
|
||||
|
||||
|
||||
def extract_json_object(text: str) -> dict[str, Any]:
|
||||
stripped = text.strip()
|
||||
if stripped.startswith("```"):
|
||||
stripped = re.sub(r"^```(?:json)?\s*", "", stripped)
|
||||
stripped = re.sub(r"\s*```$", "", stripped)
|
||||
try:
|
||||
data = json.loads(stripped)
|
||||
except json.JSONDecodeError:
|
||||
match = re.search(r"\{.*\}", stripped, re.DOTALL)
|
||||
if not match:
|
||||
raise ValueError("llm response did not contain JSON object") from None
|
||||
data = json.loads(match.group(0))
|
||||
if not isinstance(data, dict):
|
||||
raise ValueError("llm response JSON must be an object")
|
||||
return data
|
||||
|
||||
|
||||
def request_registry_draft(
|
||||
prompt: str,
|
||||
*,
|
||||
base_url: str | None = None,
|
||||
config: dict[str, Any] | None = None,
|
||||
) -> dict[str, Any]:
|
||||
draft = extract_json_object(execute_prompt(prompt, base_url=base_url, config=config))
|
||||
validator = Draft202012Validator(load_draft_schema())
|
||||
errors = sorted(validator.iter_errors(draft), key=lambda err: list(err.path))
|
||||
if errors:
|
||||
messages = "; ".join(error.message for error in errors[:3])
|
||||
raise ValueError(f"draft schema validation failed: {messages}")
|
||||
return draft
|
||||
|
||||
|
||||
def request_json_object(
|
||||
prompt: str,
|
||||
*,
|
||||
base_url: str | None = None,
|
||||
config: dict[str, Any] | None = None,
|
||||
) -> dict[str, Any]:
|
||||
return extract_json_object(execute_prompt(prompt, base_url=base_url, config=config))
|
||||
@@ -60,4 +60,31 @@ def parse_vector(vector: str) -> dict[str, str]:
|
||||
|
||||
def level_at_least(dimension: str, current: str, minimum: str) -> bool:
|
||||
order = LEVEL_ORDERS[dimension]
|
||||
return order.index(current) >= order.index(minimum)
|
||||
return order.index(current) >= order.index(minimum)
|
||||
|
||||
|
||||
def registry_paths(repo_root: Path) -> dict[str, Path]:
|
||||
registry = repo_root / "registry"
|
||||
return {
|
||||
"registry": registry,
|
||||
"capabilities": registry / "capabilities",
|
||||
"index": registry / "indexes" / "capabilities.yaml",
|
||||
"sources": registry / "federation" / "sources.yaml",
|
||||
}
|
||||
|
||||
|
||||
def load_index_at(path: Path) -> dict[str, Any]:
|
||||
with path.open(encoding="utf-8") as handle:
|
||||
return yaml.safe_load(handle)
|
||||
|
||||
|
||||
def entry_vector(front_matter: dict[str, Any]) -> str:
|
||||
discovery = front_matter["maturity"]["discovery"]["current"]
|
||||
availability = front_matter["maturity"]["availability"]["current"]
|
||||
completeness = front_matter["external_evidence"]["completeness"]["level"]
|
||||
reliability = front_matter["external_evidence"]["reliability"]["level"]
|
||||
return f"{discovery} / {availability} / {completeness} / {reliability}"
|
||||
|
||||
|
||||
def vectors_match(index_vector: str, front_matter: dict[str, Any]) -> bool:
|
||||
return index_vector.replace(" ", "") == entry_vector(front_matter).replace(" ", "")
|
||||
273
reuse_surface/registry_update.py
Normal file
273
reuse_surface/registry_update.py
Normal file
@@ -0,0 +1,273 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import subprocess
|
||||
import textwrap
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
import yaml
|
||||
|
||||
from reuse_surface.llm_bridge import request_json_object
|
||||
from reuse_surface.registry import (
|
||||
entry_vector,
|
||||
load_index_at,
|
||||
parse_front_matter,
|
||||
registry_paths,
|
||||
vectors_match,
|
||||
)
|
||||
|
||||
SAFE_EVIDENCE_PREFIXES = ("tests/", ".gitea/workflows/")
|
||||
|
||||
|
||||
def git_changed_files(repo_root: Path, since_ref: str) -> list[str]:
|
||||
result = subprocess.run(
|
||||
["git", "-C", str(repo_root), "diff", "--name-only", since_ref, "HEAD"],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=False,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
raise ValueError(result.stderr.strip() or f"git diff failed for {since_ref}")
|
||||
return [line.strip() for line in result.stdout.splitlines() if line.strip()]
|
||||
|
||||
|
||||
def collect_deterministic_suggestions(
|
||||
repo_root: Path,
|
||||
*,
|
||||
capability_id: str | None = None,
|
||||
git_since: str | None = None,
|
||||
) -> list[dict[str, Any]]:
|
||||
paths = registry_paths(repo_root)
|
||||
if not paths["index"].exists():
|
||||
raise ValueError("registry index missing; run establish --scaffold first")
|
||||
|
||||
index = load_index_at(paths["index"])
|
||||
rows = index.get("capabilities", [])
|
||||
if capability_id:
|
||||
rows = [row for row in rows if row["id"] == capability_id]
|
||||
if not rows:
|
||||
raise ValueError(f"capability not in index: {capability_id}")
|
||||
|
||||
changed_files = git_changed_files(repo_root, git_since) if git_since else []
|
||||
suggestions: list[dict[str, Any]] = []
|
||||
|
||||
for row in rows:
|
||||
entry_path = repo_root / row["path"]
|
||||
if not entry_path.exists():
|
||||
suggestions.append(
|
||||
{
|
||||
"capability_id": row["id"],
|
||||
"kind": "missing_entry",
|
||||
"detail": f"missing file {row['path']}",
|
||||
}
|
||||
)
|
||||
continue
|
||||
|
||||
front_matter = parse_front_matter(entry_path)
|
||||
if not vectors_match(row["vector"], front_matter):
|
||||
suggestions.append(
|
||||
{
|
||||
"capability_id": row["id"],
|
||||
"kind": "vector_drift",
|
||||
"detail": "index vector differs from entry front matter",
|
||||
"index_vector": row["vector"],
|
||||
"entry_vector": entry_vector(front_matter),
|
||||
"apply_patch": {
|
||||
"field": "index.vector",
|
||||
"value": entry_vector(front_matter),
|
||||
},
|
||||
}
|
||||
)
|
||||
|
||||
evidence_tests = front_matter.get("evidence", {}).get("tests", [])
|
||||
for changed in changed_files:
|
||||
if changed.startswith("tests/") and changed not in evidence_tests:
|
||||
suggestions.append(
|
||||
{
|
||||
"capability_id": row["id"],
|
||||
"kind": "evidence_test",
|
||||
"detail": f"new test file not cited: {changed}",
|
||||
"apply_patch": {
|
||||
"field": "evidence.tests",
|
||||
"append": changed,
|
||||
},
|
||||
}
|
||||
)
|
||||
|
||||
artifacts = front_matter.get("availability", {}).get("current_artifacts", [])
|
||||
for changed in changed_files:
|
||||
if changed.endswith(".py") and changed.startswith(
|
||||
tuple(
|
||||
p.name + "/"
|
||||
for p in repo_root.iterdir()
|
||||
if p.is_dir() and (p / "__init__.py").exists()
|
||||
)
|
||||
):
|
||||
if changed not in artifacts:
|
||||
suggestions.append(
|
||||
{
|
||||
"capability_id": row["id"],
|
||||
"kind": "availability_artifact",
|
||||
"detail": f"changed module not cited: {changed}",
|
||||
"apply_patch": {
|
||||
"field": "availability.current_artifacts",
|
||||
"append": changed,
|
||||
},
|
||||
}
|
||||
)
|
||||
|
||||
return suggestions
|
||||
|
||||
|
||||
def apply_deterministic_suggestions(
|
||||
repo_root: Path,
|
||||
suggestions: list[dict[str, Any]],
|
||||
) -> list[str]:
|
||||
paths = registry_paths(repo_root)
|
||||
index = load_index_at(paths["index"])
|
||||
index_by_id = {row["id"]: row for row in index.get("capabilities", [])}
|
||||
changed: list[str] = []
|
||||
|
||||
entry_cache: dict[str, dict[str, Any]] = {}
|
||||
entry_paths: dict[str, Path] = {}
|
||||
|
||||
for suggestion in suggestions:
|
||||
patch = suggestion.get("apply_patch")
|
||||
if not patch:
|
||||
continue
|
||||
cap_id = suggestion["capability_id"]
|
||||
if patch["field"] == "index.vector" and cap_id in index_by_id:
|
||||
index_by_id[cap_id]["vector"] = patch["value"]
|
||||
changed.append(f"index vector for {cap_id}")
|
||||
|
||||
row = index_by_id.get(cap_id)
|
||||
if not row:
|
||||
continue
|
||||
entry_path = repo_root / row["path"]
|
||||
if cap_id not in entry_cache:
|
||||
entry_cache[cap_id] = parse_front_matter(entry_path)
|
||||
entry_paths[cap_id] = entry_path
|
||||
|
||||
front_matter = entry_cache[cap_id]
|
||||
if patch["field"] == "evidence.tests":
|
||||
tests = front_matter.setdefault("evidence", {}).setdefault("tests", [])
|
||||
if patch["append"] not in tests:
|
||||
tests.append(patch["append"])
|
||||
changed.append(f"{cap_id} evidence.tests += {patch['append']}")
|
||||
if patch["field"] == "availability.current_artifacts":
|
||||
artifacts = front_matter.setdefault("availability", {}).setdefault(
|
||||
"current_artifacts", []
|
||||
)
|
||||
if patch["append"] not in artifacts:
|
||||
artifacts.append(patch["append"])
|
||||
changed.append(
|
||||
f"{cap_id} availability.current_artifacts += {patch['append']}"
|
||||
)
|
||||
|
||||
if changed:
|
||||
paths["index"].write_text(
|
||||
yaml.safe_dump(index, sort_keys=False, allow_unicode=True),
|
||||
encoding="utf-8",
|
||||
)
|
||||
for cap_id, front_matter in entry_cache.items():
|
||||
_write_front_matter(entry_paths[cap_id], front_matter)
|
||||
return changed
|
||||
|
||||
|
||||
def _write_front_matter(path: Path, front_matter: dict[str, Any]) -> None:
|
||||
text = path.read_text(encoding="utf-8")
|
||||
marker_end = text.find("\n---", 4)
|
||||
body = text[marker_end + 4 :] if marker_end != -1 else "\n"
|
||||
path.write_text(
|
||||
"---\n"
|
||||
+ yaml.safe_dump(front_matter, sort_keys=False, allow_unicode=True)
|
||||
+ "---"
|
||||
+ body,
|
||||
encoding="utf-8",
|
||||
)
|
||||
|
||||
|
||||
def build_update_prompt(
|
||||
repo_root: Path,
|
||||
capability_id: str,
|
||||
*,
|
||||
git_since: str | None = None,
|
||||
) -> str:
|
||||
paths = registry_paths(repo_root)
|
||||
index = load_index_at(paths["index"])
|
||||
row = next((item for item in index["capabilities"] if item["id"] == capability_id), None)
|
||||
if not row:
|
||||
raise ValueError(f"capability not in index: {capability_id}")
|
||||
entry = parse_front_matter(repo_root / row["path"])
|
||||
diff = ""
|
||||
if git_since:
|
||||
proc = subprocess.run(
|
||||
[
|
||||
"git",
|
||||
"-C",
|
||||
str(repo_root),
|
||||
"diff",
|
||||
git_since,
|
||||
"HEAD",
|
||||
"--",
|
||||
"registry/",
|
||||
"reuse_surface/",
|
||||
"tests/",
|
||||
],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
check=False,
|
||||
)
|
||||
diff = proc.stdout[:12000]
|
||||
|
||||
return textwrap.dedent(
|
||||
f"""
|
||||
Suggest registry entry updates for capability `{capability_id}`.
|
||||
|
||||
Return ONLY JSON:
|
||||
{{
|
||||
"promotion_history": [
|
||||
{{"date": "YYYY-MM-DD", "dimension": "availability", "from": "A3", "to": "A4", "rationale": "..."}}
|
||||
],
|
||||
"consumer_feedback": ["optional string notes"],
|
||||
"notes": ["human review items"]
|
||||
}}
|
||||
|
||||
Current entry YAML:
|
||||
{yaml.safe_dump(entry, sort_keys=False)}
|
||||
|
||||
Git diff since {git_since or 'N/A'}:
|
||||
{diff or '(none)'}
|
||||
"""
|
||||
).strip()
|
||||
|
||||
|
||||
def suggest_llm_updates(
|
||||
repo_root: Path,
|
||||
capability_id: str,
|
||||
*,
|
||||
git_since: str | None = None,
|
||||
llm_url: str | None = None,
|
||||
) -> dict[str, Any]:
|
||||
prompt = build_update_prompt(repo_root, capability_id, git_since=git_since)
|
||||
return request_json_object(
|
||||
prompt,
|
||||
base_url=llm_url,
|
||||
config={"temperature": 0.2, "max_tokens": 2000},
|
||||
)
|
||||
|
||||
|
||||
def format_suggestions_markdown(suggestions: list[dict[str, Any]]) -> str:
|
||||
if not suggestions:
|
||||
return "# Registry update suggestions\n\n_No suggestions._\n"
|
||||
lines = ["# Registry update suggestions", ""]
|
||||
for item in suggestions:
|
||||
lines.append(f"- `{item['capability_id']}` **{item['kind']}**: {item['detail']}")
|
||||
lines.append("")
|
||||
lines.append(f"**{len(suggestions)}** suggestion(s). Use `--apply` to apply safe patches.")
|
||||
return "\n".join(lines) + "\n"
|
||||
|
||||
|
||||
def format_suggestions_json(suggestions: list[dict[str, Any]]) -> str:
|
||||
return json.dumps({"count": len(suggestions), "suggestions": suggestions}, indent=2)
|
||||
259
reuse_surface/stats.py
Normal file
259
reuse_surface/stats.py
Normal file
@@ -0,0 +1,259 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import urllib.error
|
||||
import urllib.request
|
||||
from collections import Counter
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
import yaml
|
||||
|
||||
from reuse_surface import hub_client
|
||||
from reuse_surface.registry import (
|
||||
LEVEL_ORDERS,
|
||||
entry_vector,
|
||||
load_index_at,
|
||||
parse_front_matter,
|
||||
parse_vector,
|
||||
registry_paths,
|
||||
vectors_match,
|
||||
)
|
||||
|
||||
|
||||
def _histogram(values: list[str], order: list[str]) -> dict[str, int]:
|
||||
counts = Counter(values)
|
||||
return {level: counts.get(level, 0) for level in order if counts.get(level, 0)}
|
||||
|
||||
|
||||
def _probe_url(url: str) -> dict[str, Any]:
|
||||
request = urllib.request.Request(
|
||||
url,
|
||||
method="HEAD",
|
||||
headers={"User-Agent": "reuse-surface/0.1"},
|
||||
)
|
||||
try:
|
||||
with urllib.request.urlopen(request, timeout=30) as response:
|
||||
return {
|
||||
"url": url,
|
||||
"status": response.status,
|
||||
"content_type": response.headers.get("Content-Type", ""),
|
||||
"ok": response.status == 200,
|
||||
}
|
||||
except urllib.error.HTTPError as exc:
|
||||
return {
|
||||
"url": url,
|
||||
"status": exc.code,
|
||||
"content_type": exc.headers.get("Content-Type", ""),
|
||||
"ok": False,
|
||||
}
|
||||
except urllib.error.URLError as exc:
|
||||
return {"url": url, "status": None, "error": str(exc.reason), "ok": False}
|
||||
|
||||
|
||||
def collect_stats(
|
||||
repo_root: Path,
|
||||
*,
|
||||
federation_ready: bool = False,
|
||||
raw_url: str | None = None,
|
||||
hub_url: str | None = None,
|
||||
) -> dict[str, Any]:
|
||||
paths = registry_paths(repo_root)
|
||||
stats: dict[str, Any] = {
|
||||
"repo_root": str(repo_root),
|
||||
"registry_present": paths["registry"].exists(),
|
||||
"index_present": paths["index"].exists(),
|
||||
"sources_present": paths["sources"].exists(),
|
||||
"capability_count": 0,
|
||||
"histograms": {},
|
||||
"reliability": {"r0_r2": 0, "r3_plus": 0},
|
||||
"consumption_modes": {},
|
||||
"vector_drift": [],
|
||||
"federation": {},
|
||||
"hub": {},
|
||||
}
|
||||
|
||||
if not paths["index"].exists():
|
||||
if federation_ready and raw_url:
|
||||
stats["federation"]["raw_url_probe"] = _probe_url(raw_url)
|
||||
if hub_url or _hub_configured():
|
||||
stats["hub"] = _hub_summary(hub_url)
|
||||
return stats
|
||||
|
||||
index = load_index_at(paths["index"])
|
||||
capabilities = index.get("capabilities", [])
|
||||
stats["capability_count"] = len(capabilities)
|
||||
stats["domain"] = index.get("domain")
|
||||
|
||||
discovery: list[str] = []
|
||||
availability: list[str] = []
|
||||
completeness: list[str] = []
|
||||
reliability: list[str] = []
|
||||
mode_counts: Counter[str] = Counter()
|
||||
|
||||
for row in capabilities:
|
||||
vector = parse_vector(row["vector"])
|
||||
discovery.append(vector["discovery"])
|
||||
availability.append(vector["availability"])
|
||||
completeness.append(vector["completeness"])
|
||||
reliability.append(vector["reliability"])
|
||||
for mode in row.get("consumption_modes", []):
|
||||
mode_counts[mode] += 1
|
||||
|
||||
entry_path = repo_root / row["path"]
|
||||
if entry_path.exists():
|
||||
try:
|
||||
front_matter = parse_front_matter(entry_path)
|
||||
if not vectors_match(row["vector"], front_matter):
|
||||
stats["vector_drift"].append(
|
||||
{
|
||||
"id": row["id"],
|
||||
"index_vector": row["vector"],
|
||||
"entry_vector": entry_vector(front_matter),
|
||||
}
|
||||
)
|
||||
except ValueError:
|
||||
stats["vector_drift"].append(
|
||||
{"id": row["id"], "error": "invalid entry front matter"}
|
||||
)
|
||||
|
||||
stats["histograms"] = {
|
||||
"discovery": _histogram(discovery, LEVEL_ORDERS["discovery"]),
|
||||
"availability": _histogram(availability, LEVEL_ORDERS["availability"]),
|
||||
"completeness": _histogram(completeness, LEVEL_ORDERS["completeness"]),
|
||||
"reliability": _histogram(reliability, LEVEL_ORDERS["reliability"]),
|
||||
}
|
||||
stats["reliability"] = {
|
||||
"r0_r2": sum(1 for level in reliability if level in {"R0", "R1", "R2"}),
|
||||
"r3_plus": sum(1 for level in reliability if level_at_least_reliability(level, "R3")),
|
||||
}
|
||||
stats["consumption_modes"] = dict(sorted(mode_counts.items()))
|
||||
|
||||
if federation_ready:
|
||||
probe_url = raw_url
|
||||
if not probe_url and paths["index"].exists():
|
||||
probe_url = _default_raw_url(repo_root)
|
||||
if probe_url:
|
||||
stats["federation"]["raw_url_probe"] = _probe_url(probe_url)
|
||||
stats["federation"]["index_valid_yaml"] = _index_yaml_valid(paths["index"])
|
||||
|
||||
stats["hub"] = _hub_summary(hub_url)
|
||||
return stats
|
||||
|
||||
|
||||
def level_at_least_reliability(current: str, minimum: str) -> bool:
|
||||
order = LEVEL_ORDERS["reliability"]
|
||||
return order.index(current) >= order.index(minimum)
|
||||
|
||||
|
||||
def _hub_configured() -> bool:
|
||||
import os
|
||||
|
||||
return bool(os.environ.get("REUSE_SURFACE_URL"))
|
||||
|
||||
|
||||
def _hub_summary(hub_url: str | None) -> dict[str, Any]:
|
||||
try:
|
||||
status, payload = hub_client.hub_list(hub_url)
|
||||
except (ValueError, urllib.error.URLError, OSError):
|
||||
return {"configured": False}
|
||||
if status != 200:
|
||||
return {"configured": True, "status": status, "error": payload}
|
||||
repos = payload.get("repos", [])
|
||||
return {
|
||||
"configured": True,
|
||||
"registration_count": payload.get("count", len(repos)),
|
||||
"enabled_count": sum(1 for repo in repos if repo.get("enabled", True)),
|
||||
}
|
||||
|
||||
|
||||
def _default_raw_url(repo_root: Path) -> str | None:
|
||||
return None
|
||||
|
||||
|
||||
def _index_yaml_valid(index_path: Path) -> bool:
|
||||
try:
|
||||
data = load_index_at(index_path)
|
||||
return isinstance(data, dict) and "capabilities" in data
|
||||
except (OSError, yaml.YAMLError):
|
||||
return False
|
||||
|
||||
|
||||
def format_stats_markdown(stats: dict[str, Any]) -> str:
|
||||
lines = ["# Registry stats", ""]
|
||||
lines.append(f"**Repo:** `{stats['repo_root']}`")
|
||||
lines.append(f"**Capabilities:** {stats['capability_count']}")
|
||||
if stats.get("domain"):
|
||||
lines.append(f"**Domain:** `{stats['domain']}`")
|
||||
lines.append("")
|
||||
|
||||
lines.append("## Layout")
|
||||
lines.append(f"- registry present: `{stats['registry_present']}`")
|
||||
lines.append(f"- index present: `{stats['index_present']}`")
|
||||
lines.append(f"- federation sources present: `{stats['sources_present']}`")
|
||||
lines.append("")
|
||||
|
||||
rel = stats["reliability"]
|
||||
lines.append("## Reliability bands (index vectors)")
|
||||
lines.append(f"- R0–R2: **{rel['r0_r2']}**")
|
||||
lines.append(f"- R3+: **{rel['r3_plus']}**")
|
||||
lines.append("")
|
||||
|
||||
for dimension, histogram in stats.get("histograms", {}).items():
|
||||
if not histogram:
|
||||
continue
|
||||
lines.append(f"## {dimension.title()} histogram")
|
||||
for level, count in histogram.items():
|
||||
lines.append(f"- `{level}`: {count}")
|
||||
lines.append("")
|
||||
|
||||
if stats.get("consumption_modes"):
|
||||
lines.append("## Consumption modes")
|
||||
for mode, count in stats["consumption_modes"].items():
|
||||
lines.append(f"- `{mode}`: {count}")
|
||||
lines.append("")
|
||||
|
||||
drift = stats.get("vector_drift", [])
|
||||
lines.append(f"## Vector drift: **{len(drift)}**")
|
||||
for item in drift[:10]:
|
||||
if "error" in item:
|
||||
lines.append(f"- `{item['id']}`: {item['error']}")
|
||||
else:
|
||||
lines.append(
|
||||
f"- `{item['id']}`: index `{item['index_vector']}` "
|
||||
f"≠ entry `{item['entry_vector']}`"
|
||||
)
|
||||
if len(drift) > 10:
|
||||
lines.append(f"- … and {len(drift) - 10} more")
|
||||
lines.append("")
|
||||
|
||||
federation = stats.get("federation", {})
|
||||
if federation:
|
||||
lines.append("## Federation readiness")
|
||||
if "index_valid_yaml" in federation:
|
||||
lines.append(f"- index valid YAML: `{federation['index_valid_yaml']}`")
|
||||
probe = federation.get("raw_url_probe")
|
||||
if probe:
|
||||
status = probe.get("status")
|
||||
ok = probe.get("ok")
|
||||
lines.append(f"- raw URL probe: status **{status}** ({'ok' if ok else 'fail'})")
|
||||
lines.append(f" `{probe.get('url', '')}`")
|
||||
lines.append("")
|
||||
|
||||
hub = stats.get("hub", {})
|
||||
if hub.get("configured"):
|
||||
lines.append("## Hub")
|
||||
if "registration_count" in hub:
|
||||
lines.append(
|
||||
f"- registrations: **{hub['registration_count']}** "
|
||||
f"({hub.get('enabled_count', 0)} enabled)"
|
||||
)
|
||||
elif "error" in hub:
|
||||
lines.append(f"- hub error: {hub['error']}")
|
||||
lines.append("")
|
||||
|
||||
return "\n".join(lines) + "\n"
|
||||
|
||||
|
||||
def format_stats_json(stats: dict[str, Any]) -> str:
|
||||
return json.dumps(stats, indent=2, sort_keys=True)
|
||||
69
schemas/registry-draft.schema.json
Normal file
69
schemas/registry-draft.schema.json
Normal file
@@ -0,0 +1,69 @@
|
||||
{
|
||||
"$schema": "https://json-schema.org/draft/2020-12/schema",
|
||||
"$id": "https://reuse-surface.local/schemas/registry-draft.schema.json",
|
||||
"title": "RegistryDiscoveryDraft",
|
||||
"type": "object",
|
||||
"additionalProperties": false,
|
||||
"required": ["capabilities"],
|
||||
"properties": {
|
||||
"domain": {
|
||||
"type": "string"
|
||||
},
|
||||
"capabilities": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "object",
|
||||
"additionalProperties": false,
|
||||
"required": ["id", "name", "summary"],
|
||||
"properties": {
|
||||
"id": {
|
||||
"type": "string",
|
||||
"pattern": "^capability\\.[a-z0-9]+(\\.[a-z0-9-]+)+$"
|
||||
},
|
||||
"name": {
|
||||
"type": "string",
|
||||
"minLength": 1
|
||||
},
|
||||
"summary": {
|
||||
"type": "string",
|
||||
"minLength": 1
|
||||
},
|
||||
"owner": {
|
||||
"type": "string"
|
||||
},
|
||||
"vector": {
|
||||
"type": "string",
|
||||
"pattern": "^D[0-7] / A[0-7] / C[0-6] / R[0-6]$"
|
||||
},
|
||||
"tags": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"consumption_modes": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"discovery_intent": {
|
||||
"type": "string"
|
||||
},
|
||||
"discovery_includes": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
}
|
||||
},
|
||||
"discovery_excludes": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"type": "string"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
77
tests/test_establish.py
Normal file
77
tests/test_establish.py
Normal file
@@ -0,0 +1,77 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
from unittest.mock import patch
|
||||
|
||||
import yaml
|
||||
|
||||
from reuse_surface.establish import (
|
||||
discover_capabilities,
|
||||
publish_check,
|
||||
scaffold_registry,
|
||||
)
|
||||
from reuse_surface.registry import registry_paths
|
||||
|
||||
|
||||
def test_scaffold_creates_layout(tmp_path: Path):
|
||||
created = scaffold_registry(tmp_path, domain="helix_forge")
|
||||
paths = registry_paths(tmp_path)
|
||||
assert paths["index"] in created
|
||||
data = yaml.safe_load(paths["index"].read_text(encoding="utf-8"))
|
||||
assert data["capabilities"] == []
|
||||
assert data["domain"] == "helix_forge"
|
||||
|
||||
|
||||
def test_scaffold_refuses_existing_without_force(tmp_path: Path):
|
||||
scaffold_registry(tmp_path)
|
||||
try:
|
||||
scaffold_registry(tmp_path)
|
||||
raise AssertionError("expected ValueError")
|
||||
except ValueError as exc:
|
||||
assert "already exists" in str(exc)
|
||||
|
||||
|
||||
def test_publish_check_local_index(tmp_path: Path):
|
||||
scaffold_registry(tmp_path)
|
||||
result = publish_check(tmp_path)
|
||||
assert result["ok"] is True
|
||||
assert any(check["name"] == "local_index_yaml" for check in result["checks"])
|
||||
|
||||
|
||||
def test_publish_check_raw_url_fail(tmp_path: Path):
|
||||
with patch(
|
||||
"reuse_surface.establish._probe_raw_url",
|
||||
return_value={"ok": False, "status": 303, "content_type": "text/html"},
|
||||
):
|
||||
result = publish_check(
|
||||
tmp_path,
|
||||
raw_url="https://example.com/capabilities.yaml",
|
||||
)
|
||||
assert result["ok"] is False
|
||||
assert result.get("remediation")
|
||||
|
||||
|
||||
def test_discover_dry_run_mock_llm(tmp_path: Path):
|
||||
scaffold_registry(tmp_path)
|
||||
(tmp_path / "README.md").write_text("# Demo service\n", encoding="utf-8")
|
||||
draft = {
|
||||
"domain": "helix_forge",
|
||||
"capabilities": [
|
||||
{
|
||||
"id": "capability.demo.sample",
|
||||
"name": "Sample",
|
||||
"summary": "Sample capability.",
|
||||
"owner": "demo",
|
||||
"vector": "D2 / A0 / C0 / R0",
|
||||
"tags": ["demo"],
|
||||
"consumption_modes": ["informational"],
|
||||
"discovery_intent": "Enable demo planning.",
|
||||
}
|
||||
],
|
||||
}
|
||||
with patch(
|
||||
"reuse_surface.establish.request_registry_draft",
|
||||
return_value=draft,
|
||||
):
|
||||
result = discover_capabilities(tmp_path, dry_run=True, apply=False)
|
||||
assert result["draft"]["capabilities"][0]["id"] == "capability.demo.sample"
|
||||
53
tests/test_llm_bridge.py
Normal file
53
tests/test_llm_bridge.py
Normal file
@@ -0,0 +1,53 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from unittest.mock import patch
|
||||
|
||||
import pytest
|
||||
|
||||
from reuse_surface.llm_bridge import (
|
||||
extract_json_object,
|
||||
llm_connect_url,
|
||||
request_registry_draft,
|
||||
)
|
||||
|
||||
|
||||
def test_extract_json_object_from_fenced_block():
|
||||
data = extract_json_object('```json\n{"capabilities": []}\n```')
|
||||
assert data == {"capabilities": []}
|
||||
|
||||
|
||||
def test_llm_connect_url_missing_raises():
|
||||
with pytest.raises(ValueError, match="LLM_CONNECT_URL"):
|
||||
llm_connect_url(None)
|
||||
|
||||
|
||||
def test_request_registry_draft_mock_http():
|
||||
payload = {
|
||||
"content": json.dumps(
|
||||
{
|
||||
"capabilities": [
|
||||
{
|
||||
"id": "capability.demo.sample",
|
||||
"name": "Sample",
|
||||
"summary": "Demo capability",
|
||||
}
|
||||
]
|
||||
}
|
||||
)
|
||||
}
|
||||
|
||||
class FakeResponse:
|
||||
def __enter__(self):
|
||||
return self
|
||||
|
||||
def __exit__(self, *args):
|
||||
return False
|
||||
|
||||
def read(self):
|
||||
return json.dumps(payload).encode("utf-8")
|
||||
|
||||
with patch.dict("os.environ", {"LLM_CONNECT_URL": "http://llm.test"}):
|
||||
with patch("urllib.request.urlopen", return_value=FakeResponse()):
|
||||
draft = request_registry_draft("test prompt")
|
||||
assert draft["capabilities"][0]["id"] == "capability.demo.sample"
|
||||
87
tests/test_registry_update.py
Normal file
87
tests/test_registry_update.py
Normal file
@@ -0,0 +1,87 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
import yaml
|
||||
|
||||
from reuse_surface.establish import scaffold_registry
|
||||
from reuse_surface.registry import load_index_at, registry_paths
|
||||
from reuse_surface.registry_update import (
|
||||
apply_deterministic_suggestions,
|
||||
collect_deterministic_suggestions,
|
||||
)
|
||||
|
||||
|
||||
def _write_minimal_entry(tmp_path: Path, cap_id: str, vector: str) -> str:
|
||||
rel = "registry/capabilities/capability-demo-sample.md"
|
||||
d, a, c, r = [part.strip() for part in vector.split("/")]
|
||||
front_matter = {
|
||||
"id": cap_id,
|
||||
"name": "Sample",
|
||||
"summary": "Sample",
|
||||
"owner": "demo",
|
||||
"status": "draft",
|
||||
"domain": "helix_forge",
|
||||
"tags": ["demo"],
|
||||
"maturity": {
|
||||
"discovery": {"current": d, "target": "D5", "confidence": "low"},
|
||||
"availability": {"current": a, "target": "A3", "confidence": "low"},
|
||||
},
|
||||
"external_evidence": {
|
||||
"completeness": {"level": c, "confidence": "low"},
|
||||
"reliability": {"level": r, "confidence": "low"},
|
||||
},
|
||||
"discovery": {"intent": "demo", "includes": [], "excludes": []},
|
||||
"availability": {
|
||||
"current_level": a,
|
||||
"target_level": "A3",
|
||||
"current_artifacts": [],
|
||||
"consumption_modes": ["informational"],
|
||||
},
|
||||
"relations": {"depends_on": [], "supports": [], "related_to": []},
|
||||
"evidence": {"documentation": [], "tests": []},
|
||||
"consumer_guidance": {
|
||||
"recommended_for": [],
|
||||
"not_recommended_for": [],
|
||||
"known_limitations": [],
|
||||
},
|
||||
}
|
||||
path = tmp_path / rel
|
||||
path.parent.mkdir(parents=True, exist_ok=True)
|
||||
path.write_text(
|
||||
"---\n"
|
||||
+ yaml.safe_dump(front_matter, sort_keys=False)
|
||||
+ "---\n",
|
||||
encoding="utf-8",
|
||||
)
|
||||
return rel
|
||||
|
||||
|
||||
def test_vector_drift_suggestion(tmp_path: Path):
|
||||
scaffold_registry(tmp_path)
|
||||
cap_id = "capability.demo.sample"
|
||||
rel = _write_minimal_entry(tmp_path, cap_id, "D3 / A0 / C0 / R0")
|
||||
index_path = registry_paths(tmp_path)["index"]
|
||||
index = load_index_at(index_path)
|
||||
index["capabilities"] = [
|
||||
{
|
||||
"id": cap_id,
|
||||
"name": "Sample",
|
||||
"summary": "Sample",
|
||||
"vector": "D2 / A0 / C0 / R0",
|
||||
"domain": "helix_forge",
|
||||
"status": "draft",
|
||||
"owner": "demo",
|
||||
"path": rel,
|
||||
"tags": ["demo"],
|
||||
"consumption_modes": ["informational"],
|
||||
}
|
||||
]
|
||||
index_path.write_text(yaml.safe_dump(index, sort_keys=False), encoding="utf-8")
|
||||
|
||||
suggestions = collect_deterministic_suggestions(tmp_path, capability_id=cap_id)
|
||||
assert any(item["kind"] == "vector_drift" for item in suggestions)
|
||||
changed = apply_deterministic_suggestions(tmp_path, suggestions)
|
||||
assert changed
|
||||
updated = load_index_at(index_path)
|
||||
assert updated["capabilities"][0]["vector"] == "D3 / A0 / C0 / R0"
|
||||
20
tests/test_stats.py
Normal file
20
tests/test_stats.py
Normal file
@@ -0,0 +1,20 @@
|
||||
from __future__ import annotations
|
||||
|
||||
from pathlib import Path
|
||||
|
||||
from reuse_surface.stats import collect_stats, format_stats_markdown
|
||||
|
||||
|
||||
def test_collect_stats_on_repo_root():
|
||||
root = Path(__file__).resolve().parent.parent
|
||||
stats = collect_stats(root)
|
||||
assert stats["capability_count"] == 20
|
||||
assert stats["index_present"] is True
|
||||
assert "discovery" in stats["histograms"]
|
||||
|
||||
|
||||
def test_format_stats_markdown_contains_count():
|
||||
root = Path(__file__).resolve().parent.parent
|
||||
text = format_stats_markdown(collect_stats(root))
|
||||
assert "Capabilities:" in text
|
||||
assert "20" in text
|
||||
@@ -104,6 +104,45 @@ reuse-surface hub sync --dry-run
|
||||
|
||||
Run the service locally: `REUSE_SURFACE_TOKEN=dev-token reuse-surface serve`
|
||||
|
||||
### stats
|
||||
|
||||
Registry maturity aggregates and federation readiness.
|
||||
|
||||
```bash
|
||||
reuse-surface stats
|
||||
reuse-surface stats --format json
|
||||
reuse-surface stats --federation-ready --raw-url https://.../capabilities.yaml
|
||||
```
|
||||
|
||||
### establish
|
||||
|
||||
Bootstrap or discover a capability registry in the current or target repo.
|
||||
|
||||
```bash
|
||||
reuse-surface establish --scaffold --domain helix_forge
|
||||
reuse-surface establish --scaffold --path ../state-hub
|
||||
reuse-surface establish --publish-check --raw-url https://.../capabilities.yaml
|
||||
export LLM_CONNECT_URL=http://127.0.0.1:8088
|
||||
reuse-surface establish --discover --dry-run
|
||||
reuse-surface establish --discover --apply
|
||||
```
|
||||
|
||||
`--scaffold` creates `registry/` layout. `--publish-check` probes raw URL and
|
||||
local index YAML. `--discover` drafts capabilities via llm-connect (optional).
|
||||
|
||||
### update
|
||||
|
||||
Refresh registry metadata from repo drift signals.
|
||||
|
||||
```bash
|
||||
reuse-surface update --capability capability.registry.register --dry-run
|
||||
reuse-surface update --all --from-git-since HEAD~5 --apply
|
||||
reuse-surface update --capability capability.registry.register --suggest-maturity
|
||||
```
|
||||
|
||||
Deterministic patches (`vector_drift`, new `tests/` citations) apply with
|
||||
`--apply`. LLM suggestions use `--suggest-maturity` and remain review-only.
|
||||
|
||||
### report cohorts
|
||||
|
||||
Export capability cohorts for planning or implementation reuse decisions.
|
||||
@@ -140,6 +179,11 @@ Stable IDs and maturity fields are preserved for agent consumption (UC-RS-019).
|
||||
| Publish catalog | `reuse-surface catalog` |
|
||||
| Compose federation | `reuse-surface federation compose` |
|
||||
| Sync federation manifest from hub | `reuse-surface hub sync` |
|
||||
| Registry stats | `reuse-surface stats` |
|
||||
| Bootstrap sibling registry | `reuse-surface establish --scaffold` |
|
||||
| Verify index publish URL | `reuse-surface establish --publish-check` |
|
||||
| Draft capabilities (LLM) | `reuse-surface establish --discover` |
|
||||
| Refresh entry metadata | `reuse-surface update` |
|
||||
| Planning cohort export | `reuse-surface report cohorts` |
|
||||
| Relation graph | `reuse-surface graph` |
|
||||
|
||||
|
||||
@@ -4,11 +4,11 @@ type: workplan
|
||||
title: "Registry establish, update, and stats with optional llm-connect assist"
|
||||
domain: helix_forge
|
||||
repo: reuse-surface
|
||||
status: ready
|
||||
status: finished
|
||||
owner: codex
|
||||
topic_slug: helix-forge
|
||||
created: "2026-06-16"
|
||||
updated: "2026-06-16"
|
||||
updated: "2026-06-17"
|
||||
state_hub_workstream_id: "239a0077-8593-4dc7-918d-4c23895275f6"
|
||||
---
|
||||
|
||||
@@ -91,7 +91,7 @@ reuse-surface update --from-git-since HEAD~5 --apply
|
||||
|
||||
```task
|
||||
id: REUSE-WP-0013-T01
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "98e65330-bfc7-4282-b372-d35542b899ce"
|
||||
```
|
||||
@@ -112,7 +112,7 @@ Output: Markdown default, `--format json`. Pytest coverage. Document in
|
||||
|
||||
```task
|
||||
id: REUSE-WP-0013-T02
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "b8fedd87-d0d3-41b4-9af8-e36d52bfe1c5"
|
||||
```
|
||||
@@ -131,7 +131,7 @@ No llm-connect dependency. Pytest with temp directory.
|
||||
|
||||
```task
|
||||
id: REUSE-WP-0013-T03
|
||||
status: todo
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "2924d685-709f-4e28-886f-b363cd9c40b4"
|
||||
```
|
||||
@@ -147,7 +147,7 @@ Federation publish helper for sibling repo operators:
|
||||
|
||||
```task
|
||||
id: REUSE-WP-0013-T04
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "650ebee5-b34b-4ed8-891d-d93aacebadd7"
|
||||
```
|
||||
@@ -166,7 +166,7 @@ Thin client boundary:
|
||||
|
||||
```task
|
||||
id: REUSE-WP-0013-T05
|
||||
status: todo
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "b9154889-f538-4266-9918-b277f9a297be"
|
||||
```
|
||||
@@ -185,7 +185,7 @@ LLM-assisted bootstrap after `--scaffold` or on empty registry:
|
||||
|
||||
```task
|
||||
id: REUSE-WP-0013-T06
|
||||
status: todo
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "b79558da-54b2-4712-91d2-b298c7cf2c40"
|
||||
```
|
||||
@@ -210,7 +210,7 @@ Targets: single `--capability`, `--all`, `--from-git-since <ref>`.
|
||||
|
||||
```task
|
||||
id: REUSE-WP-0013-T07
|
||||
status: todo
|
||||
status: done
|
||||
priority: low
|
||||
state_hub_task_id: "a55a2f26-004e-4c20-90cb-49bed64a1291"
|
||||
```
|
||||
@@ -227,13 +227,20 @@ state_hub_task_id: "a55a2f26-004e-4c20-90cb-49bed64a1291"
|
||||
|
||||
## Acceptance
|
||||
|
||||
- [ ] `reuse-surface stats` reports maturity and federation-readiness aggregates
|
||||
- [ ] `establish --scaffold` creates valid empty registry layout without overwrite accidents
|
||||
- [ ] `establish --publish-check` detects 303 vs 200 raw URL outcomes
|
||||
- [ ] llm-connect bridge works with mocked HTTP; fails clearly when URL unset
|
||||
- [ ] `establish --discover --dry-run` produces schema-valid draft JSON from fixture context
|
||||
- [ ] `update --dry-run` reports deterministic drift on sample repo
|
||||
- [ ] All new commands documented; gap priority 24 recorded
|
||||
- [x] `reuse-surface stats` reports maturity and federation-readiness aggregates
|
||||
- [x] `establish --scaffold` creates valid empty registry layout without overwrite accidents
|
||||
- [x] `establish --publish-check` detects 303 vs 200 raw URL outcomes
|
||||
- [x] llm-connect bridge works with mocked HTTP; fails clearly when URL unset
|
||||
- [x] `establish --discover --dry-run` produces schema-valid draft JSON from fixture context
|
||||
- [x] `update --dry-run` reports deterministic drift on sample repo
|
||||
- [x] All new commands documented; gap priority 24 recorded
|
||||
|
||||
## Completion notes (2026-06-17)
|
||||
|
||||
- Modules: `stats.py`, `establish.py`, `registry_update.py`, `llm_bridge.py`
|
||||
- Schema: `schemas/registry-draft.schema.json`
|
||||
- `validate --root` for sibling repo validation after establish --apply
|
||||
- 43 pytest tests; optional `pip install -e ".[llm]"` extra
|
||||
|
||||
## Out of scope
|
||||
|
||||
Reference in New Issue
Block a user