generated from coulomb/repo-seed
Complete REUSE-WP-0004: CI, overlap detection, and catalog generation
Some checks failed
ci / validate-registry (push) Has been cancelled
Some checks failed
ci / validate-registry (push) Has been cancelled
Add Gitea CI workflow for registry validation, reuse-surface overlaps and catalog commands, generated catalog artifacts, and documentation updates closing gap analysis priorities 9-11.
This commit is contained in:
25
.gitea/workflows/ci.yml
Normal file
25
.gitea/workflows/ci.yml
Normal file
@@ -0,0 +1,25 @@
|
|||||||
|
name: ci
|
||||||
|
|
||||||
|
on:
|
||||||
|
push:
|
||||||
|
branches: [main]
|
||||||
|
pull_request:
|
||||||
|
branches: [main]
|
||||||
|
|
||||||
|
jobs:
|
||||||
|
validate-registry:
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
steps:
|
||||||
|
- name: Check out source
|
||||||
|
uses: actions/checkout@v4
|
||||||
|
|
||||||
|
- name: Set up Python
|
||||||
|
uses: actions/setup-python@v5
|
||||||
|
with:
|
||||||
|
python-version: "3.12"
|
||||||
|
|
||||||
|
- name: Install package
|
||||||
|
run: python -m pip install -e .
|
||||||
|
|
||||||
|
- name: Validate capability registry
|
||||||
|
run: reuse-surface validate
|
||||||
@@ -124,6 +124,10 @@ artifacts.
|
|||||||
# Registry validation (schema + index drift)
|
# Registry validation (schema + index drift)
|
||||||
.venv/bin/reuse-surface validate
|
.venv/bin/reuse-surface validate
|
||||||
|
|
||||||
|
# Overlap and catalog generation
|
||||||
|
.venv/bin/reuse-surface overlaps
|
||||||
|
.venv/bin/reuse-surface catalog
|
||||||
|
|
||||||
# Repository hygiene
|
# Repository hygiene
|
||||||
rg --files
|
rg --files
|
||||||
git diff --check
|
git diff --check
|
||||||
@@ -149,6 +153,9 @@ The generated instruction in older workplans says `make fix-consistency
|
|||||||
REPO=reuse-surface`; that is still valid when `uv` is installed and on PATH.
|
REPO=reuse-surface`; that is still valid when `uv` is installed and on PATH.
|
||||||
On this workstation, the `.venv/bin/python` fallback has been verified.
|
On this workstation, the `.venv/bin/python` fallback has been verified.
|
||||||
|
|
||||||
|
CI runs `reuse-surface validate` on push and pull requests via
|
||||||
|
`.gitea/workflows/ci.yml`.
|
||||||
|
|
||||||
### Run
|
### Run
|
||||||
|
|
||||||
There is no local service to run from this repository.
|
There is no local service to run from this repository.
|
||||||
|
|||||||
14
SCOPE.md
14
SCOPE.md
@@ -50,7 +50,9 @@ and agents can:
|
|||||||
`external_evidence.reliability`
|
`external_evidence.reliability`
|
||||||
- **Validate entries automatically** with `reuse-surface validate`
|
- **Validate entries automatically** with `reuse-surface validate`
|
||||||
- **Export a machine-readable bundle** with `reuse-surface export`
|
- **Export a machine-readable bundle** with `reuse-surface export`
|
||||||
- **Avoid duplicates** by querying the index before creating new entries
|
- **Detect overlap candidates** with `reuse-surface overlaps`
|
||||||
|
- **Generate a human-readable catalog** with `reuse-surface catalog`
|
||||||
|
- **Avoid duplicates** by querying the index and checking overlaps before adding entries
|
||||||
|
|
||||||
Registry tooling availability is **A3** (CLI). The registry product itself is
|
Registry tooling availability is **A3** (CLI). The registry product itself is
|
||||||
still documentation-first for authoring; consumption combines Markdown entries,
|
still documentation-first for authoring; consumption combines Markdown entries,
|
||||||
@@ -58,11 +60,10 @@ the index, and CLI automation.
|
|||||||
|
|
||||||
## What Is Not Possible Yet
|
## What Is Not Possible Yet
|
||||||
|
|
||||||
- Generated human-readable catalog site
|
- Interactive catalog site with live search beyond static HTML export
|
||||||
- Capability graph visualization
|
- Capability graph visualization
|
||||||
- Automated duplicate/overlap detection
|
|
||||||
- Federation across repositories or organizations
|
- Federation across repositories or organizations
|
||||||
- CI integration or packaged releases beyond local `pip install -e .`
|
- Packaged releases beyond local `pip install -e .` and Gitea CI validation
|
||||||
|
|
||||||
See `tools/README.md` for command reference.
|
See `tools/README.md` for command reference.
|
||||||
|
|
||||||
@@ -74,7 +75,9 @@ See `tools/README.md` for command reference.
|
|||||||
`pyproject.toml` and `reuse_surface/`.
|
`pyproject.toml` and `reuse_surface/`.
|
||||||
- `docs/CapabilityRegistryConcept.md` and `docs/IntentScopeGapAnalysis.md`
|
- `docs/CapabilityRegistryConcept.md` and `docs/IntentScopeGapAnalysis.md`
|
||||||
document onboarding and intent-scope tracking.
|
document onboarding and intent-scope tracking.
|
||||||
- Finished workplans: `REUSE-WP-0001`, `REUSE-WP-0002`, `REUSE-WP-0003`.
|
- CI validates the registry on push/PR via `.gitea/workflows/ci.yml`.
|
||||||
|
- Generated catalog: `docs/CapabilityCatalog.md` and `docs/catalog/index.html`.
|
||||||
|
- Finished workplans: `REUSE-WP-0001` through `REUSE-WP-0004`.
|
||||||
- **Self-assessed vector:** `D5 / A3 / C4 / R2` (see gap analysis).
|
- **Self-assessed vector:** `D5 / A3 / C4 / R2` (see gap analysis).
|
||||||
|
|
||||||
## Repository Layout
|
## Repository Layout
|
||||||
@@ -105,6 +108,7 @@ reuse-surface/
|
|||||||
- Maturity standard: specs/CapabilityMaturityStandard.md
|
- Maturity standard: specs/CapabilityMaturityStandard.md
|
||||||
- Registry index: registry/indexes/capabilities.yaml
|
- Registry index: registry/indexes/capabilities.yaml
|
||||||
- Registry guidance: registry/README.md
|
- Registry guidance: registry/README.md
|
||||||
|
- Generated catalog: docs/CapabilityCatalog.md
|
||||||
- CLI reference: tools/README.md
|
- CLI reference: tools/README.md
|
||||||
- Agent instructions: AGENTS.md
|
- Agent instructions: AGENTS.md
|
||||||
- Workplans: workplans/
|
- Workplans: workplans/
|
||||||
78
docs/CapabilityCatalog.md
Normal file
78
docs/CapabilityCatalog.md
Normal file
@@ -0,0 +1,78 @@
|
|||||||
|
# Capability Catalog
|
||||||
|
|
||||||
|
**Domain:** helix_forge
|
||||||
|
**Updated:** 2026-06-15
|
||||||
|
**Entries:** 6
|
||||||
|
|
||||||
|
Generated by `reuse-surface catalog`. Do not edit manually.
|
||||||
|
|
||||||
|
## helix_forge
|
||||||
|
|
||||||
|
### Feature Availability Evaluation
|
||||||
|
|
||||||
|
- **ID:** `capability.feature-control.evaluate`
|
||||||
|
- **Vector:** D5 / A4 / C3 / R3
|
||||||
|
- **Owner:** feature-control
|
||||||
|
- **Path:** `registry/capabilities/capability.feature-control.evaluate.md`
|
||||||
|
- **Summary:** Evaluate whether a feature is active, hidden, disabled, or unavailable for a subject in context.
|
||||||
|
|
||||||
|
**Known limitations:**
|
||||||
|
- bulk rule management is not yet covered
|
||||||
|
- agent-specific simulation remains a known gap
|
||||||
|
|
||||||
|
### Feature Rollout Control
|
||||||
|
|
||||||
|
- **ID:** `capability.feature-control.rollout`
|
||||||
|
- **Vector:** D4 / A2 / C2 / R1
|
||||||
|
- **Owner:** feature-control
|
||||||
|
- **Path:** `registry/capabilities/capability.feature-control.rollout.md`
|
||||||
|
- **Summary:** Gradually expose features to subjects across tenants, domains, groups, or cohorts using rollout rules and staged availability.
|
||||||
|
|
||||||
|
**Known limitations:**
|
||||||
|
- distinguish carefully from capability.feature-control.evaluate
|
||||||
|
|
||||||
|
### Identity Subject Resolution
|
||||||
|
|
||||||
|
- **ID:** `capability.identity.subject-resolution`
|
||||||
|
- **Vector:** D3 / A0 / C1 / R0
|
||||||
|
- **Owner:** identity-canon
|
||||||
|
- **Path:** `registry/capabilities/capability.identity.subject-resolution.md`
|
||||||
|
- **Summary:** Resolve who or what is acting in a context by mapping principals, accounts, actors, and identifiers to a stable subject model.
|
||||||
|
|
||||||
|
**Known limitations:**
|
||||||
|
- resolver artifacts are not yet available
|
||||||
|
|
||||||
|
### Identity Vocabulary Canonicalization
|
||||||
|
|
||||||
|
- **ID:** `capability.identity.vocabulary-canonicalize`
|
||||||
|
- **Vector:** D4 / A0 / C2 / R0
|
||||||
|
- **Owner:** identity-canon
|
||||||
|
- **Path:** `registry/capabilities/capability.identity.vocabulary-canonicalize.md`
|
||||||
|
- **Summary:** Define and maintain an implementation-neutral vocabulary for identity-related concepts across overlapping domains.
|
||||||
|
|
||||||
|
**Known limitations:**
|
||||||
|
- source-note backfill is incomplete
|
||||||
|
- mappings may remain candidate until evidence review completes
|
||||||
|
|
||||||
|
### Capability Registration
|
||||||
|
|
||||||
|
- **ID:** `capability.registry.register`
|
||||||
|
- **Vector:** D3 / A3 / C2 / R2
|
||||||
|
- **Owner:** reuse-surface
|
||||||
|
- **Path:** `registry/capabilities/capability.registry.register.md`
|
||||||
|
- **Summary:** Register a new capability so it becomes visible for planning and implementation reuse.
|
||||||
|
|
||||||
|
**Known limitations:**
|
||||||
|
- manual index updates are required after adding an entry
|
||||||
|
- duplicate detection is guidance-only in the MVP
|
||||||
|
|
||||||
|
### Workstream And Task Coordination
|
||||||
|
|
||||||
|
- **ID:** `capability.statehub.workstream-coordinate`
|
||||||
|
- **Vector:** D4 / A4 / C3 / R2
|
||||||
|
- **Owner:** state-hub
|
||||||
|
- **Path:** `registry/capabilities/capability.statehub.workstream-coordinate.md`
|
||||||
|
- **Summary:** Track active workstreams, tasks, progress, and consistency across domain repositories through a local-first coordination service.
|
||||||
|
|
||||||
|
**Known limitations:**
|
||||||
|
- requires running State Hub locally or via tunnel
|
||||||
@@ -265,12 +265,18 @@ own evidence (e.g. feature-control at R3).
|
|||||||
|
|
||||||
### Next recommended work
|
### Next recommended work
|
||||||
|
|
||||||
|
| Priority | Gap | Outcome | Status |
|
||||||
|
|---|---|---|---|
|
||||||
|
| 9 | Catalog site | `reuse-surface catalog` → MD + HTML | Closed (WP-0004) |
|
||||||
|
| 10 | Overlap detection | `reuse-surface overlaps` | Closed (WP-0004) |
|
||||||
|
| 11 | CI validation | `.gitea/workflows/ci.yml` | Closed (WP-0004) |
|
||||||
|
| 12 | Registry federation | Cross-repo capability index composition | Open |
|
||||||
|
|
||||||
| Priority | Gap | Suggested outcome |
|
| Priority | Gap | Suggested outcome |
|
||||||
|---|---|---|
|
|---|---|---|
|
||||||
| 9 | Catalog site | Static browsable capability catalog (UC-RS-018) |
|
| 13 | Interactive catalog | Searchable catalog UI beyond static HTML |
|
||||||
| 10 | Overlap detection | CLI or report for duplicate/overlapping capabilities |
|
| 14 | Graph visualization | Capability relation graphs |
|
||||||
| 11 | CI validation | Run `reuse-surface validate` in CI on registry changes |
|
| 15 | Federation | Compose indexes across repositories |
|
||||||
| 12 | Registry federation | Cross-repo capability index composition |
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -289,4 +295,5 @@ own evidence (e.g. feature-control at R3).
|
|||||||
| Date | Change |
|
| Date | Change |
|
||||||
|---|---|
|
|---|---|
|
||||||
| 2026-06-15 | Initial analysis after REUSE-WP-0002 completion |
|
| 2026-06-15 | Initial analysis after REUSE-WP-0002 completion |
|
||||||
| 2026-06-15 | REUSE-WP-0003 closed priority gaps 1–8; vector updated to D5/A3/C4/R2 |
|
| 2026-06-15 | REUSE-WP-0003 closed priority gaps 1–8; vector updated to D5/A3/C4/R2 |
|
||||||
|
| 2026-06-15 | REUSE-WP-0004 closed priorities 9–11 (catalog, overlaps, CI) |
|
||||||
57
docs/catalog/index.html
Normal file
57
docs/catalog/index.html
Normal file
@@ -0,0 +1,57 @@
|
|||||||
|
<!DOCTYPE html>
|
||||||
|
<html lang="en">
|
||||||
|
<head>
|
||||||
|
<meta charset="utf-8">
|
||||||
|
<title>Capability Catalog — helix_forge</title>
|
||||||
|
<style>
|
||||||
|
body { font-family: system-ui, sans-serif; margin: 2rem; line-height: 1.5; }
|
||||||
|
h1 { margin-bottom: 0.2rem; }
|
||||||
|
.subtitle { color: #555; margin-bottom: 2rem; }
|
||||||
|
section { margin-bottom: 2rem; }
|
||||||
|
.card { border: 1px solid #ddd; border-radius: 8px; padding: 1rem; margin: 1rem 0; }
|
||||||
|
.meta { color: #444; font-size: 0.95rem; }
|
||||||
|
.path { font-size: 0.85rem; color: #666; }
|
||||||
|
</style>
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<h1>Capability Catalog</h1>
|
||||||
|
<p class="subtitle">Updated 2026-06-15 · 6 entries</p>
|
||||||
|
<section><h2>helix_forge</h2>
|
||||||
|
<article class="card">
|
||||||
|
<h3>Feature Availability Evaluation</h3>
|
||||||
|
<p class="meta"><code>capability.feature-control.evaluate</code> · D5 / A4 / C3 / R3</p>
|
||||||
|
<p>Evaluate whether a feature is active, hidden, disabled, or unavailable for a subject in context.</p>
|
||||||
|
<p class="path">registry/capabilities/capability.feature-control.evaluate.md</p>
|
||||||
|
</article>
|
||||||
|
<article class="card">
|
||||||
|
<h3>Feature Rollout Control</h3>
|
||||||
|
<p class="meta"><code>capability.feature-control.rollout</code> · D4 / A2 / C2 / R1</p>
|
||||||
|
<p>Gradually expose features to subjects across tenants, domains, groups, or cohorts using rollout rules and staged availability.</p>
|
||||||
|
<p class="path">registry/capabilities/capability.feature-control.rollout.md</p>
|
||||||
|
</article>
|
||||||
|
<article class="card">
|
||||||
|
<h3>Identity Subject Resolution</h3>
|
||||||
|
<p class="meta"><code>capability.identity.subject-resolution</code> · D3 / A0 / C1 / R0</p>
|
||||||
|
<p>Resolve who or what is acting in a context by mapping principals, accounts, actors, and identifiers to a stable subject model.</p>
|
||||||
|
<p class="path">registry/capabilities/capability.identity.subject-resolution.md</p>
|
||||||
|
</article>
|
||||||
|
<article class="card">
|
||||||
|
<h3>Identity Vocabulary Canonicalization</h3>
|
||||||
|
<p class="meta"><code>capability.identity.vocabulary-canonicalize</code> · D4 / A0 / C2 / R0</p>
|
||||||
|
<p>Define and maintain an implementation-neutral vocabulary for identity-related concepts across overlapping domains.</p>
|
||||||
|
<p class="path">registry/capabilities/capability.identity.vocabulary-canonicalize.md</p>
|
||||||
|
</article>
|
||||||
|
<article class="card">
|
||||||
|
<h3>Capability Registration</h3>
|
||||||
|
<p class="meta"><code>capability.registry.register</code> · D3 / A3 / C2 / R2</p>
|
||||||
|
<p>Register a new capability so it becomes visible for planning and implementation reuse.</p>
|
||||||
|
<p class="path">registry/capabilities/capability.registry.register.md</p>
|
||||||
|
</article>
|
||||||
|
<article class="card">
|
||||||
|
<h3>Workstream And Task Coordination</h3>
|
||||||
|
<p class="meta"><code>capability.statehub.workstream-coordinate</code> · D4 / A4 / C3 / R2</p>
|
||||||
|
<p>Track active workstreams, tasks, progress, and consistency across domain repositories through a local-first coordination service.</p>
|
||||||
|
<p class="path">registry/capabilities/capability.statehub.workstream-coordinate.md</p>
|
||||||
|
</article></section>
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
@@ -117,6 +117,8 @@ Compare vectors side by side and read:
|
|||||||
|
|
||||||
### UC-RS-015 — Detect duplicate or overlapping capabilities
|
### UC-RS-015 — Detect duplicate or overlapping capabilities
|
||||||
|
|
||||||
|
Run `reuse-surface overlaps` for automated candidate detection, then review:
|
||||||
|
|
||||||
Check for overlap in:
|
Check for overlap in:
|
||||||
|
|
||||||
- similar `name` or `summary`
|
- similar `name` or `summary`
|
||||||
|
|||||||
122
reuse_surface/catalog.py
Normal file
122
reuse_surface/catalog.py
Normal file
@@ -0,0 +1,122 @@
|
|||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import html
|
||||||
|
from collections import defaultdict
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
ROOT = Path(__file__).resolve().parent.parent
|
||||||
|
CATALOG_MD = ROOT / "docs" / "CapabilityCatalog.md"
|
||||||
|
CATALOG_HTML_DIR = ROOT / "docs" / "catalog"
|
||||||
|
CATALOG_HTML = CATALOG_HTML_DIR / "index.html"
|
||||||
|
|
||||||
|
|
||||||
|
def _grouped_capabilities(
|
||||||
|
indexed_entries: list[tuple[dict[str, Any], dict[str, Any]]],
|
||||||
|
) -> dict[str, list[tuple[dict[str, Any], dict[str, Any]]]]:
|
||||||
|
grouped: dict[str, list[tuple[dict[str, Any], dict[str, Any]]]] = defaultdict(
|
||||||
|
list
|
||||||
|
)
|
||||||
|
for index_item, entry in indexed_entries:
|
||||||
|
domain = index_item.get("domain", "unknown")
|
||||||
|
grouped[domain].append((index_item, entry))
|
||||||
|
return dict(sorted(grouped.items()))
|
||||||
|
|
||||||
|
|
||||||
|
def render_markdown(
|
||||||
|
index: dict[str, Any],
|
||||||
|
indexed_entries: list[tuple[dict[str, Any], dict[str, Any]]],
|
||||||
|
) -> str:
|
||||||
|
lines = [
|
||||||
|
"# Capability Catalog",
|
||||||
|
"",
|
||||||
|
f"**Domain:** {index.get('domain', 'unknown')} ",
|
||||||
|
f"**Updated:** {index.get('updated', 'unknown')} ",
|
||||||
|
f"**Entries:** {len(indexed_entries)}",
|
||||||
|
"",
|
||||||
|
"Generated by `reuse-surface catalog`. Do not edit manually.",
|
||||||
|
"",
|
||||||
|
]
|
||||||
|
for domain, items in _grouped_capabilities(indexed_entries).items():
|
||||||
|
lines.extend([f"## {domain}", ""])
|
||||||
|
for index_item, entry in sorted(items, key=lambda pair: pair[0]["id"]):
|
||||||
|
lines.extend(
|
||||||
|
[
|
||||||
|
f"### {index_item['name']}",
|
||||||
|
"",
|
||||||
|
f"- **ID:** `{index_item['id']}`",
|
||||||
|
f"- **Vector:** {index_item['vector']}",
|
||||||
|
f"- **Owner:** {index_item.get('owner', 'unknown')}",
|
||||||
|
f"- **Path:** `{index_item['path']}`",
|
||||||
|
f"- **Summary:** {index_item['summary']}",
|
||||||
|
"",
|
||||||
|
]
|
||||||
|
)
|
||||||
|
guidance = entry.get("consumer_guidance") or {}
|
||||||
|
limitations = guidance.get("known_limitations") or []
|
||||||
|
if limitations:
|
||||||
|
lines.append("**Known limitations:**")
|
||||||
|
lines.extend(f"- {item}" for item in limitations)
|
||||||
|
lines.append("")
|
||||||
|
return "\n".join(lines).rstrip() + "\n"
|
||||||
|
|
||||||
|
|
||||||
|
def render_html(
|
||||||
|
index: dict[str, Any],
|
||||||
|
indexed_entries: list[tuple[dict[str, Any], dict[str, Any]]],
|
||||||
|
) -> str:
|
||||||
|
sections: list[str] = []
|
||||||
|
for domain, items in _grouped_capabilities(indexed_entries).items():
|
||||||
|
cards: list[str] = []
|
||||||
|
for index_item, entry in sorted(items, key=lambda pair: pair[0]["id"]):
|
||||||
|
name = html.escape(index_item["name"])
|
||||||
|
summary = html.escape(index_item["summary"])
|
||||||
|
cap_id = html.escape(index_item["id"])
|
||||||
|
vector = html.escape(index_item["vector"])
|
||||||
|
path = html.escape(index_item["path"])
|
||||||
|
cards.append(
|
||||||
|
f"""<article class="card">
|
||||||
|
<h3>{name}</h3>
|
||||||
|
<p class="meta"><code>{cap_id}</code> · {vector}</p>
|
||||||
|
<p>{summary}</p>
|
||||||
|
<p class="path">{path}</p>
|
||||||
|
</article>"""
|
||||||
|
)
|
||||||
|
sections.append(
|
||||||
|
f"<section><h2>{html.escape(domain)}</h2>\n" + "\n".join(cards) + "</section>"
|
||||||
|
)
|
||||||
|
|
||||||
|
body = "\n".join(sections)
|
||||||
|
title = html.escape(f"Capability Catalog — {index.get('domain', 'unknown')}")
|
||||||
|
return f"""<!DOCTYPE html>
|
||||||
|
<html lang="en">
|
||||||
|
<head>
|
||||||
|
<meta charset="utf-8">
|
||||||
|
<title>{title}</title>
|
||||||
|
<style>
|
||||||
|
body {{ font-family: system-ui, sans-serif; margin: 2rem; line-height: 1.5; }}
|
||||||
|
h1 {{ margin-bottom: 0.2rem; }}
|
||||||
|
.subtitle {{ color: #555; margin-bottom: 2rem; }}
|
||||||
|
section {{ margin-bottom: 2rem; }}
|
||||||
|
.card {{ border: 1px solid #ddd; border-radius: 8px; padding: 1rem; margin: 1rem 0; }}
|
||||||
|
.meta {{ color: #444; font-size: 0.95rem; }}
|
||||||
|
.path {{ font-size: 0.85rem; color: #666; }}
|
||||||
|
</style>
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<h1>Capability Catalog</h1>
|
||||||
|
<p class="subtitle">Updated {html.escape(str(index.get('updated', 'unknown')))} · {len(indexed_entries)} entries</p>
|
||||||
|
{body}
|
||||||
|
</body>
|
||||||
|
</html>
|
||||||
|
"""
|
||||||
|
|
||||||
|
|
||||||
|
def write_catalog(
|
||||||
|
index: dict[str, Any],
|
||||||
|
indexed_entries: list[tuple[dict[str, Any], dict[str, Any]]],
|
||||||
|
) -> tuple[Path, Path]:
|
||||||
|
CATALOG_HTML_DIR.mkdir(parents=True, exist_ok=True)
|
||||||
|
CATALOG_MD.write_text(render_markdown(index, indexed_entries), encoding="utf-8")
|
||||||
|
CATALOG_HTML.write_text(render_html(index, indexed_entries), encoding="utf-8")
|
||||||
|
return CATALOG_MD, CATALOG_HTML
|
||||||
@@ -9,9 +9,9 @@ from typing import Any
|
|||||||
import yaml
|
import yaml
|
||||||
from jsonschema import Draft202012Validator
|
from jsonschema import Draft202012Validator
|
||||||
|
|
||||||
|
from reuse_surface.catalog import write_catalog
|
||||||
|
from reuse_surface.overlaps import find_overlaps
|
||||||
from reuse_surface.registry import (
|
from reuse_surface.registry import (
|
||||||
CAPABILITIES_DIR,
|
|
||||||
INDEX_PATH,
|
|
||||||
ROOT,
|
ROOT,
|
||||||
capability_paths,
|
capability_paths,
|
||||||
level_at_least,
|
level_at_least,
|
||||||
@@ -115,6 +115,40 @@ def cmd_query(args: argparse.Namespace) -> int:
|
|||||||
return 0
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
def _load_indexed_entries() -> list[tuple[dict[str, Any], dict[str, Any]]]:
|
||||||
|
index = load_index()
|
||||||
|
indexed_entries: list[tuple[dict[str, Any], dict[str, Any]]] = []
|
||||||
|
for item in index.get("capabilities", []):
|
||||||
|
path = ROOT / item["path"]
|
||||||
|
indexed_entries.append((item, parse_front_matter(path)))
|
||||||
|
return indexed_entries
|
||||||
|
|
||||||
|
|
||||||
|
def cmd_overlaps(args: argparse.Namespace) -> int:
|
||||||
|
indexed_entries = _load_indexed_entries()
|
||||||
|
candidates = find_overlaps(indexed_entries, threshold=args.threshold)
|
||||||
|
if not candidates:
|
||||||
|
print("no overlap candidates")
|
||||||
|
return 0
|
||||||
|
for candidate in candidates:
|
||||||
|
reasons = "; ".join(candidate.reasons)
|
||||||
|
print(
|
||||||
|
f"{candidate.left_id} <> {candidate.right_id} "
|
||||||
|
f"score={candidate.score:.2f} {reasons}"
|
||||||
|
)
|
||||||
|
print(f"\n{len(candidates)} candidate{'s' if len(candidates) != 1 else ''}")
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
def cmd_catalog(args: argparse.Namespace) -> int:
|
||||||
|
index = load_index()
|
||||||
|
indexed_entries = _load_indexed_entries()
|
||||||
|
md_path, html_path = write_catalog(index, indexed_entries)
|
||||||
|
print(f"ok: wrote {md_path.relative_to(ROOT)}")
|
||||||
|
print(f"ok: wrote {html_path.relative_to(ROOT)}")
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
def cmd_export(args: argparse.Namespace) -> int:
|
def cmd_export(args: argparse.Namespace) -> int:
|
||||||
index = load_index()
|
index = load_index()
|
||||||
bundle: dict[str, Any] = {
|
bundle: dict[str, Any] = {
|
||||||
@@ -184,6 +218,22 @@ def main(argv: list[str] | None = None) -> int:
|
|||||||
)
|
)
|
||||||
export.set_defaults(func=cmd_export)
|
export.set_defaults(func=cmd_export)
|
||||||
|
|
||||||
|
overlaps = subparsers.add_parser(
|
||||||
|
"overlaps", help="detect potential duplicate capabilities"
|
||||||
|
)
|
||||||
|
overlaps.add_argument(
|
||||||
|
"--threshold",
|
||||||
|
type=float,
|
||||||
|
default=0.28,
|
||||||
|
help="token similarity threshold (0-1)",
|
||||||
|
)
|
||||||
|
overlaps.set_defaults(func=cmd_overlaps)
|
||||||
|
|
||||||
|
catalog = subparsers.add_parser(
|
||||||
|
"catalog", help="generate human-readable capability catalog"
|
||||||
|
)
|
||||||
|
catalog.set_defaults(func=cmd_catalog)
|
||||||
|
|
||||||
args = parser.parse_args(argv)
|
args = parser.parse_args(argv)
|
||||||
return args.func(args)
|
return args.func(args)
|
||||||
|
|
||||||
|
|||||||
87
reuse_surface/overlaps.py
Normal file
87
reuse_surface/overlaps.py
Normal file
@@ -0,0 +1,87 @@
|
|||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import re
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
TOKEN_RE = re.compile(r"[a-z][a-z0-9-]{2,}")
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class OverlapCandidate:
|
||||||
|
left_id: str
|
||||||
|
right_id: str
|
||||||
|
score: float
|
||||||
|
reasons: list[str]
|
||||||
|
|
||||||
|
|
||||||
|
def _tokens(text: str) -> set[str]:
|
||||||
|
return set(TOKEN_RE.findall(text.lower()))
|
||||||
|
|
||||||
|
|
||||||
|
def _entry_blob(entry: dict[str, Any], index_item: dict[str, Any]) -> str:
|
||||||
|
discovery = entry.get("discovery") or {}
|
||||||
|
parts = [
|
||||||
|
index_item.get("name", ""),
|
||||||
|
index_item.get("summary", ""),
|
||||||
|
entry.get("id", ""),
|
||||||
|
" ".join(index_item.get("tags", [])),
|
||||||
|
discovery.get("intent", ""),
|
||||||
|
" ".join(discovery.get("includes", [])),
|
||||||
|
]
|
||||||
|
return " ".join(str(part) for part in parts if part)
|
||||||
|
|
||||||
|
|
||||||
|
def _relation_overlap(left: dict[str, Any], right: dict[str, Any]) -> list[str]:
|
||||||
|
reasons: list[str] = []
|
||||||
|
left_id = left["id"]
|
||||||
|
right_id = right["id"]
|
||||||
|
relations = left.get("relations") or {}
|
||||||
|
for relation_type, targets in relations.items():
|
||||||
|
if not isinstance(targets, list):
|
||||||
|
continue
|
||||||
|
if right_id in targets:
|
||||||
|
reasons.append(f"relation:{relation_type}")
|
||||||
|
if left_id.split(".")[1] == right_id.split(".")[1]:
|
||||||
|
reasons.append("shared domain segment")
|
||||||
|
return reasons
|
||||||
|
|
||||||
|
|
||||||
|
def find_overlaps(
|
||||||
|
indexed_entries: list[tuple[dict[str, Any], dict[str, Any]]],
|
||||||
|
*,
|
||||||
|
threshold: float = 0.28,
|
||||||
|
) -> list[OverlapCandidate]:
|
||||||
|
candidates: list[OverlapCandidate] = []
|
||||||
|
blobs = [
|
||||||
|
(_entry_blob(entry, index_item), index_item["id"], entry)
|
||||||
|
for index_item, entry in indexed_entries
|
||||||
|
]
|
||||||
|
|
||||||
|
for i, (left_blob, left_id, left_entry) in enumerate(blobs):
|
||||||
|
left_tokens = _tokens(left_blob)
|
||||||
|
for j in range(i + 1, len(blobs)):
|
||||||
|
right_blob, right_id, right_entry = blobs[j]
|
||||||
|
right_tokens = _tokens(right_blob)
|
||||||
|
if not left_tokens or not right_tokens:
|
||||||
|
continue
|
||||||
|
score = len(left_tokens & right_tokens) / len(left_tokens | right_tokens)
|
||||||
|
reasons: list[str] = []
|
||||||
|
if score >= threshold:
|
||||||
|
reasons.append(f"token similarity {score:.2f}")
|
||||||
|
shared_tags = set(left_entry.get("tags", [])) & set(
|
||||||
|
right_entry.get("tags", [])
|
||||||
|
)
|
||||||
|
if shared_tags:
|
||||||
|
reasons.append(f"shared tags: {', '.join(sorted(shared_tags))}")
|
||||||
|
reasons.extend(_relation_overlap(left_entry, right_entry))
|
||||||
|
if reasons and (score >= threshold or len(reasons) > 1):
|
||||||
|
candidates.append(
|
||||||
|
OverlapCandidate(
|
||||||
|
left_id=left_id,
|
||||||
|
right_id=right_id,
|
||||||
|
score=score,
|
||||||
|
reasons=reasons,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
return sorted(candidates, key=lambda item: item.score, reverse=True)
|
||||||
@@ -42,6 +42,25 @@ reuse-surface export
|
|||||||
reuse-surface export --format json
|
reuse-surface export --format json
|
||||||
```
|
```
|
||||||
|
|
||||||
|
### overlaps
|
||||||
|
|
||||||
|
Detect potential duplicate or overlapping capabilities (UC-RS-015).
|
||||||
|
|
||||||
|
```bash
|
||||||
|
reuse-surface overlaps
|
||||||
|
reuse-surface overlaps --threshold 0.35
|
||||||
|
```
|
||||||
|
|
||||||
|
### catalog
|
||||||
|
|
||||||
|
Generate human-readable catalog artifacts (UC-RS-018).
|
||||||
|
|
||||||
|
```bash
|
||||||
|
reuse-surface catalog
|
||||||
|
```
|
||||||
|
|
||||||
|
Writes `docs/CapabilityCatalog.md` and `docs/catalog/index.html`.
|
||||||
|
|
||||||
## Export format
|
## Export format
|
||||||
|
|
||||||
The export bundle includes:
|
The export bundle includes:
|
||||||
@@ -59,6 +78,8 @@ Stable IDs and maturity fields are preserved for agent consumption (UC-RS-019).
|
|||||||
| Discover capabilities | `reuse-surface query` or read the index |
|
| Discover capabilities | `reuse-surface query` or read the index |
|
||||||
| Validate entry shape | `reuse-surface validate` |
|
| Validate entry shape | `reuse-surface validate` |
|
||||||
| Export for agents | `reuse-surface export --format json` |
|
| Export for agents | `reuse-surface export --format json` |
|
||||||
|
| Detect overlap | `reuse-surface overlaps` |
|
||||||
|
| Publish catalog | `reuse-surface catalog` |
|
||||||
|
|
||||||
## Related use cases
|
## Related use cases
|
||||||
|
|
||||||
|
|||||||
66
workplans/REUSE-WP-0004-registry-hardening.md
Normal file
66
workplans/REUSE-WP-0004-registry-hardening.md
Normal file
@@ -0,0 +1,66 @@
|
|||||||
|
---
|
||||||
|
id: REUSE-WP-0004
|
||||||
|
type: workplan
|
||||||
|
title: "Registry hardening: CI, overlap detection, and catalog"
|
||||||
|
domain: helix_forge
|
||||||
|
repo: reuse-surface
|
||||||
|
status: finished
|
||||||
|
owner: codex
|
||||||
|
topic_slug: helix-forge
|
||||||
|
created: "2026-06-15"
|
||||||
|
updated: "2026-06-15"
|
||||||
|
---
|
||||||
|
|
||||||
|
# Registry hardening: CI, overlap detection, and catalog
|
||||||
|
|
||||||
|
Follow-up to `docs/IntentScopeGapAnalysis.md` section 8 next recommended work
|
||||||
|
(priorities 9–11). Raise registry quality through automated CI validation, overlap
|
||||||
|
reporting (UC-RS-015), and a generated human-readable catalog (UC-RS-018).
|
||||||
|
|
||||||
|
## Add CI Validation Workflow
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: REUSE-WP-0004-T01
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
```
|
||||||
|
|
||||||
|
Add `.gitea/workflows/ci.yml` that runs on push and pull requests to `main`.
|
||||||
|
Install the package and run `reuse-surface validate`. Document the workflow in
|
||||||
|
`AGENTS.md`.
|
||||||
|
|
||||||
|
## Add Overlap Detection Command
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: REUSE-WP-0004-T02
|
||||||
|
status: done
|
||||||
|
priority: high
|
||||||
|
```
|
||||||
|
|
||||||
|
Add `reuse-surface overlaps` that flags potential duplicate or overlapping
|
||||||
|
capabilities using summary/tags/includes similarity and relation signals.
|
||||||
|
Document usage in `registry/README.md` and `tools/README.md`.
|
||||||
|
|
||||||
|
## Add Catalog Generation Command
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: REUSE-WP-0004-T03
|
||||||
|
status: done
|
||||||
|
priority: medium
|
||||||
|
```
|
||||||
|
|
||||||
|
Add `reuse-surface catalog` that generates `docs/CapabilityCatalog.md` and
|
||||||
|
`docs/catalog/index.html` from the index and entry front matter. Group by domain
|
||||||
|
and show maturity vectors.
|
||||||
|
|
||||||
|
## Refresh Docs And Gap Analysis
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: REUSE-WP-0004-T04
|
||||||
|
status: done
|
||||||
|
priority: medium
|
||||||
|
```
|
||||||
|
|
||||||
|
Update `SCOPE.md`, `tools/README.md`, and `docs/IntentScopeGapAnalysis.md` to
|
||||||
|
reflect CI, overlaps, and catalog capabilities. Close gap analysis priorities
|
||||||
|
9–11.
|
||||||
Reference in New Issue
Block a user