Transfered deep scope functionality from the custodian

This commit is contained in:
2026-05-01 00:42:10 +02:00
parent b424dea01b
commit 2d9da98257
10 changed files with 1397 additions and 47 deletions

View File

@@ -1,4 +1,4 @@
# repo-registry — Agent Instructions
# repo-scoping — Agent Instructions
## Repo Identity
@@ -8,7 +8,7 @@ scanners establish observed facts; LLM-assisted extractors propose interpreted
claims; humans or trusted agents approve registry truth.
**Domain:** capabilities
**Repo slug:** repo-registry
**Repo slug:** repo-scoping
**Topic ID:** `64418556-3206-457a-ba29-6884b5b12cf3`
**Workplan prefix:** `RREG-WP-`
@@ -33,7 +33,7 @@ curl -s "http://127.0.0.1:8000/workstreams/?topic_id=64418556-3206-457a-ba29-688
curl -s "http://127.0.0.1:8000/tasks/?status=todo" | python3 -m json.tool
# Check inbox
curl -s "http://127.0.0.1:8000/messages/?to_agent=repo-registry&unread_only=true" \
curl -s "http://127.0.0.1:8000/messages/?to_agent=repo-scoping&unread_only=true" \
| python3 -m json.tool
```
@@ -79,7 +79,7 @@ curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
**Start:**
1. `ls workplans/` — note active workplans and their open tasks
2. Check inbox via `GET /messages/?to_agent=repo-registry&unread_only=true`
2. Check inbox via `GET /messages/?to_agent=repo-scoping&unread_only=true`
3. Check for human-flagged tasks: `GET /tasks/?needs_human=true`
**During work:**
@@ -92,7 +92,7 @@ curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>" \
3. If workplan files changed, sync them to the hub DB:
```bash
curl -s -X POST "http://127.0.0.1:8000/repos/repo-registry/sync" | python3 -m json.tool
curl -s -X POST "http://127.0.0.1:8000/repos/repo-scoping/sync" | python3 -m json.tool
```
This runs the ADR-001 consistency check with `--fix` and returns a JSON report.
@@ -116,7 +116,7 @@ id: RREG-WP-NNNN
type: workplan
title: "..."
domain: capabilities
repo: repo-registry
repo: repo-scoping
status: active | done
owner: codex
topic_slug: foerster-capabilities

197
SCOPE.md
View File

@@ -1,48 +1,175 @@
---
domain: capabilities
repo: repo-registry
updated: "2026-04-26"
repo: repo-scoping
updated: "2026-04-30"
---
# repo-registry — Scope
# SCOPE
## Purpose
> This file helps you quickly understand what this repository is about,
> when it is relevant, and when it is not.
> It is curated from repo-scoping's approved characteristics and operating role.
Repository Ability Registry. Turns Git repositories into reviewable, source-linked
maps of `Ability → Capability → Feature → Evidence`.
---
## Core Design Principle
## One-liner
```
deterministic scanners → observed facts (file paths, languages, API routes, …)
LLM-assisted extractors → interpreted claims (ability names, descriptions, links)
human / agent review → approved registry truth
repo-scoping turns Git repositories into reviewable, source-linked scope maps and
maintains SCOPE.md files as the human and agent entry point to repository utility.
---
## Core Idea
repo-scoping models a repository as a hierarchy of characteristics:
`Scope -> Ability -> Capability -> Feature -> Evidence -> Observed Fact`.
Deterministic scanners establish observed facts from repository content. Optional
LLM-assisted extraction proposes interpreted candidates. Humans or trusted agents
approve the resulting characteristics before they become registry truth.
The primary output is a useful repository scope profile: what a repo is for, when
to use it, what capabilities it provides, and which facts or lower-level
characteristics support those claims.
---
## In Scope
- Register repositories and keep metadata, analysis runs, facts, candidates, and
approved characteristics together.
- Analyze repositories with deterministic scanners and optional LLM-assisted
candidate extraction.
- Review, edit, approve, reject, merge, and relink candidate abilities,
capabilities, features, and evidence.
- Search and compare approved repository characteristics.
- Generate, diff, validate, and write SCOPE.md files from approved
characteristics.
- Support the Custodian State Hub by acting as the provider for scope generation
and update capabilities.
---
## Out of Scope
- Owning the Custodian State Hub, its database, or cross-domain governance rules.
- Making unreviewed truth claims canonical without approval.
- Replacing human product judgment for curator-owned scope sections.
- Continuous Git hosting automation, deployment infrastructure, or access-control
policy beyond repository ingestion needs.
- Full static code understanding across every language and framework.
---
## Relevant When
- You need to understand what a repository is useful for without reading the whole
codebase first.
- You want a source-linked map from high-level repository scope down to observed
implementation facts.
- You need to review generated candidate abilities, capabilities, features, and
evidence before approving them.
- You need to create or refresh a SCOPE.md for a registered repository.
- You need to compare repositories by approved characteristics or find capability
gaps across a domain.
---
## Not Relevant When
- You only need raw Git hosting, CI, deployment, or issue tracking.
- You need a fully autonomous ontology without human review.
- The repository has not been registered or analyzed and no approved
characteristics exist yet.
- The needed decision is curator-owned product positioning rather than
source-observable repository behavior.
---
## Current State
- Status: active and evolving.
- Implementation: FastAPI service with SQLite development storage, deterministic
Git scanning, candidate graph review workflow, search, comparison, and SCOPE.md
generation endpoints.
- LLM assistance: optional; deterministic non-LLM behavior remains a first-class
path for continued optimization.
- UI: available for repository registration, analysis runs, candidate review, and
characteristic navigation.
- Integration: registered in the Custodian State Hub as `repo-scoping`.
---
## How It Fits
- Upstream coordination: the Custodian State Hub owns workstream/task state,
managed repository records, and capability routing.
- Downstream consumers: Custodian agents and humans use repo-scoping to inspect,
refine, and refresh repository utility profiles.
- Often used with: `llm-connect` for optional LLM-assisted extraction and
`the-custodian` for state, routing, and domain coordination.
---
## Terminology
- Preferred terms: scope, ability, capability, feature, evidence, observed fact,
characteristic, candidate, approved characteristic, SCOPE.md.
- Also known as: repository scoping service, repository ability registry.
- Potentially confusing terms: evidence is not just a raw fact; it is support for
a characteristic and may reference facts or lower-level characteristics.
Candidates are proposed claims awaiting review; approved characteristics are
canonical registry truth.
---
## Related / Overlapping Repositories
- `the-custodian` - coordination layer, State Hub, workplans, and capability
catalog.
- `llm-connect` - optional provider abstraction for LLM-assisted extraction.
- `markitect` / `markitect-project` - content and documentation platform with
related scope-document needs.
---
## Getting Oriented
- Start with: `README.md`, `AGENTS.md`, and this `SCOPE.md`.
- Key files / directories: `src/repo_registry/web_api/app.py`,
`src/repo_registry/core/service.py`, `src/repo_registry/scope/`,
`src/repo_registry/candidate_graph/`, `src/repo_registry/repo_scanning/`,
`docs/scope-md-spec.md`, and `workplans/`.
- Entry points: `uvicorn repo_registry.web_api.app:app --reload`, the `/ui`
routes, and the `/repos/{repo_slug}/scope*` API endpoints.
---
## Provided Capabilities
```capability
type: api
title: scope.generate
description: >
Generates a SCOPE.md from scratch for a registered repo using its approved
characteristics profile (abilities, capabilities, features, facts).
keywords: [scope, scope-md, generation, repository-utility]
```
Approved entries are always explicit, reviewable, and source-linked. The system
never publishes unapproved claims as canonical truth.
```capability
type: api
title: scope.update
description: >
Diffs an existing SCOPE.md against the current characteristics profile
and returns or writes an updated version.
keywords: [scope, scope-md, update, diff, staleness]
```
## In Scope (v0.1)
---
- Repository registration by Git URL
- Deterministic repository scan (file tree, languages, frameworks, API/CLI surface)
- Candidate extraction for abilities, capabilities, features, and evidence
- Human review workflow: edit, approve, reject, merge, relink
- Natural-language and semantic search over approved registry entries
- REST API for repositories, ability maps, capabilities, and search
## Notes
## Out of Scope (v0.1)
- Continuous GitHub App integration
- Full static code understanding (AST/type analysis)
- Advanced ontology enforcement
- Distributed indexing
- Benchmark execution
- Marketplace or commercial features
- Complex access control
- Automated truth claims without review
## Domain Context
Part of the **capabilities** domain — systematic modeling of abilities, capabilities,
and features across the Custodian ecosystem. First registered repo in this domain.
- The local checkout path is still `/home/worsch/repo-registry`; the canonical
State Hub slug and Git remote are now `repo-scoping`.
- Ecosystem-wide SCOPE.md refresh is blocked until Custodian C5b/C5c checks are
active and more managed repos have approved characteristics in repo-scoping.

292
docs/scope-md-spec.md Normal file
View File

@@ -0,0 +1,292 @@
# SCOPE.md Reference Specification
`SCOPE.md` is the human- and agent-facing boundary definition for a repository.
It answers, quickly and concretely, what the repository is for, when it is useful,
where it fits, and what capabilities it can provide.
Repo-registry is the source of truth for generating and validating `SCOPE.md`
because its approved characteristic model already captures the same structure:
```text
Scope -> Ability -> Capability -> Feature -> Evidence -> Observed Fact
```
This specification supersedes the Custodian dashboard reference at
`state-hub/dashboard/src/docs/scope.md`. The scaffold template remains at
`state-hub/scripts/project_rules/scope.template`; this document defines how
repo-registry should generate, validate, and update that file.
Related model docs:
- `docs/characteristic-evidence-model.md`
- `docs/classification-strategy.md`
## Purpose
`SCOPE.md` is not a README, architecture document, or marketing page. It is a
short orientation artifact for deciding whether a repo is relevant before reading
its code in depth.
It should answer:
- What is this repository for?
- Should I care about it right now?
- When is it relevant to my work?
- Where does it fit in the ecosystem?
- Is it mature enough to trust or reuse?
- Does it overlap with something else?
- What capabilities can it provide to other domains?
## Canonical Template
The historical Custodian reference calls this an "11-section template". The
current `scope.template` contains twelve functional sections plus an optional
`Notes` tail. Repo-registry should preserve the current template headings for
compatibility and treat `Notes` as curator-owned free text.
Generated files must contain these sections, in this order:
| Section | Source in repo-registry | Generation ownership |
|---------|--------------------------|----------------------|
| `## One-liner` | Scope name plus scope description | generated, curator-reviewed |
| `## Core Idea` | Scope description and top approved abilities | generated, curator-reviewed |
| `## In Scope` | Approved abilities and high-confidence capabilities | generated, curator-reviewed |
| `## Out of Scope` | Abilities or expectation gaps classified as exclusions | curator-owned unless explicitly modeled |
| `## Relevant When` | Approved features with `primary_class: business-usecase` or `attributes` including use-case labels | generated, curator-reviewed |
| `## Not Relevant When` | Negative use-case expectation gaps or curator exclusions | curator-owned unless explicitly modeled |
| `## Current State` | Observed facts aggregated by scanner: status, language, framework, tests, routes, docs, manifests | generated |
| `## How It Fits` | Evidence/support references to other characteristics or repos; dependency facts | generated, curator-reviewed |
| `## Terminology` | Domain term facts, names, aliases, and classification labels | generated, curator-reviewed |
| `## Related / Overlapping Repositories` | Cross-repo support references and comparison/discovery data | generated when known, curator-reviewed |
| `## Getting Oriented` | Source refs, content chunks, key files, entry points, docs, tests | generated |
| `## Provided Capabilities` | Approved capability characteristics rendered as machine-readable `capability` blocks | generated, file-origin truth |
| `## Notes` | Human-maintained remarks that do not fit the structured sections | curator-owned |
When a generated section has insufficient data, emit a short stub plus:
```markdown
<!-- needs curator input -->
```
This makes gaps visible without pretending the scanner knows more than it does.
## Section Mapping Details
### One-liner
Use the approved repository `Scope` as the root characteristic. Prefer a single
sentence from the scope description. If no curated sentence exists, use:
```text
<scope name> defines and maintains the repository scope for <repository name>.
```
### Core Idea
Summarize the root `Scope` and the most important approved `Ability` entries.
Use ability descriptions where available. Avoid listing every capability here;
the goal is orientation, not completeness.
### In Scope
Render approved abilities as top-level bullets. Include the most important
capabilities as nested wording inside the bullet, but avoid deep nesting in the
generated Markdown.
Suggested form:
```markdown
- <Ability name> — <ability description>. Includes <capability A>, <capability B>.
```
### Out of Scope
This section is primarily curator-owned. Repo-registry may seed it from
classification expectation gaps whose `expected_type` is one of:
- `classification-granularity`
- `classification-support`
- `out-of-scope`
Generated text must be conservative and marked for review unless there is an
approved negative/exclusion model in the future.
### Relevant When
Use approved features that represent real usage scenarios. Strong signals:
- `primary_class == "business-usecase"`
- `attributes` contains `usecase`, `workflow`, `review`, `generation`,
`analysis`, `integration`, or another domain-specific use-case label
If no business-usecase features exist, seed from high-confidence abilities and
capabilities with a curator-input marker.
### Not Relevant When
This section is curator-owned unless explicit negative use-case facts or
expectation gaps exist. Do not infer broad exclusions from missing features.
### Current State
Aggregate observed facts. Good generated indicators include:
- Status: derive from repository status and analysis run state.
- Implementation: derive from source files, package manifests, tests, and route
or CLI facts.
- Stability: conservative default `evolving` unless curated.
- Usage: conservative default `internal` or `unknown` unless facts indicate
production usage.
Include compact bullets for detected languages, frameworks, tests, manifests,
docs, interfaces, provider facts, and scanner gaps.
### How It Fits
Use support/evidence relationships and source refs:
- Upstream dependencies: package, service, provider, and integration facts.
- Downstream consumers: cross-repo support references when available.
- Often used with: related repo links and common provider/framework facts.
Evidence is support for a characteristic, not the same thing as a fact. Prefer
evidence links that point downward in abstraction, as described in
`docs/characteristic-evidence-model.md`.
### Terminology
Generate from:
- scope, ability, capability, and feature names
- `primary_class` and `attributes`
- scanner facts for providers, frameworks, commands, APIs, and domain terms
- aliases or expectation gaps when present
Mark ambiguous or overlapping terms for curator review.
### Related / Overlapping Repositories
Generate only when there is cross-repo evidence, comparison data, or explicit
curator input. Do not invent related repositories from name similarity alone.
### Getting Oriented
Use source references and observed facts to name good entry points:
- Start with: README, docs, API route files, CLI files, core service modules
- Key files / directories: source paths with high fact/support density
- Entry points: API routes, CLI commands, package manifests, tests
### Provided Capabilities
Render approved `Capability` characteristics as fenced `capability` blocks. This
section is parsed by the Custodian capability catalog and remains file-origin
truth under ADR-001.
Block format:
````markdown
```capability
type: api
title: scope.generate
description: >
Generates a SCOPE.md from approved repository characteristics.
keywords: [scope, scope-md, generation]
```
````
Fields:
| Field | Required | Source |
|-------|----------|--------|
| `type` | yes | capability `primary_class`, normalized to catalog categories |
| `title` | yes | capability name or curated capability key |
| `description` | no | capability description |
| `keywords` | no | capability attributes plus relevant feature classes |
Allowed catalog categories remain compatible with the existing Custodian ingest:
- `infrastructure`
- `api`
- `data`
- `security`
- `documentation`
- `other`
If a capability's `primary_class` is not one of these categories, map it to
`api`, `data`, `documentation`, or `other` conservatively and preserve the
original class as a keyword.
### Notes
`Notes` is optional and curator-owned. Generators should preserve existing notes
when updating a file and should not overwrite this section unless explicitly
requested.
## Generation Ownership
Repo-registry-generated sections:
- One-liner
- Core Idea
- In Scope
- Relevant When
- Current State
- How It Fits
- Terminology
- Related / Overlapping Repositories
- Getting Oriented
- Provided Capabilities
Curator-owned or curator-reviewed sections:
- Out of Scope
- Not Relevant When
- Notes
- Any generated section containing `<!-- needs curator input -->`
The generator may write stubs for curator-owned sections, but the updater must
preserve existing curator text unless the caller explicitly asks for a full
rewrite.
## Validation Rules
The validator should mirror the Custodian DOI C5 checks:
- C5a: `SCOPE.md` exists at the repository root.
- C5b: required headings are present in canonical order.
- C5c: `## Provided Capabilities` contains parseable `capability` blocks, or an
explicit empty-state note when the repo provides no routable capabilities.
Additional repo-registry validation:
- Generated sections with missing data must include `<!-- needs curator input -->`.
- Capability blocks must parse as key/value metadata.
- Capability block titles should be stable enough for routing.
- Curator-owned sections should be preserved by diff/update flows.
## Update Semantics
The validator/differ compares the existing file to freshly generated content by
section. A section is:
- `ok` when normalized existing text matches generated content.
- `stale` when the section exists but differs materially.
- `missing` when the heading is absent.
Normalization should ignore repeated whitespace and harmless Markdown wrapping,
but must not ignore changed capability block metadata.
Generated updates should be section-aware. Do not rewrite the whole file when a
smaller section update is enough.
## Agent Guidance
Agents should treat `SCOPE.md` as a decision aid:
- Read it before deep code exploration.
- Prefer it over README for scope boundaries.
- Use `AGENTS.md` for operating instructions and repo-specific workflow.
- Use generated diffs to spot stale scope claims.
- Record expectation gaps when generated scope, classes, or capabilities do not
match human judgment.

View File

@@ -0,0 +1,4 @@
from repo_registry.scope.generator import ScopeGenerator
from repo_registry.scope.validator import ScopeValidator
__all__ = ["ScopeGenerator", "ScopeValidator"]

View File

@@ -0,0 +1,323 @@
from __future__ import annotations
import re
from dataclasses import asdict
from repo_registry.core.service import RegistryService
from repo_registry.storage.sqlite import NotFoundError
SCOPE_SECTIONS = [
"One-liner",
"Core Idea",
"In Scope",
"Out of Scope",
"Relevant When",
"Not Relevant When",
"Current State",
"How It Fits",
"Terminology",
"Related / Overlapping Repositories",
"Getting Oriented",
"Provided Capabilities",
"Notes",
]
NEEDS_INPUT = "<!-- needs curator input -->"
class ScopeGenerator:
"""Render SCOPE.md from approved repository characteristics."""
def __init__(self, service: RegistryService) -> None:
self.service = service
def generate(self, repo_slug: str) -> str:
repository = self._repository_by_slug(repo_slug)
ability_map = asdict(self.service.ability_map(repository.id))
facts = [asdict(fact) for fact in self.service.list_observed_facts(repository.id)]
sections = {
"One-liner": self._one_liner(ability_map),
"Core Idea": self._core_idea(ability_map),
"In Scope": self._in_scope(ability_map),
"Out of Scope": self._curator_stub(),
"Relevant When": self._relevant_when(ability_map),
"Not Relevant When": self._curator_stub(),
"Current State": self._current_state(repository.status, facts),
"How It Fits": self._how_it_fits(ability_map),
"Terminology": self._terminology(ability_map, facts),
"Related / Overlapping Repositories": self._curator_stub(),
"Getting Oriented": self._getting_oriented(ability_map, facts),
"Provided Capabilities": self._provided_capabilities(ability_map),
"Notes": self._curator_stub(),
}
lines = [
"# SCOPE",
"",
"> This file helps you quickly understand what this repository is about,",
"> when it is relevant, and when it is not.",
"> It was generated from approved repo-registry characteristics.",
"",
"---",
"",
]
for section in SCOPE_SECTIONS:
lines.extend([f"## {section}", "", sections[section].rstrip(), "", "---", ""])
return "\n".join(lines).rstrip() + "\n"
def _repository_by_slug(self, repo_slug: str):
wanted = self._slug(repo_slug)
for repository in self.service.list_repositories():
candidates = {
self._slug(repository.name),
self._slug(repository.url.rstrip("/").rsplit("/", 1)[-1].removesuffix(".git")),
}
if wanted in candidates:
return repository
raise NotFoundError(f"repository slug {repo_slug!r} was not found")
def _one_liner(self, ability_map: dict) -> str:
scope = ability_map["scope"]
description = self._sentence(scope.get("description", ""))
if description:
return description
return f"{scope['name']} defines the repository scope for {ability_map['repository']['name']}."
def _core_idea(self, ability_map: dict) -> str:
scope = ability_map["scope"]
abilities = ability_map.get("abilities", [])
lines = [scope.get("description") or self._one_liner(ability_map)]
if abilities:
lines.append("")
lines.append("Approved abilities:")
lines.extend(
f"- {ability['name']}{ability.get('description') or 'Approved repository ability.'}"
for ability in abilities[:5]
)
else:
lines.extend(["", NEEDS_INPUT])
return "\n".join(lines)
def _in_scope(self, ability_map: dict) -> str:
abilities = ability_map.get("abilities", [])
if not abilities:
return self._curator_stub()
lines = []
for ability in abilities:
capabilities = ", ".join(
capability["name"] for capability in ability.get("capabilities", [])[:4]
)
suffix = f" Includes {capabilities}." if capabilities else ""
lines.append(
f"- {ability['name']}{ability.get('description') or 'Approved ability.'}{suffix}"
)
return "\n".join(lines)
def _relevant_when(self, ability_map: dict) -> str:
features = [
feature
for feature in self._features(ability_map)
if self._is_usecase_feature(feature)
]
if not features:
features = self._features(ability_map)[:5]
if not features:
return self._curator_stub()
lines = [
f"- You need {feature['name']} ({feature.get('primary_class') or feature.get('type', 'feature')})."
for feature in features
]
if not any(self._is_usecase_feature(feature) for feature in features):
lines.append(NEEDS_INPUT)
return "\n".join(lines)
def _current_state(self, status: str, facts: list[dict]) -> str:
kinds = self._facts_by_kind(facts)
languages = self._fact_names(kinds.get("language", []))
frameworks = self._fact_names(kinds.get("framework", []))
tests = kinds.get("test", [])
interfaces = kinds.get("interface", [])
manifests = kinds.get("manifest", [])
implementation = "substantial" if interfaces or manifests else "partial"
if not facts:
implementation = "unknown"
lines = [
f"- Status: {status}",
f"- Implementation: {implementation}",
"- Stability: evolving",
"- Usage: internal",
f"- Languages: {', '.join(languages) if languages else 'unknown'}",
f"- Frameworks: {', '.join(frameworks) if frameworks else 'none detected'}",
f"- Tests observed: {len(tests)}",
f"- Interfaces observed: {len(interfaces)}",
f"- Manifests observed: {len(manifests)}",
]
if not facts:
lines.append(NEEDS_INPUT)
return "\n".join(lines)
def _how_it_fits(self, ability_map: dict) -> str:
evidence = [
item
for capability in self._capabilities(ability_map)
for item in capability.get("evidence", [])
]
if not evidence:
return "\n".join(
[
"- Upstream dependencies: " + NEEDS_INPUT,
"- Downstream consumers: " + NEEDS_INPUT,
"- Often used with: " + NEEDS_INPUT,
]
)
refs = ", ".join(
sorted({item.get("reference", "") for item in evidence if item.get("reference")})[:8]
)
return "\n".join(
[
f"- Supported by evidence references: {refs or 'available evidence'}",
"- Upstream dependencies: " + NEEDS_INPUT,
"- Downstream consumers: " + NEEDS_INPUT,
"- Often used with: " + NEEDS_INPUT,
]
)
def _terminology(self, ability_map: dict, facts: list[dict]) -> str:
terms = set()
for item in [ability_map["scope"], *ability_map.get("abilities", [])]:
terms.add(item.get("name", ""))
terms.add(item.get("primary_class", ""))
terms.update(item.get("attributes", []))
for capability in self._capabilities(ability_map):
terms.add(capability.get("name", ""))
terms.add(capability.get("primary_class", ""))
terms.update(capability.get("attributes", []))
for fact in facts:
if fact.get("kind") in {"framework", "llm_provider", "provider_registry"}:
terms.add(fact.get("name", ""))
visible = [term for term in sorted(terms) if term]
if not visible:
return self._curator_stub()
return "\n".join(
[
"- Preferred terms: " + ", ".join(visible[:12]),
"- Also known as: " + NEEDS_INPUT,
"- Potentially confusing terms: " + NEEDS_INPUT,
]
)
def _getting_oriented(self, ability_map: dict, facts: list[dict]) -> str:
paths = self._source_paths(ability_map, facts)
if not paths:
return self._curator_stub()
return "\n".join(
[
f"- Start with: {paths[0]}",
f"- Key files / directories: {', '.join(paths[:8])}",
f"- Entry points: {', '.join(paths[:5])}",
]
)
def _provided_capabilities(self, ability_map: dict) -> str:
capabilities = self._capabilities(ability_map)
if not capabilities:
return f"<!-- No approved capabilities yet. -->\n{NEEDS_INPUT}"
blocks = []
for capability in capabilities:
keywords = self._keywords_for_capability(capability)
blocks.append(
"\n".join(
[
"```capability",
f"type: {self._capability_type(capability.get('primary_class', 'other'))}",
f"title: {capability['name']}",
"description: >",
f" {capability.get('description') or 'Approved repository capability.'}",
f"keywords: [{', '.join(keywords)}]",
"```",
]
)
)
return "\n\n".join(blocks)
def _capabilities(self, ability_map: dict) -> list[dict]:
return [
capability
for ability in ability_map.get("abilities", [])
for capability in ability.get("capabilities", [])
]
def _features(self, ability_map: dict) -> list[dict]:
return [
feature
for capability in self._capabilities(ability_map)
for feature in capability.get("features", [])
]
def _is_usecase_feature(self, feature: dict) -> bool:
labels = {str(feature.get("primary_class", "")).lower()}
labels.update(str(item).lower() for item in feature.get("attributes", []))
return bool(labels & {"business-usecase", "usecase", "workflow", "review"})
def _keywords_for_capability(self, capability: dict) -> list[str]:
keywords = [capability.get("primary_class", "")]
keywords.extend(capability.get("attributes", []))
for feature in capability.get("features", []):
keywords.append(feature.get("primary_class", ""))
keywords.extend(feature.get("attributes", []))
return [self._keyword(item) for item in self._unique(keywords)[:8] if item]
def _capability_type(self, primary_class: str) -> str:
normalized = primary_class.lower()
if normalized in {"api", "infrastructure", "data", "security", "documentation"}:
return normalized
if normalized in {"interface", "integration", "llm-integration"}:
return "api"
if normalized in {"storage", "repository-structure"}:
return "data"
return "other"
def _facts_by_kind(self, facts: list[dict]) -> dict[str, list[dict]]:
grouped: dict[str, list[dict]] = {}
for fact in facts:
grouped.setdefault(fact.get("kind", ""), []).append(fact)
return grouped
def _fact_names(self, facts: list[dict]) -> list[str]:
return self._unique([fact.get("name", "") for fact in facts])
def _source_paths(self, ability_map: dict, facts: list[dict]) -> list[str]:
paths = [fact.get("path", "") for fact in facts if fact.get("path")]
for feature in self._features(ability_map):
paths.append(feature.get("location", ""))
for source_ref in feature.get("source_refs", []):
paths.append(source_ref.get("path", ""))
return self._unique(paths)
def _curator_stub(self) -> str:
return f"- {NEEDS_INPUT}"
def _sentence(self, text: str) -> str:
cleaned = re.sub(r"\s+", " ", text.strip())
if not cleaned:
return ""
return re.split(r"(?<=[.!?])\s+", cleaned, maxsplit=1)[0]
def _slug(self, value: str) -> str:
return re.sub(r"[^a-z0-9]+", "-", value.lower()).strip("-")
def _keyword(self, value: str) -> str:
return self._slug(value) or "other"
def _unique(self, values: list[str]) -> list[str]:
result: list[str] = []
seen: set[str] = set()
for value in values:
item = str(value).strip()
key = item.lower()
if not item or key in seen:
continue
seen.add(key)
result.append(item)
return result

View File

@@ -0,0 +1,184 @@
from __future__ import annotations
import re
from dataclasses import dataclass
from pathlib import Path
from repo_registry.scope.generator import SCOPE_SECTIONS, ScopeGenerator
@dataclass(frozen=True)
class ScopeDiffSection:
section: str
status: str
current_text: str | None
proposed_text: str | None
@dataclass(frozen=True)
class ScopeDiff:
sections: list[ScopeDiffSection]
@property
def needs_update(self) -> bool:
return any(section.status != "ok" for section in self.sections)
@dataclass(frozen=True)
class ScopeValidationIssue:
check: str
severity: str
message: str
@dataclass(frozen=True)
class ValidationResult:
issues: list[ScopeValidationIssue]
@property
def ok(self) -> bool:
return not any(issue.severity == "error" for issue in self.issues)
class ScopeValidator:
"""Validate and diff SCOPE.md files."""
def __init__(self, generator: ScopeGenerator | None = None) -> None:
self.generator = generator
def diff(self, repo_slug: str, existing_path: Path) -> ScopeDiff:
if self.generator is None:
raise ValueError("ScopeValidator.diff requires a ScopeGenerator")
current = existing_path.read_text(encoding="utf-8") if existing_path.exists() else ""
proposed = self.generator.generate(repo_slug)
current_sections = self._parse_sections(current)
proposed_sections = self._parse_sections(proposed)
sections: list[ScopeDiffSection] = []
for section in SCOPE_SECTIONS:
current_text = current_sections.get(section)
proposed_text = proposed_sections.get(section, "")
if current_text is None:
status = "missing"
elif self._normalize(current_text) == self._normalize(proposed_text):
status = "ok"
else:
status = "stale"
sections.append(
ScopeDiffSection(
section=section,
status=status,
current_text=current_text,
proposed_text=proposed_text,
)
)
return ScopeDiff(sections=sections)
def validate(self, path: Path) -> ValidationResult:
issues: list[ScopeValidationIssue] = []
if not path.exists():
return ValidationResult(
issues=[
ScopeValidationIssue(
check="C5a",
severity="error",
message="SCOPE.md is missing.",
)
]
)
content = path.read_text(encoding="utf-8")
sections = self._parse_sections(content)
missing = [section for section in SCOPE_SECTIONS if section not in sections]
if missing:
severity = "warn" if missing == ["Provided Capabilities"] else "error"
issues.append(
ScopeValidationIssue(
check="C5b",
severity=severity,
message=f"Missing SCOPE.md section(s): {', '.join(missing)}.",
)
)
ordered = self._heading_order(content)
expected_order = [section for section in SCOPE_SECTIONS if section in sections]
if ordered[: len(expected_order)] != expected_order:
issues.append(
ScopeValidationIssue(
check="C5b",
severity="warn",
message="SCOPE.md sections are not in canonical order.",
)
)
capabilities = sections.get("Provided Capabilities")
if capabilities is None:
issues.append(
ScopeValidationIssue(
check="C5c",
severity="warn",
message="Provided Capabilities section is missing.",
)
)
elif "```capability" in capabilities:
for index, block in enumerate(self._capability_blocks(capabilities), start=1):
keys = self._capability_keys(block)
missing_keys = {"type", "title"} - keys
if missing_keys:
issues.append(
ScopeValidationIssue(
check="C5c",
severity="warn",
message=(
f"Capability block {index} is missing required field(s): "
f"{', '.join(sorted(missing_keys))}."
),
)
)
elif "No approved capabilities yet" not in capabilities:
issues.append(
ScopeValidationIssue(
check="C5c",
severity="warn",
message=(
"Provided Capabilities has no capability blocks or explicit "
"empty-state note."
),
)
)
return ValidationResult(issues=issues)
def _parse_sections(self, content: str) -> dict[str, str]:
matches = list(re.finditer(r"^##\s+(.+?)\s*$", content, re.MULTILINE))
sections: dict[str, str] = {}
for index, match in enumerate(matches):
title = match.group(1).strip()
start = match.end()
end = matches[index + 1].start() if index + 1 < len(matches) else len(content)
body = content[start:end]
body = re.sub(r"\n---\s*$", "", body.strip())
sections[title] = body.strip()
return sections
def _heading_order(self, content: str) -> list[str]:
return [
match.group(1).strip()
for match in re.finditer(r"^##\s+(.+?)\s*$", content, re.MULTILINE)
if match.group(1).strip() in SCOPE_SECTIONS
]
def _normalize(self, value: str | None) -> str:
if value is None:
return ""
without_comments = re.sub(r"<!--.*?-->", "", value, flags=re.DOTALL)
without_markdown = re.sub(r"[`*_>#-]+", " ", without_comments)
return re.sub(r"\s+", " ", without_markdown).strip().lower()
def _capability_blocks(self, content: str) -> list[str]:
return re.findall(
r"```capability\s*(.*?)```",
content,
flags=re.DOTALL | re.IGNORECASE,
)
def _capability_keys(self, block: str) -> set[str]:
return {
match.group(1)
for match in re.finditer(r"^([A-Za-z_][A-Za-z0-9_-]*):", block, re.MULTILINE)
}

View File

@@ -1,8 +1,11 @@
from __future__ import annotations
import logging
import json
from dataclasses import asdict
from pathlib import Path
from urllib.error import HTTPError, URLError
from urllib.request import urlopen
from fastapi import Depends, FastAPI, HTTPException, Query
from fastapi.responses import PlainTextResponse
@@ -13,6 +16,7 @@ from repo_registry.core.service import RegistryService
from repo_registry.llm_extraction import LLMCandidateExtractor, create_llm_connect_adapter
from repo_registry.repo_ingestion.git import GitIngestionService
from repo_registry.semantic import HashingEmbeddingProvider
from repo_registry.scope import ScopeGenerator, ScopeValidator
from repo_registry.storage.sqlite import NotFoundError, RegistryStore
from repo_registry.web_api.schemas import (
AbilityCreate,
@@ -58,6 +62,12 @@ from repo_registry.web_api.schemas import (
)
def slugify(value: str) -> str:
import re
return re.sub(r"[^a-z0-9]+", "-", value.lower()).strip("-")
class Settings(BaseSettings):
model_config = SettingsConfigDict(env_prefix="REPO_REGISTRY_")
@@ -67,6 +77,7 @@ class Settings(BaseSettings):
llm_provider: str | None = Field(default=None)
llm_model: str | None = Field(default=None)
embedding_provider: str | None = Field(default=None)
state_hub_base_url: str = Field(default="http://127.0.0.1:8000")
log_level: str = Field(default="INFO")
@@ -111,6 +122,7 @@ OPENAPI_TAGS = [
{"name": "analysis", "description": "Repository scans and extracted review inputs."},
{"name": "review", "description": "Candidate graph approval and correction workflow."},
{"name": "registry", "description": "Approved ability maps and manual registry CRUD."},
{"name": "scope", "description": "SCOPE.md generation, diffing, and writing."},
{"name": "search", "description": "Agent-facing discovery endpoints."},
{"name": "discovery", "description": "Comparison, gap analysis, and export helpers."},
]
@@ -1120,6 +1132,144 @@ def export_repository_registry_entry(
return PlainTextResponse(content, media_type="application/x-yaml")
@app.get(
"/repos/{repo_slug}/scope",
tags=["scope"],
response_class=PlainTextResponse,
responses={
200: {
"content": {"text/markdown": {}},
"description": "Generated SCOPE.md preview from approved characteristics.",
}
},
)
def generate_repository_scope(
repo_slug: str,
service: RegistryService = Depends(get_service),
) -> PlainTextResponse:
try:
ensure_scope_generation_ready(service, repo_slug)
content = ScopeGenerator(service).generate(repo_slug)
except NotFoundError as exc:
raise HTTPException(status_code=404, detail=str(exc)) from exc
return PlainTextResponse(content, media_type="text/markdown")
@app.get(
"/repos/{repo_slug}/scope/diff",
tags=["scope"],
)
def diff_repository_scope(
repo_slug: str,
service: RegistryService = Depends(get_service),
settings: Settings = Depends(get_settings),
) -> dict[str, object]:
try:
repository = ensure_scope_generation_ready(service, repo_slug)
scope_path = scope_file_path(service, repository, repo_slug, settings)
diff = ScopeValidator(ScopeGenerator(service)).diff(repo_slug, scope_path)
except NotFoundError as exc:
raise HTTPException(status_code=404, detail=str(exc)) from exc
except ValueError as exc:
raise HTTPException(status_code=409, detail=str(exc)) from exc
return {
"sections": [asdict(section) for section in diff.sections],
"needs_update": diff.needs_update,
}
@app.post(
"/repos/{repo_slug}/scope/write",
tags=["scope"],
)
def write_repository_scope(
repo_slug: str,
service: RegistryService = Depends(get_service),
settings: Settings = Depends(get_settings),
) -> dict[str, object]:
try:
repository = ensure_scope_generation_ready(service, repo_slug)
scope_path = scope_file_path(service, repository, repo_slug, settings)
content = ScopeGenerator(service).generate(repo_slug)
except NotFoundError as exc:
raise HTTPException(status_code=404, detail=str(exc)) from exc
except ValueError as exc:
raise HTTPException(status_code=409, detail=str(exc)) from exc
scope_path.write_text(content, encoding="utf-8")
return {"written": True, "path": str(scope_path)}
def ensure_scope_generation_ready(
service: RegistryService,
repo_slug: str,
):
repository = repository_by_slug(service, repo_slug)
ability_map = service.ability_map(repository.id)
if not ability_map.abilities:
raise NotFoundError(
f"repository {repo_slug!r} has no approved characteristics"
)
return repository
def repository_by_slug(service: RegistryService, repo_slug: str):
wanted = slugify(repo_slug)
for repository in service.list_repositories():
candidates = {
slugify(repository.name),
slugify(repository.url.rstrip("/").rsplit("/", 1)[-1].removesuffix(".git")),
}
if wanted in candidates:
return repository
raise NotFoundError(f"repository slug {repo_slug!r} was not found")
def scope_file_path(
service: RegistryService,
repository,
repo_slug: str,
settings: Settings,
) -> Path:
state_hub_path = state_hub_scope_file_path(repo_slug, settings)
if state_hub_path is not None:
return state_hub_path
source_path = Path(repository.url)
if source_path.exists() and source_path.is_dir():
return source_path / "SCOPE.md"
checkout = service.ingestion.cached_checkout(repository.url)
if checkout is not None and checkout.source_path.exists():
return checkout.source_path / "SCOPE.md"
raise ValueError(
"repository has no known local checkout path on this host"
)
def state_hub_scope_file_path(repo_slug: str, settings: Settings) -> Path | None:
base_url = settings.state_hub_base_url.rstrip("/")
if not base_url:
return None
try:
with urlopen(f"{base_url}/repos/{repo_slug}/", timeout=2) as response:
repo = json.loads(response.read().decode("utf-8"))
except HTTPError as exc:
if exc.code == 404:
return None
raise ValueError("state hub repository path lookup failed") from exc
except (URLError, TimeoutError, OSError, json.JSONDecodeError):
return None
local_path = repo.get("local_path")
if not local_path:
raise ValueError(
f"state hub repo {repo_slug!r} has no local path on this host"
)
path = Path(local_path)
if path.exists() and path.is_dir():
return path / "SCOPE.md"
raise ValueError(
f"state hub local path for repo {repo_slug!r} is not available: {path}"
)
@app.get(
"/repository-comparisons",
tags=["discovery"],

View File

@@ -0,0 +1,138 @@
from repo_registry.core.service import RegistryService
from repo_registry.repo_ingestion.git import GitIngestionService
from repo_registry.scope.generator import SCOPE_SECTIONS, ScopeGenerator
from repo_registry.scope.validator import ScopeValidator
from repo_registry.storage.sqlite import RegistryStore
def make_service(tmp_path):
store = RegistryStore(tmp_path / "registry.sqlite3")
store.initialize()
return RegistryService(store, ingestion=GitIngestionService(tmp_path / "checkouts"))
def test_scope_generator_renders_canonical_sections_and_capability_blocks(tmp_path):
service = make_service(tmp_path)
repository = service.register_repository(
name="Repo Registry",
url="https://example.test/coulomb/repo-registry.git",
description="Generates repository scope files from approved characteristics.",
)
service.update_scope(
repository.id,
name="Repo Scoping",
description="Generates and validates SCOPE.md files for registered repositories.",
confidence=0.95,
)
ability_id = service.add_ability(
repository.id,
name="Maintain Repository Scope",
description="Keeps repository utility and boundaries understandable.",
primary_class="repository-intelligence",
attributes=["scope", "capability-mapping"],
)
capability_id = service.add_capability(
repository.id,
ability_id,
name="Generate SCOPE.md",
description="Renders SCOPE.md from approved repository characteristics.",
primary_class="api",
attributes=["scope", "generation"],
)
service.add_feature(
repository.id,
capability_id,
name="Preview generated SCOPE.md",
type="business-usecase",
primary_class="business-usecase",
attributes=["scope", "preview"],
location="src/repo_registry/scope/generator.py",
)
content = ScopeGenerator(service).generate("repo-registry")
assert content.startswith("# SCOPE\n")
for section in SCOPE_SECTIONS:
assert f"## {section}" in content
assert "Generates and validates SCOPE.md files" in content
assert "Maintain Repository Scope" in content
assert "Preview generated SCOPE.md" in content
assert "src/repo_registry/scope/generator.py" in content
assert "```capability" in content
assert "type: api" in content
assert "title: Generate SCOPE.md" in content
assert "keywords: [api, scope, generation, business-usecase, preview]" in content
def test_scope_generator_marks_missing_curator_owned_sections(tmp_path):
service = make_service(tmp_path)
service.register_repository(
name="Sparse Repo",
url="https://example.test/sparse.git",
description="Sparse repo.",
)
content = ScopeGenerator(service).generate("sparse")
assert "## Out of Scope" in content
assert "<!-- needs curator input -->" in content
assert "<!-- No approved capabilities yet. -->" in content
def test_scope_validator_validates_generated_scope_and_diffs_sections(tmp_path):
service = make_service(tmp_path)
repository = service.register_repository(
name="Validator Repo",
url="https://example.test/validator-repo.git",
description="Validates generated scope files.",
)
ability_id = service.add_ability(repository.id, name="Validate Scope Files")
service.add_capability(
repository.id,
ability_id,
name="Diff SCOPE.md",
description="Compares generated and existing scope sections.",
primary_class="api",
attributes=["scope", "diff"],
)
generator = ScopeGenerator(service)
validator = ScopeValidator(generator)
path = tmp_path / "SCOPE.md"
path.write_text(generator.generate("validator-repo"), encoding="utf-8")
validation = validator.validate(path)
diff = validator.diff("validator-repo", path)
assert validation.ok
assert validation.issues == []
assert not diff.needs_update
assert {section.status for section in diff.sections} == {"ok"}
path.write_text(
path.read_text(encoding="utf-8").replace("## Core Idea", "## Core Thought"),
encoding="utf-8",
)
diff = validator.diff("validator-repo", path)
assert diff.needs_update
assert next(section for section in diff.sections if section.section == "Core Idea").status == "missing"
def test_scope_validator_warns_when_provided_capabilities_section_is_missing(tmp_path):
path = tmp_path / "SCOPE.md"
path.write_text(
"\n\n".join(
f"## {section}\n\nplaceholder"
for section in SCOPE_SECTIONS
if section != "Provided Capabilities"
),
encoding="utf-8",
)
result = ScopeValidator().validate(path)
assert any(
issue.check == "C5c"
and issue.severity == "warn"
and "Provided Capabilities" in issue.message
for issue in result.issues
)

View File

@@ -18,6 +18,7 @@ def test_openapi_groups_agent_facing_endpoints():
"analysis",
"review",
"registry",
"scope",
"search",
"discovery",
}
@@ -252,6 +253,15 @@ def test_openapi_contract_snapshot_for_stable_agent_paths():
"/repos/{repository_id}/export": {
"get": {"tags": ["discovery"], "success_schema": "application/x-yaml"}
},
"/repos/{repo_slug}/scope": {
"get": {"tags": ["scope"], "success_schema": None}
},
"/repos/{repo_slug}/scope/diff": {
"get": {"tags": ["scope"], "success_schema": "object"}
},
"/repos/{repo_slug}/scope/write": {
"post": {"tags": ["scope"], "success_schema": "object"}
},
"/repos/{repository_id}/expectation-gaps": {
"get": {"tags": ["review"], "success_schema": "list[ExpectationGapResponse]"},
"post": {"tags": ["review"], "success_schema": "ExpectationGapResponse"},
@@ -455,6 +465,100 @@ def test_api_manual_registry_loop(tmp_path):
app.dependency_overrides.clear()
def test_api_generates_diffs_and_writes_scope_md(tmp_path):
source = tmp_path / "scope-repo"
source.mkdir()
def override_settings():
return Settings(
database_path=str(tmp_path / "scope-api.sqlite3"),
checkout_root=str(tmp_path / "checkouts"),
)
app.dependency_overrides[get_settings] = override_settings
client = TestClient(app)
try:
repository = client.post(
"/repos",
json={
"name": "Scope Repo",
"url": str(source),
"description": "Generates SCOPE.md through the API.",
},
).json()
ability_id = client.post(
f"/repos/{repository['id']}/abilities",
json={
"name": "Maintain Repository Scope",
"description": "Keeps repository utility understandable.",
},
).json()["id"]
client.post(
f"/repos/{repository['id']}/capabilities",
json={
"ability_id": ability_id,
"name": "Generate SCOPE.md",
"description": "Renders SCOPE.md from approved characteristics.",
"primary_class": "api",
"attributes": ["scope", "generation"],
},
)
preview = client.get("/repos/scope-repo/scope")
assert preview.status_code == 200
assert preview.headers["content-type"].startswith("text/markdown")
assert "# SCOPE" in preview.text
assert "title: Generate SCOPE.md" in preview.text
diff = client.get("/repos/scope-repo/scope/diff")
assert diff.status_code == 200
assert diff.json()["needs_update"] is True
assert {section["status"] for section in diff.json()["sections"]} == {"missing"}
write = client.post("/repos/scope-repo/scope/write")
assert write.status_code == 200
assert write.json() == {"written": True, "path": str(source / "SCOPE.md")}
assert (source / "SCOPE.md").read_text(encoding="utf-8").startswith("# SCOPE")
current = client.get("/repos/scope-repo/scope/diff")
assert current.status_code == 200
assert current.json()["needs_update"] is False
assert {section["status"] for section in current.json()["sections"]} == {"ok"}
empty = client.post(
"/repos",
json={
"name": "Empty Scope",
"url": "https://example.test/empty-scope.git",
"description": "No approved characteristics yet.",
},
).json()
assert client.get("/repos/empty-scope/scope").status_code == 404
remote = client.post(
"/repos",
json={
"name": "Remote Scope",
"url": "https://example.test/remote-scope.git",
"description": "Has no known local checkout path.",
},
).json()
remote_ability = client.post(
f"/repos/{remote['id']}/abilities",
json={"name": "Remote Scope Generation"},
).json()["id"]
client.post(
f"/repos/{remote['id']}/capabilities",
json={
"ability_id": remote_ability,
"name": "Generate Remote SCOPE.md",
},
)
assert client.post("/repos/remote-scope/scope/write").status_code == 409
finally:
app.dependency_overrides.clear()
def test_api_compare_gap_and_export_use_cases(tmp_path):
def override_settings():
return Settings(

View File

@@ -4,7 +4,7 @@ type: workplan
title: "SCOPE.md Generation Feature"
domain: capabilities
repo: repo-registry
status: todo
status: done
owner: codex
topic_slug: foerster-capabilities
created: "2026-04-30"
@@ -37,7 +37,7 @@ Unblocks: RREG-WP-0006
```task
id: RREG-WP-0005-T01
status: todo
status: done
priority: high
state_hub_task_id: "83154aae-dd06-4329-8df6-3906b2bf0f14"
```
@@ -73,11 +73,17 @@ Acceptance: `docs/scope-md-spec.md` exists, covers all 11 sections with
explicit characteristic-to-section mappings, and is consistent with the
existing template at `state-hub/scripts/project_rules/scope.template`.
Implementation note 2026-04-30: `docs/scope-md-spec.md` now owns the reference
specification. It maps the current Custodian template headings to the
Scope/Ability/Capability/Feature/Evidence/Facts model, documents generated vs.
curator-owned sections, preserves the existing capability block format, and
cross-references the characteristic/evidence and classification strategy docs.
## T02: Build SCOPE.md generator
```task
id: RREG-WP-0005-T02
status: todo
status: done
priority: high
state_hub_task_id: "39feb7ea-72ca-4d99-8094-b006df605dbe"
```
@@ -109,11 +115,17 @@ valid SCOPE.md; all 11 sections are present; the `## Provided Capabilities`
section contains parseable capability blocks; the output passes the C5b/C5c
checks defined in CUST-WP-0034-T01.
Implementation note 2026-04-30: `repo_registry.scope.ScopeGenerator` now renders
SCOPE.md from approved repository scope, abilities, capabilities, features, facts,
support evidence, and classification metadata. It preserves the current template
headings, emits curator-input stubs for missing data, and renders approved
capabilities as parseable `capability` blocks.
## T03: Build SCOPE.md validator and differ
```task
id: RREG-WP-0005-T03
status: todo
status: done
priority: high
state_hub_task_id: "0c9c1347-368a-4657-a039-ae143a6500bd"
```
@@ -137,11 +149,15 @@ Acceptance: `ScopeValidator.diff("repo-registry", Path("SCOPE.md"))` returns
a diff with at least some `ok` sections and surfaces any real gaps; the
validator catches a missing `## Provided Capabilities` section as a `warn`.
Implementation note 2026-04-30: `repo_registry.scope.ScopeValidator` now validates
C5a/C5b/C5c-style SCOPE.md structure, parses capability blocks, and produces
section-aware diffs against freshly generated content.
## T04: API endpoints
```task
id: RREG-WP-0005-T04
status: todo
status: done
priority: high
state_hub_task_id: "a2d1937b-f9e2-480e-8e28-1c12837e1b23"
```
@@ -164,11 +180,17 @@ Acceptance: `GET /repos/repo-registry/scope` returns valid Markdown; `GET
/repos/repo-registry/scope/diff` returns a diff JSON; a `POST` write succeeds
and the written file passes the validator.
Implementation note 2026-04-30: Added `/repos/{repo_slug}/scope`,
`/repos/{repo_slug}/scope/diff`, and `/repos/{repo_slug}/scope/write` API
endpoints. The endpoints resolve registered repositories by slug, require
approved characteristics, use a local repository path or cached checkout for
diff/write operations, and return 409 when no local path is available.
## T05: Register capabilities in custodian
```task
id: RREG-WP-0005-T05
status: todo
status: done
priority: medium
state_hub_task_id: "e1bd4a4f-3d9a-4384-a254-ef75bd9905b9"
```
@@ -200,3 +222,9 @@ CUST-WP-0034-T03 will then resolve correctly.
Acceptance: `list_capabilities()` in the state-hub MCP returns `scope.generate`
and `scope.update` with `provider_repo: repo-registry`; `request_capability`
with either key resolves without routing error.
Implementation note 2026-04-30: Added `scope.generate` and `scope.update`
capability blocks to `SCOPE.md`, then ingested them with the State Hub
capability ingestion script using `/home/worsch/repo-registry` as the explicit
repo path. The State Hub catalog created `api/scope.generate` and
`api/scope.update` entries for `repo-registry`.