diff --git a/AGENTS.md b/AGENTS.md index 2575833..39b26cb 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,4 +1,4 @@ -# repo-registry — Agent Instructions +# repo-scoping — Agent Instructions ## Repo Identity @@ -8,7 +8,7 @@ scanners establish observed facts; LLM-assisted extractors propose interpreted claims; humans or trusted agents approve registry truth. **Domain:** capabilities -**Repo slug:** repo-registry +**Repo slug:** repo-scoping **Topic ID:** `64418556-3206-457a-ba29-6884b5b12cf3` **Workplan prefix:** `RREG-WP-` @@ -33,7 +33,7 @@ curl -s "http://127.0.0.1:8000/workstreams/?topic_id=64418556-3206-457a-ba29-688 curl -s "http://127.0.0.1:8000/tasks/?status=todo" | python3 -m json.tool # Check inbox -curl -s "http://127.0.0.1:8000/messages/?to_agent=repo-registry&unread_only=true" \ +curl -s "http://127.0.0.1:8000/messages/?to_agent=repo-scoping&unread_only=true" \ | python3 -m json.tool ``` @@ -79,7 +79,7 @@ curl -s -X PATCH "http://127.0.0.1:8000/tasks/" \ **Start:** 1. `ls workplans/` — note active workplans and their open tasks -2. Check inbox via `GET /messages/?to_agent=repo-registry&unread_only=true` +2. Check inbox via `GET /messages/?to_agent=repo-scoping&unread_only=true` 3. Check for human-flagged tasks: `GET /tasks/?needs_human=true` **During work:** @@ -92,7 +92,7 @@ curl -s -X PATCH "http://127.0.0.1:8000/tasks/" \ 3. If workplan files changed, sync them to the hub DB: ```bash -curl -s -X POST "http://127.0.0.1:8000/repos/repo-registry/sync" | python3 -m json.tool +curl -s -X POST "http://127.0.0.1:8000/repos/repo-scoping/sync" | python3 -m json.tool ``` This runs the ADR-001 consistency check with `--fix` and returns a JSON report. @@ -116,7 +116,7 @@ id: RREG-WP-NNNN type: workplan title: "..." domain: capabilities -repo: repo-registry +repo: repo-scoping status: active | done owner: codex topic_slug: foerster-capabilities diff --git a/SCOPE.md b/SCOPE.md index d10dd10..86e8261 100644 --- a/SCOPE.md +++ b/SCOPE.md @@ -1,48 +1,175 @@ --- domain: capabilities -repo: repo-registry -updated: "2026-04-26" +repo: repo-scoping +updated: "2026-04-30" --- -# repo-registry — Scope +# SCOPE -## Purpose +> This file helps you quickly understand what this repository is about, +> when it is relevant, and when it is not. +> It is curated from repo-scoping's approved characteristics and operating role. -Repository Ability Registry. Turns Git repositories into reviewable, source-linked -maps of `Ability → Capability → Feature → Evidence`. +--- -## Core Design Principle +## One-liner -``` -deterministic scanners → observed facts (file paths, languages, API routes, …) -LLM-assisted extractors → interpreted claims (ability names, descriptions, links) -human / agent review → approved registry truth +repo-scoping turns Git repositories into reviewable, source-linked scope maps and +maintains SCOPE.md files as the human and agent entry point to repository utility. + +--- + +## Core Idea + +repo-scoping models a repository as a hierarchy of characteristics: +`Scope -> Ability -> Capability -> Feature -> Evidence -> Observed Fact`. + +Deterministic scanners establish observed facts from repository content. Optional +LLM-assisted extraction proposes interpreted candidates. Humans or trusted agents +approve the resulting characteristics before they become registry truth. + +The primary output is a useful repository scope profile: what a repo is for, when +to use it, what capabilities it provides, and which facts or lower-level +characteristics support those claims. + +--- + +## In Scope + +- Register repositories and keep metadata, analysis runs, facts, candidates, and + approved characteristics together. +- Analyze repositories with deterministic scanners and optional LLM-assisted + candidate extraction. +- Review, edit, approve, reject, merge, and relink candidate abilities, + capabilities, features, and evidence. +- Search and compare approved repository characteristics. +- Generate, diff, validate, and write SCOPE.md files from approved + characteristics. +- Support the Custodian State Hub by acting as the provider for scope generation + and update capabilities. + +--- + +## Out of Scope + +- Owning the Custodian State Hub, its database, or cross-domain governance rules. +- Making unreviewed truth claims canonical without approval. +- Replacing human product judgment for curator-owned scope sections. +- Continuous Git hosting automation, deployment infrastructure, or access-control + policy beyond repository ingestion needs. +- Full static code understanding across every language and framework. + +--- + +## Relevant When + +- You need to understand what a repository is useful for without reading the whole + codebase first. +- You want a source-linked map from high-level repository scope down to observed + implementation facts. +- You need to review generated candidate abilities, capabilities, features, and + evidence before approving them. +- You need to create or refresh a SCOPE.md for a registered repository. +- You need to compare repositories by approved characteristics or find capability + gaps across a domain. + +--- + +## Not Relevant When + +- You only need raw Git hosting, CI, deployment, or issue tracking. +- You need a fully autonomous ontology without human review. +- The repository has not been registered or analyzed and no approved + characteristics exist yet. +- The needed decision is curator-owned product positioning rather than + source-observable repository behavior. + +--- + +## Current State + +- Status: active and evolving. +- Implementation: FastAPI service with SQLite development storage, deterministic + Git scanning, candidate graph review workflow, search, comparison, and SCOPE.md + generation endpoints. +- LLM assistance: optional; deterministic non-LLM behavior remains a first-class + path for continued optimization. +- UI: available for repository registration, analysis runs, candidate review, and + characteristic navigation. +- Integration: registered in the Custodian State Hub as `repo-scoping`. + +--- + +## How It Fits + +- Upstream coordination: the Custodian State Hub owns workstream/task state, + managed repository records, and capability routing. +- Downstream consumers: Custodian agents and humans use repo-scoping to inspect, + refine, and refresh repository utility profiles. +- Often used with: `llm-connect` for optional LLM-assisted extraction and + `the-custodian` for state, routing, and domain coordination. + +--- + +## Terminology + +- Preferred terms: scope, ability, capability, feature, evidence, observed fact, + characteristic, candidate, approved characteristic, SCOPE.md. +- Also known as: repository scoping service, repository ability registry. +- Potentially confusing terms: evidence is not just a raw fact; it is support for + a characteristic and may reference facts or lower-level characteristics. + Candidates are proposed claims awaiting review; approved characteristics are + canonical registry truth. + +--- + +## Related / Overlapping Repositories + +- `the-custodian` - coordination layer, State Hub, workplans, and capability + catalog. +- `llm-connect` - optional provider abstraction for LLM-assisted extraction. +- `markitect` / `markitect-project` - content and documentation platform with + related scope-document needs. + +--- + +## Getting Oriented + +- Start with: `README.md`, `AGENTS.md`, and this `SCOPE.md`. +- Key files / directories: `src/repo_registry/web_api/app.py`, + `src/repo_registry/core/service.py`, `src/repo_registry/scope/`, + `src/repo_registry/candidate_graph/`, `src/repo_registry/repo_scanning/`, + `docs/scope-md-spec.md`, and `workplans/`. +- Entry points: `uvicorn repo_registry.web_api.app:app --reload`, the `/ui` + routes, and the `/repos/{repo_slug}/scope*` API endpoints. + +--- + +## Provided Capabilities + +```capability +type: api +title: scope.generate +description: > + Generates a SCOPE.md from scratch for a registered repo using its approved + characteristics profile (abilities, capabilities, features, facts). +keywords: [scope, scope-md, generation, repository-utility] ``` -Approved entries are always explicit, reviewable, and source-linked. The system -never publishes unapproved claims as canonical truth. +```capability +type: api +title: scope.update +description: > + Diffs an existing SCOPE.md against the current characteristics profile + and returns or writes an updated version. +keywords: [scope, scope-md, update, diff, staleness] +``` -## In Scope (v0.1) +--- -- Repository registration by Git URL -- Deterministic repository scan (file tree, languages, frameworks, API/CLI surface) -- Candidate extraction for abilities, capabilities, features, and evidence -- Human review workflow: edit, approve, reject, merge, relink -- Natural-language and semantic search over approved registry entries -- REST API for repositories, ability maps, capabilities, and search +## Notes -## Out of Scope (v0.1) - -- Continuous GitHub App integration -- Full static code understanding (AST/type analysis) -- Advanced ontology enforcement -- Distributed indexing -- Benchmark execution -- Marketplace or commercial features -- Complex access control -- Automated truth claims without review - -## Domain Context - -Part of the **capabilities** domain — systematic modeling of abilities, capabilities, -and features across the Custodian ecosystem. First registered repo in this domain. +- The local checkout path is still `/home/worsch/repo-registry`; the canonical + State Hub slug and Git remote are now `repo-scoping`. +- Ecosystem-wide SCOPE.md refresh is blocked until Custodian C5b/C5c checks are + active and more managed repos have approved characteristics in repo-scoping. diff --git a/docs/scope-md-spec.md b/docs/scope-md-spec.md new file mode 100644 index 0000000..77a5322 --- /dev/null +++ b/docs/scope-md-spec.md @@ -0,0 +1,292 @@ +# SCOPE.md Reference Specification + +`SCOPE.md` is the human- and agent-facing boundary definition for a repository. +It answers, quickly and concretely, what the repository is for, when it is useful, +where it fits, and what capabilities it can provide. + +Repo-registry is the source of truth for generating and validating `SCOPE.md` +because its approved characteristic model already captures the same structure: + +```text +Scope -> Ability -> Capability -> Feature -> Evidence -> Observed Fact +``` + +This specification supersedes the Custodian dashboard reference at +`state-hub/dashboard/src/docs/scope.md`. The scaffold template remains at +`state-hub/scripts/project_rules/scope.template`; this document defines how +repo-registry should generate, validate, and update that file. + +Related model docs: + +- `docs/characteristic-evidence-model.md` +- `docs/classification-strategy.md` + +## Purpose + +`SCOPE.md` is not a README, architecture document, or marketing page. It is a +short orientation artifact for deciding whether a repo is relevant before reading +its code in depth. + +It should answer: + +- What is this repository for? +- Should I care about it right now? +- When is it relevant to my work? +- Where does it fit in the ecosystem? +- Is it mature enough to trust or reuse? +- Does it overlap with something else? +- What capabilities can it provide to other domains? + +## Canonical Template + +The historical Custodian reference calls this an "11-section template". The +current `scope.template` contains twelve functional sections plus an optional +`Notes` tail. Repo-registry should preserve the current template headings for +compatibility and treat `Notes` as curator-owned free text. + +Generated files must contain these sections, in this order: + +| Section | Source in repo-registry | Generation ownership | +|---------|--------------------------|----------------------| +| `## One-liner` | Scope name plus scope description | generated, curator-reviewed | +| `## Core Idea` | Scope description and top approved abilities | generated, curator-reviewed | +| `## In Scope` | Approved abilities and high-confidence capabilities | generated, curator-reviewed | +| `## Out of Scope` | Abilities or expectation gaps classified as exclusions | curator-owned unless explicitly modeled | +| `## Relevant When` | Approved features with `primary_class: business-usecase` or `attributes` including use-case labels | generated, curator-reviewed | +| `## Not Relevant When` | Negative use-case expectation gaps or curator exclusions | curator-owned unless explicitly modeled | +| `## Current State` | Observed facts aggregated by scanner: status, language, framework, tests, routes, docs, manifests | generated | +| `## How It Fits` | Evidence/support references to other characteristics or repos; dependency facts | generated, curator-reviewed | +| `## Terminology` | Domain term facts, names, aliases, and classification labels | generated, curator-reviewed | +| `## Related / Overlapping Repositories` | Cross-repo support references and comparison/discovery data | generated when known, curator-reviewed | +| `## Getting Oriented` | Source refs, content chunks, key files, entry points, docs, tests | generated | +| `## Provided Capabilities` | Approved capability characteristics rendered as machine-readable `capability` blocks | generated, file-origin truth | +| `## Notes` | Human-maintained remarks that do not fit the structured sections | curator-owned | + +When a generated section has insufficient data, emit a short stub plus: + +```markdown + +``` + +This makes gaps visible without pretending the scanner knows more than it does. + +## Section Mapping Details + +### One-liner + +Use the approved repository `Scope` as the root characteristic. Prefer a single +sentence from the scope description. If no curated sentence exists, use: + +```text + defines and maintains the repository scope for . +``` + +### Core Idea + +Summarize the root `Scope` and the most important approved `Ability` entries. +Use ability descriptions where available. Avoid listing every capability here; +the goal is orientation, not completeness. + +### In Scope + +Render approved abilities as top-level bullets. Include the most important +capabilities as nested wording inside the bullet, but avoid deep nesting in the +generated Markdown. + +Suggested form: + +```markdown +- . Includes , . +``` + +### Out of Scope + +This section is primarily curator-owned. Repo-registry may seed it from +classification expectation gaps whose `expected_type` is one of: + +- `classification-granularity` +- `classification-support` +- `out-of-scope` + +Generated text must be conservative and marked for review unless there is an +approved negative/exclusion model in the future. + +### Relevant When + +Use approved features that represent real usage scenarios. Strong signals: + +- `primary_class == "business-usecase"` +- `attributes` contains `usecase`, `workflow`, `review`, `generation`, + `analysis`, `integration`, or another domain-specific use-case label + +If no business-usecase features exist, seed from high-confidence abilities and +capabilities with a curator-input marker. + +### Not Relevant When + +This section is curator-owned unless explicit negative use-case facts or +expectation gaps exist. Do not infer broad exclusions from missing features. + +### Current State + +Aggregate observed facts. Good generated indicators include: + +- Status: derive from repository status and analysis run state. +- Implementation: derive from source files, package manifests, tests, and route + or CLI facts. +- Stability: conservative default `evolving` unless curated. +- Usage: conservative default `internal` or `unknown` unless facts indicate + production usage. + +Include compact bullets for detected languages, frameworks, tests, manifests, +docs, interfaces, provider facts, and scanner gaps. + +### How It Fits + +Use support/evidence relationships and source refs: + +- Upstream dependencies: package, service, provider, and integration facts. +- Downstream consumers: cross-repo support references when available. +- Often used with: related repo links and common provider/framework facts. + +Evidence is support for a characteristic, not the same thing as a fact. Prefer +evidence links that point downward in abstraction, as described in +`docs/characteristic-evidence-model.md`. + +### Terminology + +Generate from: + +- scope, ability, capability, and feature names +- `primary_class` and `attributes` +- scanner facts for providers, frameworks, commands, APIs, and domain terms +- aliases or expectation gaps when present + +Mark ambiguous or overlapping terms for curator review. + +### Related / Overlapping Repositories + +Generate only when there is cross-repo evidence, comparison data, or explicit +curator input. Do not invent related repositories from name similarity alone. + +### Getting Oriented + +Use source references and observed facts to name good entry points: + +- Start with: README, docs, API route files, CLI files, core service modules +- Key files / directories: source paths with high fact/support density +- Entry points: API routes, CLI commands, package manifests, tests + +### Provided Capabilities + +Render approved `Capability` characteristics as fenced `capability` blocks. This +section is parsed by the Custodian capability catalog and remains file-origin +truth under ADR-001. + +Block format: + +````markdown +```capability +type: api +title: scope.generate +description: > + Generates a SCOPE.md from approved repository characteristics. +keywords: [scope, scope-md, generation] +``` +```` + +Fields: + +| Field | Required | Source | +|-------|----------|--------| +| `type` | yes | capability `primary_class`, normalized to catalog categories | +| `title` | yes | capability name or curated capability key | +| `description` | no | capability description | +| `keywords` | no | capability attributes plus relevant feature classes | + +Allowed catalog categories remain compatible with the existing Custodian ingest: + +- `infrastructure` +- `api` +- `data` +- `security` +- `documentation` +- `other` + +If a capability's `primary_class` is not one of these categories, map it to +`api`, `data`, `documentation`, or `other` conservatively and preserve the +original class as a keyword. + +### Notes + +`Notes` is optional and curator-owned. Generators should preserve existing notes +when updating a file and should not overwrite this section unless explicitly +requested. + +## Generation Ownership + +Repo-registry-generated sections: + +- One-liner +- Core Idea +- In Scope +- Relevant When +- Current State +- How It Fits +- Terminology +- Related / Overlapping Repositories +- Getting Oriented +- Provided Capabilities + +Curator-owned or curator-reviewed sections: + +- Out of Scope +- Not Relevant When +- Notes +- Any generated section containing `` + +The generator may write stubs for curator-owned sections, but the updater must +preserve existing curator text unless the caller explicitly asks for a full +rewrite. + +## Validation Rules + +The validator should mirror the Custodian DOI C5 checks: + +- C5a: `SCOPE.md` exists at the repository root. +- C5b: required headings are present in canonical order. +- C5c: `## Provided Capabilities` contains parseable `capability` blocks, or an + explicit empty-state note when the repo provides no routable capabilities. + +Additional repo-registry validation: + +- Generated sections with missing data must include ``. +- Capability blocks must parse as key/value metadata. +- Capability block titles should be stable enough for routing. +- Curator-owned sections should be preserved by diff/update flows. + +## Update Semantics + +The validator/differ compares the existing file to freshly generated content by +section. A section is: + +- `ok` when normalized existing text matches generated content. +- `stale` when the section exists but differs materially. +- `missing` when the heading is absent. + +Normalization should ignore repeated whitespace and harmless Markdown wrapping, +but must not ignore changed capability block metadata. + +Generated updates should be section-aware. Do not rewrite the whole file when a +smaller section update is enough. + +## Agent Guidance + +Agents should treat `SCOPE.md` as a decision aid: + +- Read it before deep code exploration. +- Prefer it over README for scope boundaries. +- Use `AGENTS.md` for operating instructions and repo-specific workflow. +- Use generated diffs to spot stale scope claims. +- Record expectation gaps when generated scope, classes, or capabilities do not +match human judgment. diff --git a/src/repo_registry/scope/__init__.py b/src/repo_registry/scope/__init__.py new file mode 100644 index 0000000..c35ac17 --- /dev/null +++ b/src/repo_registry/scope/__init__.py @@ -0,0 +1,4 @@ +from repo_registry.scope.generator import ScopeGenerator +from repo_registry.scope.validator import ScopeValidator + +__all__ = ["ScopeGenerator", "ScopeValidator"] diff --git a/src/repo_registry/scope/generator.py b/src/repo_registry/scope/generator.py new file mode 100644 index 0000000..f4b9445 --- /dev/null +++ b/src/repo_registry/scope/generator.py @@ -0,0 +1,323 @@ +from __future__ import annotations + +import re +from dataclasses import asdict + +from repo_registry.core.service import RegistryService +from repo_registry.storage.sqlite import NotFoundError + + +SCOPE_SECTIONS = [ + "One-liner", + "Core Idea", + "In Scope", + "Out of Scope", + "Relevant When", + "Not Relevant When", + "Current State", + "How It Fits", + "Terminology", + "Related / Overlapping Repositories", + "Getting Oriented", + "Provided Capabilities", + "Notes", +] + + +NEEDS_INPUT = "" + + +class ScopeGenerator: + """Render SCOPE.md from approved repository characteristics.""" + + def __init__(self, service: RegistryService) -> None: + self.service = service + + def generate(self, repo_slug: str) -> str: + repository = self._repository_by_slug(repo_slug) + ability_map = asdict(self.service.ability_map(repository.id)) + facts = [asdict(fact) for fact in self.service.list_observed_facts(repository.id)] + sections = { + "One-liner": self._one_liner(ability_map), + "Core Idea": self._core_idea(ability_map), + "In Scope": self._in_scope(ability_map), + "Out of Scope": self._curator_stub(), + "Relevant When": self._relevant_when(ability_map), + "Not Relevant When": self._curator_stub(), + "Current State": self._current_state(repository.status, facts), + "How It Fits": self._how_it_fits(ability_map), + "Terminology": self._terminology(ability_map, facts), + "Related / Overlapping Repositories": self._curator_stub(), + "Getting Oriented": self._getting_oriented(ability_map, facts), + "Provided Capabilities": self._provided_capabilities(ability_map), + "Notes": self._curator_stub(), + } + lines = [ + "# SCOPE", + "", + "> This file helps you quickly understand what this repository is about,", + "> when it is relevant, and when it is not.", + "> It was generated from approved repo-registry characteristics.", + "", + "---", + "", + ] + for section in SCOPE_SECTIONS: + lines.extend([f"## {section}", "", sections[section].rstrip(), "", "---", ""]) + return "\n".join(lines).rstrip() + "\n" + + def _repository_by_slug(self, repo_slug: str): + wanted = self._slug(repo_slug) + for repository in self.service.list_repositories(): + candidates = { + self._slug(repository.name), + self._slug(repository.url.rstrip("/").rsplit("/", 1)[-1].removesuffix(".git")), + } + if wanted in candidates: + return repository + raise NotFoundError(f"repository slug {repo_slug!r} was not found") + + def _one_liner(self, ability_map: dict) -> str: + scope = ability_map["scope"] + description = self._sentence(scope.get("description", "")) + if description: + return description + return f"{scope['name']} defines the repository scope for {ability_map['repository']['name']}." + + def _core_idea(self, ability_map: dict) -> str: + scope = ability_map["scope"] + abilities = ability_map.get("abilities", []) + lines = [scope.get("description") or self._one_liner(ability_map)] + if abilities: + lines.append("") + lines.append("Approved abilities:") + lines.extend( + f"- {ability['name']} — {ability.get('description') or 'Approved repository ability.'}" + for ability in abilities[:5] + ) + else: + lines.extend(["", NEEDS_INPUT]) + return "\n".join(lines) + + def _in_scope(self, ability_map: dict) -> str: + abilities = ability_map.get("abilities", []) + if not abilities: + return self._curator_stub() + lines = [] + for ability in abilities: + capabilities = ", ".join( + capability["name"] for capability in ability.get("capabilities", [])[:4] + ) + suffix = f" Includes {capabilities}." if capabilities else "" + lines.append( + f"- {ability['name']} — {ability.get('description') or 'Approved ability.'}{suffix}" + ) + return "\n".join(lines) + + def _relevant_when(self, ability_map: dict) -> str: + features = [ + feature + for feature in self._features(ability_map) + if self._is_usecase_feature(feature) + ] + if not features: + features = self._features(ability_map)[:5] + if not features: + return self._curator_stub() + lines = [ + f"- You need {feature['name']} ({feature.get('primary_class') or feature.get('type', 'feature')})." + for feature in features + ] + if not any(self._is_usecase_feature(feature) for feature in features): + lines.append(NEEDS_INPUT) + return "\n".join(lines) + + def _current_state(self, status: str, facts: list[dict]) -> str: + kinds = self._facts_by_kind(facts) + languages = self._fact_names(kinds.get("language", [])) + frameworks = self._fact_names(kinds.get("framework", [])) + tests = kinds.get("test", []) + interfaces = kinds.get("interface", []) + manifests = kinds.get("manifest", []) + implementation = "substantial" if interfaces or manifests else "partial" + if not facts: + implementation = "unknown" + lines = [ + f"- Status: {status}", + f"- Implementation: {implementation}", + "- Stability: evolving", + "- Usage: internal", + f"- Languages: {', '.join(languages) if languages else 'unknown'}", + f"- Frameworks: {', '.join(frameworks) if frameworks else 'none detected'}", + f"- Tests observed: {len(tests)}", + f"- Interfaces observed: {len(interfaces)}", + f"- Manifests observed: {len(manifests)}", + ] + if not facts: + lines.append(NEEDS_INPUT) + return "\n".join(lines) + + def _how_it_fits(self, ability_map: dict) -> str: + evidence = [ + item + for capability in self._capabilities(ability_map) + for item in capability.get("evidence", []) + ] + if not evidence: + return "\n".join( + [ + "- Upstream dependencies: " + NEEDS_INPUT, + "- Downstream consumers: " + NEEDS_INPUT, + "- Often used with: " + NEEDS_INPUT, + ] + ) + refs = ", ".join( + sorted({item.get("reference", "") for item in evidence if item.get("reference")})[:8] + ) + return "\n".join( + [ + f"- Supported by evidence references: {refs or 'available evidence'}", + "- Upstream dependencies: " + NEEDS_INPUT, + "- Downstream consumers: " + NEEDS_INPUT, + "- Often used with: " + NEEDS_INPUT, + ] + ) + + def _terminology(self, ability_map: dict, facts: list[dict]) -> str: + terms = set() + for item in [ability_map["scope"], *ability_map.get("abilities", [])]: + terms.add(item.get("name", "")) + terms.add(item.get("primary_class", "")) + terms.update(item.get("attributes", [])) + for capability in self._capabilities(ability_map): + terms.add(capability.get("name", "")) + terms.add(capability.get("primary_class", "")) + terms.update(capability.get("attributes", [])) + for fact in facts: + if fact.get("kind") in {"framework", "llm_provider", "provider_registry"}: + terms.add(fact.get("name", "")) + visible = [term for term in sorted(terms) if term] + if not visible: + return self._curator_stub() + return "\n".join( + [ + "- Preferred terms: " + ", ".join(visible[:12]), + "- Also known as: " + NEEDS_INPUT, + "- Potentially confusing terms: " + NEEDS_INPUT, + ] + ) + + def _getting_oriented(self, ability_map: dict, facts: list[dict]) -> str: + paths = self._source_paths(ability_map, facts) + if not paths: + return self._curator_stub() + return "\n".join( + [ + f"- Start with: {paths[0]}", + f"- Key files / directories: {', '.join(paths[:8])}", + f"- Entry points: {', '.join(paths[:5])}", + ] + ) + + def _provided_capabilities(self, ability_map: dict) -> str: + capabilities = self._capabilities(ability_map) + if not capabilities: + return f"\n{NEEDS_INPUT}" + blocks = [] + for capability in capabilities: + keywords = self._keywords_for_capability(capability) + blocks.append( + "\n".join( + [ + "```capability", + f"type: {self._capability_type(capability.get('primary_class', 'other'))}", + f"title: {capability['name']}", + "description: >", + f" {capability.get('description') or 'Approved repository capability.'}", + f"keywords: [{', '.join(keywords)}]", + "```", + ] + ) + ) + return "\n\n".join(blocks) + + def _capabilities(self, ability_map: dict) -> list[dict]: + return [ + capability + for ability in ability_map.get("abilities", []) + for capability in ability.get("capabilities", []) + ] + + def _features(self, ability_map: dict) -> list[dict]: + return [ + feature + for capability in self._capabilities(ability_map) + for feature in capability.get("features", []) + ] + + def _is_usecase_feature(self, feature: dict) -> bool: + labels = {str(feature.get("primary_class", "")).lower()} + labels.update(str(item).lower() for item in feature.get("attributes", [])) + return bool(labels & {"business-usecase", "usecase", "workflow", "review"}) + + def _keywords_for_capability(self, capability: dict) -> list[str]: + keywords = [capability.get("primary_class", "")] + keywords.extend(capability.get("attributes", [])) + for feature in capability.get("features", []): + keywords.append(feature.get("primary_class", "")) + keywords.extend(feature.get("attributes", [])) + return [self._keyword(item) for item in self._unique(keywords)[:8] if item] + + def _capability_type(self, primary_class: str) -> str: + normalized = primary_class.lower() + if normalized in {"api", "infrastructure", "data", "security", "documentation"}: + return normalized + if normalized in {"interface", "integration", "llm-integration"}: + return "api" + if normalized in {"storage", "repository-structure"}: + return "data" + return "other" + + def _facts_by_kind(self, facts: list[dict]) -> dict[str, list[dict]]: + grouped: dict[str, list[dict]] = {} + for fact in facts: + grouped.setdefault(fact.get("kind", ""), []).append(fact) + return grouped + + def _fact_names(self, facts: list[dict]) -> list[str]: + return self._unique([fact.get("name", "") for fact in facts]) + + def _source_paths(self, ability_map: dict, facts: list[dict]) -> list[str]: + paths = [fact.get("path", "") for fact in facts if fact.get("path")] + for feature in self._features(ability_map): + paths.append(feature.get("location", "")) + for source_ref in feature.get("source_refs", []): + paths.append(source_ref.get("path", "")) + return self._unique(paths) + + def _curator_stub(self) -> str: + return f"- {NEEDS_INPUT}" + + def _sentence(self, text: str) -> str: + cleaned = re.sub(r"\s+", " ", text.strip()) + if not cleaned: + return "" + return re.split(r"(?<=[.!?])\s+", cleaned, maxsplit=1)[0] + + def _slug(self, value: str) -> str: + return re.sub(r"[^a-z0-9]+", "-", value.lower()).strip("-") + + def _keyword(self, value: str) -> str: + return self._slug(value) or "other" + + def _unique(self, values: list[str]) -> list[str]: + result: list[str] = [] + seen: set[str] = set() + for value in values: + item = str(value).strip() + key = item.lower() + if not item or key in seen: + continue + seen.add(key) + result.append(item) + return result diff --git a/src/repo_registry/scope/validator.py b/src/repo_registry/scope/validator.py new file mode 100644 index 0000000..5024576 --- /dev/null +++ b/src/repo_registry/scope/validator.py @@ -0,0 +1,184 @@ +from __future__ import annotations + +import re +from dataclasses import dataclass +from pathlib import Path + +from repo_registry.scope.generator import SCOPE_SECTIONS, ScopeGenerator + + +@dataclass(frozen=True) +class ScopeDiffSection: + section: str + status: str + current_text: str | None + proposed_text: str | None + + +@dataclass(frozen=True) +class ScopeDiff: + sections: list[ScopeDiffSection] + + @property + def needs_update(self) -> bool: + return any(section.status != "ok" for section in self.sections) + + +@dataclass(frozen=True) +class ScopeValidationIssue: + check: str + severity: str + message: str + + +@dataclass(frozen=True) +class ValidationResult: + issues: list[ScopeValidationIssue] + + @property + def ok(self) -> bool: + return not any(issue.severity == "error" for issue in self.issues) + + +class ScopeValidator: + """Validate and diff SCOPE.md files.""" + + def __init__(self, generator: ScopeGenerator | None = None) -> None: + self.generator = generator + + def diff(self, repo_slug: str, existing_path: Path) -> ScopeDiff: + if self.generator is None: + raise ValueError("ScopeValidator.diff requires a ScopeGenerator") + current = existing_path.read_text(encoding="utf-8") if existing_path.exists() else "" + proposed = self.generator.generate(repo_slug) + current_sections = self._parse_sections(current) + proposed_sections = self._parse_sections(proposed) + sections: list[ScopeDiffSection] = [] + for section in SCOPE_SECTIONS: + current_text = current_sections.get(section) + proposed_text = proposed_sections.get(section, "") + if current_text is None: + status = "missing" + elif self._normalize(current_text) == self._normalize(proposed_text): + status = "ok" + else: + status = "stale" + sections.append( + ScopeDiffSection( + section=section, + status=status, + current_text=current_text, + proposed_text=proposed_text, + ) + ) + return ScopeDiff(sections=sections) + + def validate(self, path: Path) -> ValidationResult: + issues: list[ScopeValidationIssue] = [] + if not path.exists(): + return ValidationResult( + issues=[ + ScopeValidationIssue( + check="C5a", + severity="error", + message="SCOPE.md is missing.", + ) + ] + ) + content = path.read_text(encoding="utf-8") + sections = self._parse_sections(content) + missing = [section for section in SCOPE_SECTIONS if section not in sections] + if missing: + severity = "warn" if missing == ["Provided Capabilities"] else "error" + issues.append( + ScopeValidationIssue( + check="C5b", + severity=severity, + message=f"Missing SCOPE.md section(s): {', '.join(missing)}.", + ) + ) + ordered = self._heading_order(content) + expected_order = [section for section in SCOPE_SECTIONS if section in sections] + if ordered[: len(expected_order)] != expected_order: + issues.append( + ScopeValidationIssue( + check="C5b", + severity="warn", + message="SCOPE.md sections are not in canonical order.", + ) + ) + capabilities = sections.get("Provided Capabilities") + if capabilities is None: + issues.append( + ScopeValidationIssue( + check="C5c", + severity="warn", + message="Provided Capabilities section is missing.", + ) + ) + elif "```capability" in capabilities: + for index, block in enumerate(self._capability_blocks(capabilities), start=1): + keys = self._capability_keys(block) + missing_keys = {"type", "title"} - keys + if missing_keys: + issues.append( + ScopeValidationIssue( + check="C5c", + severity="warn", + message=( + f"Capability block {index} is missing required field(s): " + f"{', '.join(sorted(missing_keys))}." + ), + ) + ) + elif "No approved capabilities yet" not in capabilities: + issues.append( + ScopeValidationIssue( + check="C5c", + severity="warn", + message=( + "Provided Capabilities has no capability blocks or explicit " + "empty-state note." + ), + ) + ) + return ValidationResult(issues=issues) + + def _parse_sections(self, content: str) -> dict[str, str]: + matches = list(re.finditer(r"^##\s+(.+?)\s*$", content, re.MULTILINE)) + sections: dict[str, str] = {} + for index, match in enumerate(matches): + title = match.group(1).strip() + start = match.end() + end = matches[index + 1].start() if index + 1 < len(matches) else len(content) + body = content[start:end] + body = re.sub(r"\n---\s*$", "", body.strip()) + sections[title] = body.strip() + return sections + + def _heading_order(self, content: str) -> list[str]: + return [ + match.group(1).strip() + for match in re.finditer(r"^##\s+(.+?)\s*$", content, re.MULTILINE) + if match.group(1).strip() in SCOPE_SECTIONS + ] + + def _normalize(self, value: str | None) -> str: + if value is None: + return "" + without_comments = re.sub(r"", "", value, flags=re.DOTALL) + without_markdown = re.sub(r"[`*_>#-]+", " ", without_comments) + return re.sub(r"\s+", " ", without_markdown).strip().lower() + + def _capability_blocks(self, content: str) -> list[str]: + return re.findall( + r"```capability\s*(.*?)```", + content, + flags=re.DOTALL | re.IGNORECASE, + ) + + def _capability_keys(self, block: str) -> set[str]: + return { + match.group(1) + for match in re.finditer(r"^([A-Za-z_][A-Za-z0-9_-]*):", block, re.MULTILINE) + } diff --git a/src/repo_registry/web_api/app.py b/src/repo_registry/web_api/app.py index 08b3120..ac609e9 100644 --- a/src/repo_registry/web_api/app.py +++ b/src/repo_registry/web_api/app.py @@ -1,8 +1,11 @@ from __future__ import annotations import logging +import json from dataclasses import asdict from pathlib import Path +from urllib.error import HTTPError, URLError +from urllib.request import urlopen from fastapi import Depends, FastAPI, HTTPException, Query from fastapi.responses import PlainTextResponse @@ -13,6 +16,7 @@ from repo_registry.core.service import RegistryService from repo_registry.llm_extraction import LLMCandidateExtractor, create_llm_connect_adapter from repo_registry.repo_ingestion.git import GitIngestionService from repo_registry.semantic import HashingEmbeddingProvider +from repo_registry.scope import ScopeGenerator, ScopeValidator from repo_registry.storage.sqlite import NotFoundError, RegistryStore from repo_registry.web_api.schemas import ( AbilityCreate, @@ -58,6 +62,12 @@ from repo_registry.web_api.schemas import ( ) +def slugify(value: str) -> str: + import re + + return re.sub(r"[^a-z0-9]+", "-", value.lower()).strip("-") + + class Settings(BaseSettings): model_config = SettingsConfigDict(env_prefix="REPO_REGISTRY_") @@ -67,6 +77,7 @@ class Settings(BaseSettings): llm_provider: str | None = Field(default=None) llm_model: str | None = Field(default=None) embedding_provider: str | None = Field(default=None) + state_hub_base_url: str = Field(default="http://127.0.0.1:8000") log_level: str = Field(default="INFO") @@ -111,6 +122,7 @@ OPENAPI_TAGS = [ {"name": "analysis", "description": "Repository scans and extracted review inputs."}, {"name": "review", "description": "Candidate graph approval and correction workflow."}, {"name": "registry", "description": "Approved ability maps and manual registry CRUD."}, + {"name": "scope", "description": "SCOPE.md generation, diffing, and writing."}, {"name": "search", "description": "Agent-facing discovery endpoints."}, {"name": "discovery", "description": "Comparison, gap analysis, and export helpers."}, ] @@ -1120,6 +1132,144 @@ def export_repository_registry_entry( return PlainTextResponse(content, media_type="application/x-yaml") +@app.get( + "/repos/{repo_slug}/scope", + tags=["scope"], + response_class=PlainTextResponse, + responses={ + 200: { + "content": {"text/markdown": {}}, + "description": "Generated SCOPE.md preview from approved characteristics.", + } + }, +) +def generate_repository_scope( + repo_slug: str, + service: RegistryService = Depends(get_service), +) -> PlainTextResponse: + try: + ensure_scope_generation_ready(service, repo_slug) + content = ScopeGenerator(service).generate(repo_slug) + except NotFoundError as exc: + raise HTTPException(status_code=404, detail=str(exc)) from exc + return PlainTextResponse(content, media_type="text/markdown") + + +@app.get( + "/repos/{repo_slug}/scope/diff", + tags=["scope"], +) +def diff_repository_scope( + repo_slug: str, + service: RegistryService = Depends(get_service), + settings: Settings = Depends(get_settings), +) -> dict[str, object]: + try: + repository = ensure_scope_generation_ready(service, repo_slug) + scope_path = scope_file_path(service, repository, repo_slug, settings) + diff = ScopeValidator(ScopeGenerator(service)).diff(repo_slug, scope_path) + except NotFoundError as exc: + raise HTTPException(status_code=404, detail=str(exc)) from exc + except ValueError as exc: + raise HTTPException(status_code=409, detail=str(exc)) from exc + return { + "sections": [asdict(section) for section in diff.sections], + "needs_update": diff.needs_update, + } + + +@app.post( + "/repos/{repo_slug}/scope/write", + tags=["scope"], +) +def write_repository_scope( + repo_slug: str, + service: RegistryService = Depends(get_service), + settings: Settings = Depends(get_settings), +) -> dict[str, object]: + try: + repository = ensure_scope_generation_ready(service, repo_slug) + scope_path = scope_file_path(service, repository, repo_slug, settings) + content = ScopeGenerator(service).generate(repo_slug) + except NotFoundError as exc: + raise HTTPException(status_code=404, detail=str(exc)) from exc + except ValueError as exc: + raise HTTPException(status_code=409, detail=str(exc)) from exc + scope_path.write_text(content, encoding="utf-8") + return {"written": True, "path": str(scope_path)} + + +def ensure_scope_generation_ready( + service: RegistryService, + repo_slug: str, +): + repository = repository_by_slug(service, repo_slug) + ability_map = service.ability_map(repository.id) + if not ability_map.abilities: + raise NotFoundError( + f"repository {repo_slug!r} has no approved characteristics" + ) + return repository + + +def repository_by_slug(service: RegistryService, repo_slug: str): + wanted = slugify(repo_slug) + for repository in service.list_repositories(): + candidates = { + slugify(repository.name), + slugify(repository.url.rstrip("/").rsplit("/", 1)[-1].removesuffix(".git")), + } + if wanted in candidates: + return repository + raise NotFoundError(f"repository slug {repo_slug!r} was not found") + + +def scope_file_path( + service: RegistryService, + repository, + repo_slug: str, + settings: Settings, +) -> Path: + state_hub_path = state_hub_scope_file_path(repo_slug, settings) + if state_hub_path is not None: + return state_hub_path + source_path = Path(repository.url) + if source_path.exists() and source_path.is_dir(): + return source_path / "SCOPE.md" + checkout = service.ingestion.cached_checkout(repository.url) + if checkout is not None and checkout.source_path.exists(): + return checkout.source_path / "SCOPE.md" + raise ValueError( + "repository has no known local checkout path on this host" + ) + + +def state_hub_scope_file_path(repo_slug: str, settings: Settings) -> Path | None: + base_url = settings.state_hub_base_url.rstrip("/") + if not base_url: + return None + try: + with urlopen(f"{base_url}/repos/{repo_slug}/", timeout=2) as response: + repo = json.loads(response.read().decode("utf-8")) + except HTTPError as exc: + if exc.code == 404: + return None + raise ValueError("state hub repository path lookup failed") from exc + except (URLError, TimeoutError, OSError, json.JSONDecodeError): + return None + local_path = repo.get("local_path") + if not local_path: + raise ValueError( + f"state hub repo {repo_slug!r} has no local path on this host" + ) + path = Path(local_path) + if path.exists() and path.is_dir(): + return path / "SCOPE.md" + raise ValueError( + f"state hub local path for repo {repo_slug!r} is not available: {path}" + ) + + @app.get( "/repository-comparisons", tags=["discovery"], diff --git a/tests/test_scope_generator.py b/tests/test_scope_generator.py new file mode 100644 index 0000000..2986e0a --- /dev/null +++ b/tests/test_scope_generator.py @@ -0,0 +1,138 @@ +from repo_registry.core.service import RegistryService +from repo_registry.repo_ingestion.git import GitIngestionService +from repo_registry.scope.generator import SCOPE_SECTIONS, ScopeGenerator +from repo_registry.scope.validator import ScopeValidator +from repo_registry.storage.sqlite import RegistryStore + + +def make_service(tmp_path): + store = RegistryStore(tmp_path / "registry.sqlite3") + store.initialize() + return RegistryService(store, ingestion=GitIngestionService(tmp_path / "checkouts")) + + +def test_scope_generator_renders_canonical_sections_and_capability_blocks(tmp_path): + service = make_service(tmp_path) + repository = service.register_repository( + name="Repo Registry", + url="https://example.test/coulomb/repo-registry.git", + description="Generates repository scope files from approved characteristics.", + ) + service.update_scope( + repository.id, + name="Repo Scoping", + description="Generates and validates SCOPE.md files for registered repositories.", + confidence=0.95, + ) + ability_id = service.add_ability( + repository.id, + name="Maintain Repository Scope", + description="Keeps repository utility and boundaries understandable.", + primary_class="repository-intelligence", + attributes=["scope", "capability-mapping"], + ) + capability_id = service.add_capability( + repository.id, + ability_id, + name="Generate SCOPE.md", + description="Renders SCOPE.md from approved repository characteristics.", + primary_class="api", + attributes=["scope", "generation"], + ) + service.add_feature( + repository.id, + capability_id, + name="Preview generated SCOPE.md", + type="business-usecase", + primary_class="business-usecase", + attributes=["scope", "preview"], + location="src/repo_registry/scope/generator.py", + ) + + content = ScopeGenerator(service).generate("repo-registry") + + assert content.startswith("# SCOPE\n") + for section in SCOPE_SECTIONS: + assert f"## {section}" in content + assert "Generates and validates SCOPE.md files" in content + assert "Maintain Repository Scope" in content + assert "Preview generated SCOPE.md" in content + assert "src/repo_registry/scope/generator.py" in content + assert "```capability" in content + assert "type: api" in content + assert "title: Generate SCOPE.md" in content + assert "keywords: [api, scope, generation, business-usecase, preview]" in content + + +def test_scope_generator_marks_missing_curator_owned_sections(tmp_path): + service = make_service(tmp_path) + service.register_repository( + name="Sparse Repo", + url="https://example.test/sparse.git", + description="Sparse repo.", + ) + + content = ScopeGenerator(service).generate("sparse") + + assert "## Out of Scope" in content + assert "" in content + assert "" in content + + +def test_scope_validator_validates_generated_scope_and_diffs_sections(tmp_path): + service = make_service(tmp_path) + repository = service.register_repository( + name="Validator Repo", + url="https://example.test/validator-repo.git", + description="Validates generated scope files.", + ) + ability_id = service.add_ability(repository.id, name="Validate Scope Files") + service.add_capability( + repository.id, + ability_id, + name="Diff SCOPE.md", + description="Compares generated and existing scope sections.", + primary_class="api", + attributes=["scope", "diff"], + ) + generator = ScopeGenerator(service) + validator = ScopeValidator(generator) + path = tmp_path / "SCOPE.md" + path.write_text(generator.generate("validator-repo"), encoding="utf-8") + + validation = validator.validate(path) + diff = validator.diff("validator-repo", path) + + assert validation.ok + assert validation.issues == [] + assert not diff.needs_update + assert {section.status for section in diff.sections} == {"ok"} + + path.write_text( + path.read_text(encoding="utf-8").replace("## Core Idea", "## Core Thought"), + encoding="utf-8", + ) + diff = validator.diff("validator-repo", path) + assert diff.needs_update + assert next(section for section in diff.sections if section.section == "Core Idea").status == "missing" + + +def test_scope_validator_warns_when_provided_capabilities_section_is_missing(tmp_path): + path = tmp_path / "SCOPE.md" + path.write_text( + "\n\n".join( + f"## {section}\n\nplaceholder" + for section in SCOPE_SECTIONS + if section != "Provided Capabilities" + ), + encoding="utf-8", + ) + + result = ScopeValidator().validate(path) + + assert any( + issue.check == "C5c" + and issue.severity == "warn" + and "Provided Capabilities" in issue.message + for issue in result.issues + ) diff --git a/tests/test_web_api.py b/tests/test_web_api.py index 60c3427..6e59e6c 100644 --- a/tests/test_web_api.py +++ b/tests/test_web_api.py @@ -18,6 +18,7 @@ def test_openapi_groups_agent_facing_endpoints(): "analysis", "review", "registry", + "scope", "search", "discovery", } @@ -252,6 +253,15 @@ def test_openapi_contract_snapshot_for_stable_agent_paths(): "/repos/{repository_id}/export": { "get": {"tags": ["discovery"], "success_schema": "application/x-yaml"} }, + "/repos/{repo_slug}/scope": { + "get": {"tags": ["scope"], "success_schema": None} + }, + "/repos/{repo_slug}/scope/diff": { + "get": {"tags": ["scope"], "success_schema": "object"} + }, + "/repos/{repo_slug}/scope/write": { + "post": {"tags": ["scope"], "success_schema": "object"} + }, "/repos/{repository_id}/expectation-gaps": { "get": {"tags": ["review"], "success_schema": "list[ExpectationGapResponse]"}, "post": {"tags": ["review"], "success_schema": "ExpectationGapResponse"}, @@ -455,6 +465,100 @@ def test_api_manual_registry_loop(tmp_path): app.dependency_overrides.clear() +def test_api_generates_diffs_and_writes_scope_md(tmp_path): + source = tmp_path / "scope-repo" + source.mkdir() + + def override_settings(): + return Settings( + database_path=str(tmp_path / "scope-api.sqlite3"), + checkout_root=str(tmp_path / "checkouts"), + ) + + app.dependency_overrides[get_settings] = override_settings + client = TestClient(app) + try: + repository = client.post( + "/repos", + json={ + "name": "Scope Repo", + "url": str(source), + "description": "Generates SCOPE.md through the API.", + }, + ).json() + ability_id = client.post( + f"/repos/{repository['id']}/abilities", + json={ + "name": "Maintain Repository Scope", + "description": "Keeps repository utility understandable.", + }, + ).json()["id"] + client.post( + f"/repos/{repository['id']}/capabilities", + json={ + "ability_id": ability_id, + "name": "Generate SCOPE.md", + "description": "Renders SCOPE.md from approved characteristics.", + "primary_class": "api", + "attributes": ["scope", "generation"], + }, + ) + + preview = client.get("/repos/scope-repo/scope") + assert preview.status_code == 200 + assert preview.headers["content-type"].startswith("text/markdown") + assert "# SCOPE" in preview.text + assert "title: Generate SCOPE.md" in preview.text + + diff = client.get("/repos/scope-repo/scope/diff") + assert diff.status_code == 200 + assert diff.json()["needs_update"] is True + assert {section["status"] for section in diff.json()["sections"]} == {"missing"} + + write = client.post("/repos/scope-repo/scope/write") + assert write.status_code == 200 + assert write.json() == {"written": True, "path": str(source / "SCOPE.md")} + assert (source / "SCOPE.md").read_text(encoding="utf-8").startswith("# SCOPE") + + current = client.get("/repos/scope-repo/scope/diff") + assert current.status_code == 200 + assert current.json()["needs_update"] is False + assert {section["status"] for section in current.json()["sections"]} == {"ok"} + + empty = client.post( + "/repos", + json={ + "name": "Empty Scope", + "url": "https://example.test/empty-scope.git", + "description": "No approved characteristics yet.", + }, + ).json() + assert client.get("/repos/empty-scope/scope").status_code == 404 + + remote = client.post( + "/repos", + json={ + "name": "Remote Scope", + "url": "https://example.test/remote-scope.git", + "description": "Has no known local checkout path.", + }, + ).json() + remote_ability = client.post( + f"/repos/{remote['id']}/abilities", + json={"name": "Remote Scope Generation"}, + ).json()["id"] + client.post( + f"/repos/{remote['id']}/capabilities", + json={ + "ability_id": remote_ability, + "name": "Generate Remote SCOPE.md", + }, + ) + assert client.post("/repos/remote-scope/scope/write").status_code == 409 + finally: + app.dependency_overrides.clear() + + def test_api_compare_gap_and_export_use_cases(tmp_path): def override_settings(): return Settings( diff --git a/workplans/RREG-WP-0005-scope-md-generation-feature.md b/workplans/RREG-WP-0005-scope-md-generation-feature.md index e6804ee..5d2051f 100644 --- a/workplans/RREG-WP-0005-scope-md-generation-feature.md +++ b/workplans/RREG-WP-0005-scope-md-generation-feature.md @@ -4,7 +4,7 @@ type: workplan title: "SCOPE.md Generation Feature" domain: capabilities repo: repo-registry -status: todo +status: done owner: codex topic_slug: foerster-capabilities created: "2026-04-30" @@ -37,7 +37,7 @@ Unblocks: RREG-WP-0006 ```task id: RREG-WP-0005-T01 -status: todo +status: done priority: high state_hub_task_id: "83154aae-dd06-4329-8df6-3906b2bf0f14" ``` @@ -73,11 +73,17 @@ Acceptance: `docs/scope-md-spec.md` exists, covers all 11 sections with explicit characteristic-to-section mappings, and is consistent with the existing template at `state-hub/scripts/project_rules/scope.template`. +Implementation note 2026-04-30: `docs/scope-md-spec.md` now owns the reference +specification. It maps the current Custodian template headings to the +Scope/Ability/Capability/Feature/Evidence/Facts model, documents generated vs. +curator-owned sections, preserves the existing capability block format, and +cross-references the characteristic/evidence and classification strategy docs. + ## T02: Build SCOPE.md generator ```task id: RREG-WP-0005-T02 -status: todo +status: done priority: high state_hub_task_id: "39feb7ea-72ca-4d99-8094-b006df605dbe" ``` @@ -109,11 +115,17 @@ valid SCOPE.md; all 11 sections are present; the `## Provided Capabilities` section contains parseable capability blocks; the output passes the C5b/C5c checks defined in CUST-WP-0034-T01. +Implementation note 2026-04-30: `repo_registry.scope.ScopeGenerator` now renders +SCOPE.md from approved repository scope, abilities, capabilities, features, facts, +support evidence, and classification metadata. It preserves the current template +headings, emits curator-input stubs for missing data, and renders approved +capabilities as parseable `capability` blocks. + ## T03: Build SCOPE.md validator and differ ```task id: RREG-WP-0005-T03 -status: todo +status: done priority: high state_hub_task_id: "0c9c1347-368a-4657-a039-ae143a6500bd" ``` @@ -137,11 +149,15 @@ Acceptance: `ScopeValidator.diff("repo-registry", Path("SCOPE.md"))` returns a diff with at least some `ok` sections and surfaces any real gaps; the validator catches a missing `## Provided Capabilities` section as a `warn`. +Implementation note 2026-04-30: `repo_registry.scope.ScopeValidator` now validates +C5a/C5b/C5c-style SCOPE.md structure, parses capability blocks, and produces +section-aware diffs against freshly generated content. + ## T04: API endpoints ```task id: RREG-WP-0005-T04 -status: todo +status: done priority: high state_hub_task_id: "a2d1937b-f9e2-480e-8e28-1c12837e1b23" ``` @@ -164,11 +180,17 @@ Acceptance: `GET /repos/repo-registry/scope` returns valid Markdown; `GET /repos/repo-registry/scope/diff` returns a diff JSON; a `POST` write succeeds and the written file passes the validator. +Implementation note 2026-04-30: Added `/repos/{repo_slug}/scope`, +`/repos/{repo_slug}/scope/diff`, and `/repos/{repo_slug}/scope/write` API +endpoints. The endpoints resolve registered repositories by slug, require +approved characteristics, use a local repository path or cached checkout for +diff/write operations, and return 409 when no local path is available. + ## T05: Register capabilities in custodian ```task id: RREG-WP-0005-T05 -status: todo +status: done priority: medium state_hub_task_id: "e1bd4a4f-3d9a-4384-a254-ef75bd9905b9" ``` @@ -200,3 +222,9 @@ CUST-WP-0034-T03 will then resolve correctly. Acceptance: `list_capabilities()` in the state-hub MCP returns `scope.generate` and `scope.update` with `provider_repo: repo-registry`; `request_capability` with either key resolves without routing error. + +Implementation note 2026-04-30: Added `scope.generate` and `scope.update` +capability blocks to `SCOPE.md`, then ingested them with the State Hub +capability ingestion script using `/home/worsch/repo-registry` as the explicit +repo path. The State Hub catalog created `api/scope.generate` and +`api/scope.update` entries for `repo-registry`.