Repository Ability Registry
The Repository Ability Registry maps repositories from usefulness to implementation:
Ability -> Capability -> Feature -> Evidence -> Code location
The first implementation slice is a Python registry core plus FastAPI HTTP API and a small curator UI. Repository registration imports basic metadata from the repository itself, then analysis builds observed facts and candidate review entries.
Local Development
Create an environment and install dependencies:
python3 -m venv .venv
. .venv/bin/activate
python -m pip install -e ".[dev]"
Run tests:
pytest
Run the API:
uvicorn repo_registry.web_api.app:app --reload
The API creates a local SQLite database at var/repo-registry.sqlite3 by default.
First API Loop
curl -X POST http://127.0.0.1:8000/repos \
-H 'content-type: application/json' \
-d '{"url":"https://example.com/mail-router.git"}'
The registry imports name and description from pyproject.toml, package.json, or README where possible. Then add abilities, capabilities, features, and evidence under that repository and inspect:
curl http://127.0.0.1:8000/repos/1/ability-map
curl 'http://127.0.0.1:8000/search?q=classify'
Deterministic Analysis
For local development, repository URLs may be local filesystem paths. Git URLs, including file:// URLs, are cloned into var/checkouts before scanning. Trigger a deterministic scan:
curl -X POST http://127.0.0.1:8000/repos/1/analysis-runs \
-H 'content-type: application/json' \
-d '{}'
Or override the scan source path explicitly:
curl -X POST http://127.0.0.1:8000/repos/1/analysis-runs \
-H 'content-type: application/json' \
-d '{"source_path":"/path/to/repository"}'
Inspect recorded facts:
curl http://127.0.0.1:8000/repos/1/analysis-runs
curl http://127.0.0.1:8000/repos/1/analysis-runs/1
curl http://127.0.0.1:8000/repos/1/observed-facts
The deterministic scanner records observed facts only: languages, documentation files, examples, tests, package manifests, configuration files, framework hints, and likely API/CLI interfaces.
Each completed analysis run also creates a conservative candidate graph for review:
curl http://127.0.0.1:8000/repos/1/analysis-runs/1/candidate-graph
Candidate entries are source-linked review seeds. They are not canonical registry truth until a review workflow approves them.
Candidate, approved, and search responses include numeric confidence values plus low, medium, or high confidence labels for quick triage.
Approve a candidate graph into the canonical registry:
curl -X POST http://127.0.0.1:8000/repos/1/analysis-runs/1/candidate-graph/approve \
-H 'content-type: application/json' \
-d '{"notes":"Approved first review package"}'
Approval copies candidate abilities, capabilities, features, and evidence into the approved registry tables, marks candidates approved, and moves the repository status to indexed.
Review Workflow
Candidate graphs are meant to be corrected before publication. The API supports:
- edit candidate abilities and capabilities with
PATCH - reject candidate abilities, capabilities, features, and evidence
- relink capabilities under another ability
- relink features or evidence under another capability
- merge duplicate abilities, capabilities, features, or evidence
Examples are available in the generated OpenAPI docs at /docs.
Optional LLM Extraction
The llm_extraction module is designed to work with the sibling llm-connect
project without making it a hard dependency. To enable provider-backed
extraction locally:
python -m pip install -e ../llm-connect
The integration accepts any llm-connect style adapter with
execute_prompt(prompt, config) and parses strict JSON candidate drafts from
model responses. Parsed drafts can be mapped into reviewable candidate graph
entries while preserving source paths where they match observed facts or
content chunks. Tests use fake adapters, so the default test suite does not call
external providers.
Application code can inject an LLMCandidateExtractor into RegistryService.
When an extractor is present and returns candidates, analysis stores those
reviewable candidates; when it returns no candidates, the deterministic
heuristic generator remains the fallback.
If extraction fails, the failure is recorded as a review decision and analysis
continues with deterministic candidates.
Successful LLM candidate generation is also recorded as a review decision so
curators can see whether a graph came from deterministic heuristics or an LLM
draft.
The FastAPI settings object also accepts llm_provider and llm_model. By
default llm_provider is unset, so analysis is fully offline and deterministic.
Environment variables use the REPO_REGISTRY_ prefix:
REPO_REGISTRY_LLM_PROVIDER=gemini
REPO_REGISTRY_LLM_MODEL=gemini-2.5-flash
Agent-Facing Endpoints
The v0.1 API covers the main registration, analysis, review, search, and inspection loop:
GET /repos
POST /repos
GET /repos/{id}
PATCH /repos/{id}
DELETE /repos/{id}
POST /repos/{id}/analysis-runs
GET /repos/{id}/analysis-runs
GET /repos/{id}/analysis-runs/{run_id}
GET /repos/{id}/analysis-runs/{run_id}/candidate-graph
POST /repos/{id}/analysis-runs/{run_id}/candidate-graph/approve
GET /repos/{id}/ability-map
PATCH /repos/{id}/abilities/{ability_id}
DELETE /repos/{id}/abilities/{ability_id}
PATCH /repos/{id}/capabilities/{capability_id}
DELETE /repos/{id}/capabilities/{capability_id}
PATCH /repos/{id}/features/{feature_id}
DELETE /repos/{id}/features/{feature_id}
PATCH /repos/{id}/evidence/{evidence_id}
DELETE /repos/{id}/evidence/{evidence_id}
GET /abilities
GET /capabilities
GET /search?q=...