chore(custodian): onboard repo-registry to Custodian State Hub

- Register under capabilities domain (foerster_capabilities renamed)
- Replace prose workplans with ADR-001 format (RREG-WP-0001 done, RREG-WP-0002 active)
- Add AGENTS.md for Codex agent state-hub integration via HTTP API
- Add SCOPE.md with domain context and v0.1 scope boundaries

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-04-26 13:06:35 +02:00
parent 249641728b
commit 4e17c9fea9
6 changed files with 440 additions and 663 deletions

169
AGENTS.md Normal file
View File

@@ -0,0 +1,169 @@
# repo-registry — Agent Instructions
## Repo Identity
**Purpose:** Repository Ability Registry — turns Git repositories into reviewable,
source-linked maps of `Ability → Capability → Feature → Evidence`. Deterministic
scanners establish observed facts; LLM-assisted extractors propose interpreted
claims; humans or trusted agents approve registry truth.
**Domain:** capabilities
**Repo slug:** repo-registry
**Topic ID:** `64418556-3206-457a-ba29-6884b5b12cf3`
**Workplan prefix:** `RREG-WP-`
---
## State Hub Integration
The Custodian State Hub tracks work across all domains. It runs at
`http://127.0.0.1:8000` (local) or `http://127.0.0.1:18000` when accessed from
a remote machine via tunnel.
Interact via HTTP — there is no MCP integration for Codex agents.
### Orient at session start
```bash
# Domain workstreams
curl -s "http://127.0.0.1:8000/workstreams/?topic_id=64418556-3206-457a-ba29-6884b5b12cf3&status=active" \
| python3 -m json.tool
# Open tasks for this repo (once workstreams are registered)
curl -s "http://127.0.0.1:8000/tasks/?status=todo" | python3 -m json.tool
# Check inbox
curl -s "http://127.0.0.1:8000/messages/?to_agent=repo-registry&unread_only=true" \
| python3 -m json.tool
```
Also read `workplans/` directly — the files are the source of truth:
```bash
ls workplans/
grep -h "^status:" workplans/RREG-WP-*.md
```
### Log progress (required at session close)
```bash
curl -s -X POST http://127.0.0.1:8000/progress/ \
-H "Content-Type: application/json" \
-d '{
"summary": "describe what was done",
"event_type": "note",
"author": "codex"
}'
```
Include `"workstream_id": "<uuid>"` and `"task_id": "<uuid>"` when known.
### Mark a message read
```bash
curl -s -X PATCH "http://127.0.0.1:8000/messages/<message_id>/read" \
-H "Content-Type: application/json" -d '{}'
```
### Update task status (after workstreams are synced)
```bash
curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>/" \
-H "Content-Type: application/json" \
-d '{"status": "in_progress"}'
```
---
## Session Protocol
**Start:**
1. `ls workplans/` — note active workplans and their open tasks
2. Check inbox via `GET /messages/?to_agent=repo-registry&unread_only=true`
3. Check for human-flagged tasks: `GET /tasks/?needs_human=true`
**During work:**
- Update task status in the workplan file as tasks progress
- For significant decisions, record them: `POST /decisions/`
**Close:**
1. Update task statuses in workplan files to match progress
2. Call `POST /progress/` with a summary of what was done
3. If workplan files changed, note that `fix-consistency` should be run from
the custodian machine: `cd ~/the-custodian/state-hub && make fix-consistency REPO=repo-registry`
---
## Workplan Convention (ADR-001)
Work items originate as files in this repo, not in the hub. The hub is a
read/cache/index layer.
**File location:** `workplans/RREG-WP-NNNN-<slug>.md`
**Frontmatter:**
```yaml
---
id: RREG-WP-NNNN
type: workplan
title: "..."
domain: capabilities
repo: repo-registry
status: active | done
owner: codex
topic_slug: foerster-capabilities
created: "YYYY-MM-DD"
updated: "YYYY-MM-DD"
state_hub_workstream_id: "<uuid>" # populated by fix-consistency
---
```
**Task blocks** (one per `##` section):
```markdown
## Task Title
\`\`\`task
id: RREG-WP-NNNN-T01
status: todo | in_progress | done | blocked
priority: high | medium | low
\`\`\`
Task description.
```
**Status values:** `todo``in_progress``done` (or `blocked`)
---
## Stack and Commands
**Runtime:** Python 3.x, FastAPI, SQLite (dev) / PostgreSQL (prod)
**Package manager:** pip / uv
```bash
# Install
pip install -e ".[dev]"
# Run dev server
uvicorn src.repo_registry.app:app --reload
# Run tests
pytest tests/
pytest tests/ -k "e2e"
# Check API health
curl http://127.0.0.1:8001/health
```
---
## Repo Boundary
This repo owns: repository ingestion, deterministic scanning, LLM-assisted candidate
extraction, review/approval workflow, registry query and search.
It does NOT own: the Custodian State Hub, other domain repos, deployment infrastructure.
Coordination with other domains goes through the State Hub message inbox.

48
SCOPE.md Normal file
View File

@@ -0,0 +1,48 @@
---
domain: capabilities
repo: repo-registry
updated: "2026-04-26"
---
# repo-registry — Scope
## Purpose
Repository Ability Registry. Turns Git repositories into reviewable, source-linked
maps of `Ability → Capability → Feature → Evidence`.
## Core Design Principle
```
deterministic scanners → observed facts (file paths, languages, API routes, …)
LLM-assisted extractors → interpreted claims (ability names, descriptions, links)
human / agent review → approved registry truth
```
Approved entries are always explicit, reviewable, and source-linked. The system
never publishes unapproved claims as canonical truth.
## In Scope (v0.1)
- Repository registration by Git URL
- Deterministic repository scan (file tree, languages, frameworks, API/CLI surface)
- Candidate extraction for abilities, capabilities, features, and evidence
- Human review workflow: edit, approve, reject, merge, relink
- Natural-language and semantic search over approved registry entries
- REST API for repositories, ability maps, capabilities, and search
## Out of Scope (v0.1)
- Continuous GitHub App integration
- Full static code understanding (AST/type analysis)
- Advanced ontology enforcement
- Distributed indexing
- Benchmark execution
- Marketplace or commercial features
- Complex access control
- Automated truth claims without review
## Domain Context
Part of the **capabilities** domain — systematic modeling of abilities, capabilities,
and features across the Custodian ecosystem. First registered repo in this domain.

View File

@@ -1,478 +0,0 @@
# Repository Ability Registry Implementation Workplan
## MVP Closure
Status: closed as MVP complete on 2026-04-26.
The v0.1 implementation now covers the core product loop:
```text
Register repository
Analyze repository
Generate source-linked candidate map
Review and approve candidates
Publish approved profile
Search, inspect, compare, gap-check, and export registry entries
```
The full test suite passed at closure with `63 passed`.
Remaining work is no longer considered part of this MVP workplan. Production
hardening items have moved to `workplans/ProductionHardeningWorkplan.md`,
including first-class analysis-run diffs, semantic/vector search, broader
fixture coverage, and richer UI surfaces for discovery workflows.
## 1. Documentation Review Summary
The wiki defines a coherent v0.1 product: a registry that turns Git repositories into reviewable, source-linked maps of:
```text
Ability -> Capability -> Feature -> Evidence -> Code location
```
The strongest architectural principle across the docs is:
```text
deterministic scanners establish observed facts
LLM-assisted extractors propose interpreted claims
humans or trusted agents approve registry truth
```
This should remain the core design constraint for implementation. The system should be conservative, explainable, reviewable, and source-linked rather than attempting fully automatic code understanding.
## 2. MVP Scope
The first version should implement the core journey documented in the PRD, FRS, architecture sketch, and use-case catalog:
```text
Register repository
Analyze repository
Generate candidate ability/capability/feature/evidence map
Review and approve candidates
Publish registry profile
Search and inspect repositories
```
In scope for v0.1:
- Repository registration by Git URL
- Repository metadata and snapshot tracking
- Deterministic repository scan
- Candidate extraction for abilities, capabilities, features, and evidence
- Human review actions: edit, approve, reject, merge, relink
- Inspectable ability map
- Natural-language search over approved registry entries
- API access for repositories, ability maps, capabilities, and search
Out of scope for v0.1:
- Continuous GitHub app integration
- Full static code understanding
- Advanced ontology enforcement
- Distributed indexing
- Benchmark execution
- Marketplace features
- Complex access control
- Automated truth claims without review
## 3. Recommended Technical Baseline
Use a pragmatic stack that keeps the analyzer and registry easy to evolve:
- Backend: Python FastAPI
- Database: PostgreSQL
- Semantic search: pgvector inside PostgreSQL
- Worker: simple background jobs first; graduate to RQ or Celery when needed
- Git access: subprocess git or GitPython
- Frontend: React/Next.js or server-rendered FastAPI templates for earliest prototype
- LLM extraction: provider-abstracted interface
- Local artifact storage: filesystem under an application data directory
For the first implementation pass, prefer a modular monolith over distributed services. Keep clean module boundaries internally, but avoid operational complexity until the product loop is proven.
## 4. Core Domain Model
Implement these entities first:
- Repository
- RepositorySnapshot
- AnalysisRun
- ObservedFact
- CandidateAbility
- CandidateCapability
- CandidateFeature
- CandidateEvidence
- ApprovedAbility
- ApprovedCapability
- ApprovedFeature
- ApprovedEvidence
- SourceReference
- ReviewDecision
The model should preserve a clear distinction between observed facts and interpreted claims.
Observed facts include things like:
- File paths
- Documentation files
- Test files
- Package manifests
- API routes
- CLI commands
- Public modules/functions
- Detected languages/frameworks
Interpreted claims include:
- Ability names and descriptions
- Capability names and descriptions
- Feature-to-capability links
- Evidence-to-capability links
- Confidence scores
## 5. Suggested Module Boundaries
Use the architecture sketch's boundaries as implementation modules:
- `repo_ingestion`: validate Git URLs, clone/fetch repos, resolve branch/commit
- `repo_scanning`: deterministic file tree, language, docs, tests, examples, API/CLI detection
- `content_indexing`: text extraction, chunking, source references, embeddings
- `llm_extraction`: prompt orchestration and structured candidate generation
- `candidate_graph`: build and validate ability/capability/feature/evidence relationships
- `review_workflow`: edit, approve, reject, merge, relink, publish
- `registry_query`: search, filters, profile retrieval, ability-map assembly
- `web_api`: HTTP endpoints and request/response schemas
- `web_ui`: registration, analysis, review, profile, and search screens
## 6. Milestones
### Milestone 0: Project Foundation
Goal: establish the application skeleton and development path.
Deliverables:
- Backend app skeleton
- Database migration setup
- Configuration system
- Local development instructions
- Basic test harness
- Health endpoint
Acceptance criteria:
- App starts locally
- Tests run locally
- Database migrations apply cleanly
### Milestone 1: Manual Registry
Goal: prove the core data model and inspection experience before automation.
Deliverables:
- Repository CRUD
- Manual ability/capability/feature/evidence CRUD
- Ability map endpoint
- Basic repository profile UI
Acceptance criteria:
- A user can create a repository profile by hand
- The UI displays `Ability -> Capability -> Feature -> Evidence`
- API returns the same map as structured JSON
### Milestone 2: Git Ingestion and Deterministic Scanner
Goal: establish trustworthy observed facts from repository contents.
Deliverables:
- Git URL validation
- Clone/fetch and checkout
- Snapshot record with branch and commit hash
- File tree scan
- README/docs/examples/tests/package manifest detection
- Basic language/framework/interface detection
- Analysis run status tracking
Acceptance criteria:
- A public Git repository can be registered and analyzed
- The system records a snapshot and deterministic scan summary
- Analysis failures are visible without corrupting prior data
### Milestone 3: Reviewable Candidate Graph
Goal: generate candidate registry entries from deterministic facts and extracted content.
Deliverables:
- Content extraction from README, docs, examples, tests, package metadata, and selected source files
- Source references with file paths and line ranges where possible
- Candidate ability generation
- Candidate capability generation
- Candidate feature generation
- Candidate evidence detection
- Confidence scoring using the documented additive factors
- Candidate graph endpoint and UI
Acceptance criteria:
- Analysis produces candidates with source references and confidence
- Candidates distinguish observed facts from interpreted claims
- Candidate output is explainable enough for curator review
### Milestone 4: Review and Approval Workflow
Goal: turn candidates into canonical registry entries.
Deliverables:
- Approve/reject candidate entries
- Edit names, descriptions, confidence, and relationships
- Merge duplicate abilities/capabilities/features
- Relink capabilities, features, and evidence
- Publish approved repository profile
- Persist review decisions
Acceptance criteria:
- A curator can correct and approve an analysis result
- Only approved entries appear in canonical search/profile views
- Repository status changes from analyzed to indexed/published
### Milestone 5: Search and Inspection
Goal: make the registry useful for discovery.
Deliverables:
- Text search over repositories, abilities, capabilities, and descriptions
- Semantic search with pgvector
- Search filters for language, framework, and ability/capability presence
- Search UI
- Repository profile drill-down UI
- Code/evidence links from features and capabilities
Acceptance criteria:
- A user can search by need using natural language
- Results show repository, matching ability/capability, confidence, and evidence level
- A user can drill from a search result into the ability map and code/evidence references
### Milestone 6: API Completeness for Agents
Goal: support programmatic consumers cleanly.
Deliverables:
- `GET /repos`
- `POST /repos`
- `GET /repos/{id}`
- `POST /repos/{id}/analysis-runs`
- `GET /repos/{id}/analysis-runs/{run_id}`
- `GET /repos/{id}/ability-map`
- `GET /abilities`
- `GET /capabilities`
- `GET /search?q=...`
- OpenAPI examples
Acceptance criteria:
- API covers repository registration, analysis, search, and inspection
- Responses are stable enough for agent/tooling integration
- OpenAPI docs describe all MVP endpoints
## 6.1 Implemented Status Checkpoint
Status date: 2026-04-26
Current implementation baseline:
- Milestone 0: implemented. FastAPI app, SQLite migrations, settings, health endpoint, README development flow, and pytest harness are in place.
- Milestone 1: implemented. Repository CRUD, manual ability/capability/feature/evidence CRUD, ability-map API, and server-rendered repository profile UI are in place.
- Milestone 2: implemented for local paths and Git URLs. Registration can import metadata, analysis records snapshots and observed facts, and failures are captured on analysis runs.
- Milestone 3: implemented for deterministic extraction plus optional LLM-assisted extraction. Analysis stores content chunks, source-linked candidates, candidate evidence, confidence scores, and confidence labels.
- Milestone 4: implemented. Candidate approval, reject, edit, relink, merge, review decisions, and indexed repository publication are supported through API and UI paths.
- Milestone 5: partially implemented. Text search, filters, search UI, ability-map drill-down, and evidence/source context are implemented. pgvector-backed semantic search remains future work.
- Milestone 6: implemented for the MVP and review workflow. Agent-facing endpoints have typed OpenAPI response schemas, examples, tags, and docs smoke coverage.
Use case coverage status:
| ID | Use Case | Implementation Status | E2E Coverage Status |
| --- | --- | --- | --- |
| UC-01 | Register Git Repository | Implemented through API and UI. | Covered by API and UI registration loops. |
| UC-02 | Import Repository Metadata | Implemented from repository files when name/description are omitted. | Covered by API and service metadata tests. |
| UC-03 | Analyze Repository Structure | Implemented by deterministic scanner and analysis runs. | Covered by API, service, scanner, and UI analysis loops. |
| UC-04 | Extract Candidate Abilities | Implemented by deterministic generator and optional LLM mapper. | Covered by API/service analysis loops and LLM extraction tests. |
| UC-05 | Extract Candidate Capabilities | Implemented by deterministic generator and optional LLM mapper. | Covered by API/service analysis loops and LLM extraction tests. |
| UC-06 | Extract Candidate Features | Implemented with detected interfaces, languages, frameworks, docs, tests, and manifests. | Covered by API/service analysis loops plus source-linked fixture e2e assertions. |
| UC-07 | Link Features to Code Locations | Implemented through feature locations and source references. | Covered by service approval tests and API e2e assertions for source paths/lines. |
| UC-08 | Attach Evidence to Capabilities | Implemented for candidate and approved evidence. | Covered by API/UI review, manual registry tests, and source-linked approved evidence e2e assertions. |
| UC-09 | Review and Approve Analysis | Implemented through approve, edit, reject, relink, merge, and review decisions. | Covered by API/service/UI review tests. |
| UC-10 | Search Repositories by Need | Implemented with text search and structured filters. | Covered by API/service/UI search tests. Semantic search remains future work. |
| UC-11 | Inspect Repository Ability Map | Implemented through API and UI profile drill-down. | Covered by API/service/UI ability-map tests. |
| UC-12 | Compare Repositories | Implemented as a read-only API comparison over approved ability maps. | Covered by API e2e comparison test. |
| UC-13 | Detect Capability Gaps | Implemented as a read-only API gap report over desired capabilities and approved maps. | Covered by API e2e gap-analysis test. |
| UC-14 | Expose Registry via API | Implemented for MVP plus review workflow. | Covered by API contract, OpenAPI, and docs smoke tests. |
| UC-15 | Update Registry After Repo Change | Partially implemented by rerunning analysis; no explicit diff/change-review workflow yet. | Covered for rerun behavior by API e2e: second analysis records new candidates without corrupting approved profile. |
| UC-16 | Export Registry Entry | Implemented as YAML export for approved registry entries. | Covered by API e2e export test. |
Immediate production-readiness test focus:
1. If UC-15 becomes a production priority, add an explicit diff/change-review model instead of relying only on rerun analysis.
2. Broaden fixture coverage over time for README-only, Python CLI, FastAPI, JavaScript/TypeScript, tests/examples, and weak-doc repositories.
3. Add richer UI affordances for comparison, gap analysis, and export if these discovery endpoints become curator-facing workflows.
## 7. Initial Database Shape
Start with tables for:
- `repositories`
- `repository_snapshots`
- `analysis_runs`
- `observed_facts`
- `source_references`
- `candidate_abilities`
- `candidate_capabilities`
- `candidate_features`
- `candidate_evidence`
- `candidate_links`
- `approved_abilities`
- `approved_capabilities`
- `approved_features`
- `approved_evidence`
- `approved_links`
- `review_decisions`
- `content_chunks`
- `embeddings`
Use status fields consistently:
```text
registered
ingesting
analyzing
analysis_failed
analyzed
reviewing
indexed
```
## 8. Analyzer v0.1 Strategy
The first analyzer should be intentionally modest.
Deterministic scan:
- Identify repo root metadata files
- Identify docs, examples, tests, package manifests, API specs, config files
- Detect languages from extensions and package files
- Detect common frameworks from manifests
- Detect likely API/CLI features using simple framework-specific scanners
Content extraction:
- README and docs first
- Examples and tests second
- Selected source files only when they expose interfaces
- Preserve path and line references
LLM extraction:
- Use separate prompts for abilities, capabilities, features, and evidence
- Request structured JSON
- Require source references for each candidate
- Reject or mark speculative any candidate without supporting sources
Confidence scoring:
- Start from the documented additive model
- Normalize to `0.0-1.0`
- Store both numeric confidence and label
## 9. UI Workplan
Build application screens in this order:
1. Repository list
2. Repository registration
3. Repository detail and analysis status
4. Deterministic scan summary
5. Candidate review tree
6. Published repository profile
7. Search
The UI should feel like an operational tool rather than a marketing site: dense, clear, review-focused, and optimized for repeated curator work.
## 10. Testing Strategy
Add tests around the highest-risk boundaries:
- Database migrations and model relationships
- Git URL validation
- Scanner output for fixture repositories
- Candidate graph validation
- Review workflow transitions
- Search result ranking and filtering
- API contract tests for MVP endpoints
Create small fixture repositories for:
- README-only repository
- Python CLI repository
- FastAPI repository
- JavaScript/TypeScript package
- Repository with tests and examples
- Repository with weak or misleading docs
## 11. Key Risks and Mitigations
Extraction quality risk:
- Require source references.
- Keep candidates reviewable.
- Separate observed facts from interpreted claims.
Over-complex ontology risk:
- Keep v0.1 schema minimal.
- Avoid enforcing deep taxonomy too early.
Search quality risk:
- Combine relational filters, full-text search, and vector search.
- Show why a result matched.
Operational complexity risk:
- Start as a modular monolith.
- Use simple jobs before adding worker infrastructure.
Trust risk:
- Never publish unapproved claims as canonical truth.
- Preserve analysis run history and review decisions.
## 12. Immediate Next Actions
Recommended next implementation sequence:
1. Scaffold the FastAPI application, database migrations, and test harness.
2. Implement the core schema for repositories, snapshots, analysis runs, observed facts, candidates, and approved entries.
3. Add manual registry CRUD and ability-map API.
4. Build a minimal repository list/profile UI.
5. Add Git ingestion and deterministic scanning.
6. Add candidate graph generation and review workflow.
The first meaningful demo should be:
```text
Create a repository
Add or generate an ability map
Approve it
Search for a capability
Open the repository profile
Drill down to feature and evidence locations
```

View File

@@ -1,185 +0,0 @@
# Repository Ability Registry Production Hardening Workplan
Status: open
Created: 2026-04-26
This workplan starts after the v0.1 MVP closure. The MVP proves the core loop:
```text
Register -> Analyze -> Review -> Approve -> Search/Inspect
```
Production hardening should improve trust, search quality, update safety, and
operational readiness without weakening the core rule:
```text
observed facts are deterministic
interpreted claims are reviewable
approved registry truth is explicit
```
## 1. Priorities
### P0: Update Safety and Change Review
Goal: make repeated analysis after repository changes first-class.
Current state:
- A repository can be analyzed repeatedly.
- Each run records a snapshot, observed facts, chunks, and candidates.
- Existing approved profiles are not corrupted by later runs.
- There is no explicit analysis-run diff or change-review workflow.
Deliverables:
- Analysis-run diff model for facts, chunks, candidates, and approved entries.
- API endpoint to compare two analysis runs for a repository.
- Review view for changed, added, removed, or weakened claims.
- Review decisions that record change acceptance/rejection.
- Tests proving approved profiles remain stable until changes are approved.
Candidate endpoints:
```text
GET /repos/{id}/analysis-runs/{base_run_id}/diff/{target_run_id}
POST /repos/{id}/analysis-runs/{target_run_id}/changes/approve
```
Acceptance criteria:
- A user can see what changed between two analysis runs.
- New claims are not published automatically.
- Removed or weakened evidence is visible before approval.
- Review decisions preserve the reason for accepting or rejecting changes.
### P1: Search Quality and Semantic Retrieval
Goal: improve discovery beyond simple text matching while keeping results
explainable.
Current state:
- Text search covers repositories, abilities, capabilities, features, and evidence.
- Filters exist for status, language, framework, ability, and capability.
- Result explanations include matched field and context.
- Semantic/vector search is not implemented.
Deliverables:
- Embedding abstraction for approved registry entries and content chunks.
- Local/offline fake embedding provider for tests.
- Optional pgvector/PostgreSQL backend path, without breaking SQLite dev mode.
- Hybrid ranking that combines text match, filters, confidence, and vector score.
- Search response fields that explain text and semantic match reasons.
Acceptance criteria:
- Existing text search behavior remains stable.
- Semantic search can be enabled optionally.
- Search results remain source-linked and explainable.
- Tests cover ranking, filters, and fallback when embeddings are unavailable.
### P1: Discovery UI
Goal: make comparison, gap analysis, and export usable from the curator UI.
Current state:
- API endpoints exist for repository comparison, capability-gap reports, and YAML export.
- There are no dedicated UI workflows for these discovery helpers.
Deliverables:
- Repository comparison screen.
- Capability-gap input and report screen.
- Export action from repository profile.
- Clear empty states for repositories without approved profiles.
Acceptance criteria:
- A user can compare at least two approved repositories from the UI.
- A user can enter desired capabilities and inspect missing/weak/duplicate results.
- A user can export a registry entry without using raw API calls.
### P1: Fixture Breadth and Regression Confidence
Goal: broaden real-world coverage across repository styles.
Current state:
- Tests cover manual registry, FastAPI-like routes, docs/tests/examples, UI loops,
LLM fallback, and source-linked approval.
- Fixture helper coverage exists for README-only, Python CLI, and misleading-docs
repositories, but not every fixture style has full e2e coverage.
Deliverables:
- E2E tests for README-only repositories.
- E2E tests for Python CLI repositories.
- E2E tests for JavaScript/TypeScript packages.
- E2E tests for repositories with weak or misleading documentation.
- Negative tests for unsupported or empty repositories.
Acceptance criteria:
- Candidate extraction stays conservative when docs are weak.
- CLI and frontend/package repositories produce useful facts and candidates.
- Misleading docs do not become approved truth without review.
### P2: Operational Readiness
Goal: make the service easier to run and diagnose outside local development.
Deliverables:
- Structured logging around ingestion, analysis, LLM extraction, and review actions.
- Configuration documentation for database, checkout root, and LLM provider settings.
- Basic metrics or health details for database and checkout root reachability.
- Backup/restore guidance for SQLite MVP deployments.
- Migration strategy notes for a future PostgreSQL deployment.
Acceptance criteria:
- Operators can diagnose failed analyses from logs and API state.
- Configuration is documented in one place.
- Database migration and backup expectations are clear.
### P2: API Contract Stability
Goal: make agent/tooling integration safer as the API grows.
Deliverables:
- Versioned API path or explicit compatibility policy.
- Golden OpenAPI snapshot test or schema-diff check.
- More response examples for discovery and change-review endpoints.
- Error response schema for common 400/404 cases.
Acceptance criteria:
- Breaking API changes are deliberate and visible in tests.
- Agent-facing endpoints have stable request/response models.
## 2. First Implementation Sequence
Recommended next sequence:
1. Add analysis-run diff data structures and a read-only diff endpoint.
2. Add e2e tests for rerun diff behavior with added/removed routes and evidence.
3. Add UI display for analysis-run diffs.
4. Add approval workflow for selected changes.
5. Broaden fixture e2e coverage for README-only, CLI, JS/TS, and weak-doc repos.
6. Add discovery UI for compare, gaps, and export.
7. Prototype optional semantic search behind a feature/config boundary.
## 3. Definition of Done
This hardening plan can be closed when:
- Re-analysis changes are explicit, reviewable, and tested end to end.
- Search supports an optional semantic mode while preserving explainability.
- Discovery workflows are available in both API and UI.
- Fixture coverage represents the major repository shapes in the use case catalog.
- API contract changes are guarded by tests.
- Operational docs cover running, configuring, diagnosing, and backing up the service.

View File

@@ -0,0 +1,115 @@
---
id: RREG-WP-0001
type: workplan
title: "Repository Ability Registry — MVP Implementation"
domain: capabilities
repo: repo-registry
status: done
owner: codex
topic_slug: foerster-capabilities
created: "2026-04-26"
updated: "2026-04-26"
state_hub_workstream_id: "acee5529-43c0-4519-94bc-e59e61719af1"
---
# RREG-WP-0001 — MVP Implementation
## Goal
Prove the core registry loop: Register → Analyze → Review → Approve → Search/Inspect.
Core design constraint: deterministic scanners establish observed facts; LLM-assisted
extractors propose interpreted claims; humans or trusted agents approve registry truth.
## Scaffold and Foundation
```task
id: RREG-WP-0001-T01
status: done
priority: high
state_hub_task_id: "d26e70c9-e3a9-48f9-951c-c6210208aef4"
```
App skeleton, database migration setup, configuration system, health endpoint, basic
test harness, local development instructions. Acceptance: app starts, tests run,
migrations apply cleanly.
## Manual Registry
```task
id: RREG-WP-0001-T02
status: done
priority: high
state_hub_task_id: "e9c1cb9b-be21-4862-9443-a3e6c390882e"
```
Repository CRUD, manual ability/capability/feature/evidence CRUD, ability-map endpoint,
basic repository profile UI. Acceptance: user can create a profile by hand; UI and API
both display `Ability → Capability → Feature → Evidence`.
## Git Ingestion and Deterministic Scanner
```task
id: RREG-WP-0001-T03
status: done
priority: high
state_hub_task_id: "4a9fd9f0-d968-4778-93d4-609090c13c62"
```
Git URL validation, clone/fetch, snapshot with branch+commit hash, file tree scan,
README/docs/examples/tests/manifest detection, language/framework detection, analysis
run status tracking. Acceptance: public repo can be registered and analyzed; failures
visible without corrupting prior data.
## Reviewable Candidate Graph
```task
id: RREG-WP-0001-T04
status: done
priority: high
state_hub_task_id: "9aa84fb5-998b-480e-adf8-3c1b0faa395a"
```
Content extraction, source references, candidate ability/capability/feature/evidence
generation, confidence scoring, candidate graph endpoint and UI. Acceptance: candidates
have source references and confidence; output is explainable for curator review.
## Review and Approval Workflow
```task
id: RREG-WP-0001-T05
status: done
priority: high
state_hub_task_id: "0b5cff5d-4bc8-48ca-aa41-2f1d0c774bed"
```
Approve/reject/edit/merge/relink candidates, publish approved profile, persist review
decisions. Acceptance: curator can correct and approve an analysis result; only approved
entries appear in canonical views.
## Search and Inspection
```task
id: RREG-WP-0001-T06
status: done
priority: high
state_hub_task_id: "13e506eb-017a-4710-b20b-ae10ac8df9d4"
```
Text search over repos/abilities/capabilities/descriptions, search filters, search UI,
repository profile drill-down, code/evidence links. Acceptance: user can search by need;
results show repo, ability/capability, confidence, evidence level.
## API Completeness for Agents
```task
id: RREG-WP-0001-T07
status: done
priority: high
state_hub_task_id: "cb94bb9c-756f-446a-8a59-54aa5a51eff0"
```
Full MVP REST API: GET/POST /repos, GET /repos/{id}, POST/GET /repos/{id}/analysis-runs,
GET /repos/{id}/ability-map, GET /abilities, GET /capabilities, GET /search, OpenAPI
examples. Acceptance: API covers registration, analysis, search, inspection; stable
enough for agent/tooling integration.

View File

@@ -0,0 +1,108 @@
---
id: RREG-WP-0002
type: workplan
title: "Repository Ability Registry — Production Hardening"
domain: capabilities
repo: repo-registry
status: active
owner: codex
topic_slug: foerster-capabilities
created: "2026-04-26"
updated: "2026-04-26"
state_hub_workstream_id: "4218d2bb-33e8-4f98-94ba-38b4ac21502d"
---
# RREG-WP-0002 — Production Hardening
## Goal
Improve trust, search quality, update safety, and operational readiness after MVP
closure. Core invariant remains: observed facts are deterministic; interpreted claims
are reviewable; approved registry truth is explicit.
## P0: Update Safety and Change Review
```task
id: RREG-WP-0002-T01
status: todo
priority: high
state_hub_task_id: "a27142a6-c160-4453-ab59-50a7db92f9c4"
```
Analysis-run diff model for facts/chunks/candidates/approved entries. API endpoint to
compare two runs. UI review view for changed/added/removed/weakened claims. Review
decisions that record change acceptance/rejection. Tests proving approved profiles
remain stable until changes are approved.
Candidate endpoints:
- `GET /repos/{id}/analysis-runs/{base}/diff/{target}`
- `POST /repos/{id}/analysis-runs/{target}/changes/approve`
## P1: Search Quality and Semantic Retrieval
```task
id: RREG-WP-0002-T02
status: todo
priority: medium
state_hub_task_id: "0e7cce78-13ab-4aa2-8d25-ae50ff8ccd74"
```
Embedding abstraction for approved entries and content chunks. Local/offline fake
provider for tests. Optional pgvector path without breaking SQLite dev mode. Hybrid
ranking combining text match, filters, confidence, and vector score. Existing text
search behavior must remain stable.
## P1: Discovery UI
```task
id: RREG-WP-0002-T03
status: todo
priority: medium
state_hub_task_id: "aee945eb-ea25-49f7-b755-4ec451c1d05a"
```
Repository comparison screen. Capability-gap input and report screen. Export action
from repository profile. Clear empty states for repos without approved profiles.
Acceptance: user can compare approved repos, enter desired capabilities, and export
without raw API calls.
## P1: Fixture Breadth and Regression Confidence
```task
id: RREG-WP-0002-T04
status: todo
priority: medium
state_hub_task_id: "d1df6453-3bca-4524-85d1-9b3f3f275b45"
```
E2E tests for: README-only repos, Python CLI repos, JavaScript/TypeScript packages,
repos with weak or misleading docs, negative cases for unsupported/empty repos.
Acceptance: conservative candidate extraction when docs are weak; misleading docs
do not become approved truth without review.
## P2: Operational Readiness
```task
id: RREG-WP-0002-T05
status: todo
priority: low
state_hub_task_id: "44b10491-f1f2-4e2e-9a8e-e8bd59cbf892"
```
Structured logging around ingestion, analysis, LLM extraction, and review actions.
Configuration documentation. Basic health details for DB and checkout root. Backup/
restore guidance for SQLite deployments. Migration strategy notes for PostgreSQL.
## P2: API Contract Stability
```task
id: RREG-WP-0002-T06
status: todo
priority: low
state_hub_task_id: "271a4fc4-d966-40ef-bc6f-a5fd1c445a16"
```
Versioned API path or explicit compatibility policy. Golden OpenAPI snapshot test or
schema-diff check. More response examples for discovery and change-review endpoints.
Error response schema for common 400/404 cases. Acceptance: breaking changes are
deliberate and visible in tests; agent-facing endpoints have stable models.