chore(custodian): onboard repo-registry to Custodian State Hub

- Register under capabilities domain (foerster_capabilities renamed) - Replace prose workplans with ADR-001 format (RREG-WP-0001 done, RREG-WP-0002 active) - Add AGENTS.md for Codex agent state-hub integration via HTTP API - Add SCOPE.md with domain context and v0.1 scope boundaries Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-26 13:06:35 +02:00
parent 249641728b
commit 4e17c9fea9
6 changed files with 440 additions and 663 deletions
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -0,0 +1,169 @@
 # repo-registry — Agent Instructions
 ## Repo Identity
 **Purpose:** Repository Ability Registry — turns Git repositories into reviewable,
 source-linked maps of `Ability → Capability → Feature → Evidence`. Deterministic
 scanners establish observed facts; LLM-assisted extractors propose interpreted
 claims; humans or trusted agents approve registry truth.
 **Domain:** capabilities  
 **Repo slug:** repo-registry  
 **Topic ID:** `64418556-3206-457a-ba29-6884b5b12cf3`  
 **Workplan prefix:** `RREG-WP-`
 ---
 ## State Hub Integration
 The Custodian State Hub tracks work across all domains. It runs at
 `http://127.0.0.1:8000` (local) or `http://127.0.0.1:18000` when accessed from
 a remote machine via tunnel.
 Interact via HTTP — there is no MCP integration for Codex agents.
 ### Orient at session start
 ```bash
 # Domain workstreams
 curl -s "http://127.0.0.1:8000/workstreams/?topic_id=64418556-3206-457a-ba29-6884b5b12cf3&status=active" \
  | python3 -m json.tool
 # Open tasks for this repo (once workstreams are registered)
 curl -s "http://127.0.0.1:8000/tasks/?status=todo" | python3 -m json.tool
 # Check inbox
 curl -s "http://127.0.0.1:8000/messages/?to_agent=repo-registry&unread_only=true" \
  | python3 -m json.tool
 ```
 Also read `workplans/` directly — the files are the source of truth:
 ```bash
 ls workplans/
 grep -h "^status:" workplans/RREG-WP-*.md
 ```
 ### Log progress (required at session close)
 ```bash
 curl -s -X POST http://127.0.0.1:8000/progress/ \
  -H "Content-Type: application/json" \
  -d '{
    "summary": "describe what was done",
    "event_type": "note",
    "author": "codex"
  }'
 ```
 Include `"workstream_id": "<uuid>"` and `"task_id": "<uuid>"` when known.
 ### Mark a message read
 ```bash
 curl -s -X PATCH "http://127.0.0.1:8000/messages/<message_id>/read" \
  -H "Content-Type: application/json" -d '{}'
 ```
 ### Update task status (after workstreams are synced)
 ```bash
 curl -s -X PATCH "http://127.0.0.1:8000/tasks/<task_id>/" \
  -H "Content-Type: application/json" \
  -d '{"status": "in_progress"}'
 ```
 ---
 ## Session Protocol
 **Start:**
 1. `ls workplans/` — note active workplans and their open tasks
 2. Check inbox via `GET /messages/?to_agent=repo-registry&unread_only=true`
 3. Check for human-flagged tasks: `GET /tasks/?needs_human=true`
 **During work:**
 - Update task status in the workplan file as tasks progress
 - For significant decisions, record them: `POST /decisions/`
 **Close:**
 1. Update task statuses in workplan files to match progress
 2. Call `POST /progress/` with a summary of what was done
 3. If workplan files changed, note that `fix-consistency` should be run from
   the custodian machine: `cd ~/the-custodian/state-hub && make fix-consistency REPO=repo-registry`
 ---
 ## Workplan Convention (ADR-001)
 Work items originate as files in this repo, not in the hub. The hub is a
 read/cache/index layer.
 **File location:** `workplans/RREG-WP-NNNN-<slug>.md`
 **Frontmatter:**
 ```yaml
 ---
 id: RREG-WP-NNNN
 type: workplan
 title: "..."
 domain: capabilities
 repo: repo-registry
 status: active | done
 owner: codex
 topic_slug: foerster-capabilities
 created: "YYYY-MM-DD"
 updated: "YYYY-MM-DD"
 state_hub_workstream_id: "<uuid>"   # populated by fix-consistency
 ---
 ```
 **Task blocks** (one per `##` section):
 ```markdown
 ## Task Title
 \`\`\`task
 id: RREG-WP-NNNN-T01
 status: todo | in_progress | done | blocked
 priority: high | medium | low
 \`\`\`
 Task description.
 ```
 **Status values:** `todo` → `in_progress` → `done` (or `blocked`)
 ---
 ## Stack and Commands
 **Runtime:** Python 3.x, FastAPI, SQLite (dev) / PostgreSQL (prod)  
 **Package manager:** pip / uv
 ```bash
 # Install
 pip install -e ".[dev]"
 # Run dev server
 uvicorn src.repo_registry.app:app --reload
 # Run tests
 pytest tests/
 pytest tests/ -k "e2e"
 # Check API health
 curl http://127.0.0.1:8001/health
 ```
 ---
 ## Repo Boundary
 This repo owns: repository ingestion, deterministic scanning, LLM-assisted candidate
 extraction, review/approval workflow, registry query and search.
 It does NOT own: the Custodian State Hub, other domain repos, deployment infrastructure.
 Coordination with other domains goes through the State Hub message inbox.
--- a/SCOPE.md
+++ b/SCOPE.md
@@ -0,0 +1,48 @@
 ---
 domain: capabilities
 repo: repo-registry
 updated: "2026-04-26"
 ---
 # repo-registry — Scope
 ## Purpose
 Repository Ability Registry. Turns Git repositories into reviewable, source-linked
 maps of `Ability → Capability → Feature → Evidence`.
 ## Core Design Principle
 ```
 deterministic scanners  →  observed facts  (file paths, languages, API routes, …)
 LLM-assisted extractors →  interpreted claims  (ability names, descriptions, links)
 human / agent review    →  approved registry truth
 ```
 Approved entries are always explicit, reviewable, and source-linked. The system
 never publishes unapproved claims as canonical truth.
 ## In Scope (v0.1)
 - Repository registration by Git URL
 - Deterministic repository scan (file tree, languages, frameworks, API/CLI surface)
 - Candidate extraction for abilities, capabilities, features, and evidence
 - Human review workflow: edit, approve, reject, merge, relink
 - Natural-language and semantic search over approved registry entries
 - REST API for repositories, ability maps, capabilities, and search
 ## Out of Scope (v0.1)
 - Continuous GitHub App integration
 - Full static code understanding (AST/type analysis)
 - Advanced ontology enforcement
 - Distributed indexing
 - Benchmark execution
 - Marketplace or commercial features
 - Complex access control
 - Automated truth claims without review
 ## Domain Context
 Part of the **capabilities** domain — systematic modeling of abilities, capabilities,
 and features across the Custodian ecosystem. First registered repo in this domain.
--- a/workplans/ImplementationWorkplan.md
+++ b/workplans/ImplementationWorkplan.md
@@ -1,478 +0,0 @@
 # Repository Ability Registry Implementation Workplan
 ## MVP Closure
 Status: closed as MVP complete on 2026-04-26.
 The v0.1 implementation now covers the core product loop:
 ```text
 Register repository
 Analyze repository
 Generate source-linked candidate map
 Review and approve candidates
 Publish approved profile
 Search, inspect, compare, gap-check, and export registry entries
 ```
 The full test suite passed at closure with `63 passed`.
 Remaining work is no longer considered part of this MVP workplan. Production
 hardening items have moved to `workplans/ProductionHardeningWorkplan.md`,
 including first-class analysis-run diffs, semantic/vector search, broader
 fixture coverage, and richer UI surfaces for discovery workflows.
 ## 1. Documentation Review Summary
 The wiki defines a coherent v0.1 product: a registry that turns Git repositories into reviewable, source-linked maps of:
 ```text
 Ability -> Capability -> Feature -> Evidence -> Code location
 ```
 The strongest architectural principle across the docs is:
 ```text
 deterministic scanners establish observed facts
 LLM-assisted extractors propose interpreted claims
 humans or trusted agents approve registry truth
 ```
 This should remain the core design constraint for implementation. The system should be conservative, explainable, reviewable, and source-linked rather than attempting fully automatic code understanding.
 ## 2. MVP Scope
 The first version should implement the core journey documented in the PRD, FRS, architecture sketch, and use-case catalog:
 ```text
 Register repository
 Analyze repository
 Generate candidate ability/capability/feature/evidence map
 Review and approve candidates
 Publish registry profile
 Search and inspect repositories
 ```
 In scope for v0.1:
 - Repository registration by Git URL
 - Repository metadata and snapshot tracking
 - Deterministic repository scan
 - Candidate extraction for abilities, capabilities, features, and evidence
 - Human review actions: edit, approve, reject, merge, relink
 - Inspectable ability map
 - Natural-language search over approved registry entries
 - API access for repositories, ability maps, capabilities, and search
 Out of scope for v0.1:
 - Continuous GitHub app integration
 - Full static code understanding
 - Advanced ontology enforcement
 - Distributed indexing
 - Benchmark execution
 - Marketplace features
 - Complex access control
 - Automated truth claims without review
 ## 3. Recommended Technical Baseline
 Use a pragmatic stack that keeps the analyzer and registry easy to evolve:
 - Backend: Python FastAPI
 - Database: PostgreSQL
 - Semantic search: pgvector inside PostgreSQL
 - Worker: simple background jobs first; graduate to RQ or Celery when needed
 - Git access: subprocess git or GitPython
 - Frontend: React/Next.js or server-rendered FastAPI templates for earliest prototype
 - LLM extraction: provider-abstracted interface
 - Local artifact storage: filesystem under an application data directory
 For the first implementation pass, prefer a modular monolith over distributed services. Keep clean module boundaries internally, but avoid operational complexity until the product loop is proven.
 ## 4. Core Domain Model
 Implement these entities first:
 - Repository
 - RepositorySnapshot
 - AnalysisRun
 - ObservedFact
 - CandidateAbility
 - CandidateCapability
 - CandidateFeature
 - CandidateEvidence
 - ApprovedAbility
 - ApprovedCapability
 - ApprovedFeature
 - ApprovedEvidence
 - SourceReference
 - ReviewDecision
 The model should preserve a clear distinction between observed facts and interpreted claims.
 Observed facts include things like:
 - File paths
 - Documentation files
 - Test files
 - Package manifests
 - API routes
 - CLI commands
 - Public modules/functions
 - Detected languages/frameworks
 Interpreted claims include:
 - Ability names and descriptions
 - Capability names and descriptions
 - Feature-to-capability links
 - Evidence-to-capability links
 - Confidence scores
 ## 5. Suggested Module Boundaries
 Use the architecture sketch's boundaries as implementation modules:
 - `repo_ingestion`: validate Git URLs, clone/fetch repos, resolve branch/commit
 - `repo_scanning`: deterministic file tree, language, docs, tests, examples, API/CLI detection
 - `content_indexing`: text extraction, chunking, source references, embeddings
 - `llm_extraction`: prompt orchestration and structured candidate generation
 - `candidate_graph`: build and validate ability/capability/feature/evidence relationships
 - `review_workflow`: edit, approve, reject, merge, relink, publish
 - `registry_query`: search, filters, profile retrieval, ability-map assembly
 - `web_api`: HTTP endpoints and request/response schemas
 - `web_ui`: registration, analysis, review, profile, and search screens
 ## 6. Milestones
 ### Milestone 0: Project Foundation
 Goal: establish the application skeleton and development path.
 Deliverables:
 - Backend app skeleton
 - Database migration setup
 - Configuration system
 - Local development instructions
 - Basic test harness
 - Health endpoint
 Acceptance criteria:
 - App starts locally
 - Tests run locally
 - Database migrations apply cleanly
 ### Milestone 1: Manual Registry
 Goal: prove the core data model and inspection experience before automation.
 Deliverables:
 - Repository CRUD
 - Manual ability/capability/feature/evidence CRUD
 - Ability map endpoint
 - Basic repository profile UI
 Acceptance criteria:
 - A user can create a repository profile by hand
 - The UI displays `Ability -> Capability -> Feature -> Evidence`
 - API returns the same map as structured JSON
 ### Milestone 2: Git Ingestion and Deterministic Scanner
 Goal: establish trustworthy observed facts from repository contents.
 Deliverables:
 - Git URL validation
 - Clone/fetch and checkout
 - Snapshot record with branch and commit hash
 - File tree scan
 - README/docs/examples/tests/package manifest detection
 - Basic language/framework/interface detection
 - Analysis run status tracking
 Acceptance criteria:
 - A public Git repository can be registered and analyzed
 - The system records a snapshot and deterministic scan summary
 - Analysis failures are visible without corrupting prior data
 ### Milestone 3: Reviewable Candidate Graph
 Goal: generate candidate registry entries from deterministic facts and extracted content.
 Deliverables:
 - Content extraction from README, docs, examples, tests, package metadata, and selected source files
 - Source references with file paths and line ranges where possible
 - Candidate ability generation
 - Candidate capability generation
 - Candidate feature generation
 - Candidate evidence detection
 - Confidence scoring using the documented additive factors
 - Candidate graph endpoint and UI
 Acceptance criteria:
 - Analysis produces candidates with source references and confidence
 - Candidates distinguish observed facts from interpreted claims
 - Candidate output is explainable enough for curator review
 ### Milestone 4: Review and Approval Workflow
 Goal: turn candidates into canonical registry entries.
 Deliverables:
 - Approve/reject candidate entries
 - Edit names, descriptions, confidence, and relationships
 - Merge duplicate abilities/capabilities/features
 - Relink capabilities, features, and evidence
 - Publish approved repository profile
 - Persist review decisions
 Acceptance criteria:
 - A curator can correct and approve an analysis result
 - Only approved entries appear in canonical search/profile views
 - Repository status changes from analyzed to indexed/published
 ### Milestone 5: Search and Inspection
 Goal: make the registry useful for discovery.
 Deliverables:
 - Text search over repositories, abilities, capabilities, and descriptions
 - Semantic search with pgvector
 - Search filters for language, framework, and ability/capability presence
 - Search UI
 - Repository profile drill-down UI
 - Code/evidence links from features and capabilities
 Acceptance criteria:
 - A user can search by need using natural language
 - Results show repository, matching ability/capability, confidence, and evidence level
 - A user can drill from a search result into the ability map and code/evidence references
 ### Milestone 6: API Completeness for Agents
 Goal: support programmatic consumers cleanly.
 Deliverables:
 - `GET /repos`
 - `POST /repos`
 - `GET /repos/{id}`
 - `POST /repos/{id}/analysis-runs`
 - `GET /repos/{id}/analysis-runs/{run_id}`
 - `GET /repos/{id}/ability-map`
 - `GET /abilities`
 - `GET /capabilities`
 - `GET /search?q=...`
 - OpenAPI examples
 Acceptance criteria:
 - API covers repository registration, analysis, search, and inspection
 - Responses are stable enough for agent/tooling integration
 - OpenAPI docs describe all MVP endpoints
 ## 6.1 Implemented Status Checkpoint
 Status date: 2026-04-26
 Current implementation baseline:
 - Milestone 0: implemented. FastAPI app, SQLite migrations, settings, health endpoint, README development flow, and pytest harness are in place.
 - Milestone 1: implemented. Repository CRUD, manual ability/capability/feature/evidence CRUD, ability-map API, and server-rendered repository profile UI are in place.
 - Milestone 2: implemented for local paths and Git URLs. Registration can import metadata, analysis records snapshots and observed facts, and failures are captured on analysis runs.
 - Milestone 3: implemented for deterministic extraction plus optional LLM-assisted extraction. Analysis stores content chunks, source-linked candidates, candidate evidence, confidence scores, and confidence labels.
 - Milestone 4: implemented. Candidate approval, reject, edit, relink, merge, review decisions, and indexed repository publication are supported through API and UI paths.
 - Milestone 5: partially implemented. Text search, filters, search UI, ability-map drill-down, and evidence/source context are implemented. pgvector-backed semantic search remains future work.
 - Milestone 6: implemented for the MVP and review workflow. Agent-facing endpoints have typed OpenAPI response schemas, examples, tags, and docs smoke coverage.
 Use case coverage status:
 | ID | Use Case | Implementation Status | E2E Coverage Status |
 | --- | --- | --- | --- |
 | UC-01 | Register Git Repository | Implemented through API and UI. | Covered by API and UI registration loops. |
 | UC-02 | Import Repository Metadata | Implemented from repository files when name/description are omitted. | Covered by API and service metadata tests. |
 | UC-03 | Analyze Repository Structure | Implemented by deterministic scanner and analysis runs. | Covered by API, service, scanner, and UI analysis loops. |
 | UC-04 | Extract Candidate Abilities | Implemented by deterministic generator and optional LLM mapper. | Covered by API/service analysis loops and LLM extraction tests. |
 | UC-05 | Extract Candidate Capabilities | Implemented by deterministic generator and optional LLM mapper. | Covered by API/service analysis loops and LLM extraction tests. |
 | UC-06 | Extract Candidate Features | Implemented with detected interfaces, languages, frameworks, docs, tests, and manifests. | Covered by API/service analysis loops plus source-linked fixture e2e assertions. |
 | UC-07 | Link Features to Code Locations | Implemented through feature locations and source references. | Covered by service approval tests and API e2e assertions for source paths/lines. |
 | UC-08 | Attach Evidence to Capabilities | Implemented for candidate and approved evidence. | Covered by API/UI review, manual registry tests, and source-linked approved evidence e2e assertions. |
 | UC-09 | Review and Approve Analysis | Implemented through approve, edit, reject, relink, merge, and review decisions. | Covered by API/service/UI review tests. |
 | UC-10 | Search Repositories by Need | Implemented with text search and structured filters. | Covered by API/service/UI search tests. Semantic search remains future work. |
 | UC-11 | Inspect Repository Ability Map | Implemented through API and UI profile drill-down. | Covered by API/service/UI ability-map tests. |
 | UC-12 | Compare Repositories | Implemented as a read-only API comparison over approved ability maps. | Covered by API e2e comparison test. |
 | UC-13 | Detect Capability Gaps | Implemented as a read-only API gap report over desired capabilities and approved maps. | Covered by API e2e gap-analysis test. |
 | UC-14 | Expose Registry via API | Implemented for MVP plus review workflow. | Covered by API contract, OpenAPI, and docs smoke tests. |
 | UC-15 | Update Registry After Repo Change | Partially implemented by rerunning analysis; no explicit diff/change-review workflow yet. | Covered for rerun behavior by API e2e: second analysis records new candidates without corrupting approved profile. |
 | UC-16 | Export Registry Entry | Implemented as YAML export for approved registry entries. | Covered by API e2e export test. |
 Immediate production-readiness test focus:
 1. If UC-15 becomes a production priority, add an explicit diff/change-review model instead of relying only on rerun analysis.
 2. Broaden fixture coverage over time for README-only, Python CLI, FastAPI, JavaScript/TypeScript, tests/examples, and weak-doc repositories.
 3. Add richer UI affordances for comparison, gap analysis, and export if these discovery endpoints become curator-facing workflows.
 ## 7. Initial Database Shape
 Start with tables for:
 - `repositories`
 - `repository_snapshots`
 - `analysis_runs`
 - `observed_facts`
 - `source_references`
 - `candidate_abilities`
 - `candidate_capabilities`
 - `candidate_features`
 - `candidate_evidence`
 - `candidate_links`
 - `approved_abilities`
 - `approved_capabilities`
 - `approved_features`
 - `approved_evidence`
 - `approved_links`
 - `review_decisions`
 - `content_chunks`
 - `embeddings`
 Use status fields consistently:
 ```text
 registered
 ingesting
 analyzing
 analysis_failed
 analyzed
 reviewing
 indexed
 ```
 ## 8. Analyzer v0.1 Strategy
 The first analyzer should be intentionally modest.
 Deterministic scan:
 - Identify repo root metadata files
 - Identify docs, examples, tests, package manifests, API specs, config files
 - Detect languages from extensions and package files
 - Detect common frameworks from manifests
 - Detect likely API/CLI features using simple framework-specific scanners
 Content extraction:
 - README and docs first
 - Examples and tests second
 - Selected source files only when they expose interfaces
 - Preserve path and line references
 LLM extraction:
 - Use separate prompts for abilities, capabilities, features, and evidence
 - Request structured JSON
 - Require source references for each candidate
 - Reject or mark speculative any candidate without supporting sources
 Confidence scoring:
 - Start from the documented additive model
 - Normalize to `0.0-1.0`
 - Store both numeric confidence and label
 ## 9. UI Workplan
 Build application screens in this order:
 1. Repository list
 2. Repository registration
 3. Repository detail and analysis status
 4. Deterministic scan summary
 5. Candidate review tree
 6. Published repository profile
 7. Search
 The UI should feel like an operational tool rather than a marketing site: dense, clear, review-focused, and optimized for repeated curator work.
 ## 10. Testing Strategy
 Add tests around the highest-risk boundaries:
 - Database migrations and model relationships
 - Git URL validation
 - Scanner output for fixture repositories
 - Candidate graph validation
 - Review workflow transitions
 - Search result ranking and filtering
 - API contract tests for MVP endpoints
 Create small fixture repositories for:
 - README-only repository
 - Python CLI repository
 - FastAPI repository
 - JavaScript/TypeScript package
 - Repository with tests and examples
 - Repository with weak or misleading docs
 ## 11. Key Risks and Mitigations
 Extraction quality risk:
 - Require source references.
 - Keep candidates reviewable.
 - Separate observed facts from interpreted claims.
 Over-complex ontology risk:
 - Keep v0.1 schema minimal.
 - Avoid enforcing deep taxonomy too early.
 Search quality risk:
 - Combine relational filters, full-text search, and vector search.
 - Show why a result matched.
 Operational complexity risk:
 - Start as a modular monolith.
 - Use simple jobs before adding worker infrastructure.
 Trust risk:
 - Never publish unapproved claims as canonical truth.
 - Preserve analysis run history and review decisions.
 ## 12. Immediate Next Actions
 Recommended next implementation sequence:
 1. Scaffold the FastAPI application, database migrations, and test harness.
 2. Implement the core schema for repositories, snapshots, analysis runs, observed facts, candidates, and approved entries.
 3. Add manual registry CRUD and ability-map API.
 4. Build a minimal repository list/profile UI.
 5. Add Git ingestion and deterministic scanning.
 6. Add candidate graph generation and review workflow.
 The first meaningful demo should be:
 ```text
 Create a repository
 Add or generate an ability map
 Approve it
 Search for a capability
 Open the repository profile
 Drill down to feature and evidence locations
 ```
--- a/workplans/ProductionHardeningWorkplan.md
+++ b/workplans/ProductionHardeningWorkplan.md
@@ -1,185 +0,0 @@
 # Repository Ability Registry Production Hardening Workplan
 Status: open
 Created: 2026-04-26
 This workplan starts after the v0.1 MVP closure. The MVP proves the core loop:
 ```text
 Register -> Analyze -> Review -> Approve -> Search/Inspect
 ```
 Production hardening should improve trust, search quality, update safety, and
 operational readiness without weakening the core rule:
 ```text
 observed facts are deterministic
 interpreted claims are reviewable
 approved registry truth is explicit
 ```
 ## 1. Priorities
 ### P0: Update Safety and Change Review
 Goal: make repeated analysis after repository changes first-class.
 Current state:
 - A repository can be analyzed repeatedly.
 - Each run records a snapshot, observed facts, chunks, and candidates.
 - Existing approved profiles are not corrupted by later runs.
 - There is no explicit analysis-run diff or change-review workflow.
 Deliverables:
 - Analysis-run diff model for facts, chunks, candidates, and approved entries.
 - API endpoint to compare two analysis runs for a repository.
 - Review view for changed, added, removed, or weakened claims.
 - Review decisions that record change acceptance/rejection.
 - Tests proving approved profiles remain stable until changes are approved.
 Candidate endpoints:
 ```text
 GET  /repos/{id}/analysis-runs/{base_run_id}/diff/{target_run_id}
 POST /repos/{id}/analysis-runs/{target_run_id}/changes/approve
 ```
 Acceptance criteria:
 - A user can see what changed between two analysis runs.
 - New claims are not published automatically.
 - Removed or weakened evidence is visible before approval.
 - Review decisions preserve the reason for accepting or rejecting changes.
 ### P1: Search Quality and Semantic Retrieval
 Goal: improve discovery beyond simple text matching while keeping results
 explainable.
 Current state:
 - Text search covers repositories, abilities, capabilities, features, and evidence.
 - Filters exist for status, language, framework, ability, and capability.
 - Result explanations include matched field and context.
 - Semantic/vector search is not implemented.
 Deliverables:
 - Embedding abstraction for approved registry entries and content chunks.
 - Local/offline fake embedding provider for tests.
 - Optional pgvector/PostgreSQL backend path, without breaking SQLite dev mode.
 - Hybrid ranking that combines text match, filters, confidence, and vector score.
 - Search response fields that explain text and semantic match reasons.
 Acceptance criteria:
 - Existing text search behavior remains stable.
 - Semantic search can be enabled optionally.
 - Search results remain source-linked and explainable.
 - Tests cover ranking, filters, and fallback when embeddings are unavailable.
 ### P1: Discovery UI
 Goal: make comparison, gap analysis, and export usable from the curator UI.
 Current state:
 - API endpoints exist for repository comparison, capability-gap reports, and YAML export.
 - There are no dedicated UI workflows for these discovery helpers.
 Deliverables:
 - Repository comparison screen.
 - Capability-gap input and report screen.
 - Export action from repository profile.
 - Clear empty states for repositories without approved profiles.
 Acceptance criteria:
 - A user can compare at least two approved repositories from the UI.
 - A user can enter desired capabilities and inspect missing/weak/duplicate results.
 - A user can export a registry entry without using raw API calls.
 ### P1: Fixture Breadth and Regression Confidence
 Goal: broaden real-world coverage across repository styles.
 Current state:
 - Tests cover manual registry, FastAPI-like routes, docs/tests/examples, UI loops,
  LLM fallback, and source-linked approval.
 - Fixture helper coverage exists for README-only, Python CLI, and misleading-docs
  repositories, but not every fixture style has full e2e coverage.
 Deliverables:
 - E2E tests for README-only repositories.
 - E2E tests for Python CLI repositories.
 - E2E tests for JavaScript/TypeScript packages.
 - E2E tests for repositories with weak or misleading documentation.
 - Negative tests for unsupported or empty repositories.
 Acceptance criteria:
 - Candidate extraction stays conservative when docs are weak.
 - CLI and frontend/package repositories produce useful facts and candidates.
 - Misleading docs do not become approved truth without review.
 ### P2: Operational Readiness
 Goal: make the service easier to run and diagnose outside local development.
 Deliverables:
 - Structured logging around ingestion, analysis, LLM extraction, and review actions.
 - Configuration documentation for database, checkout root, and LLM provider settings.
 - Basic metrics or health details for database and checkout root reachability.
 - Backup/restore guidance for SQLite MVP deployments.
 - Migration strategy notes for a future PostgreSQL deployment.
 Acceptance criteria:
 - Operators can diagnose failed analyses from logs and API state.
 - Configuration is documented in one place.
 - Database migration and backup expectations are clear.
 ### P2: API Contract Stability
 Goal: make agent/tooling integration safer as the API grows.
 Deliverables:
 - Versioned API path or explicit compatibility policy.
 - Golden OpenAPI snapshot test or schema-diff check.
 - More response examples for discovery and change-review endpoints.
 - Error response schema for common 400/404 cases.
 Acceptance criteria:
 - Breaking API changes are deliberate and visible in tests.
 - Agent-facing endpoints have stable request/response models.
 ## 2. First Implementation Sequence
 Recommended next sequence:
 1. Add analysis-run diff data structures and a read-only diff endpoint.
 2. Add e2e tests for rerun diff behavior with added/removed routes and evidence.
 3. Add UI display for analysis-run diffs.
 4. Add approval workflow for selected changes.
 5. Broaden fixture e2e coverage for README-only, CLI, JS/TS, and weak-doc repos.
 6. Add discovery UI for compare, gaps, and export.
 7. Prototype optional semantic search behind a feature/config boundary.
 ## 3. Definition of Done
 This hardening plan can be closed when:
 - Re-analysis changes are explicit, reviewable, and tested end to end.
 - Search supports an optional semantic mode while preserving explainability.
 - Discovery workflows are available in both API and UI.
 - Fixture coverage represents the major repository shapes in the use case catalog.
 - API contract changes are guarded by tests.
 - Operational docs cover running, configuring, diagnosing, and backing up the service.
--- a/workplans/RREG-WP-0001-mvp-implementation.md
+++ b/workplans/RREG-WP-0001-mvp-implementation.md
@@ -0,0 +1,115 @@
 ---
 id: RREG-WP-0001
 type: workplan
 title: "Repository Ability Registry — MVP Implementation"
 domain: capabilities
 repo: repo-registry
 status: done
 owner: codex
 topic_slug: foerster-capabilities
 created: "2026-04-26"
 updated: "2026-04-26"
 state_hub_workstream_id: "acee5529-43c0-4519-94bc-e59e61719af1"
 ---
 # RREG-WP-0001 — MVP Implementation
 ## Goal
 Prove the core registry loop: Register → Analyze → Review → Approve → Search/Inspect.
 Core design constraint: deterministic scanners establish observed facts; LLM-assisted
 extractors propose interpreted claims; humans or trusted agents approve registry truth.
 ## Scaffold and Foundation
 ```task
 id: RREG-WP-0001-T01
 status: done
 priority: high
 state_hub_task_id: "d26e70c9-e3a9-48f9-951c-c6210208aef4"
 ```
 App skeleton, database migration setup, configuration system, health endpoint, basic
 test harness, local development instructions. Acceptance: app starts, tests run,
 migrations apply cleanly.
 ## Manual Registry
 ```task
 id: RREG-WP-0001-T02
 status: done
 priority: high
 state_hub_task_id: "e9c1cb9b-be21-4862-9443-a3e6c390882e"
 ```
 Repository CRUD, manual ability/capability/feature/evidence CRUD, ability-map endpoint,
 basic repository profile UI. Acceptance: user can create a profile by hand; UI and API
 both display `Ability → Capability → Feature → Evidence`.
 ## Git Ingestion and Deterministic Scanner
 ```task
 id: RREG-WP-0001-T03
 status: done
 priority: high
 state_hub_task_id: "4a9fd9f0-d968-4778-93d4-609090c13c62"
 ```
 Git URL validation, clone/fetch, snapshot with branch+commit hash, file tree scan,
 README/docs/examples/tests/manifest detection, language/framework detection, analysis
 run status tracking. Acceptance: public repo can be registered and analyzed; failures
 visible without corrupting prior data.
 ## Reviewable Candidate Graph
 ```task
 id: RREG-WP-0001-T04
 status: done
 priority: high
 state_hub_task_id: "9aa84fb5-998b-480e-adf8-3c1b0faa395a"
 ```
 Content extraction, source references, candidate ability/capability/feature/evidence
 generation, confidence scoring, candidate graph endpoint and UI. Acceptance: candidates
 have source references and confidence; output is explainable for curator review.
 ## Review and Approval Workflow
 ```task
 id: RREG-WP-0001-T05
 status: done
 priority: high
 state_hub_task_id: "0b5cff5d-4bc8-48ca-aa41-2f1d0c774bed"
 ```
 Approve/reject/edit/merge/relink candidates, publish approved profile, persist review
 decisions. Acceptance: curator can correct and approve an analysis result; only approved
 entries appear in canonical views.
 ## Search and Inspection
 ```task
 id: RREG-WP-0001-T06
 status: done
 priority: high
 state_hub_task_id: "13e506eb-017a-4710-b20b-ae10ac8df9d4"
 ```
 Text search over repos/abilities/capabilities/descriptions, search filters, search UI,
 repository profile drill-down, code/evidence links. Acceptance: user can search by need;
 results show repo, ability/capability, confidence, evidence level.
 ## API Completeness for Agents
 ```task
 id: RREG-WP-0001-T07
 status: done
 priority: high
 state_hub_task_id: "cb94bb9c-756f-446a-8a59-54aa5a51eff0"
 ```
 Full MVP REST API: GET/POST /repos, GET /repos/{id}, POST/GET /repos/{id}/analysis-runs,
 GET /repos/{id}/ability-map, GET /abilities, GET /capabilities, GET /search, OpenAPI
 examples. Acceptance: API covers registration, analysis, search, inspection; stable
 enough for agent/tooling integration.
--- a/workplans/RREG-WP-0002-production-hardening.md
+++ b/workplans/RREG-WP-0002-production-hardening.md
@@ -0,0 +1,108 @@
 ---
 id: RREG-WP-0002
 type: workplan
 title: "Repository Ability Registry — Production Hardening"
 domain: capabilities
 repo: repo-registry
 status: active
 owner: codex
 topic_slug: foerster-capabilities
 created: "2026-04-26"
 updated: "2026-04-26"
 state_hub_workstream_id: "4218d2bb-33e8-4f98-94ba-38b4ac21502d"
 ---
 # RREG-WP-0002 — Production Hardening
 ## Goal
 Improve trust, search quality, update safety, and operational readiness after MVP
 closure. Core invariant remains: observed facts are deterministic; interpreted claims
 are reviewable; approved registry truth is explicit.
 ## P0: Update Safety and Change Review
 ```task
 id: RREG-WP-0002-T01
 status: todo
 priority: high
 state_hub_task_id: "a27142a6-c160-4453-ab59-50a7db92f9c4"
 ```
 Analysis-run diff model for facts/chunks/candidates/approved entries. API endpoint to
 compare two runs. UI review view for changed/added/removed/weakened claims. Review
 decisions that record change acceptance/rejection. Tests proving approved profiles
 remain stable until changes are approved.
 Candidate endpoints:
 - `GET /repos/{id}/analysis-runs/{base}/diff/{target}`
 - `POST /repos/{id}/analysis-runs/{target}/changes/approve`
 ## P1: Search Quality and Semantic Retrieval
 ```task
 id: RREG-WP-0002-T02
 status: todo
 priority: medium
 state_hub_task_id: "0e7cce78-13ab-4aa2-8d25-ae50ff8ccd74"
 ```
 Embedding abstraction for approved entries and content chunks. Local/offline fake
 provider for tests. Optional pgvector path without breaking SQLite dev mode. Hybrid
 ranking combining text match, filters, confidence, and vector score. Existing text
 search behavior must remain stable.
 ## P1: Discovery UI
 ```task
 id: RREG-WP-0002-T03
 status: todo
 priority: medium
 state_hub_task_id: "aee945eb-ea25-49f7-b755-4ec451c1d05a"
 ```
 Repository comparison screen. Capability-gap input and report screen. Export action
 from repository profile. Clear empty states for repos without approved profiles.
 Acceptance: user can compare approved repos, enter desired capabilities, and export
 without raw API calls.
 ## P1: Fixture Breadth and Regression Confidence
 ```task
 id: RREG-WP-0002-T04
 status: todo
 priority: medium
 state_hub_task_id: "d1df6453-3bca-4524-85d1-9b3f3f275b45"
 ```
 E2E tests for: README-only repos, Python CLI repos, JavaScript/TypeScript packages,
 repos with weak or misleading docs, negative cases for unsupported/empty repos.
 Acceptance: conservative candidate extraction when docs are weak; misleading docs
 do not become approved truth without review.
 ## P2: Operational Readiness
 ```task
 id: RREG-WP-0002-T05
 status: todo
 priority: low
 state_hub_task_id: "44b10491-f1f2-4e2e-9a8e-e8bd59cbf892"
 ```
 Structured logging around ingestion, analysis, LLM extraction, and review actions.
 Configuration documentation. Basic health details for DB and checkout root. Backup/
 restore guidance for SQLite deployments. Migration strategy notes for PostgreSQL.
 ## P2: API Contract Stability
 ```task
 id: RREG-WP-0002-T06
 status: todo
 priority: low
 state_hub_task_id: "271a4fc4-d966-40ef-bc6f-a5fd1c445a16"
 ```
 Versioned API path or explicit compatibility policy. Golden OpenAPI snapshot test or
 schema-diff check. More response examples for discovery and change-review endpoints.
 Error response schema for common 400/404 cases. Acceptance: breaking changes are
 deliberate and visible in tests; agent-facing endpoints have stable models.