repo-scoping/workplans/ProductionHardeningWorkplan.md

# Repository Ability Registry Production Hardening Workplan

Status: open
Created: 2026-04-26

This workplan starts after the v0.1 MVP closure. The MVP proves the core loop:

```text
Register -> Analyze -> Review -> Approve -> Search/Inspect
```

Production hardening should improve trust, search quality, update safety, and
operational readiness without weakening the core rule:

```text
observed facts are deterministic
interpreted claims are reviewable
approved registry truth is explicit
```

## 1. Priorities

### P0: Update Safety and Change Review

Goal: make repeated analysis after repository changes first-class.

Current state:

- A repository can be analyzed repeatedly.
- Each run records a snapshot, observed facts, chunks, and candidates.
- Existing approved profiles are not corrupted by later runs.
- There is no explicit analysis-run diff or change-review workflow.

Deliverables:

- Analysis-run diff model for facts, chunks, candidates, and approved entries.
- API endpoint to compare two analysis runs for a repository.
- Review view for changed, added, removed, or weakened claims.
- Review decisions that record change acceptance/rejection.
- Tests proving approved profiles remain stable until changes are approved.

Candidate endpoints:

```text
GET  /repos/{id}/analysis-runs/{base_run_id}/diff/{target_run_id}
POST /repos/{id}/analysis-runs/{target_run_id}/changes/approve
```

Acceptance criteria:

- A user can see what changed between two analysis runs.
- New claims are not published automatically.
- Removed or weakened evidence is visible before approval.
- Review decisions preserve the reason for accepting or rejecting changes.

### P1: Search Quality and Semantic Retrieval

Goal: improve discovery beyond simple text matching while keeping results
explainable.

Current state:

- Text search covers repositories, abilities, capabilities, features, and evidence.
- Filters exist for status, language, framework, ability, and capability.
- Result explanations include matched field and context.
- Semantic/vector search is not implemented.

Deliverables:

- Embedding abstraction for approved registry entries and content chunks.
- Local/offline fake embedding provider for tests.
- Optional pgvector/PostgreSQL backend path, without breaking SQLite dev mode.
- Hybrid ranking that combines text match, filters, confidence, and vector score.
- Search response fields that explain text and semantic match reasons.

Acceptance criteria:

- Existing text search behavior remains stable.
- Semantic search can be enabled optionally.
- Search results remain source-linked and explainable.
- Tests cover ranking, filters, and fallback when embeddings are unavailable.

### P1: Discovery UI

Goal: make comparison, gap analysis, and export usable from the curator UI.

Current state:

- API endpoints exist for repository comparison, capability-gap reports, and YAML export.
- There are no dedicated UI workflows for these discovery helpers.

Deliverables:

- Repository comparison screen.
- Capability-gap input and report screen.
- Export action from repository profile.
- Clear empty states for repositories without approved profiles.

Acceptance criteria:

- A user can compare at least two approved repositories from the UI.
- A user can enter desired capabilities and inspect missing/weak/duplicate results.
- A user can export a registry entry without using raw API calls.

### P1: Fixture Breadth and Regression Confidence

Goal: broaden real-world coverage across repository styles.

Current state:

- Tests cover manual registry, FastAPI-like routes, docs/tests/examples, UI loops,
  LLM fallback, and source-linked approval.
- Fixture helper coverage exists for README-only, Python CLI, and misleading-docs
  repositories, but not every fixture style has full e2e coverage.

Deliverables:

- E2E tests for README-only repositories.
- E2E tests for Python CLI repositories.
- E2E tests for JavaScript/TypeScript packages.
- E2E tests for repositories with weak or misleading documentation.
- Negative tests for unsupported or empty repositories.

Acceptance criteria:

- Candidate extraction stays conservative when docs are weak.
- CLI and frontend/package repositories produce useful facts and candidates.
- Misleading docs do not become approved truth without review.

### P2: Operational Readiness

Goal: make the service easier to run and diagnose outside local development.

Deliverables:

- Structured logging around ingestion, analysis, LLM extraction, and review actions.
- Configuration documentation for database, checkout root, and LLM provider settings.
- Basic metrics or health details for database and checkout root reachability.
- Backup/restore guidance for SQLite MVP deployments.
- Migration strategy notes for a future PostgreSQL deployment.

Acceptance criteria:

- Operators can diagnose failed analyses from logs and API state.
- Configuration is documented in one place.
- Database migration and backup expectations are clear.

### P2: API Contract Stability

Goal: make agent/tooling integration safer as the API grows.

Deliverables:

- Versioned API path or explicit compatibility policy.
- Golden OpenAPI snapshot test or schema-diff check.
- More response examples for discovery and change-review endpoints.
- Error response schema for common 400/404 cases.

Acceptance criteria:

- Breaking API changes are deliberate and visible in tests.
- Agent-facing endpoints have stable request/response models.

## 2. First Implementation Sequence

Recommended next sequence:

1. Add analysis-run diff data structures and a read-only diff endpoint.
2. Add e2e tests for rerun diff behavior with added/removed routes and evidence.
3. Add UI display for analysis-run diffs.
4. Add approval workflow for selected changes.
5. Broaden fixture e2e coverage for README-only, CLI, JS/TS, and weak-doc repos.
6. Add discovery UI for compare, gaps, and export.
7. Prototype optional semantic search behind a feature/config boundary.

## 3. Definition of Done

This hardening plan can be closed when:

- Re-analysis changes are explicit, reviewable, and tested end to end.
- Search supports an optional semantic mode while preserving explainability.
- Discovery workflows are available in both API and UI.
- Fixture coverage represents the major repository shapes in the use case catalog.
- API contract changes are guarded by tests.
- Operational docs cover running, configuring, diagnosing, and backing up the service.