Files
repo-scoping/workplans/ProductionHardeningWorkplan.md

6.3 KiB

Repository Ability Registry Production Hardening Workplan

Status: open Created: 2026-04-26

This workplan starts after the v0.1 MVP closure. The MVP proves the core loop:

Register -> Analyze -> Review -> Approve -> Search/Inspect

Production hardening should improve trust, search quality, update safety, and operational readiness without weakening the core rule:

observed facts are deterministic
interpreted claims are reviewable
approved registry truth is explicit

1. Priorities

P0: Update Safety and Change Review

Goal: make repeated analysis after repository changes first-class.

Current state:

  • A repository can be analyzed repeatedly.
  • Each run records a snapshot, observed facts, chunks, and candidates.
  • Existing approved profiles are not corrupted by later runs.
  • There is no explicit analysis-run diff or change-review workflow.

Deliverables:

  • Analysis-run diff model for facts, chunks, candidates, and approved entries.
  • API endpoint to compare two analysis runs for a repository.
  • Review view for changed, added, removed, or weakened claims.
  • Review decisions that record change acceptance/rejection.
  • Tests proving approved profiles remain stable until changes are approved.

Candidate endpoints:

GET  /repos/{id}/analysis-runs/{base_run_id}/diff/{target_run_id}
POST /repos/{id}/analysis-runs/{target_run_id}/changes/approve

Acceptance criteria:

  • A user can see what changed between two analysis runs.
  • New claims are not published automatically.
  • Removed or weakened evidence is visible before approval.
  • Review decisions preserve the reason for accepting or rejecting changes.

P1: Search Quality and Semantic Retrieval

Goal: improve discovery beyond simple text matching while keeping results explainable.

Current state:

  • Text search covers repositories, abilities, capabilities, features, and evidence.
  • Filters exist for status, language, framework, ability, and capability.
  • Result explanations include matched field and context.
  • Semantic/vector search is not implemented.

Deliverables:

  • Embedding abstraction for approved registry entries and content chunks.
  • Local/offline fake embedding provider for tests.
  • Optional pgvector/PostgreSQL backend path, without breaking SQLite dev mode.
  • Hybrid ranking that combines text match, filters, confidence, and vector score.
  • Search response fields that explain text and semantic match reasons.

Acceptance criteria:

  • Existing text search behavior remains stable.
  • Semantic search can be enabled optionally.
  • Search results remain source-linked and explainable.
  • Tests cover ranking, filters, and fallback when embeddings are unavailable.

P1: Discovery UI

Goal: make comparison, gap analysis, and export usable from the curator UI.

Current state:

  • API endpoints exist for repository comparison, capability-gap reports, and YAML export.
  • There are no dedicated UI workflows for these discovery helpers.

Deliverables:

  • Repository comparison screen.
  • Capability-gap input and report screen.
  • Export action from repository profile.
  • Clear empty states for repositories without approved profiles.

Acceptance criteria:

  • A user can compare at least two approved repositories from the UI.
  • A user can enter desired capabilities and inspect missing/weak/duplicate results.
  • A user can export a registry entry without using raw API calls.

P1: Fixture Breadth and Regression Confidence

Goal: broaden real-world coverage across repository styles.

Current state:

  • Tests cover manual registry, FastAPI-like routes, docs/tests/examples, UI loops, LLM fallback, and source-linked approval.
  • Fixture helper coverage exists for README-only, Python CLI, and misleading-docs repositories, but not every fixture style has full e2e coverage.

Deliverables:

  • E2E tests for README-only repositories.
  • E2E tests for Python CLI repositories.
  • E2E tests for JavaScript/TypeScript packages.
  • E2E tests for repositories with weak or misleading documentation.
  • Negative tests for unsupported or empty repositories.

Acceptance criteria:

  • Candidate extraction stays conservative when docs are weak.
  • CLI and frontend/package repositories produce useful facts and candidates.
  • Misleading docs do not become approved truth without review.

P2: Operational Readiness

Goal: make the service easier to run and diagnose outside local development.

Deliverables:

  • Structured logging around ingestion, analysis, LLM extraction, and review actions.
  • Configuration documentation for database, checkout root, and LLM provider settings.
  • Basic metrics or health details for database and checkout root reachability.
  • Backup/restore guidance for SQLite MVP deployments.
  • Migration strategy notes for a future PostgreSQL deployment.

Acceptance criteria:

  • Operators can diagnose failed analyses from logs and API state.
  • Configuration is documented in one place.
  • Database migration and backup expectations are clear.

P2: API Contract Stability

Goal: make agent/tooling integration safer as the API grows.

Deliverables:

  • Versioned API path or explicit compatibility policy.
  • Golden OpenAPI snapshot test or schema-diff check.
  • More response examples for discovery and change-review endpoints.
  • Error response schema for common 400/404 cases.

Acceptance criteria:

  • Breaking API changes are deliberate and visible in tests.
  • Agent-facing endpoints have stable request/response models.

2. First Implementation Sequence

Recommended next sequence:

  1. Add analysis-run diff data structures and a read-only diff endpoint.
  2. Add e2e tests for rerun diff behavior with added/removed routes and evidence.
  3. Add UI display for analysis-run diffs.
  4. Add approval workflow for selected changes.
  5. Broaden fixture e2e coverage for README-only, CLI, JS/TS, and weak-doc repos.
  6. Add discovery UI for compare, gaps, and export.
  7. Prototype optional semantic search behind a feature/config boundary.

3. Definition of Done

This hardening plan can be closed when:

  • Re-analysis changes are explicit, reviewable, and tested end to end.
  • Search supports an optional semantic mode while preserving explainability.
  • Discovery workflows are available in both API and UI.
  • Fixture coverage represents the major repository shapes in the use case catalog.
  • API contract changes are guarded by tests.
  • Operational docs cover running, configuring, diagnosing, and backing up the service.