generated from coulomb/repo-seed
186 lines
6.3 KiB
Markdown
186 lines
6.3 KiB
Markdown
# Repository Ability Registry Production Hardening Workplan
|
|
|
|
Status: open
|
|
Created: 2026-04-26
|
|
|
|
This workplan starts after the v0.1 MVP closure. The MVP proves the core loop:
|
|
|
|
```text
|
|
Register -> Analyze -> Review -> Approve -> Search/Inspect
|
|
```
|
|
|
|
Production hardening should improve trust, search quality, update safety, and
|
|
operational readiness without weakening the core rule:
|
|
|
|
```text
|
|
observed facts are deterministic
|
|
interpreted claims are reviewable
|
|
approved registry truth is explicit
|
|
```
|
|
|
|
## 1. Priorities
|
|
|
|
### P0: Update Safety and Change Review
|
|
|
|
Goal: make repeated analysis after repository changes first-class.
|
|
|
|
Current state:
|
|
|
|
- A repository can be analyzed repeatedly.
|
|
- Each run records a snapshot, observed facts, chunks, and candidates.
|
|
- Existing approved profiles are not corrupted by later runs.
|
|
- There is no explicit analysis-run diff or change-review workflow.
|
|
|
|
Deliverables:
|
|
|
|
- Analysis-run diff model for facts, chunks, candidates, and approved entries.
|
|
- API endpoint to compare two analysis runs for a repository.
|
|
- Review view for changed, added, removed, or weakened claims.
|
|
- Review decisions that record change acceptance/rejection.
|
|
- Tests proving approved profiles remain stable until changes are approved.
|
|
|
|
Candidate endpoints:
|
|
|
|
```text
|
|
GET /repos/{id}/analysis-runs/{base_run_id}/diff/{target_run_id}
|
|
POST /repos/{id}/analysis-runs/{target_run_id}/changes/approve
|
|
```
|
|
|
|
Acceptance criteria:
|
|
|
|
- A user can see what changed between two analysis runs.
|
|
- New claims are not published automatically.
|
|
- Removed or weakened evidence is visible before approval.
|
|
- Review decisions preserve the reason for accepting or rejecting changes.
|
|
|
|
### P1: Search Quality and Semantic Retrieval
|
|
|
|
Goal: improve discovery beyond simple text matching while keeping results
|
|
explainable.
|
|
|
|
Current state:
|
|
|
|
- Text search covers repositories, abilities, capabilities, features, and evidence.
|
|
- Filters exist for status, language, framework, ability, and capability.
|
|
- Result explanations include matched field and context.
|
|
- Semantic/vector search is not implemented.
|
|
|
|
Deliverables:
|
|
|
|
- Embedding abstraction for approved registry entries and content chunks.
|
|
- Local/offline fake embedding provider for tests.
|
|
- Optional pgvector/PostgreSQL backend path, without breaking SQLite dev mode.
|
|
- Hybrid ranking that combines text match, filters, confidence, and vector score.
|
|
- Search response fields that explain text and semantic match reasons.
|
|
|
|
Acceptance criteria:
|
|
|
|
- Existing text search behavior remains stable.
|
|
- Semantic search can be enabled optionally.
|
|
- Search results remain source-linked and explainable.
|
|
- Tests cover ranking, filters, and fallback when embeddings are unavailable.
|
|
|
|
### P1: Discovery UI
|
|
|
|
Goal: make comparison, gap analysis, and export usable from the curator UI.
|
|
|
|
Current state:
|
|
|
|
- API endpoints exist for repository comparison, capability-gap reports, and YAML export.
|
|
- There are no dedicated UI workflows for these discovery helpers.
|
|
|
|
Deliverables:
|
|
|
|
- Repository comparison screen.
|
|
- Capability-gap input and report screen.
|
|
- Export action from repository profile.
|
|
- Clear empty states for repositories without approved profiles.
|
|
|
|
Acceptance criteria:
|
|
|
|
- A user can compare at least two approved repositories from the UI.
|
|
- A user can enter desired capabilities and inspect missing/weak/duplicate results.
|
|
- A user can export a registry entry without using raw API calls.
|
|
|
|
### P1: Fixture Breadth and Regression Confidence
|
|
|
|
Goal: broaden real-world coverage across repository styles.
|
|
|
|
Current state:
|
|
|
|
- Tests cover manual registry, FastAPI-like routes, docs/tests/examples, UI loops,
|
|
LLM fallback, and source-linked approval.
|
|
- Fixture helper coverage exists for README-only, Python CLI, and misleading-docs
|
|
repositories, but not every fixture style has full e2e coverage.
|
|
|
|
Deliverables:
|
|
|
|
- E2E tests for README-only repositories.
|
|
- E2E tests for Python CLI repositories.
|
|
- E2E tests for JavaScript/TypeScript packages.
|
|
- E2E tests for repositories with weak or misleading documentation.
|
|
- Negative tests for unsupported or empty repositories.
|
|
|
|
Acceptance criteria:
|
|
|
|
- Candidate extraction stays conservative when docs are weak.
|
|
- CLI and frontend/package repositories produce useful facts and candidates.
|
|
- Misleading docs do not become approved truth without review.
|
|
|
|
### P2: Operational Readiness
|
|
|
|
Goal: make the service easier to run and diagnose outside local development.
|
|
|
|
Deliverables:
|
|
|
|
- Structured logging around ingestion, analysis, LLM extraction, and review actions.
|
|
- Configuration documentation for database, checkout root, and LLM provider settings.
|
|
- Basic metrics or health details for database and checkout root reachability.
|
|
- Backup/restore guidance for SQLite MVP deployments.
|
|
- Migration strategy notes for a future PostgreSQL deployment.
|
|
|
|
Acceptance criteria:
|
|
|
|
- Operators can diagnose failed analyses from logs and API state.
|
|
- Configuration is documented in one place.
|
|
- Database migration and backup expectations are clear.
|
|
|
|
### P2: API Contract Stability
|
|
|
|
Goal: make agent/tooling integration safer as the API grows.
|
|
|
|
Deliverables:
|
|
|
|
- Versioned API path or explicit compatibility policy.
|
|
- Golden OpenAPI snapshot test or schema-diff check.
|
|
- More response examples for discovery and change-review endpoints.
|
|
- Error response schema for common 400/404 cases.
|
|
|
|
Acceptance criteria:
|
|
|
|
- Breaking API changes are deliberate and visible in tests.
|
|
- Agent-facing endpoints have stable request/response models.
|
|
|
|
## 2. First Implementation Sequence
|
|
|
|
Recommended next sequence:
|
|
|
|
1. Add analysis-run diff data structures and a read-only diff endpoint.
|
|
2. Add e2e tests for rerun diff behavior with added/removed routes and evidence.
|
|
3. Add UI display for analysis-run diffs.
|
|
4. Add approval workflow for selected changes.
|
|
5. Broaden fixture e2e coverage for README-only, CLI, JS/TS, and weak-doc repos.
|
|
6. Add discovery UI for compare, gaps, and export.
|
|
7. Prototype optional semantic search behind a feature/config boundary.
|
|
|
|
## 3. Definition of Done
|
|
|
|
This hardening plan can be closed when:
|
|
|
|
- Re-analysis changes are explicit, reviewable, and tested end to end.
|
|
- Search supports an optional semantic mode while preserving explainability.
|
|
- Discovery workflows are available in both API and UI.
|
|
- Fixture coverage represents the major repository shapes in the use case catalog.
|
|
- API contract changes are guarded by tests.
|
|
- Operational docs cover running, configuring, diagnosing, and backing up the service.
|