repo-scoping/docs/operations.md

# Operational Readiness

This note captures the runtime knobs and baseline operating procedures for the
Repository Scoping service.

## Configuration

Configuration is read from environment variables with the `REPO_SCOPING_`
prefix. The same naming is used by the import package and default local
database path so service identity stays aligned with Repository Scoping.

| Variable | Default | Purpose |
| --- | --- | --- |
| `REPO_SCOPING_DATABASE_PATH` | `var/repo-scoping.sqlite3` | SQLite database file used by the default store. |
| `REPO_SCOPING_CHECKOUT_ROOT` | `var/checkouts` | Local checkout cache used during repository ingestion. |
| `REPO_SCOPING_LLM_PROVIDER` | unset | Optional LLM provider name for candidate extraction. |
| `REPO_SCOPING_LLM_MODEL` | unset | Optional model name passed to the configured LLM provider. |
| `REPO_SCOPING_EMBEDDING_PROVIDER` | unset | Set to `hashing` to enable deterministic local hybrid search scoring. |
| `REPO_SCOPING_LOG_LEVEL` | `INFO` | Log level for the `repo_scoping.operations` structured event logger. |

## Health Checks

`GET /health` returns service status plus the operational dependencies that can
be checked locally:

```json
{
  "status": "ok",
  "database": {
    "path": "var/repo-scoping.sqlite3",
    "reachable": true,
    "error": null
  },
  "checkout_root": {
    "path": "var/checkouts",
    "exists": true
  }
}
```

`status` is `degraded` when the database cannot be initialized or queried. The
checkout root is reported as metadata because it may be created lazily by the
ingestion path.

## Structured Logs

Operational events are emitted through the `repo_scoping.operations` logger as
single-line JSON messages. Current events include repository registration,
analysis start/completion/failure, LLM extraction usage/failure, and review
decisions.

Configure the Python or ASGI server logging stack to route this logger to the
same sink as application logs. `REPO_SCOPING_LOG_LEVEL` controls the logger
level used by API-created service instances.

## SQLite Backup And Restore

For single-node SQLite deployments, prefer the SQLite backup API so readers can
continue while the backup is created:

```bash
mkdir -p backups
sqlite3 var/repo-scoping.sqlite3 ".backup 'backups/repo-scoping-$(date +%F).sqlite3'"
```

For the most conservative backup window, stop writes first, run the backup, then
resume the service. Verify a backup with:

```bash
sqlite3 backups/repo-scoping-YYYY-MM-DD.sqlite3 "PRAGMA integrity_check;"
```

To restore, stop the service, move the current database aside, copy the backup to
`REPO_SCOPING_DATABASE_PATH`, start the service, and verify `GET /health`.

## PostgreSQL Migration Notes

The storage interface is intentionally kept behind `RegistryStore` so a
PostgreSQL-backed implementation can be introduced alongside SQLite before
cutover. A production migration should:

1. Add a PostgreSQL store that preserves the current repository, analysis,
   observed fact, content chunk, candidate, approved registry, and review
   decision contracts.
2. Manage schema changes with explicit migrations rather than implicit table
   creation.
3. Export from SQLite and import into PostgreSQL in a repeatable script, then
   compare repository counts, approved ability maps, search results, and recent
   review decisions.
4. Keep vector search optional. If pgvector is enabled, follow the plan in
   `docs/semantic-retrieval.md` and validate hybrid ranking before making it the
   default.
5. Take a final SQLite backup immediately before cutover and retain it until the
   PostgreSQL deployment has passed health and smoke tests.