Files
reuse-surface/docs/deploy/reuse-kubernetes.md
tegwick 270065ff58
Some checks failed
ci / validate-registry (push) Has been cancelled
Implement REUSE-WP-0012 federation scale and intent alignment
Add hub sync and report cohorts CLI commands with pytest coverage, document
sibling index publish contract and hub hardening path, align INTENT layout,
raise external evidence on three registry entries, and close gap priorities
19-23 (priority 18 deferred on sibling index blocks).
2026-06-16 00:42:50 +02:00

124 lines
4.1 KiB
Markdown

# reuse-surface Service — Kubernetes Deployment
Companion to **RAILIANCE-WP-0007** (`railiance-apps` Helm release).
## Image
Repository: `gitea.coulomb.social/coulomb/reuse-surface` (Gitea org `coulomb`, repo `reuse-surface`).
```bash
docker build -t gitea.coulomb.social/coulomb/reuse-surface:<tag> .
docker push gitea.coulomb.social/coulomb/reuse-surface:<tag>
```
## Required environment
| Variable | Purpose |
|---|---|
| `REUSE_SURFACE_TOKEN` | Bearer token for write API |
| `REUSE_SURFACE_DB` | SQLite path (default `/data/reuse.db`) |
| `REUSE_SURFACE_CACHE_DIR` | Remote index cache (default `/data/cache`) |
Mount a PVC at `/data` for persistence. Inject secrets via Kubernetes Secret
`reuse-surface-env`.
## Probes
- Liveness/readiness: `GET /health` on port `8000`
## Browser landing page
Production ingress routes HTTPS `/` to a static landing Deployment
(`reuse-surface-landing`, **RAILIANCE-WP-0008**). API paths are unchanged:
- `/health` and `/v1/*` → hub service container
- `/` → informational HTML for browser visitors (no login, no secrets)
Agents and CLI clients should target `/health` and `/v1/*` only, not `/`.
## Public URL and DNS
| Item | Value |
|---|---|
| URL | `https://reuse.coulomb.social` |
| DNS A record | **`92.205.62.239`** (Railiance01 production) |
CoulombCore (`92.205.130.254`) held a bootstrap deploy; production release uses
`KUBECONFIG=~/.kube/config-hosteurope`. Verify propagation:
```bash
dig +short reuse.coulomb.social A # must return 92.205.62.239
```
## Client configuration
```bash
export REUSE_SURFACE_URL=https://reuse.coulomb.social
export REUSE_SURFACE_TOKEN=<write-token>
reuse-surface hub status
```
## Operational hardening
The hub runs as a single-replica Deployment with SQLite on a PVC (**A5**
containerized service). **A6** (managed platform) is deferred until multi-replica
or Postgres backing is required.
### Backup and restore (SQLite PVC)
1. Identify the PVC mounted at `/data` (stores `reuse.db` and remote index cache).
2. Snapshot or copy while the pod is running (SQLite WAL-safe copy) or scale to
zero briefly for a cold copy:
```bash
kubectl -n <namespace> exec deploy/reuse-surface -- \
sqlite3 /data/reuse.db '.backup /tmp/reuse-backup.db'
kubectl -n <namespace> cp deploy/reuse-surface:/tmp/reuse-backup.db ./reuse-backup.db
```
3. Restore by replacing `/data/reuse.db` from backup and restarting the pod.
4. Re-register repos if the database is empty (`reuse-surface hub list`).
Verify backup once per environment after deploy changes.
### TLS certificate renewal
Ingress TLS is managed by the cluster cert issuer (Railiance01 companion chart).
Monitor certificate expiry on `reuse.coulomb.social`. Renewal is automatic when
the issuer is healthy; on failure, check ingress secret `reuse-surface-tls` and
cert-manager / companion operator logs.
### Token rotation
1. Generate a new `REUSE_SURFACE_TOKEN` value.
2. Update Kubernetes Secret `reuse-surface-env`.
3. Rolling restart the hub Deployment.
4. Update operator workstations and CI secrets that call write endpoints.
5. Confirm `reuse-surface hub register` fails with the old token and succeeds
with the new token.
### Image promotion checklist
1. Tag image from CI commit: `gitea.coulomb.social/coulomb/reuse-surface:<sha>`.
2. Run `pytest -q` and `reuse-surface validate` on that commit.
3. Update Helm values image tag in `railiance-apps`.
4. Deploy to Railiance01; verify `GET /health` and `GET /v1/repos`.
5. Smoke `reuse-surface hub list` and `GET /v1/federated` capability count.
6. Record image digest in workplan or progress log.
### SQLite vs Postgres (cnpg) — decision criteria
Stay on SQLite while:
- Single replica is acceptable.
- RPO of occasional PVC snapshot is sufficient.
- Write volume is low (repo registration changes only).
Consider Postgres (e.g. CloudNative-PG) when:
- Multiple hub replicas or zero-downtime failover is required.
- RPO/RTO targets need point-in-time recovery beyond PVC snapshots.
- Federation cache metadata or audit tables grow beyond comfortable SQLite size.
**Implementation deferred** unless an operator approves migration. Document only
until then.