Files
railiance-apps/docs/vergabe-teilnahme.md
tegwick 398b0fe211 RAILIANCE-WP-0002 finished: vergabe-teilnahme T07+T08 done
T07 smoke: migrate all apps; /health/ 200, /ausschreibungen/dashboard/ Übersicht, /admin/login/ Anmelden, static assets (Tailwind, Alpine, htmx, Django admin) all 200. Auth-required smoke and createsuperuser deferred to the operator (interactive credentials not safe through this session); seed_dev deliberately skipped (hardcoded dev user). T08 runbook in docs/vergabe-teilnahme.md: identity, secret rotation recipes, day-to-day make targets, image promotion + rollback, troubleshooting, deferred backup posture, cross-refs.

Workplan status: finished. vergabe-teilnahme is the second S5 application on railiance01 (after Gitea).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-19 20:43:04 +02:00

6.1 KiB

vergabe-teilnahme — operator runbook

Production deployment of the Django tender-management app, shipped under RAILIANCE-WP-0002.

Identity

Public URL https://vergabe-teilnahme.whywhynot.de
Namespace vergabe-teilnahme
Helm release vergabe-teilnahme
Chart charts/vergabe-teilnahme/
Values helm/vergabe-teilnahme-values.yaml (plain — no SOPS)
Ingress manifests/vergabe-teilnahme-ingress.yaml
Image gitea.coulomb.social/coulomb/vergabe-teilnahme:<tag>
Database vergabe_db on shared cnpg apps-pg (see railiance-platform/docs/apps-pg.md)
TLS vergabe-teilnahme-tls, issued by cert-manager letsencrypt-prod

Secrets

Two K8s Secrets in the vergabe-teilnahme namespace:

Secret Type Source of truth Used for
vergabe-app-credentials kubernetes.io/basic-auth mirror of databases/vergabe-app-credentials (cnpg-owned) raw DB role credential
vergabe-teilnahme-env Opaque created by operator SECRET_KEY + URL-encoded DATABASE_URL (envFrom on the Deployment)

No SOPS encryption for this app — all sensitive material lives in K8s Secrets, not in committed values files.

Rotating the DB password

  1. Have railiance-platform rotate the cnpg-managed Secret (databases/vergabe-app-credentials).
  2. Mirror the new password into vergabe-teilnahme/vergabe-app-credentials.
  3. Rebuild DATABASE_URL in vergabe-teilnahme-env, URL-encoding the password (the base64 character set breaks the URL parser otherwise — see RAILIANCE-WP-0004 I01):
    PW=$(kubectl get secret vergabe-app-credentials -n vergabe-teilnahme -o jsonpath='{.data.password}' | base64 -d)
    ENCODED=$(python3 -c "import urllib.parse,sys; print(urllib.parse.quote(sys.argv[1], safe=''))" "$PW")
    kubectl patch secret vergabe-teilnahme-env -n vergabe-teilnahme \
      --type=merge \
      -p "{\"stringData\":{\"DATABASE_URL\":\"postgresql://vergabe:$ENCODED@apps-pg-rw.databases:5432/vergabe_db\"}}"
    kubectl rollout restart deploy/vergabe-teilnahme -n vergabe-teilnahme
    

Rotating SECRET_KEY

Django SECRET_KEY rotation invalidates active sessions but is otherwise zero-downtime:

NEW=$(openssl rand -base64 50 | tr -d '\n' | tr '/+=' 'abc')
kubectl patch secret vergabe-teilnahme-env -n vergabe-teilnahme \
  --type=merge -p "{\"stringData\":{\"SECRET_KEY\":\"$NEW\"}}"
kubectl rollout restart deploy/vergabe-teilnahme -n vergabe-teilnahme

Day-to-day commands

make vergabe-status      # pods, svc, ingress, certificate
make vergabe-logs        # tail app logs
make vergabe-dry-run     # helm template render (audit values)
make vergabe-deploy      # helm upgrade --install (idempotent)
make vergabe-migrate     # manage.py migrate against live deploy
make vergabe-seed        # seed_dev — DEV ONLY, creates max.muster/testpass123 (do not run in prod)
make vergabe-superuser   # interactive createsuperuser

Promoting a new image tag

  1. Build + push from vergabe-teilnahme repo (see its Dockerfile header for the BuildKit --build-context invocation — see also RAILIANCE-WP-0004 I03).
  2. Update image.tag in helm/vergabe-teilnahme-values.yaml to the new git SHA.
  3. make vergabe-deploy — Helm rolls a new ReplicaSet with zero-downtime (maxSurge: 1, maxUnavailable: 0).
  4. Verify via make vergabe-status and an HTTPS probe.
  5. If migrations are needed, run make vergabe-migrate after the rollout completes.

Rollback

helm history vergabe-teilnahme -n vergabe-teilnahme
helm rollback vergabe-teilnahme <REVISION> -n vergabe-teilnahme

Rollback does not unwind DB migrations. For any rollback that crosses a migration boundary, plan a manage.py migrate <app> <name> reverse step explicitly.

Troubleshooting

Pod stuck Running 0/1, kube-probe failing

Most likely the probe's Host header doesn't match ALLOWED_HOSTS. The chart sets probes.hostHeader: vergabe-teilnahme.whywhynot.de precisely to avoid this — if you change ALLOWED_HOSTS in values, also update probes.hostHeader. Symptom in kubectl logs: kube-probe requests returning HTTP 400.

dj-database-url error: "The database name 'XYZ...' is longer than 63 characters"

The DATABASE_URL password isn't URL-encoded. See the rotation recipe above. Tracked in RAILIANCE-WP-0004 I01.

Cert-manager: cert stuck in False

Check the Order/Challenge resources:

kubectl get order,challenge -n vergabe-teilnahme
kubectl describe challenge -n vergabe-teilnahme

Common causes: DNS not yet propagated to all resolvers, Let's Encrypt rate-limited, or the ingress controller isn't forwarding /.well-known/acme-challenge/ requests.

make vergabe-status shows certificate False

The chart leaves cert lifecycle to cert-manager. If the cert renews fail, cert-manager keeps serving the old cert until it expires. Investigate with kubectl describe certificate vergabe-teilnahme-tls -n vergabe-teilnahme.

Backup posture (open)

The shared apps-pg cluster is not yet covered by an automated backup job — only the legacy PostgreSQL-HA setup is. Manual logical dump for now:

kubectl exec -n databases apps-pg-1 -- pg_dump -U postgres -Fc vergabe_db > vergabe_db-$(date +%F).dump

Tracked as a follow-up in RAILIANCE-WP-0003 Notes (CNPG backup configuration belongs to railiance-platform).

Deferred for v1

  • Multi-replica HA (replicaCount: 1).
  • Media-upload PVC (persistence.media.enabled: false — Django MEDIA_ROOT is in-pod ephemeral).
  • 3-stage canary (the Staged Promotion Lifecycle workstream is still 0/7).
  • SSO / Keycloak integration (Django built-in auth only).
  • Celery + Redis workers.

Cross-references

  • Workplan: workplans/railiance-apps-WP-0002-vergabe-teilnahme-on-railiance01.md
  • Improvements backlog: workplans/railiance-apps-WP-0004-app-deployment-improvements.md
  • Shared DB cluster: railiance-platform/docs/apps-pg.md
  • Container registry: docs/gitea-container-registry.md
  • App source: https://gitea.coulomb.social/coulomb/vergabe-teilnahme