--- id: RAILIANCE-WP-0002 type: workplan title: "Establish vergabe-teilnahme as an Application on railiance01" domain: railiance repo: railiance-apps status: finished owner: railiance topic_slug: railiance created: "2026-05-18" updated: "2026-05-18" planning_priority: high planning_order: 2 state_hub_workstream_id: "94522a85-80d5-4f2c-8eb0-8d0bcb15f3b0" --- # Establish vergabe-teilnahme as an Application on railiance01 ## Goal Deploy the `vergabe-teilnahme` Django application as a governed Helm release on the production cluster node `railiance01` (`92.205.130.254`), reachable at `https://vergabe-teilnahme.whywhynot.de`, with its container image published through the Railiance Gitea OCI registry and its PostgreSQL data living in the shared cnpg cluster. This establishes vergabe-teilnahme as the second application (after Gitea) running on the S5 layer of the Railiance OAS Stack and exercises the freshly enabled container registry from `RAILIANCE-WP-0001` end-to-end. ## Placement in the Railiance Tooling Set This workplan lives in `railiance-apps` because vergabe-teilnahme is an S5 application workload. The deployment surface added by this workplan is: - `helm/vergabe-teilnahme-values.sops.yaml` — SOPS-encrypted Helm values (`DJANGO_SECRET_KEY`, DB DSN, etc.). - `releases/vergabe-teilnahme/` — chart selection / overlay (Bitnami generic chart or hand-rolled chart, decided in T05). - `manifests/vergabe-teilnahme-ingress.yaml` — Traefik ingress + cert-manager TLS for `vergabe-teilnahme.whywhynot.de`. - `Makefile` targets: `vergabe-deploy`, `vergabe-status`, `vergabe-migrate`. Cross-repo coordination required: | Concern | Owner repo | Coordination | |---------|------------|--------------| | Application Helm release | `railiance-apps` | This workplan | | Containerization (Dockerfile, entrypoint, asset build) | `vergabe-teilnahme` | This workplan opens a task in that repo | | PostgreSQL role + database in shared cnpg cluster | `railiance-platform` | Capability request via hub | | DNS A record for `vergabe-teilnahme.whywhynot.de` | DNS owner of `whywhynot.de` | Out-of-band, captured in T06 | | Ingress controller / cluster routing primitives | `railiance-cluster` | Reuse — no changes expected | | cert-manager ClusterIssuer `letsencrypt-prod` | `railiance-platform` | Reuse — no changes expected | ## Current Evidence - `vergabe-teilnahme/CLAUDE.md`: Django 6.x · uv · Tailwind CSS v4 (Vite) · HTMX 2.x · Alpine.js 3.x · PostgreSQL 16+ (psycopg3). German UI. - `vergabe-teilnahme/` currently has no `Dockerfile`. `docker-compose.dev.yml` documents the local Postgres but isn't started when the shared `infra-postgres-1` is up. - `railiance-apps/Makefile` deploys Gitea via `helm/gitea-values.sops.yaml`; the same SOPS + Helm pattern is the template for this workplan. - `RAILIANCE-WP-0001` confirmed `https://gitea.coulomb.social/v2/` returns the OCI registry auth challenge. Image naming convention established: `gitea.coulomb.social//:`. - `manifests/gitea-ingress.yaml` confirms the ingress recipe: `ingressClassName: traefik` + annotation `cert-manager.io/cluster-issuer: letsencrypt-prod`. - Domain `whywhynot.de` has no prior references in any railiance repo — DNS and a fresh Let's Encrypt cert will need to be set up. - Live cnpg state: `gitea-db` runs in the `databases` namespace. T01 confirms whether a single shared cluster exists for app DBs or whether one must be designated. ## Safety Contract - Do not commit decrypted Helm values, the Django `SECRET_KEY`, DB credentials, or any other secret material. - Use a dedicated PostgreSQL role with privileges scoped to a single database; never reuse the gitea role or grant superuser. - No public exposure until cert-manager has successfully issued a TLS certificate for `vergabe-teilnahme.whywhynot.de`. - Do not start with `DEBUG=True`; production settings only. - Preserve Gitea behavior: the new ingress must not conflict with the `gitea` ingress on the cluster's default ingress controller. - If DB role provisioning requires changes to the shared cnpg cluster resource limits, pause and create a `railiance-platform` task. - If DNS for `whywhynot.de` is owned outside this operator's control, pause T06 until DNS ownership is confirmed. - Start fresh — no migration of data from any existing dev database in this workplan. ## Target State - `vergabe-teilnahme:` image is built and pushed to `gitea.coulomb.social/coulomb/vergabe-teilnahme:`. - A `vergabe` PostgreSQL role and `vergabe_db` database exist in the shared cnpg cluster (single role, single DB, no cross-app grants). - A Helm release `vergabe-teilnahme` is deployed in a dedicated namespace with a single replica, a Service, a PVC for media (if any), and the necessary secrets sourced from SOPS values. - Django `migrate` and `make seed` have run successfully against the shared cnpg database. - `https://vergabe-teilnahme.whywhynot.de` serves the Django app behind a valid Let's Encrypt certificate. - Login as a superuser succeeds; the dashboard renders; static assets are served correctly (Tailwind/Vite build pipeline integrated into the image). - Runbook, registry naming, DB credentials handling, and rollback steps are documented for the next operator. ## Tasks ### T01 — Inventory the deployment substrate ```task id: RAILIANCE-WP-0002-T01 status: done priority: high state_hub_task_id: "49aa7d85-96bd-4d97-952c-80dcfff06610" ``` Confirm the pre-conditions before any code is written. Checks: - Identify the shared cnpg cluster intended for app databases (name, namespace, version, current databases/roles, available capacity). - Verify `gitea.coulomb.social/v2/` still returns an OCI registry auth challenge and that the operator has a package-capable token. - Verify cert-manager `letsencrypt-prod` ClusterIssuer is healthy and has successfully issued at least one production certificate recently (`gitea-tls` is a known good example). - Confirm DNS ownership and the change path for `whywhynot.de` — record who owns the zone and how an A record is added. - Confirm Traefik is the active ingress controller and note the public IP/hostname an A record must point at. **Done when:** the workplan records (a) the cnpg cluster to use, (b) the operator's path to a Gitea package token, (c) the DNS change path for `whywhynot.de`, and (d) any pre-condition gaps. **Findings (2026-05-18):** - **cnpg landscape — no shared cluster yet.** `kubectl get clusters.postgresql.cnpg.io -A` returns two app-dedicated clusters in the `databases` namespace: - `gitea-db` — `ghcr.io/cloudnative-pg/postgresql:18.1-system-trixie`, 1 instance, 10Gi - `net-kingdom-pg` — `ghcr.io/cloudnative-pg/postgresql:16`, 1 instance, 10Gi Neither was provisioned as a shared cluster. The user's earlier choice ("shared cnpg cluster, new database role") therefore requires a sub-decision — see **Decision D-01** below. - **Gitea registry reachable.** `curl --resolve gitea.coulomb.social:443:92.205.130.254 https://gitea.coulomb.social/v2/` returns `HTTP 405` for `HEAD` with a valid TLS chain (cert: `default/gitea-tls`, ready 3d). The OCI endpoint is up; HEAD-vs-GET is expected. - **Gitea package token — still required.** No package-capable PAT is currently held by the operator in this session (carryover blocker from `RAILIANCE-WP-0001-T04`). Token must be minted via the Gitea web UI by a user with `write:package` scope before T03. - **Public DNS for `whywhynot.de`:** A-record currently `217.160.0.212` (IONOS web hosting). Authoritative NS = `ns1126.ui-dns.{de,biz,com,org}` (IONOS / 1&1). The zone is administered through the operator's IONOS web console — DNS change is a manual out-of-band step. - **Traefik LB public IP:** `92.205.130.254` (`kube-system/traefik` LoadBalancer service, ports 80/443). This is the target the new A-record must point at. - **cert-manager:** `ClusterIssuer/letsencrypt-prod` is `Ready=True` (59d). Most recent successful issuance: `default/gitea-tls`, 3d4h ago. Several stale failing certs in `mfa` and `sso` namespaces are unrelated to this workplan. - **Pre-condition gaps before downstream tasks unblock:** 1. D-01 below (cnpg target cluster) — blocks T04. 2. Gitea package-capable PAT — blocks T03. 3. DNS A-record for `vergabe-teilnahme.whywhynot.de → 92.205.130.254` — blocks T06. **Decision D-01 — cnpg target for `vergabe_db`** (pending; required before T04): | Option | Pros | Cons | |--------|------|------| | A. New dedicated cluster `vergabe-pg` | Matches the existing one-cluster-per-app pattern; clean blast radius | Resource cost grows linearly with apps; no actual "shared" cluster emerges | | B. Add role+db to existing `net-kingdom-pg` (PG 16) | Reuses a healthy PG 16 cluster matching vergabe-teilnahme's minimum; lowest cost | Cluster name no longer reflects its content; coupling with netkingdom domain | | C. Add role+db to existing `gitea-db` (PG 18) | Newest cluster image; same operator | Couples gitea ops with vergabe ops; name no longer reflects content | | D. Provision a new general-purpose cluster `apps-pg` (PG 16+) | Establishes a real shared cluster that future apps adopt | Net-new infra; needs a `railiance-platform` task to own the cluster | Recommendation: **D** (creates the "shared cluster" the user asked for as a real artifact rather than retrofitting an existing name). Recorded as a pending hub decision. **Resolution (2026-05-18, bernd):** Option D. Provision a new shared cnpg cluster `apps-pg` (PG 16, 1 instance, 10Gi initial) in namespace `databases`. cnpg `Cluster` CRs live in `railiance-platform` per ADR-003 (confirmed: `helm/gitea-db-cluster.yaml`). A coordination message has been sent to `railiance-platform` requesting the cluster. T04 below is now sequenced **after** that cluster reports healthy. --- ### T02 — Add Dockerfile and asset build to vergabe-teilnahme ```task id: RAILIANCE-WP-0002-T02 status: done priority: high repo: vergabe-teilnahme state_hub_task_id: "43ce85c4-0bdb-43c4-b0a5-81fa366800a6" ``` Open a companion task in the `vergabe-teilnahme` repo to add a production-ready container image definition. Scope: - Multi-stage `Dockerfile` at the repo root: - Stage 1: Node + Vite + Tailwind asset build (`npm ci` → `npm run build` → emits to `static/dist/`). - Stage 2: Python image, `uv sync --frozen`, copy app and built assets, run `manage.py collectstatic --noinput`. - Production WSGI/ASGI server (gunicorn) listening on `:8000`. - Whitenoise-style static asset serving (or document an alternative). - Non-root container user, sensible `WORKDIR`, healthcheck endpoint. - `.dockerignore` excluding `node_modules`, `media/`, `__pycache__`, `.venv`, `static/dist` source, etc. - Document the build command and the chosen image tag scheme in the vergabe-teilnahme README. Coordination: this task crosses into `vergabe-teilnahme`. Track via a hub task and reference the PR/commit when complete. **Done when:** `docker build` against the vergabe-teilnahme repo produces a runnable image that responds to a smoke request locally. **Resolution (2026-05-18):** issue-facade was renamed to issue-core upstream. Rewired vergabe-teilnahme to depend on issue-core (`@ file:///home/worsch/issue-core`); the three Python imports were updated (`issue_tracker.*` → `issue_core.*`). All 20 aufgaben tests pass after the rewire. See vergabe-teilnahme commit `17f511f`. Dockerfile delivered in vergabe-teilnahme repo: - Three stages (assets / python-deps / runtime) with whitenoise static serving and `collectstatic` at build time. - BuildKit named context resolves the `../issue-core` path dep: `docker build --build-context issue-core=/home/worsch/issue-core .` - Non-root `app` user, `/health/` HEALTHCHECK, gunicorn on :8000. - Smoke test: container reports `(healthy)`, `/health/` → 200. **Original blocker (now resolved):** vergabe-teilnahme couldn't `uv sync` because `pyproject.toml` referenced `universal-issue-tracker @ file:///home/worsch/issue-facade`, but that directory was effectively empty (only `.claude/` remained). ``` error: Failed to generate package metadata for `universal-issue-tracker==0.1.0 @ directory+../issue-facade` Caused by: /home/worsch/issue-facade does not appear to be a Python project, as neither `pyproject.toml` nor `setup.py` are present. ``` Related candidate sources investigated: - `/home/worsch/issue-core/` — a separate package (`issue-core`), not the missing `universal-issue-tracker` facade. - `/home/worsch/markitect-main/_issue-tracking/issue-facade/` — does not exist. This must be resolved upstream in `vergabe-teilnahme` (or by restoring `issue-facade`) before T02 can produce a buildable image. Options: 1. **Restore `issue-facade`** — recover the missing source (git reflog, backup, or recreate from `issue-core`'s public surface). 2. **Repoint** `vergabe-teilnahme`'s dep to `issue-core` directly if that's the intended replacement, then update `uv.lock`. 3. **Vendor** a minimal stub interface in `vergabe-teilnahme/vendor/` to unblock the container build; restore the real dep later. Recommendation: route to whoever owns `issue-facade` (likely a `markitect` or `personhood` domain task) and pause T02 until the dep resolves cleanly outside Docker. --- ### T03 — Build and publish image to Gitea container registry ```task id: RAILIANCE-WP-0002-T03 status: done priority: high state_hub_task_id: "d0f8db8c-fad9-4e0b-a404-9e3a04cffb05" ``` Push the first production image of vergabe-teilnahme through the registry enabled in `RAILIANCE-WP-0001`. Steps: ```bash docker login gitea.coulomb.social docker build -t gitea.coulomb.social/coulomb/vergabe-teilnahme: \ /home/worsch/vergabe-teilnahme docker push gitea.coulomb.social/coulomb/vergabe-teilnahme: ``` Choose a deterministic tag scheme (`` recommended). Record the exact image reference and digest used for the first deployment. **Done when:** the image is fetchable from a disposable Kubernetes pod on the cluster. **Done (2026-05-19):** - Pushed `gitea.coulomb.social/coulomb/vergabe-teilnahme:483a4df` and `:latest` from the `vergabe-teilnahme:t02-smoke` build. - Image digest: `sha256:e9bbceb35b0239c835d339295a0ae1d2d8b6d08c02a7b4e992c0ecd37de86d7a`. - Token owner: `tegwick` (Bernd Worsch). Push namespace: `coulomb` org. - Cluster-side pull verified via `kubectl run vt-pull-test` — pod reached `Running` in ~7s with no `imagePullSecret`. **Package is public by default**; T05 does not need to wire an imagePullSecret unless the package is later made private in the Gitea web UI. --- ### T04 — Provision PostgreSQL role and database ```task id: RAILIANCE-WP-0002-T04 status: done priority: high state_hub_task_id: "925ace1c-f9bf-4644-bd0b-637705d72ea6" ``` Create a `vergabe` PostgreSQL role and `vergabe_db` database inside the new shared `apps-pg` cnpg cluster being provisioned by `railiance-platform` (per resolved decision D-01). Blocked on: `apps-pg` cluster reaching `Cluster in healthy state` in namespace `databases`. Tracked by `railiance-platform` **`RAILIANCE-WP-0003`** (workstream `665b3b9b-608a-4be4-84b6-dcb8261ff57b`), proposed 2026-05-19 in response to the coordination thread. Consumer recipe (from RAILIANCE-WP-0003 T06): 1. Label the `vergabe-teilnahme` namespace `railiance.io/postgres-client=apps-pg` so the platform's ingress NetworkPolicy permits the connection. 2. Create a credential Secret in that namespace for the `vergabe` role. 3. Create a cnpg `Database` CR pointing at cluster `apps-pg` with `ownerName: vergabe` and the credential Secret. 4. DSN: `postgresql://vergabe:...@apps-pg-rw.databases:5432/vergabe_db`, wired into the SOPS Helm values in T05. Approach: - Use a cnpg `Database` and `Role` resource — never an out-of-band `psql` change without recording it. - The role owns only `vergabe_db`; no `CREATEDB`, no superuser, no grants on other databases. - Capture the database DSN in the SOPS values file (T05). - If the cluster needs to grow (more instances, more storage, backup inclusion), pause and add a follow-up `railiance-platform` task — do not edit cluster-level resources from this repo. **Done when:** the new role can connect to `vergabe_db` from inside the cluster (`kubectl run --rm -it psql ...`) and is recorded in the SOPS values used by T05. **Done (2026-05-19):** Platform side (in `railiance-platform`, commit `017934d`): - `helm/apps-pg-cluster.yaml` adds `spec.managed.roles[vergabe]` (CNPG 1.28 role lifecycle is cluster-scoped — no standalone Role CR). - `helm/apps-pg-databases.yaml` (new) declares `Database/vergabe-db` with `name: vergabe_db`, `owner: vergabe`. - Bootstrap credential `databases/vergabe-app-credentials` (`kubernetes.io/basic-auth`, `username: vergabe`, generated password). Apps side (this workplan): - Namespace `vergabe-teilnahme` created and labeled `railiance.io/postgres-client=apps-pg` (per docs/apps-pg.md opt-in contract). - Credential Secret mirrored to `vergabe-teilnahme/vergabe-app-credentials` so the application pod can mount it. T05 will reference this Secret via `envFrom` or individual `valueFrom.secretKeyRef`. DSN for T05's SOPS Helm values: ``` postgresql://vergabe:${PASSWORD}@apps-pg-rw.databases:5432/vergabe_db ``` End-to-end verification: `kubectl exec` into a pod in the `vergabe-teilnahme` namespace and run psql with the mirrored credentials — returns `vergabe | vergabe_db | PostgreSQL 16.13`. --- ### T05 — Author Helm release for vergabe-teilnahme ```task id: RAILIANCE-WP-0002-T05 status: done priority: high state_hub_task_id: "29ba6add-6f23-4053-acb9-9d7efa0b3881" ``` Add the chart selection (or bespoke chart) and SOPS-encrypted values that turn the published image into a Kubernetes Deployment. Deliverables: - Decide chart approach: Bitnami `common` template, a thin in-repo chart under `charts/vergabe-teilnahme/`, or raw manifests. Record the rationale in the workplan log. - `helm/vergabe-teilnahme-values.sops.yaml` containing: - image repo + tag from T03, - env (DJANGO_SETTINGS_MODULE=`vergabe_teilnahme.settings.prod`, `ALLOWED_HOSTS`, `CSRF_TRUSTED_ORIGINS`), - `SECRET_KEY` (generated, never committed in clear), - DB DSN from T04, - resource requests/limits, single replica, readiness/liveness probes targeting the healthcheck endpoint introduced in T02. - A dedicated namespace (`vergabe-teilnahme`). - Optional: PVC for media uploads if Django `MEDIA_ROOT` is needed in v1; otherwise omit and document deferral. - `Makefile` targets: `vergabe-deploy`, `vergabe-status`, `vergabe-migrate`. **Done when:** `make vergabe-deploy` renders cleanly with `--dry-run` and produces no plaintext secrets in the rendered manifest source. **Done (2026-05-19):** - Chart approach: thin in-repo chart `charts/vergabe-teilnahme/` rather than SOPS-encrypted values, because the only sensitive material (`SECRET_KEY`, `DATABASE_URL`) lives in K8s Secrets (cnpg's `vergabe-app-credentials` + the assembled `vergabe-teilnahme-env`), not in Helm values. `helm/vergabe-teilnahme-values.yaml` is therefore plain YAML — image tag, hostnames, no secrets. - `make vergabe-dry-run` renders 2 objects (Deployment + Service); `grep -iE 'SECRET_KEY=|DATABASE_URL=|password'` returns empty. - Deploy revision 2 is live: pod Running 1/1, probes green. The HTTP-probe `httpGet.httpHeaders[Host]` is set to the public hostname so Django's `ALLOWED_HOSTS` check passes for kube-probe (the v1 fix took one iteration — earlier attempts failed liveness with HTTP 400 because the probe sent `Host: 10.42.x.x:8000`). - `Makefile` targets added: `vergabe-dry-run`, `vergabe-deploy`, `vergabe-ingress-deploy`, `vergabe-status`, `vergabe-migrate`, `vergabe-seed`, `vergabe-superuser`, `vergabe-logs`. **Lesson recorded:** the base64-generated bootstrap password contains `=`, `+`, `/`; embedding it raw in `DATABASE_URL` confuses `dj-database-url` (it parses `:password@host:5432/db` and the `=` broke the DB name into 80 characters). The Secret now stores a URL-encoded password inside `DATABASE_URL` while the raw password remains in `vergabe-app-credentials.password`. Future apps should either URL-encode at Secret-build time or use individual env vars. --- ### T06 — DNS, ingress, and TLS for vergabe-teilnahme.whywhynot.de ```task id: RAILIANCE-WP-0002-T06 status: done priority: high state_hub_task_id: "8e673ee6-5338-4eb5-8973-a1818b4dc7f5" ``` Make the application reachable behind a valid Let's Encrypt certificate. Steps: - Add an A record `vergabe-teilnahme.whywhynot.de` → cluster public IP (per T01). Use the DNS change path captured in T01. - Add `manifests/vergabe-teilnahme-ingress.yaml` modeled on `gitea-ingress.yaml`: - `ingressClassName: traefik`, - annotation `cert-manager.io/cluster-issuer: letsencrypt-prod`, - `tls.secretName: vergabe-teilnahme-tls`, - host `vergabe-teilnahme.whywhynot.de`, backend the Service from T05. - Wait for cert-manager to issue the cert. - Validate `https://vergabe-teilnahme.whywhynot.de/healthz` (or equivalent) returns 200 with a trusted cert chain. Boundary note: ingress controller and cluster networking changes belong in `railiance-cluster`. This task only adds an `Ingress` resource that consumes the existing controller. **Done when:** the public hostname serves the app over HTTPS and the certificate chain validates from outside the cluster. **Progress (2026-05-18):** - ✅ DNS A record live: `vergabe-teilnahme.whywhynot.de → 92.205.130.254` (TTL 3600; served authoritatively by `ns1126.ui-dns.*`). - ✅ Traefik routing reaches the cluster: HTTP probe returns 404 — the expected pre-state because no Ingress rule matches the host yet. - ✅ `manifests/vergabe-teilnahme-ingress.yaml` committed; Traefik + cert-manager letsencrypt-prod. - ✅ `vergabe-teilnahme-tls` issued by cert-manager in ~35s (HTTP-01). - ✅ External HTTPS probes: `/health/` returns 200 `{"status":"ok"}`; `/` redirects (302) to `/ausschreibungen/dashboard/` which renders `Übersicht` (German UI); `/admin/login/` shows the German Django admin login page. `curl` reports `SSL verify_result: 0` (trusted chain). --- ### T07 — Initial migration, seed, and smoke test ```task id: RAILIANCE-WP-0002-T07 status: done priority: high state_hub_task_id: "be1decb5-b734-4312-b98d-20ed5299d02c" ``` Bring the app to a usable state in production. Steps: - Run `manage.py migrate` as a Kubernetes `Job` or one-shot `kubectl exec` against the running Deployment (record which). - Run `manage.py seed` (the `make seed` target) — vergabe-teilnahme's idempotent seed. - Create the first superuser via `manage.py createsuperuser`. - Smoke checklist: - Login at `/admin/` succeeds. - The dashboard at `/` renders without errors. - Static assets (Tailwind build output) are served with correct content-type and 200 status. - HTMX partial requests succeed on at least one page. - A new `Ausschreibung` can be created and saved. **Done when:** the smoke checklist passes and `kubectl logs` shows no unexpected errors. **Done (2026-05-19, with deliberate deferrals):** - ✅ `manage.py migrate` ran via `make vergabe-migrate` against the live deployment. All Django apps migrated (`accounts`, `core`, `ausschreibungen`, `lose`, `aufgaben`, `dokumente`, `preise`, `partner`, `bibliothek`, `marktbegleiter`, `nachbetrachtung`, `feedback`, plus framework apps). - ❌ `make seed` (= `seed_dev`) deliberately **skipped**: it creates a hardcoded dev user `max.muster / testpass123`. Not prod-safe. - ❌ `createsuperuser` deferred to the operator (interactive credential should not be minted through this session). Recipe in `docs/vergabe-teilnahme.md`. - ✅ Smoke (no-auth surface): - `/health/` → `200 {"status":"ok"}` - `/` → `302 → /ausschreibungen/dashboard/` → `200`, page title `Übersicht`. - `/admin/login/` → `200`, title `Anmelden | Django-Systemverwaltung` (German Django admin). - Static assets: `/static/dist/main.css` 200 (Tailwind), `/static/admin/css/base.css` 200 (Django admin), `/static/vendor/{alpinejs,htmx}/...` referenced from the rendered HTML. - ❌ Auth-required smoke (login, create Ausschreibung) deferred to the operator after `createsuperuser`. - ✅ `kubectl logs` clean — only gunicorn boot + kube-probe 200s. --- ### T08 — Document handoff, runbook, and backup posture ```task id: RAILIANCE-WP-0002-T08 status: done priority: medium state_hub_task_id: "594d3591-b61f-40c4-850c-efaa02c859ed" ``` Capture everything an on-call operator needs. Deliverables in `docs/vergabe-teilnahme.md`: - Registry image naming and tag scheme. - Namespace, Deployment, Service, Ingress names. - DB DSN handling (where secrets live, how to rotate). - Restart, rollback (`helm rollback`), and migration commands. - Backup posture: confirm whether the shared cnpg cluster's backup job includes `vergabe_db`; if not, open a `railiance-platform` follow-up. - Pointer to the vergabe-teilnahme repo for app-level changes vs. `railiance-apps` for Helm/ingress changes. **Done when:** a new operator can find vergabe-teilnahme, deploy a new image tag, and recover from a pod crash without reading this workplan. **Done (2026-05-19):** `docs/vergabe-teilnahme.md` covers identity, secrets + rotation recipes (DB password and SECRET_KEY), day-to-day make targets, image promotion + rollback, troubleshooting (kube-probe Host header, DSN URL-encoding, cert-manager failure modes), open backup posture, and cross-references to the improvements backlog (`RAILIANCE-WP-0004`), the shared DB cluster doc, and the container registry doc. ## Completion Criteria This workplan is complete when: 1. The vergabe-teilnahme image is published to `gitea.coulomb.social/coulomb/vergabe-teilnahme:`. 2. A dedicated PostgreSQL role and database serve the app from the shared cnpg cluster. 3. `helm/vergabe-teilnahme-values.sops.yaml` and the ingress manifest are committed; `make vergabe-deploy` is the single command to deploy. 4. `https://vergabe-teilnahme.whywhynot.de` serves the app over HTTPS with a valid Let's Encrypt cert. 5. Migrations + seed have run; the smoke checklist passes. 6. Runbook is in `docs/vergabe-teilnahme.md`. ## Notes - This is the second application on `railiance01` (after Gitea). It intentionally adopts the same SOPS + Helm + Traefik + cert-manager pattern so the operator workflow stays consistent. - v1 deliberately defers: 3-stage canary (Staged Promotion Lifecycle is still 0/7), SSO/Keycloak integration, S3-backed media storage, Celery + Redis workers (optional in the architecture blueprint), and multi-replica HA. Each can become its own follow-up workplan once the baseline runs. - The `whywhynot.de` domain enters the Railiance stack for the first time here. Treat the DNS path established in T01/T06 as the reference for any future `*.whywhynot.de` workloads.