Propose RAILIANCE-WP-0002: vergabe-teilnahme on railiance01
8-task plan to deploy vergabe-teilnahme as a Helm release at vergabe-teilnahme.whywhynot.de with image from gitea.coulomb.social and a dedicated role on the shared cnpg cluster. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,383 @@
|
|||||||
|
---
|
||||||
|
id: RAILIANCE-WP-0002
|
||||||
|
type: workplan
|
||||||
|
title: "Establish vergabe-teilnahme as an Application on railiance01"
|
||||||
|
domain: railiance
|
||||||
|
repo: railiance-apps
|
||||||
|
status: proposed
|
||||||
|
owner: railiance
|
||||||
|
topic_slug: railiance
|
||||||
|
created: "2026-05-18"
|
||||||
|
updated: "2026-05-18"
|
||||||
|
planning_priority: high
|
||||||
|
planning_order: 2
|
||||||
|
---
|
||||||
|
|
||||||
|
# Establish vergabe-teilnahme as an Application on railiance01
|
||||||
|
|
||||||
|
## Goal
|
||||||
|
|
||||||
|
Deploy the `vergabe-teilnahme` Django application as a governed Helm release on
|
||||||
|
the production cluster node `railiance01` (`92.205.130.254`), reachable at
|
||||||
|
`https://vergabe-teilnahme.whywhynot.de`, with its container image published
|
||||||
|
through the Railiance Gitea OCI registry and its PostgreSQL data living in the
|
||||||
|
shared cnpg cluster.
|
||||||
|
|
||||||
|
This establishes vergabe-teilnahme as the second application (after Gitea)
|
||||||
|
running on the S5 layer of the Railiance OAS Stack and exercises the freshly
|
||||||
|
enabled container registry from `RAILIANCE-WP-0001` end-to-end.
|
||||||
|
|
||||||
|
## Placement in the Railiance Tooling Set
|
||||||
|
|
||||||
|
This workplan lives in `railiance-apps` because vergabe-teilnahme is an S5
|
||||||
|
application workload. The deployment surface added by this workplan is:
|
||||||
|
|
||||||
|
- `helm/vergabe-teilnahme-values.sops.yaml` — SOPS-encrypted Helm values
|
||||||
|
(`DJANGO_SECRET_KEY`, DB DSN, etc.).
|
||||||
|
- `releases/vergabe-teilnahme/` — chart selection / overlay (Bitnami generic
|
||||||
|
chart or hand-rolled chart, decided in T05).
|
||||||
|
- `manifests/vergabe-teilnahme-ingress.yaml` — Traefik ingress + cert-manager
|
||||||
|
TLS for `vergabe-teilnahme.whywhynot.de`.
|
||||||
|
- `Makefile` targets: `vergabe-deploy`, `vergabe-status`, `vergabe-migrate`.
|
||||||
|
|
||||||
|
Cross-repo coordination required:
|
||||||
|
|
||||||
|
| Concern | Owner repo | Coordination |
|
||||||
|
|---------|------------|--------------|
|
||||||
|
| Application Helm release | `railiance-apps` | This workplan |
|
||||||
|
| Containerization (Dockerfile, entrypoint, asset build) | `vergabe-teilnahme` | This workplan opens a task in that repo |
|
||||||
|
| PostgreSQL role + database in shared cnpg cluster | `railiance-platform` | Capability request via hub |
|
||||||
|
| DNS A record for `vergabe-teilnahme.whywhynot.de` | DNS owner of `whywhynot.de` | Out-of-band, captured in T06 |
|
||||||
|
| Ingress controller / cluster routing primitives | `railiance-cluster` | Reuse — no changes expected |
|
||||||
|
| cert-manager ClusterIssuer `letsencrypt-prod` | `railiance-platform` | Reuse — no changes expected |
|
||||||
|
|
||||||
|
## Current Evidence
|
||||||
|
|
||||||
|
- `vergabe-teilnahme/CLAUDE.md`: Django 6.x · uv · Tailwind CSS v4 (Vite) ·
|
||||||
|
HTMX 2.x · Alpine.js 3.x · PostgreSQL 16+ (psycopg3). German UI.
|
||||||
|
- `vergabe-teilnahme/` currently has no `Dockerfile`. `docker-compose.dev.yml`
|
||||||
|
documents the local Postgres but isn't started when the shared
|
||||||
|
`infra-postgres-1` is up.
|
||||||
|
- `railiance-apps/Makefile` deploys Gitea via `helm/gitea-values.sops.yaml`;
|
||||||
|
the same SOPS + Helm pattern is the template for this workplan.
|
||||||
|
- `RAILIANCE-WP-0001` confirmed `https://gitea.coulomb.social/v2/` returns
|
||||||
|
the OCI registry auth challenge. Image naming convention established:
|
||||||
|
`gitea.coulomb.social/<org>/<image>:<tag>`.
|
||||||
|
- `manifests/gitea-ingress.yaml` confirms the ingress recipe:
|
||||||
|
`ingressClassName: traefik` + annotation
|
||||||
|
`cert-manager.io/cluster-issuer: letsencrypt-prod`.
|
||||||
|
- Domain `whywhynot.de` has no prior references in any railiance repo —
|
||||||
|
DNS and a fresh Let's Encrypt cert will need to be set up.
|
||||||
|
- Live cnpg state: `gitea-db` runs in the `databases` namespace. T01
|
||||||
|
confirms whether a single shared cluster exists for app DBs or whether
|
||||||
|
one must be designated.
|
||||||
|
|
||||||
|
## Safety Contract
|
||||||
|
|
||||||
|
- Do not commit decrypted Helm values, the Django `SECRET_KEY`, DB
|
||||||
|
credentials, or any other secret material.
|
||||||
|
- Use a dedicated PostgreSQL role with privileges scoped to a single
|
||||||
|
database; never reuse the gitea role or grant superuser.
|
||||||
|
- No public exposure until cert-manager has successfully issued a TLS
|
||||||
|
certificate for `vergabe-teilnahme.whywhynot.de`.
|
||||||
|
- Do not start with `DEBUG=True`; production settings only.
|
||||||
|
- Preserve Gitea behavior: the new ingress must not conflict with the
|
||||||
|
`gitea` ingress on the cluster's default ingress controller.
|
||||||
|
- If DB role provisioning requires changes to the shared cnpg cluster
|
||||||
|
resource limits, pause and create a `railiance-platform` task.
|
||||||
|
- If DNS for `whywhynot.de` is owned outside this operator's control,
|
||||||
|
pause T06 until DNS ownership is confirmed.
|
||||||
|
- Start fresh — no migration of data from any existing dev database in
|
||||||
|
this workplan.
|
||||||
|
|
||||||
|
## Target State
|
||||||
|
|
||||||
|
- `vergabe-teilnahme:<tag>` image is built and pushed to
|
||||||
|
`gitea.coulomb.social/coulomb/vergabe-teilnahme:<tag>`.
|
||||||
|
- A `vergabe` PostgreSQL role and `vergabe_db` database exist in the
|
||||||
|
shared cnpg cluster (single role, single DB, no cross-app grants).
|
||||||
|
- A Helm release `vergabe-teilnahme` is deployed in a dedicated
|
||||||
|
namespace with a single replica, a Service, a PVC for media (if any),
|
||||||
|
and the necessary secrets sourced from SOPS values.
|
||||||
|
- Django `migrate` and `make seed` have run successfully against the
|
||||||
|
shared cnpg database.
|
||||||
|
- `https://vergabe-teilnahme.whywhynot.de` serves the Django app behind
|
||||||
|
a valid Let's Encrypt certificate.
|
||||||
|
- Login as a superuser succeeds; the dashboard renders; static assets
|
||||||
|
are served correctly (Tailwind/Vite build pipeline integrated into the
|
||||||
|
image).
|
||||||
|
- Runbook, registry naming, DB credentials handling, and rollback steps
|
||||||
|
are documented for the next operator.
|
||||||
|
|
||||||
|
## Tasks
|
||||||
|
|
||||||
|
### T01 — Inventory the deployment substrate
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: RAILIANCE-WP-0002-T01
|
||||||
|
status: todo
|
||||||
|
priority: high
|
||||||
|
```
|
||||||
|
|
||||||
|
Confirm the pre-conditions before any code is written.
|
||||||
|
|
||||||
|
Checks:
|
||||||
|
|
||||||
|
- Identify the shared cnpg cluster intended for app databases (name,
|
||||||
|
namespace, version, current databases/roles, available capacity).
|
||||||
|
- Verify `gitea.coulomb.social/v2/` still returns an OCI registry auth
|
||||||
|
challenge and that the operator has a package-capable token.
|
||||||
|
- Verify cert-manager `letsencrypt-prod` ClusterIssuer is healthy and
|
||||||
|
has successfully issued at least one production certificate recently
|
||||||
|
(`gitea-tls` is a known good example).
|
||||||
|
- Confirm DNS ownership and the change path for `whywhynot.de` — record
|
||||||
|
who owns the zone and how an A record is added.
|
||||||
|
- Confirm Traefik is the active ingress controller and note the public
|
||||||
|
IP/hostname an A record must point at.
|
||||||
|
|
||||||
|
**Done when:** the workplan records (a) the cnpg cluster to use,
|
||||||
|
(b) the operator's path to a Gitea package token, (c) the DNS change
|
||||||
|
path for `whywhynot.de`, and (d) any pre-condition gaps.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### T02 — Add Dockerfile and asset build to vergabe-teilnahme
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: RAILIANCE-WP-0002-T02
|
||||||
|
status: todo
|
||||||
|
priority: high
|
||||||
|
repo: vergabe-teilnahme
|
||||||
|
```
|
||||||
|
|
||||||
|
Open a companion task in the `vergabe-teilnahme` repo to add a
|
||||||
|
production-ready container image definition.
|
||||||
|
|
||||||
|
Scope:
|
||||||
|
|
||||||
|
- Multi-stage `Dockerfile` at the repo root:
|
||||||
|
- Stage 1: Node + Vite + Tailwind asset build (`npm ci` →
|
||||||
|
`npm run build` → emits to `static/dist/`).
|
||||||
|
- Stage 2: Python image, `uv sync --frozen`, copy app and built
|
||||||
|
assets, run `manage.py collectstatic --noinput`.
|
||||||
|
- Production WSGI/ASGI server (gunicorn) listening on `:8000`.
|
||||||
|
- Whitenoise-style static asset serving (or document an alternative).
|
||||||
|
- Non-root container user, sensible `WORKDIR`, healthcheck endpoint.
|
||||||
|
- `.dockerignore` excluding `node_modules`, `media/`, `__pycache__`,
|
||||||
|
`.venv`, `static/dist` source, etc.
|
||||||
|
- Document the build command and the chosen image tag scheme in the
|
||||||
|
vergabe-teilnahme README.
|
||||||
|
|
||||||
|
Coordination: this task crosses into `vergabe-teilnahme`. Track via a
|
||||||
|
hub task and reference the PR/commit when complete.
|
||||||
|
|
||||||
|
**Done when:** `docker build` against the vergabe-teilnahme repo produces
|
||||||
|
a runnable image that responds to a smoke request locally.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### T03 — Build and publish image to Gitea container registry
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: RAILIANCE-WP-0002-T03
|
||||||
|
status: todo
|
||||||
|
priority: high
|
||||||
|
```
|
||||||
|
|
||||||
|
Push the first production image of vergabe-teilnahme through the
|
||||||
|
registry enabled in `RAILIANCE-WP-0001`.
|
||||||
|
|
||||||
|
Steps:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker login gitea.coulomb.social
|
||||||
|
docker build -t gitea.coulomb.social/coulomb/vergabe-teilnahme:<tag> \
|
||||||
|
/home/worsch/vergabe-teilnahme
|
||||||
|
docker push gitea.coulomb.social/coulomb/vergabe-teilnahme:<tag>
|
||||||
|
```
|
||||||
|
|
||||||
|
Choose a deterministic tag scheme (`<git-sha>` recommended). Record the
|
||||||
|
exact image reference and digest used for the first deployment.
|
||||||
|
|
||||||
|
**Done when:** the image is fetchable from a disposable Kubernetes pod
|
||||||
|
on the cluster.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### T04 — Provision PostgreSQL role and database
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: RAILIANCE-WP-0002-T04
|
||||||
|
status: todo
|
||||||
|
priority: high
|
||||||
|
```
|
||||||
|
|
||||||
|
Create a `vergabe` PostgreSQL role and `vergabe_db` database inside the
|
||||||
|
shared cnpg cluster identified in T01.
|
||||||
|
|
||||||
|
Approach:
|
||||||
|
|
||||||
|
- Use a cnpg `Database` and `Role` (or `ScheduledBackup` / SQL bootstrap)
|
||||||
|
resource — never an out-of-band `psql` change without recording it.
|
||||||
|
- The role owns only `vergabe_db`; no `CREATEDB`, no superuser, no grants
|
||||||
|
on other databases.
|
||||||
|
- Capture the database DSN in the SOPS values file (T05).
|
||||||
|
- Coordinate with `railiance-platform` if any cluster-level change is
|
||||||
|
needed (resource limits, backup inclusion, monitoring).
|
||||||
|
|
||||||
|
**Done when:** the new role can connect to `vergabe_db` from inside the
|
||||||
|
cluster (`kubectl run --rm -it psql ...`) and is recorded in the SOPS
|
||||||
|
values used by T05.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### T05 — Author Helm release for vergabe-teilnahme
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: RAILIANCE-WP-0002-T05
|
||||||
|
status: todo
|
||||||
|
priority: high
|
||||||
|
```
|
||||||
|
|
||||||
|
Add the chart selection (or bespoke chart) and SOPS-encrypted values
|
||||||
|
that turn the published image into a Kubernetes Deployment.
|
||||||
|
|
||||||
|
Deliverables:
|
||||||
|
|
||||||
|
- Decide chart approach: Bitnami `common` template, a thin in-repo
|
||||||
|
chart under `charts/vergabe-teilnahme/`, or raw manifests. Record the
|
||||||
|
rationale in the workplan log.
|
||||||
|
- `helm/vergabe-teilnahme-values.sops.yaml` containing:
|
||||||
|
- image repo + tag from T03,
|
||||||
|
- env (DJANGO_SETTINGS_MODULE=`vergabe_teilnahme.settings.prod`,
|
||||||
|
`ALLOWED_HOSTS`, `CSRF_TRUSTED_ORIGINS`),
|
||||||
|
- `SECRET_KEY` (generated, never committed in clear),
|
||||||
|
- DB DSN from T04,
|
||||||
|
- resource requests/limits, single replica, readiness/liveness probes
|
||||||
|
targeting the healthcheck endpoint introduced in T02.
|
||||||
|
- A dedicated namespace (`vergabe-teilnahme`).
|
||||||
|
- Optional: PVC for media uploads if Django `MEDIA_ROOT` is needed in
|
||||||
|
v1; otherwise omit and document deferral.
|
||||||
|
- `Makefile` targets: `vergabe-deploy`, `vergabe-status`,
|
||||||
|
`vergabe-migrate`.
|
||||||
|
|
||||||
|
**Done when:** `make vergabe-deploy` renders cleanly with `--dry-run`
|
||||||
|
and produces no plaintext secrets in the rendered manifest source.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### T06 — DNS, ingress, and TLS for vergabe-teilnahme.whywhynot.de
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: RAILIANCE-WP-0002-T06
|
||||||
|
status: todo
|
||||||
|
priority: high
|
||||||
|
```
|
||||||
|
|
||||||
|
Make the application reachable behind a valid Let's Encrypt certificate.
|
||||||
|
|
||||||
|
Steps:
|
||||||
|
|
||||||
|
- Add an A record `vergabe-teilnahme.whywhynot.de` →
|
||||||
|
cluster public IP (per T01). Use the DNS change path captured in T01.
|
||||||
|
- Add `manifests/vergabe-teilnahme-ingress.yaml` modeled on
|
||||||
|
`gitea-ingress.yaml`:
|
||||||
|
- `ingressClassName: traefik`,
|
||||||
|
- annotation `cert-manager.io/cluster-issuer: letsencrypt-prod`,
|
||||||
|
- `tls.secretName: vergabe-teilnahme-tls`,
|
||||||
|
- host `vergabe-teilnahme.whywhynot.de`, backend the Service from T05.
|
||||||
|
- Wait for cert-manager to issue the cert.
|
||||||
|
- Validate `https://vergabe-teilnahme.whywhynot.de/healthz` (or
|
||||||
|
equivalent) returns 200 with a trusted cert chain.
|
||||||
|
|
||||||
|
Boundary note: ingress controller and cluster networking changes
|
||||||
|
belong in `railiance-cluster`. This task only adds an `Ingress`
|
||||||
|
resource that consumes the existing controller.
|
||||||
|
|
||||||
|
**Done when:** the public hostname serves the app over HTTPS and the
|
||||||
|
certificate chain validates from outside the cluster.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### T07 — Initial migration, seed, and smoke test
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: RAILIANCE-WP-0002-T07
|
||||||
|
status: todo
|
||||||
|
priority: high
|
||||||
|
```
|
||||||
|
|
||||||
|
Bring the app to a usable state in production.
|
||||||
|
|
||||||
|
Steps:
|
||||||
|
|
||||||
|
- Run `manage.py migrate` as a Kubernetes `Job` or one-shot
|
||||||
|
`kubectl exec` against the running Deployment (record which).
|
||||||
|
- Run `manage.py seed` (the `make seed` target) — vergabe-teilnahme's
|
||||||
|
idempotent seed.
|
||||||
|
- Create the first superuser via `manage.py createsuperuser`.
|
||||||
|
- Smoke checklist:
|
||||||
|
- Login at `/admin/` succeeds.
|
||||||
|
- The dashboard at `/` renders without errors.
|
||||||
|
- Static assets (Tailwind build output) are served with correct
|
||||||
|
content-type and 200 status.
|
||||||
|
- HTMX partial requests succeed on at least one page.
|
||||||
|
- A new `Ausschreibung` can be created and saved.
|
||||||
|
|
||||||
|
**Done when:** the smoke checklist passes and `kubectl logs` shows no
|
||||||
|
unexpected errors.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### T08 — Document handoff, runbook, and backup posture
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: RAILIANCE-WP-0002-T08
|
||||||
|
status: todo
|
||||||
|
priority: medium
|
||||||
|
```
|
||||||
|
|
||||||
|
Capture everything an on-call operator needs.
|
||||||
|
|
||||||
|
Deliverables in `docs/vergabe-teilnahme.md`:
|
||||||
|
|
||||||
|
- Registry image naming and tag scheme.
|
||||||
|
- Namespace, Deployment, Service, Ingress names.
|
||||||
|
- DB DSN handling (where secrets live, how to rotate).
|
||||||
|
- Restart, rollback (`helm rollback`), and migration commands.
|
||||||
|
- Backup posture: confirm whether the shared cnpg cluster's backup job
|
||||||
|
includes `vergabe_db`; if not, open a `railiance-platform` follow-up.
|
||||||
|
- Pointer to the vergabe-teilnahme repo for app-level changes vs.
|
||||||
|
`railiance-apps` for Helm/ingress changes.
|
||||||
|
|
||||||
|
**Done when:** a new operator can find vergabe-teilnahme, deploy a new
|
||||||
|
image tag, and recover from a pod crash without reading this workplan.
|
||||||
|
|
||||||
|
## Completion Criteria
|
||||||
|
|
||||||
|
This workplan is complete when:
|
||||||
|
|
||||||
|
1. The vergabe-teilnahme image is published to
|
||||||
|
`gitea.coulomb.social/coulomb/vergabe-teilnahme:<tag>`.
|
||||||
|
2. A dedicated PostgreSQL role and database serve the app from the
|
||||||
|
shared cnpg cluster.
|
||||||
|
3. `helm/vergabe-teilnahme-values.sops.yaml` and the ingress manifest
|
||||||
|
are committed; `make vergabe-deploy` is the single command to deploy.
|
||||||
|
4. `https://vergabe-teilnahme.whywhynot.de` serves the app over HTTPS
|
||||||
|
with a valid Let's Encrypt cert.
|
||||||
|
5. Migrations + seed have run; the smoke checklist passes.
|
||||||
|
6. Runbook is in `docs/vergabe-teilnahme.md`.
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- This is the second application on `railiance01` (after Gitea). It
|
||||||
|
intentionally adopts the same SOPS + Helm + Traefik + cert-manager
|
||||||
|
pattern so the operator workflow stays consistent.
|
||||||
|
- v1 deliberately defers: 3-stage canary (Staged Promotion Lifecycle is
|
||||||
|
still 0/7), SSO/Keycloak integration, S3-backed media storage,
|
||||||
|
Celery + Redis workers (optional in the architecture blueprint), and
|
||||||
|
multi-replica HA. Each can become its own follow-up workplan once the
|
||||||
|
baseline runs.
|
||||||
|
- The `whywhynot.de` domain enters the Railiance stack for the first
|
||||||
|
time here. Treat the DNS path established in T01/T06 as the reference
|
||||||
|
for any future `*.whywhynot.de` workloads.
|
||||||
Reference in New Issue
Block a user