Propose RAILIANCE-WP-0002: vergabe-teilnahme on railiance01
8-task plan to deploy vergabe-teilnahme as a Helm release at vergabe-teilnahme.whywhynot.de with image from gitea.coulomb.social and a dedicated role on the shared cnpg cluster. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,383 @@
|
||||
---
|
||||
id: RAILIANCE-WP-0002
|
||||
type: workplan
|
||||
title: "Establish vergabe-teilnahme as an Application on railiance01"
|
||||
domain: railiance
|
||||
repo: railiance-apps
|
||||
status: proposed
|
||||
owner: railiance
|
||||
topic_slug: railiance
|
||||
created: "2026-05-18"
|
||||
updated: "2026-05-18"
|
||||
planning_priority: high
|
||||
planning_order: 2
|
||||
---
|
||||
|
||||
# Establish vergabe-teilnahme as an Application on railiance01
|
||||
|
||||
## Goal
|
||||
|
||||
Deploy the `vergabe-teilnahme` Django application as a governed Helm release on
|
||||
the production cluster node `railiance01` (`92.205.130.254`), reachable at
|
||||
`https://vergabe-teilnahme.whywhynot.de`, with its container image published
|
||||
through the Railiance Gitea OCI registry and its PostgreSQL data living in the
|
||||
shared cnpg cluster.
|
||||
|
||||
This establishes vergabe-teilnahme as the second application (after Gitea)
|
||||
running on the S5 layer of the Railiance OAS Stack and exercises the freshly
|
||||
enabled container registry from `RAILIANCE-WP-0001` end-to-end.
|
||||
|
||||
## Placement in the Railiance Tooling Set
|
||||
|
||||
This workplan lives in `railiance-apps` because vergabe-teilnahme is an S5
|
||||
application workload. The deployment surface added by this workplan is:
|
||||
|
||||
- `helm/vergabe-teilnahme-values.sops.yaml` — SOPS-encrypted Helm values
|
||||
(`DJANGO_SECRET_KEY`, DB DSN, etc.).
|
||||
- `releases/vergabe-teilnahme/` — chart selection / overlay (Bitnami generic
|
||||
chart or hand-rolled chart, decided in T05).
|
||||
- `manifests/vergabe-teilnahme-ingress.yaml` — Traefik ingress + cert-manager
|
||||
TLS for `vergabe-teilnahme.whywhynot.de`.
|
||||
- `Makefile` targets: `vergabe-deploy`, `vergabe-status`, `vergabe-migrate`.
|
||||
|
||||
Cross-repo coordination required:
|
||||
|
||||
| Concern | Owner repo | Coordination |
|
||||
|---------|------------|--------------|
|
||||
| Application Helm release | `railiance-apps` | This workplan |
|
||||
| Containerization (Dockerfile, entrypoint, asset build) | `vergabe-teilnahme` | This workplan opens a task in that repo |
|
||||
| PostgreSQL role + database in shared cnpg cluster | `railiance-platform` | Capability request via hub |
|
||||
| DNS A record for `vergabe-teilnahme.whywhynot.de` | DNS owner of `whywhynot.de` | Out-of-band, captured in T06 |
|
||||
| Ingress controller / cluster routing primitives | `railiance-cluster` | Reuse — no changes expected |
|
||||
| cert-manager ClusterIssuer `letsencrypt-prod` | `railiance-platform` | Reuse — no changes expected |
|
||||
|
||||
## Current Evidence
|
||||
|
||||
- `vergabe-teilnahme/CLAUDE.md`: Django 6.x · uv · Tailwind CSS v4 (Vite) ·
|
||||
HTMX 2.x · Alpine.js 3.x · PostgreSQL 16+ (psycopg3). German UI.
|
||||
- `vergabe-teilnahme/` currently has no `Dockerfile`. `docker-compose.dev.yml`
|
||||
documents the local Postgres but isn't started when the shared
|
||||
`infra-postgres-1` is up.
|
||||
- `railiance-apps/Makefile` deploys Gitea via `helm/gitea-values.sops.yaml`;
|
||||
the same SOPS + Helm pattern is the template for this workplan.
|
||||
- `RAILIANCE-WP-0001` confirmed `https://gitea.coulomb.social/v2/` returns
|
||||
the OCI registry auth challenge. Image naming convention established:
|
||||
`gitea.coulomb.social/<org>/<image>:<tag>`.
|
||||
- `manifests/gitea-ingress.yaml` confirms the ingress recipe:
|
||||
`ingressClassName: traefik` + annotation
|
||||
`cert-manager.io/cluster-issuer: letsencrypt-prod`.
|
||||
- Domain `whywhynot.de` has no prior references in any railiance repo —
|
||||
DNS and a fresh Let's Encrypt cert will need to be set up.
|
||||
- Live cnpg state: `gitea-db` runs in the `databases` namespace. T01
|
||||
confirms whether a single shared cluster exists for app DBs or whether
|
||||
one must be designated.
|
||||
|
||||
## Safety Contract
|
||||
|
||||
- Do not commit decrypted Helm values, the Django `SECRET_KEY`, DB
|
||||
credentials, or any other secret material.
|
||||
- Use a dedicated PostgreSQL role with privileges scoped to a single
|
||||
database; never reuse the gitea role or grant superuser.
|
||||
- No public exposure until cert-manager has successfully issued a TLS
|
||||
certificate for `vergabe-teilnahme.whywhynot.de`.
|
||||
- Do not start with `DEBUG=True`; production settings only.
|
||||
- Preserve Gitea behavior: the new ingress must not conflict with the
|
||||
`gitea` ingress on the cluster's default ingress controller.
|
||||
- If DB role provisioning requires changes to the shared cnpg cluster
|
||||
resource limits, pause and create a `railiance-platform` task.
|
||||
- If DNS for `whywhynot.de` is owned outside this operator's control,
|
||||
pause T06 until DNS ownership is confirmed.
|
||||
- Start fresh — no migration of data from any existing dev database in
|
||||
this workplan.
|
||||
|
||||
## Target State
|
||||
|
||||
- `vergabe-teilnahme:<tag>` image is built and pushed to
|
||||
`gitea.coulomb.social/coulomb/vergabe-teilnahme:<tag>`.
|
||||
- A `vergabe` PostgreSQL role and `vergabe_db` database exist in the
|
||||
shared cnpg cluster (single role, single DB, no cross-app grants).
|
||||
- A Helm release `vergabe-teilnahme` is deployed in a dedicated
|
||||
namespace with a single replica, a Service, a PVC for media (if any),
|
||||
and the necessary secrets sourced from SOPS values.
|
||||
- Django `migrate` and `make seed` have run successfully against the
|
||||
shared cnpg database.
|
||||
- `https://vergabe-teilnahme.whywhynot.de` serves the Django app behind
|
||||
a valid Let's Encrypt certificate.
|
||||
- Login as a superuser succeeds; the dashboard renders; static assets
|
||||
are served correctly (Tailwind/Vite build pipeline integrated into the
|
||||
image).
|
||||
- Runbook, registry naming, DB credentials handling, and rollback steps
|
||||
are documented for the next operator.
|
||||
|
||||
## Tasks
|
||||
|
||||
### T01 — Inventory the deployment substrate
|
||||
|
||||
```task
|
||||
id: RAILIANCE-WP-0002-T01
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Confirm the pre-conditions before any code is written.
|
||||
|
||||
Checks:
|
||||
|
||||
- Identify the shared cnpg cluster intended for app databases (name,
|
||||
namespace, version, current databases/roles, available capacity).
|
||||
- Verify `gitea.coulomb.social/v2/` still returns an OCI registry auth
|
||||
challenge and that the operator has a package-capable token.
|
||||
- Verify cert-manager `letsencrypt-prod` ClusterIssuer is healthy and
|
||||
has successfully issued at least one production certificate recently
|
||||
(`gitea-tls` is a known good example).
|
||||
- Confirm DNS ownership and the change path for `whywhynot.de` — record
|
||||
who owns the zone and how an A record is added.
|
||||
- Confirm Traefik is the active ingress controller and note the public
|
||||
IP/hostname an A record must point at.
|
||||
|
||||
**Done when:** the workplan records (a) the cnpg cluster to use,
|
||||
(b) the operator's path to a Gitea package token, (c) the DNS change
|
||||
path for `whywhynot.de`, and (d) any pre-condition gaps.
|
||||
|
||||
---
|
||||
|
||||
### T02 — Add Dockerfile and asset build to vergabe-teilnahme
|
||||
|
||||
```task
|
||||
id: RAILIANCE-WP-0002-T02
|
||||
status: todo
|
||||
priority: high
|
||||
repo: vergabe-teilnahme
|
||||
```
|
||||
|
||||
Open a companion task in the `vergabe-teilnahme` repo to add a
|
||||
production-ready container image definition.
|
||||
|
||||
Scope:
|
||||
|
||||
- Multi-stage `Dockerfile` at the repo root:
|
||||
- Stage 1: Node + Vite + Tailwind asset build (`npm ci` →
|
||||
`npm run build` → emits to `static/dist/`).
|
||||
- Stage 2: Python image, `uv sync --frozen`, copy app and built
|
||||
assets, run `manage.py collectstatic --noinput`.
|
||||
- Production WSGI/ASGI server (gunicorn) listening on `:8000`.
|
||||
- Whitenoise-style static asset serving (or document an alternative).
|
||||
- Non-root container user, sensible `WORKDIR`, healthcheck endpoint.
|
||||
- `.dockerignore` excluding `node_modules`, `media/`, `__pycache__`,
|
||||
`.venv`, `static/dist` source, etc.
|
||||
- Document the build command and the chosen image tag scheme in the
|
||||
vergabe-teilnahme README.
|
||||
|
||||
Coordination: this task crosses into `vergabe-teilnahme`. Track via a
|
||||
hub task and reference the PR/commit when complete.
|
||||
|
||||
**Done when:** `docker build` against the vergabe-teilnahme repo produces
|
||||
a runnable image that responds to a smoke request locally.
|
||||
|
||||
---
|
||||
|
||||
### T03 — Build and publish image to Gitea container registry
|
||||
|
||||
```task
|
||||
id: RAILIANCE-WP-0002-T03
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Push the first production image of vergabe-teilnahme through the
|
||||
registry enabled in `RAILIANCE-WP-0001`.
|
||||
|
||||
Steps:
|
||||
|
||||
```bash
|
||||
docker login gitea.coulomb.social
|
||||
docker build -t gitea.coulomb.social/coulomb/vergabe-teilnahme:<tag> \
|
||||
/home/worsch/vergabe-teilnahme
|
||||
docker push gitea.coulomb.social/coulomb/vergabe-teilnahme:<tag>
|
||||
```
|
||||
|
||||
Choose a deterministic tag scheme (`<git-sha>` recommended). Record the
|
||||
exact image reference and digest used for the first deployment.
|
||||
|
||||
**Done when:** the image is fetchable from a disposable Kubernetes pod
|
||||
on the cluster.
|
||||
|
||||
---
|
||||
|
||||
### T04 — Provision PostgreSQL role and database
|
||||
|
||||
```task
|
||||
id: RAILIANCE-WP-0002-T04
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Create a `vergabe` PostgreSQL role and `vergabe_db` database inside the
|
||||
shared cnpg cluster identified in T01.
|
||||
|
||||
Approach:
|
||||
|
||||
- Use a cnpg `Database` and `Role` (or `ScheduledBackup` / SQL bootstrap)
|
||||
resource — never an out-of-band `psql` change without recording it.
|
||||
- The role owns only `vergabe_db`; no `CREATEDB`, no superuser, no grants
|
||||
on other databases.
|
||||
- Capture the database DSN in the SOPS values file (T05).
|
||||
- Coordinate with `railiance-platform` if any cluster-level change is
|
||||
needed (resource limits, backup inclusion, monitoring).
|
||||
|
||||
**Done when:** the new role can connect to `vergabe_db` from inside the
|
||||
cluster (`kubectl run --rm -it psql ...`) and is recorded in the SOPS
|
||||
values used by T05.
|
||||
|
||||
---
|
||||
|
||||
### T05 — Author Helm release for vergabe-teilnahme
|
||||
|
||||
```task
|
||||
id: RAILIANCE-WP-0002-T05
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Add the chart selection (or bespoke chart) and SOPS-encrypted values
|
||||
that turn the published image into a Kubernetes Deployment.
|
||||
|
||||
Deliverables:
|
||||
|
||||
- Decide chart approach: Bitnami `common` template, a thin in-repo
|
||||
chart under `charts/vergabe-teilnahme/`, or raw manifests. Record the
|
||||
rationale in the workplan log.
|
||||
- `helm/vergabe-teilnahme-values.sops.yaml` containing:
|
||||
- image repo + tag from T03,
|
||||
- env (DJANGO_SETTINGS_MODULE=`vergabe_teilnahme.settings.prod`,
|
||||
`ALLOWED_HOSTS`, `CSRF_TRUSTED_ORIGINS`),
|
||||
- `SECRET_KEY` (generated, never committed in clear),
|
||||
- DB DSN from T04,
|
||||
- resource requests/limits, single replica, readiness/liveness probes
|
||||
targeting the healthcheck endpoint introduced in T02.
|
||||
- A dedicated namespace (`vergabe-teilnahme`).
|
||||
- Optional: PVC for media uploads if Django `MEDIA_ROOT` is needed in
|
||||
v1; otherwise omit and document deferral.
|
||||
- `Makefile` targets: `vergabe-deploy`, `vergabe-status`,
|
||||
`vergabe-migrate`.
|
||||
|
||||
**Done when:** `make vergabe-deploy` renders cleanly with `--dry-run`
|
||||
and produces no plaintext secrets in the rendered manifest source.
|
||||
|
||||
---
|
||||
|
||||
### T06 — DNS, ingress, and TLS for vergabe-teilnahme.whywhynot.de
|
||||
|
||||
```task
|
||||
id: RAILIANCE-WP-0002-T06
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Make the application reachable behind a valid Let's Encrypt certificate.
|
||||
|
||||
Steps:
|
||||
|
||||
- Add an A record `vergabe-teilnahme.whywhynot.de` →
|
||||
cluster public IP (per T01). Use the DNS change path captured in T01.
|
||||
- Add `manifests/vergabe-teilnahme-ingress.yaml` modeled on
|
||||
`gitea-ingress.yaml`:
|
||||
- `ingressClassName: traefik`,
|
||||
- annotation `cert-manager.io/cluster-issuer: letsencrypt-prod`,
|
||||
- `tls.secretName: vergabe-teilnahme-tls`,
|
||||
- host `vergabe-teilnahme.whywhynot.de`, backend the Service from T05.
|
||||
- Wait for cert-manager to issue the cert.
|
||||
- Validate `https://vergabe-teilnahme.whywhynot.de/healthz` (or
|
||||
equivalent) returns 200 with a trusted cert chain.
|
||||
|
||||
Boundary note: ingress controller and cluster networking changes
|
||||
belong in `railiance-cluster`. This task only adds an `Ingress`
|
||||
resource that consumes the existing controller.
|
||||
|
||||
**Done when:** the public hostname serves the app over HTTPS and the
|
||||
certificate chain validates from outside the cluster.
|
||||
|
||||
---
|
||||
|
||||
### T07 — Initial migration, seed, and smoke test
|
||||
|
||||
```task
|
||||
id: RAILIANCE-WP-0002-T07
|
||||
status: todo
|
||||
priority: high
|
||||
```
|
||||
|
||||
Bring the app to a usable state in production.
|
||||
|
||||
Steps:
|
||||
|
||||
- Run `manage.py migrate` as a Kubernetes `Job` or one-shot
|
||||
`kubectl exec` against the running Deployment (record which).
|
||||
- Run `manage.py seed` (the `make seed` target) — vergabe-teilnahme's
|
||||
idempotent seed.
|
||||
- Create the first superuser via `manage.py createsuperuser`.
|
||||
- Smoke checklist:
|
||||
- Login at `/admin/` succeeds.
|
||||
- The dashboard at `/` renders without errors.
|
||||
- Static assets (Tailwind build output) are served with correct
|
||||
content-type and 200 status.
|
||||
- HTMX partial requests succeed on at least one page.
|
||||
- A new `Ausschreibung` can be created and saved.
|
||||
|
||||
**Done when:** the smoke checklist passes and `kubectl logs` shows no
|
||||
unexpected errors.
|
||||
|
||||
---
|
||||
|
||||
### T08 — Document handoff, runbook, and backup posture
|
||||
|
||||
```task
|
||||
id: RAILIANCE-WP-0002-T08
|
||||
status: todo
|
||||
priority: medium
|
||||
```
|
||||
|
||||
Capture everything an on-call operator needs.
|
||||
|
||||
Deliverables in `docs/vergabe-teilnahme.md`:
|
||||
|
||||
- Registry image naming and tag scheme.
|
||||
- Namespace, Deployment, Service, Ingress names.
|
||||
- DB DSN handling (where secrets live, how to rotate).
|
||||
- Restart, rollback (`helm rollback`), and migration commands.
|
||||
- Backup posture: confirm whether the shared cnpg cluster's backup job
|
||||
includes `vergabe_db`; if not, open a `railiance-platform` follow-up.
|
||||
- Pointer to the vergabe-teilnahme repo for app-level changes vs.
|
||||
`railiance-apps` for Helm/ingress changes.
|
||||
|
||||
**Done when:** a new operator can find vergabe-teilnahme, deploy a new
|
||||
image tag, and recover from a pod crash without reading this workplan.
|
||||
|
||||
## Completion Criteria
|
||||
|
||||
This workplan is complete when:
|
||||
|
||||
1. The vergabe-teilnahme image is published to
|
||||
`gitea.coulomb.social/coulomb/vergabe-teilnahme:<tag>`.
|
||||
2. A dedicated PostgreSQL role and database serve the app from the
|
||||
shared cnpg cluster.
|
||||
3. `helm/vergabe-teilnahme-values.sops.yaml` and the ingress manifest
|
||||
are committed; `make vergabe-deploy` is the single command to deploy.
|
||||
4. `https://vergabe-teilnahme.whywhynot.de` serves the app over HTTPS
|
||||
with a valid Let's Encrypt cert.
|
||||
5. Migrations + seed have run; the smoke checklist passes.
|
||||
6. Runbook is in `docs/vergabe-teilnahme.md`.
|
||||
|
||||
## Notes
|
||||
|
||||
- This is the second application on `railiance01` (after Gitea). It
|
||||
intentionally adopts the same SOPS + Helm + Traefik + cert-manager
|
||||
pattern so the operator workflow stays consistent.
|
||||
- v1 deliberately defers: 3-stage canary (Staged Promotion Lifecycle is
|
||||
still 0/7), SSO/Keycloak integration, S3-backed media storage,
|
||||
Celery + Redis workers (optional in the architecture blueprint), and
|
||||
multi-replica HA. Each can become its own follow-up workplan once the
|
||||
baseline runs.
|
||||
- The `whywhynot.de` domain enters the Railiance stack for the first
|
||||
time here. Treat the DNS path established in T01/T06 as the reference
|
||||
for any future `*.whywhynot.de` workloads.
|
||||
Reference in New Issue
Block a user