3.7 KiB
Railiance01 Kubernetes Deployment
This bundle establishes activity-core as an internal production service on the railiance01 K3s cluster. It keeps the unauthenticated API as a ClusterIP service; publish it through an authenticated ingress only after choosing the final host name and access policy.
Layout
00-namespace.yaml: namespace and shared labels10-infrastructure.yaml: PostgreSQL for app data, PostgreSQL for Temporal, NATS JetStream, Temporal, and Temporal UI20-runtime.yaml: migrate/sync jobs plus API, worker, and event-routerbootstrap-secrets.sh: idempotently creates generated Kubernetes secrets
The runtime image tag is activity-core:railiance01-prod and is expected to be
loaded into the railiance01 K3s containerd image store.
20-runtime.yaml also projects the disabled Custodian-owned
ops-service-inventory-probes.md ActivityDefinition and a non-secret
actcore-ops-service-inventory ConfigMap snapshot. The source of truth for the
inventory remains /home/worsch/the-custodian/ops/service-inventory.yml; update
the ConfigMap projection from that file before enabling the probe schedule.
OPS_HUB_KEY is created only as an empty Secret placeholder until the operator
provisions the Inter-Hub ops-hub key.
The same runtime projection now includes the active
daily-statehub-wsjf-triage.md ActivityDefinition plus its JSON output schema
and a persistent working-memory volume mounted at
/home/worsch/the-custodian/memory/working. Before trusting the daily 07:20
Europe/Berlin schedule, verify both runtime dependencies:
actcore-state-hub-bridgecan reach the State Hub API through the node-local tunnel expected at127.0.0.1:18000.LLM_CONNECT_URLis set to an operator-approved llm-connect endpoint that can serve thecustodian-triage-balancedprofile.
If LLM_CONNECT_URL is missing or broken, report-sink instructions write a
visible execution_failed diagnostic instead of silently producing no report.
Deploy
docker build -t activity-core:railiance01-prod .
docker save -o /tmp/activity-core-railiance01-prod.tar activity-core:railiance01-prod
scp /tmp/activity-core-railiance01-prod.tar railiance01:/tmp/
ssh railiance01 sudo k3s ctr images import /tmp/activity-core-railiance01-prod.tar
rsync -a k8s/railiance/ railiance01:activity-core/k8s/railiance/
ssh railiance01
cd ~/activity-core
bash k8s/railiance/bootstrap-secrets.sh
kubectl apply -f k8s/railiance/10-infrastructure.yaml
kubectl -n activity-core wait --for=condition=ready pod -l app.kubernetes.io/name=actcore-app-db --timeout=180s
kubectl -n activity-core wait --for=condition=ready pod -l app.kubernetes.io/name=actcore-temporal-db --timeout=180s
kubectl -n activity-core wait --for=condition=ready pod -l app.kubernetes.io/name=actcore-nats --timeout=180s
kubectl -n activity-core rollout status deploy/actcore-temporal --timeout=300s
kubectl -n activity-core delete job actcore-migrate --ignore-not-found
kubectl apply -f k8s/railiance/20-runtime.yaml
kubectl -n activity-core wait --for=condition=complete job/actcore-migrate --timeout=180s
kubectl -n activity-core rollout status deploy/actcore-api --timeout=180s
kubectl -n activity-core rollout status deploy/actcore-worker --timeout=180s
kubectl -n activity-core rollout status deploy/actcore-event-router --timeout=180s
kubectl -n activity-core delete job actcore-sync --ignore-not-found
kubectl apply -f k8s/railiance/20-runtime.yaml
kubectl -n activity-core wait --for=condition=complete job/actcore-sync --timeout=180s
Verify
kubectl -n activity-core exec deploy/actcore-api -- \
python -c "import urllib.request; print(urllib.request.urlopen('http://localhost:8010/health').read().decode())"
kubectl -n activity-core get pods
kubectl -n activity-core get svc