Files
activity-core/k8s/railiance

Railiance01 Kubernetes Deployment

This bundle establishes activity-core as an internal production service on the railiance01 K3s cluster. It keeps the unauthenticated API as a ClusterIP service; publish it through an authenticated ingress only after choosing the final host name and access policy.

Layout

  • 00-namespace.yaml: namespace and shared labels
  • 10-infrastructure.yaml: PostgreSQL for app data, PostgreSQL for Temporal, NATS JetStream, Temporal, and Temporal UI
  • 20-runtime.yaml: migrate/sync jobs plus API, worker, and event-router
  • bootstrap-secrets.sh: idempotently creates generated Kubernetes secrets

The runtime image tag is activity-core:railiance01-prod and is expected to be loaded into the railiance01 K3s containerd image store.

20-runtime.yaml also projects the disabled Custodian-owned ops-service-inventory-probes.md ActivityDefinition and a non-secret actcore-ops-service-inventory ConfigMap snapshot. The source of truth for the inventory remains /home/worsch/the-custodian/ops/service-inventory.yml; update the ConfigMap projection from that file before enabling the probe schedule. OPS_HUB_KEY is created only as an empty Secret placeholder until the operator provisions the Inter-Hub ops-hub key.

The same runtime projection now includes the active daily-statehub-wsjf-triage.md ActivityDefinition plus its JSON output schema and a persistent working-memory volume mounted at /home/worsch/the-custodian/memory/working. Before trusting the daily 07:20 Europe/Berlin schedule, verify both runtime dependencies:

  • actcore-state-hub-bridge can reach the State Hub API through the node-local tunnel expected at 127.0.0.1:18000.
  • LLM_CONNECT_URL is set to an operator-approved llm-connect endpoint that can serve the custodian-triage-balanced profile.

If LLM_CONNECT_URL is missing or broken, report-sink instructions write a visible execution_failed diagnostic instead of silently producing no report.

Deploy

docker build -t activity-core:railiance01-prod .
docker save -o /tmp/activity-core-railiance01-prod.tar activity-core:railiance01-prod
scp /tmp/activity-core-railiance01-prod.tar railiance01:/tmp/
ssh railiance01 sudo k3s ctr images import /tmp/activity-core-railiance01-prod.tar
rsync -a k8s/railiance/ railiance01:activity-core/k8s/railiance/

ssh railiance01
cd ~/activity-core
bash k8s/railiance/bootstrap-secrets.sh
kubectl apply -f k8s/railiance/10-infrastructure.yaml
kubectl -n activity-core wait --for=condition=ready pod -l app.kubernetes.io/name=actcore-app-db --timeout=180s
kubectl -n activity-core wait --for=condition=ready pod -l app.kubernetes.io/name=actcore-temporal-db --timeout=180s
kubectl -n activity-core wait --for=condition=ready pod -l app.kubernetes.io/name=actcore-nats --timeout=180s
kubectl -n activity-core rollout status deploy/actcore-temporal --timeout=300s

kubectl -n activity-core delete job actcore-migrate --ignore-not-found
kubectl apply -f k8s/railiance/20-runtime.yaml
kubectl -n activity-core wait --for=condition=complete job/actcore-migrate --timeout=180s
kubectl -n activity-core rollout status deploy/actcore-api --timeout=180s
kubectl -n activity-core rollout status deploy/actcore-worker --timeout=180s
kubectl -n activity-core rollout status deploy/actcore-event-router --timeout=180s
kubectl -n activity-core delete job actcore-sync --ignore-not-found
kubectl apply -f k8s/railiance/20-runtime.yaml
kubectl -n activity-core wait --for=condition=complete job/actcore-sync --timeout=180s

Verify

kubectl -n activity-core exec deploy/actcore-api -- \
  python -c "import urllib.request; print(urllib.request.urlopen('http://localhost:8010/health').read().decode())"

kubectl -n activity-core get pods
kubectl -n activity-core get svc