Files
railiance-platform/docs/openbao.md

13 KiB

OpenBao - Platform Secrets Service

Chart: openbao/openbao Chart version: 0.28.2 App version: v2.5.3 Namespace: openbao Managed by: railiance-platform (S3) Workplan: RAIL-PL-WP-0002 Initial target: Railiance01 (92.205.62.239)


Architecture

S5 workloads / operators
  -> openbao.openbao.svc.cluster.local:8200
       -> openbao-0
            -> integrated Raft storage on local-path PVC
            -> audit storage PVC mounted at /openbao/audit
  • OpenBao is the canonical Railiance S3 secrets service.
  • SOPS/age remains the Git-at-rest bootstrap mechanism.
  • The first Railiance01 deployment is single-replica Raft, not true HA.
  • Public ingress is disabled. Operators use kubectl exec or port-forwarding.
  • TLS is disabled inside the pod listener for this internal-only bootstrap. Add cert-manager-backed internal TLS before exposing OpenBao beyond cluster-local traffic.

Deployment

The official OpenBao project recommends the Helm chart for Kubernetes deployments and warns to run Helm with --dry-run before install or upgrade.

From a host with kubeconfig access:

make openbao-dry-run
make openbao-deploy
make openbao-status

On Railiance01 directly:

cd ~/railiance-platform
sudo env KUBECONFIG=/etc/rancher/k3s/k3s.yaml make openbao-dry-run
sudo env KUBECONFIG=/etc/rancher/k3s/k3s.yaml make openbao-deploy
sudo env KUBECONFIG=/etc/rancher/k3s/k3s.yaml make openbao-status

If the repo is not present on Railiance01 yet, copy only the non-secret values file and run Helm directly:

scp helm/openbao-values.yaml tegwick@92.205.62.239:/tmp/openbao-values.yaml
ssh tegwick@92.205.62.239 \
  'sudo env KUBECONFIG=/etc/rancher/k3s/k3s.yaml helm upgrade --install openbao openbao/openbao \
     --version 0.28.2 \
     --namespace openbao \
     --create-namespace \
     -f /tmp/openbao-values.yaml \
     --dry-run'

Repeat without --dry-run to deploy.

Verification

kubectl get pods,svc,pvc -n openbao -o wide
kubectl exec -n openbao openbao-0 -- bao status

Expected immediately after install:

  • openbao-0 is Running.
  • openbao, openbao-active, openbao-internal, and openbao-ui services exist as cluster-internal services.
  • data and audit PVCs are Bound.
  • bao status reports Initialized: false and Sealed: true.

That state is intentional until the bootstrap ceremony is completed. bao status may return exit code 2 while sealed; this is expected for the pre-init state and does not by itself indicate a deployment failure.

Bootstrap Ceremony

Do not initialize OpenBao in a casual shell session. Initialization emits the unseal keys and initial root token. Treat this as a break-glass event.

Setup Operator And King Credential

The initial accountable setup operator/contact is tegwick (bernd.worsch@gmail.com), with Gitea identity tegwick. This identity can assemble early infrastructure, receive notifications, and operate day-to-day Git/Gitea workflows, but it is not the desired long-term platform root of trust.

The actual platform-root target is a separate king credential created through the NetKingdom bootstrap path before OpenBao becomes live secret custody. Email may receive notifications, but Gitea, Git, State Hub, chat, tickets, shell history, and email must not store or transfer OpenBao unseal keys, root tokens, private keys, OTP seeds, recovery codes, or screenshots of secret output.

The canonical custody policy is in net-kingdom/docs/platform-root-custody.md. The preferred production posture is independent two-of-three custody. Temporary single-operator king custody is feasible for pre-production bootstrap only when second-factor protection, offline recovery storage, and a low-friction upgrade path to additional custodians are in place.

Pre-flight checks:

make openbao-status
make openbao-verify

Proceed only when:

  • openbao-0 is Running.
  • data and audit PVCs are Bound.
  • bao status reports Initialized: false and Sealed: true.
  • Railiance01 host/cluster backup posture is understood for this maintenance window.
  • the guided NetKingdom bootstrap path exists for creating or importing the king credential.
  • the OpenBao custody mode is recorded: preferred independent custody, or an explicit temporary single-custodian king bootstrap exception.

Recommended ceremony:

  1. Confirm the Railiance01 backup posture first.

  2. Prepare the king credential and approved escrow holders or offline single-custody locations.

  3. Run initialization once:

    kubectl exec -n openbao openbao-0 -- \
      bao operator init -key-shares=3 -key-threshold=2
    
  4. Give each unseal share to its escrow owner or approved king-custody location through an out-of-band channel.

  5. Unseal with two shares:

    kubectl exec -n openbao openbao-0 -- bao operator unseal
    
  6. Log in with the initial root token only long enough to create durable admin auth, enable audit, and prepare policies.

  7. Revoke or tightly escrow the initial root token.

Do not paste unseal keys, root tokens, screenshots, or command output into Git, State Hub, chat, shell history, or issue trackers. Each unseal share goes to one escrow owner through an out-of-band channel. The initial root token is either revoked after a non-root platform-admin token exists or stored as offline break-glass material with the same handling as unseal shares.

Initial Configuration After Unseal

Enable file audit:

kubectl exec -n openbao openbao-0 -- \
  bao audit enable file file_path=/openbao/audit/openbao-audit.log

Enable the first KV v2 mount:

kubectl exec -n openbao openbao-0 -- \
  bao secrets enable -path=platform kv-v2

Kubernetes auth, database dynamic credentials, PKI, CSI, and External Secrets integration are follow-up tasks in RAIL-PL-WP-0002. Do not migrate live application secrets until those policies and restore drills are documented.

The repo now includes a non-secret helper for the first post-unseal configuration:

make openbao-configure-initial

The target prompts for a token, enables file audit when API-managed audit is available, enables the platform/ KV v2 mount, enables Kubernetes auth, configures Kubernetes auth from the in-pod service account, and loads:

  • openbao/policies/platform-admin.hcl
  • openbao/policies/platform-readonly.hcl

It does not print or store the token. You may also set OPENBAO_TOKEN_FILE=/path/to/token-file for an operator-local, uncommitted token file.

Current OpenBao releases may reject API-managed audit setup with a message that audit devices must be configured declaratively. In that case the helper exits successfully with a warning after applying the other bootstrap configuration. Treat declarative audit configuration in the OpenBao server config/Helm values as mandatory before production secrets move in.

After the helper succeeds, create a non-root admin token:

kubectl exec -n openbao openbao-0 -- \
  bao token create -policy=platform-admin -period=24h -orphan

Store that token through the approved operator secret path, then revoke or tightly escrow the initial root token. The root token should not become the normal operator credential.

Auth And Workload Integration

Initial auth model:

Actor Method Notes
Setup operator/contact Gitea tegwick / bernd.worsch@gmail.com low-trust assembly and notifications; not platform root of trust
King credential NetKingdom custody record for dedicated platform-root identity accountable bootstrap/recovery authority; not a Git or email secret store
Bootstrap operator one-time root token only for initial audit, mounts, auth, policies, and non-root token creation
Platform operator token with platform-admin temporary until NetKingdom OIDC/admin integration is ready
Read-only reviewer token with platform-readonly metadata and health visibility, no secret reads
Kubernetes workload Kubernetes auth role namespace/service-account bound, policy per workload
Human identity NetKingdom IAM Profile/OIDC target model; OpenBao is not the identity provider
Automation Kubernetes auth or short-lived operator token no root tokens in automation

Workload delivery choice:

  • Prefer External Secrets Operator for values that should become Kubernetes Secrets consumed by ordinary Helm charts.
  • Use CSI-mounted files for workloads that need file references, sharper mount-level boundaries, or secret refresh without rewriting application manifests.
  • Do not use the OpenBao injector in the current deployment; the Helm values leave it disabled.
  • Application repositories request paths and policies; railiance-platform owns platform mounts, policy shape, and delivery mechanisms.

Path convention:

platform/workloads/<namespace>/<service-account>/<secret-name>
platform/object-storage/<consumer>
platform/databases/<consumer>
platform/operators/<purpose>

The template policy for workload KV reads is openbao/policies/workload-kv-read-template.hcl.

Backup, Restore, Audit, And Monitoring

Before any live application secrets move into OpenBao:

  1. Enable file audit and confirm an audit file is written under /openbao/audit/openbao-audit.log.

  2. Create an OpenBao Raft snapshot from the unsealed pod:

    kubectl exec -n openbao openbao-0 -- \
      bao operator raft snapshot save /tmp/openbao-raft.snap
    kubectl cp openbao/openbao-0:/tmp/openbao-raft.snap ./openbao-raft.snap
    
  3. Encrypt the snapshot with age/SOPS-compatible custody before it leaves the operator machine.

  4. Run an isolated restore drill before treating OpenBao as live secret custody. The drill must prove that a fresh OpenBao instance can restore the snapshot, unseal, and read a test secret.

  5. Decide where audit logs are shipped durably. The audit PVC alone is not a durable audit sink.

  6. Run:

    make openbao-verify-post-unseal
    

Monitoring baseline:

  • pod readiness and liveness from Kubernetes probes
  • bao status seal/init state
  • PVC capacity for data and audit storage
  • audit log write success
  • future Prometheus scraping once the cluster monitoring stack exists

Artifact-Store Object Storage Handoff

artifact-store is the consumer-facing artifact preservation service for generated outputs, evidence packages, reports, logs, snapshots, exports, and release artifacts. It already has an S3-compatible backend with env:NAME and file:/mounted/path credential references, plus an artifactstore storage verify --backend s3 smoke path.

Railiance should avoid building a parallel object-storage client or credential vending flow in OpenBao. The ownership split is:

  • railiance-platform / OpenBao owns bootstrap secret custody, policy, audit, break-glass access, and workload secret delivery.
  • artifact-store owns artifact package manifests, the S3 backend, storage verification, and whether temporary credentials require backend refresh support or a sidecar/controller.
  • net-kingdom owns the identity issuer and role-claim model if object storage adopts STS with AssumeRoleWithWebIdentity.

Initial static-credential bridge, before STS is proven:

  1. Create a scoped object-store access key limited to the artifact-store bucket and prefix. Do not use object-store root credentials.

  2. Store the key pair in OpenBao under a platform-owned path such as platform/object-storage/artifact-store.

  3. Deliver the values to the artifact-store pod through CSI or External Secrets as mounted files.

  4. Configure artifact-store with file references:

    export ARTIFACTSTORE_S3_ACCESS_KEY_REF=file:/run/secrets/artifactstore/s3-access-key
    export ARTIFACTSTORE_S3_SECRET_KEY_REF=file:/run/secrets/artifactstore/s3-secret-key
    
  5. Verify from artifact-store:

    artifactstore storage verify --backend s3
    

STS credential vending remains linked to ARTIFACT-STORE-WP-0007 - MinIO Compatibility, MaxIO Fork Assessment, And STS Credential Vending. If that workstream chooses MinIO-compatible AssumeRoleWithWebIdentity, OpenBao should not become the identity provider by default. Use the NetKingdom OIDC issuer for workload/user identity, map object storage roles and policies there, and keep OpenBao responsible for bootstrap, break-glass, audit, and delivery of any controller configuration.

Current artifact-store configuration exposes access key and secret key refs, but no session-token ref. ARTIFACT-STORE-WP-0007-T004 must either add temporary-session-token support to the S3 backend or choose a sidecar/secret controller pattern that keeps refreshed credentials available through the existing env/file reference contract.

Upgrade And Rollback

  1. Read the OpenBao chart release notes.
  2. Update OPENBAO_CHART_VERSION in Makefile.
  3. Run make openbao-dry-run.
  4. Confirm current backup and audit log posture.
  5. Run make openbao-deploy.
  6. Run make openbao-status.

For rollback, run helm rollback openbao <REVISION> -n openbao on Railiance01 and re-check bao status.

Scaling To Three Nodes

When Railiance02 and Railiance03 join:

  1. Move storage from local-path to distributed storage.
  2. Set server.affinity back to anti-affinity.
  3. Set server.ha.replicas: 3.
  4. Re-enable a PodDisruptionBudget.
  5. Run an unseal, failover, backup, and restore drill before migrating secrets.