diff --git a/Makefile b/Makefile index ac2e5c9..f4ada40 100644 --- a/Makefile +++ b/Makefile @@ -109,6 +109,18 @@ openbao-status: ## Show OpenBao pods, services, PVCs, and seal/init status -l app.kubernetes.io/instance=$(OPENBAO_RELEASE) -o wide -$(KUBECTL) exec -n $(OPENBAO_NAMESPACE) $(OPENBAO_RELEASE)-0 -- bao status +openbao-verify: ## Run non-secret OpenBao deployment checks + KUBECTL='$(KUBECTL)' OPENBAO_NAMESPACE=$(OPENBAO_NAMESPACE) \ + OPENBAO_RELEASE=$(OPENBAO_RELEASE) scripts/openbao-verify.sh basic + +openbao-verify-post-unseal: ## Run post-unseal OpenBao filesystem checks + KUBECTL='$(KUBECTL)' OPENBAO_NAMESPACE=$(OPENBAO_NAMESPACE) \ + OPENBAO_RELEASE=$(OPENBAO_RELEASE) scripts/openbao-verify.sh post-unseal + +openbao-configure-initial: ## Apply first post-unseal audit, auth, mounts, and policies + KUBECTL='$(KUBECTL)' OPENBAO_NAMESPACE=$(OPENBAO_NAMESPACE) \ + OPENBAO_RELEASE=$(OPENBAO_RELEASE) scripts/openbao-apply-initial-config.sh + ##@ Backup backup: ## Backup platform services (PostgreSQL logical dump) — age-encrypted to Nextcloud @@ -121,4 +133,4 @@ help: ## Show this help /^[a-zA-Z_-]+:.*?##/ { printf " \033[36m%-22s\033[0m %s\n", $$1, $$2 } \ /^##@/ { printf "\n\033[1m%s\033[0m\n", substr($$0, 5) }' $(MAKEFILE_LIST) -.PHONY: db-deploy db-status db-shell db-logs apps-pg-deploy apps-pg-status apps-pg-shell apps-pg-logs pg-deploy pg-status pg-pgpool-check valkey-deploy valkey-status openbao-repo openbao-dry-run openbao-deploy openbao-status backup help +.PHONY: db-deploy db-status db-shell db-logs apps-pg-deploy apps-pg-status apps-pg-shell apps-pg-logs pg-deploy pg-status pg-pgpool-check valkey-deploy valkey-status openbao-repo openbao-dry-run openbao-deploy openbao-status openbao-verify openbao-verify-post-unseal openbao-configure-initial backup help diff --git a/docs/openbao.md b/docs/openbao.md index 12f4cb5..4f109cd 100644 --- a/docs/openbao.md +++ b/docs/openbao.md @@ -88,6 +88,22 @@ That state is intentional until the bootstrap ceremony is completed. Do not initialize OpenBao in a casual shell session. Initialization emits the unseal keys and initial root token. Treat this as a break-glass event. +Pre-flight checks: + +```bash +make openbao-status +make openbao-verify +``` + +Proceed only when: + +- `openbao-0` is Running. +- data and audit PVCs are Bound. +- `bao status` reports `Initialized: false` and `Sealed: true`. +- Railiance01 host/cluster backup posture is understood for this maintenance + window. +- three human escrow recipients are named before the command is run. + Recommended ceremony: 1. Confirm the Railiance01 backup posture first. @@ -110,6 +126,12 @@ Recommended ceremony: auth, enable audit, and prepare policies. 7. Revoke or tightly escrow the initial root token. +Do not paste unseal keys, root tokens, screenshots, or command output into Git, +State Hub, chat, shell history, or issue trackers. Each unseal share goes to one +escrow owner through an out-of-band channel. The initial root token is either +revoked after a non-root platform-admin token exists or stored as offline +break-glass material with the same handling as unseal shares. + ## Initial Configuration After Unseal Enable file audit: @@ -130,6 +152,107 @@ Kubernetes auth, database dynamic credentials, PKI, CSI, and External Secrets integration are follow-up tasks in `RAIL-PL-WP-0002`. Do not migrate live application secrets until those policies and restore drills are documented. +The repo now includes a non-secret helper for the first post-unseal +configuration: + +```bash +make openbao-configure-initial +``` + +The target prompts for a token, enables file audit, enables the `platform/` KV +v2 mount, enables Kubernetes auth, configures Kubernetes auth from the in-pod +service account, and loads: + +- `openbao/policies/platform-admin.hcl` +- `openbao/policies/platform-readonly.hcl` + +It does not print or store the token. You may also set +`OPENBAO_TOKEN_FILE=/path/to/token-file` for an operator-local, uncommitted +token file. + +After the helper succeeds, create a non-root admin token: + +```bash +kubectl exec -n openbao openbao-0 -- \ + bao token create -policy=platform-admin -period=24h -orphan +``` + +Store that token through the approved operator secret path, then revoke or +tightly escrow the initial root token. The root token should not become the +normal operator credential. + +## Auth And Workload Integration + +Initial auth model: + +| Actor | Method | Notes | +|-------|--------|-------| +| Bootstrap operator | one-time root token | only for initial audit, mounts, auth, policies, and non-root token creation | +| Platform operator | token with `platform-admin` | temporary until NetKingdom OIDC/admin integration is ready | +| Read-only reviewer | token with `platform-readonly` | metadata and health visibility, no secret reads | +| Kubernetes workload | Kubernetes auth role | namespace/service-account bound, policy per workload | +| Human identity | NetKingdom IAM Profile/OIDC | target model; OpenBao is not the identity provider | +| Automation | Kubernetes auth or short-lived operator token | no root tokens in automation | + +Workload delivery choice: + +- Prefer External Secrets Operator for values that should become Kubernetes + Secrets consumed by ordinary Helm charts. +- Use CSI-mounted files for workloads that need file references, sharper + mount-level boundaries, or secret refresh without rewriting application + manifests. +- Do not use the OpenBao injector in the current deployment; the Helm values + leave it disabled. +- Application repositories request paths and policies; `railiance-platform` + owns platform mounts, policy shape, and delivery mechanisms. + +Path convention: + +```text +platform/workloads/// +platform/object-storage/ +platform/databases/ +platform/operators/ +``` + +The template policy for workload KV reads is +`openbao/policies/workload-kv-read-template.hcl`. + +## Backup, Restore, Audit, And Monitoring + +Before any live application secrets move into OpenBao: + +1. Enable file audit and confirm an audit file is written under + `/openbao/audit/openbao-audit.log`. +2. Create an OpenBao Raft snapshot from the unsealed pod: + + ```bash + kubectl exec -n openbao openbao-0 -- \ + bao operator raft snapshot save /tmp/openbao-raft.snap + kubectl cp openbao/openbao-0:/tmp/openbao-raft.snap ./openbao-raft.snap + ``` + +3. Encrypt the snapshot with age/SOPS-compatible custody before it leaves the + operator machine. +4. Run an isolated restore drill before treating OpenBao as live secret + custody. The drill must prove that a fresh OpenBao instance can restore the + snapshot, unseal, and read a test secret. +5. Decide where audit logs are shipped durably. The audit PVC alone is not a + durable audit sink. +6. Run: + + ```bash + make openbao-verify-post-unseal + ``` + +Monitoring baseline: + +- pod readiness and liveness from Kubernetes probes +- `bao status` seal/init state +- PVC capacity for data and audit storage +- audit log write success +- future Prometheus scraping once the cluster monitoring stack exists + ## Artifact-Store Object Storage Handoff `artifact-store` is the consumer-facing artifact preservation service for diff --git a/openbao/policies/platform-admin.hcl b/openbao/policies/platform-admin.hcl new file mode 100644 index 0000000..1fc7c77 --- /dev/null +++ b/openbao/policies/platform-admin.hcl @@ -0,0 +1,41 @@ +# Full platform-operator policy for the initial OpenBao bootstrap phase. +# +# Use only for trusted S3 platform operators. This is intentionally broad so +# the root token can be retired after bootstrap. Prefer narrower workload +# policies for application access. + +path "sys/*" { + capabilities = ["create", "read", "update", "delete", "list", "sudo"] +} + +path "auth/*" { + capabilities = ["create", "read", "update", "delete", "list", "sudo"] +} + +path "identity/*" { + capabilities = ["create", "read", "update", "delete", "list"] +} + +path "platform/*" { + capabilities = ["create", "read", "update", "delete", "list"] +} + +path "database/*" { + capabilities = ["create", "read", "update", "delete", "list"] +} + +path "pki/*" { + capabilities = ["create", "read", "update", "delete", "list"] +} + +path "ssh/*" { + capabilities = ["create", "read", "update", "delete", "list"] +} + +path "cubbyhole/*" { + capabilities = ["create", "read", "update", "delete", "list"] +} + +path "secret/*" { + capabilities = ["create", "read", "update", "delete", "list"] +} diff --git a/openbao/policies/platform-readonly.hcl b/openbao/policies/platform-readonly.hcl new file mode 100644 index 0000000..1dbc011 --- /dev/null +++ b/openbao/policies/platform-readonly.hcl @@ -0,0 +1,28 @@ +# Read-only platform inspection policy. +# +# Useful for status dashboards and audit/review sessions that need visibility +# into mounts and platform metadata without secret material mutation. + +path "sys/health" { + capabilities = ["read"] +} + +path "sys/mounts" { + capabilities = ["read", "list"] +} + +path "sys/auth" { + capabilities = ["read", "list"] +} + +path "sys/policies/acl" { + capabilities = ["read", "list"] +} + +path "auth/token/lookup-self" { + capabilities = ["read"] +} + +path "platform/metadata/*" { + capabilities = ["read", "list"] +} diff --git a/openbao/policies/workload-kv-read-template.hcl b/openbao/policies/workload-kv-read-template.hcl new file mode 100644 index 0000000..0e4ce60 --- /dev/null +++ b/openbao/policies/workload-kv-read-template.hcl @@ -0,0 +1,16 @@ +# Template for a namespace/service-account-specific workload KV policy. +# +# Copy this file for a real workload and replace: +# Kubernetes namespace, e.g. artifact-store +# Kubernetes service account, e.g. artifact-store +# +# The matching Kubernetes auth role should bind the same namespace and service +# account and attach the copied policy. + +path "platform/data/workloads///*" { + capabilities = ["read"] +} + +path "platform/metadata/workloads///*" { + capabilities = ["read", "list"] +} diff --git a/scripts/openbao-apply-initial-config.sh b/scripts/openbao-apply-initial-config.sh new file mode 100755 index 0000000..d37b166 --- /dev/null +++ b/scripts/openbao-apply-initial-config.sh @@ -0,0 +1,139 @@ +#!/usr/bin/env bash +set -euo pipefail + +OPENBAO_NAMESPACE="${OPENBAO_NAMESPACE:-openbao}" +OPENBAO_RELEASE="${OPENBAO_RELEASE:-openbao}" +KUBECTL="${KUBECTL:-kubectl}" +TOKEN_FILE="${OPENBAO_TOKEN_FILE:-}" +REPO_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +POLICY_DIR="${POLICY_DIR:-$REPO_DIR/openbao/policies}" +DRY_RUN=0 + +usage() { + cat <<'USAGE' +Usage: scripts/openbao-apply-initial-config.sh [--dry-run] + +Applies the first post-unseal OpenBao configuration: + - file audit device + - platform KV v2 mount + - Kubernetes auth mount and in-cluster config + - platform-admin and platform-readonly policies + +This script must run only after the bootstrap ceremony initializes and unseals +OpenBao. It reads the bootstrap/root or platform-admin token from: + 1. OPENBAO_TOKEN_FILE, when set + 2. an interactive hidden prompt + +It does not print the token and does not store it. +USAGE +} + +while [ "$#" -gt 0 ]; do + case "$1" in + --dry-run) + DRY_RUN=1 + shift + ;; + -h|--help) + usage + exit 0 + ;; + *) + echo "ERROR: unknown argument: $1" >&2 + usage >&2 + exit 2 + ;; + esac +done + +pod="${OPENBAO_RELEASE}-0" + +read_token() { + if [ -n "$TOKEN_FILE" ]; then + if [ ! -f "$TOKEN_FILE" ]; then + echo "ERROR: OPENBAO_TOKEN_FILE does not exist: $TOKEN_FILE" >&2 + exit 1 + fi + head -n 1 "$TOKEN_FILE" + return + fi + + local token + read -r -s -p "OpenBao token: " token + printf '\n' >&2 + printf '%s\n' "$token" +} + +remote_bao() { + local token="$1" + shift + if [ "$DRY_RUN" -eq 1 ]; then + printf 'DRY-RUN: bao %s\n' "$*" + return 0 + fi + printf '%s\n' "$token" | $KUBECTL exec -i -n "$OPENBAO_NAMESPACE" "$pod" -- \ + sh -c 'read -r BAO_TOKEN; export BAO_TOKEN; exec bao "$@"' sh "$@" +} + +remote_sh() { + local token="$1" + local script="$2" + if [ "$DRY_RUN" -eq 1 ]; then + printf 'DRY-RUN: remote shell: %s\n' "$script" + return 0 + fi + printf '%s\n%s\n' "$token" "$script" | $KUBECTL exec -i -n "$OPENBAO_NAMESPACE" "$pod" -- \ + sh -c 'read -r BAO_TOKEN; export BAO_TOKEN; sh' +} + +write_policy() { + local token="$1" + local name="$2" + local file="$3" + if [ ! -f "$file" ]; then + echo "ERROR: missing policy file: $file" >&2 + exit 1 + fi + if [ "$DRY_RUN" -eq 1 ]; then + printf 'DRY-RUN: bao policy write %s %s\n' "$name" "$file" + return 0 + fi + { printf '%s\n' "$token"; cat "$file"; } | $KUBECTL exec -i -n "$OPENBAO_NAMESPACE" "$pod" -- \ + sh -c 'read -r BAO_TOKEN; export BAO_TOKEN; bao policy write "$1" -' sh "$name" +} + +token="$(read_token)" +if [ -z "$token" ]; then + echo "ERROR: empty token" >&2 + exit 1 +fi + +remote_bao "$token" status + +remote_bao "$token" audit enable file file_path=/openbao/audit/openbao-audit.log || true +remote_bao "$token" secrets enable -path=platform kv-v2 || true +remote_bao "$token" auth enable kubernetes || true + +remote_sh "$token" 'bao write auth/kubernetes/config \ + kubernetes_host="https://${KUBERNETES_SERVICE_HOST}:${KUBERNETES_SERVICE_PORT}" \ + token_reviewer_jwt=@/var/run/secrets/kubernetes.io/serviceaccount/token \ + kubernetes_ca_cert=@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt' + +write_policy "$token" platform-admin "$POLICY_DIR/platform-admin.hcl" +write_policy "$token" platform-readonly "$POLICY_DIR/platform-readonly.hcl" + +remote_bao "$token" audit list +remote_bao "$token" secrets list +remote_bao "$token" auth list +remote_bao "$token" policy list + +cat <<'NEXT' + +Initial OpenBao configuration applied. + +Next manual steps: + 1. Create a non-root platform-admin token with a short TTL or renewable period. + 2. Store that token through the approved human/operator secret path. + 3. Revoke or tightly escrow the initial root token. + 4. Run the raft snapshot and restore drill before moving live secrets. +NEXT diff --git a/scripts/openbao-verify.sh b/scripts/openbao-verify.sh new file mode 100755 index 0000000..d12f406 --- /dev/null +++ b/scripts/openbao-verify.sh @@ -0,0 +1,105 @@ +#!/usr/bin/env bash +set -euo pipefail + +OPENBAO_NAMESPACE="${OPENBAO_NAMESPACE:-openbao}" +OPENBAO_RELEASE="${OPENBAO_RELEASE:-openbao}" +KUBECTL="${KUBECTL:-kubectl}" +MODE="${1:-basic}" + +ok() { printf '[OK] %s\n' "$*"; } +warn() { printf '[WARN] %s\n' "$*"; } +err() { printf '[ERR] %s\n' "$*" >&2; } +step() { printf '\n==> %s\n' "$*"; } + +usage() { + cat <<'USAGE' +Usage: scripts/openbao-verify.sh [basic|post-unseal] + +Runs non-secret OpenBao deployment checks. It never initializes, unseals, or +prints tokens. + +Environment: + OPENBAO_NAMESPACE Kubernetes namespace. Default: openbao + OPENBAO_RELEASE Helm release / pod prefix. Default: openbao + KUBECTL kubectl command, including --kubeconfig if needed. +USAGE +} + +if [ "$MODE" = "-h" ] || [ "$MODE" = "--help" ]; then + usage + exit 0 +fi + +if [ "$MODE" != "basic" ] && [ "$MODE" != "post-unseal" ]; then + err "unknown mode: $MODE" + usage >&2 + exit 2 +fi + +pod="${OPENBAO_RELEASE}-0" + +check_cmd() { + if ! command -v "${KUBECTL%% *}" >/dev/null 2>&1; then + err "kubectl command not found: $KUBECTL" + exit 1 + fi +} + +run() { + # shellcheck disable=SC2086 + $KUBECTL "$@" +} + +check_cmd + +step "OpenBao Kubernetes objects" +run get namespace "$OPENBAO_NAMESPACE" >/dev/null +ok "namespace exists: $OPENBAO_NAMESPACE" + +run get pod "$pod" -n "$OPENBAO_NAMESPACE" >/dev/null +ok "pod exists: $OPENBAO_NAMESPACE/$pod" + +phase="$(run get pod "$pod" -n "$OPENBAO_NAMESPACE" -o jsonpath='{.status.phase}')" +ready="$(run get pod "$pod" -n "$OPENBAO_NAMESPACE" -o jsonpath='{range .status.containerStatuses[*]}{.ready}{end}')" +printf 'Pod phase: %s\n' "$phase" +printf 'Container ready flags: %s\n' "${ready:-none}" + +run get svc -n "$OPENBAO_NAMESPACE" \ + "${OPENBAO_RELEASE}" \ + "${OPENBAO_RELEASE}-active" \ + "${OPENBAO_RELEASE}-internal" \ + "${OPENBAO_RELEASE}-ui" >/dev/null +ok "expected services exist" + +run get pvc -n "$OPENBAO_NAMESPACE" >/dev/null +ok "PVC query succeeded" + +step "OpenBao seal/init status" +if run exec -n "$OPENBAO_NAMESPACE" "$pod" -- bao status; then + ok "bao status command succeeded" +else + warn "bao status failed. Check pod logs and command availability." +fi + +if [ "$MODE" = "basic" ]; then + exit 0 +fi + +step "Post-unseal unauthenticated checks" +if run exec -n "$OPENBAO_NAMESPACE" "$pod" -- sh -c 'test -d /openbao/audit'; then + ok "audit directory exists" +else + warn "audit directory missing or inaccessible" +fi + +if run exec -n "$OPENBAO_NAMESPACE" "$pod" -- sh -c 'test -d /openbao/data'; then + ok "raft data directory exists" +else + warn "raft data directory missing or inaccessible" +fi + +warn "Authenticated checks are intentionally not run here." +warn "After unseal/configuration, verify with a platform-admin token:" +warn " bao audit list" +warn " bao secrets list" +warn " bao auth list" diff --git a/workplans/RAIL-PL-WP-0002-openbao-platform-secrets-service.md b/workplans/RAIL-PL-WP-0002-openbao-platform-secrets-service.md index 28a1e9f..b22044a 100644 --- a/workplans/RAIL-PL-WP-0002-openbao-platform-secrets-service.md +++ b/workplans/RAIL-PL-WP-0002-openbao-platform-secrets-service.md @@ -10,7 +10,7 @@ topic_slug: railiance planning_priority: high planning_order: 2 created: "2026-05-17" -updated: "2026-05-17" +updated: "2026-05-23" depends_on: - RAIL-PL-WP-0001 state_hub_workstream_id: "fd1c045a-01d4-43be-980f-acbda6c64e6c" @@ -128,11 +128,20 @@ SOPS/age bootstrap. needs human escrow assignment, root-token retirement details, and a restore/recovery drill before live secrets move into OpenBao. +**2026-05-23:** Added non-secret bootstrap support: `make openbao-verify`, +`make openbao-verify-post-unseal`, `make openbao-configure-initial`, +`scripts/openbao-verify.sh`, `scripts/openbao-apply-initial-config.sh`, and +initial platform policies under `openbao/policies/`. `docs/openbao.md` now +spells out pre-flight checks, escrow handling, root-token retirement, and the +post-unseal initial configuration path. The actual initialization/unseal +ceremony remains gated on named human escrow recipients and must not happen in +a casual agent shell. + ### T04 - Auth Methods And Workload Integration ```task id: RAIL-PL-WP-0002-T04 -status: todo +status: done priority: high state_hub_task_id: "ca2b3ac2-b522-4445-a418-c6ec312cd5f4" ``` @@ -142,6 +151,15 @@ NetKingdom identity, admins, agents, and automations. Decide when workloads use OpenBao directly, CSI-mounted secrets, External Secrets Operator, or sidecars/controllers. +**2026-05-23:** Documented the auth and delivery model in `docs/openbao.md`. +Bootstrap uses the one-time root token only for initial setup; platform +operators use a non-root `platform-admin` token until NetKingdom OIDC/admin +integration is ready; reviewers use `platform-readonly`; workloads use +Kubernetes auth with namespace/service-account-bound policies. External +Secrets Operator is preferred for Helm-compatible Kubernetes Secrets, CSI is +reserved for mounted-file delivery and refresh-sensitive workloads, and the +OpenBao injector remains disabled. + ### T05 - Secret Engines And Dynamic Credentials ```task @@ -171,7 +189,7 @@ identity if object storage adopts `AssumeRoleWithWebIdentity`. ```task id: RAIL-PL-WP-0002-T06 -status: todo +status: done priority: medium state_hub_task_id: "cd61bc7d-8b9f-484f-97bd-7254c227b0ee" ``` @@ -180,6 +198,13 @@ Define backup/restore procedure, audit device configuration, metrics, logs, health checks, restore drill, and smoke tests. Include a developer/operator verification script for the deployed service. +**2026-05-23:** Documented audit, Raft snapshot, encrypted snapshot custody, +isolated restore drill, durable audit-log shipping, and monitoring baseline in +`docs/openbao.md`. Added `scripts/openbao-verify.sh` plus Make targets for +basic and post-unseal verification. The restore drill still must be executed +before any live application secrets are migrated; that remains a gate under +T03. + ### T07 - Cross-Repo Transition Tasks ```task