diff --git a/Makefile b/Makefile index 24bbf2c..96747fa 100644 --- a/Makefile +++ b/Makefile @@ -25,6 +25,7 @@ ARGOCD_BOOTSTRAP_DIR ?= argocd/bootstrap ARGOCD_REPOSITORY_SECRET ?= CREDENTIAL_GRANTS ?= credential-grants/catalog.yaml OPENBAO_TOKEN_GRANT_ARGS ?= +OPENBAO_WORKLOAD_KV_ARGS ?= CREDENTIAL_HELPER_GLOBAL_ARGS ?= CREDENTIAL_HELPER_ARGS ?= CREDENTIAL_HELPER_PURPOSE ?= flex-auth-openbao-smoke @@ -168,6 +169,14 @@ openbao-configure-external-secrets-issue-core: ## Configure OpenBao policy/role OPENBAO_RELEASE=$(OPENBAO_RELEASE) ESO_NAMESPACE=$(EXTERNAL_SECRETS_NAMESPACE) \ scripts/openbao-apply-external-secrets-issue-core.sh +openbao-workload-kv-lanes-dry-run: ## Dry-run OpenBao workload KV read-lane policy apply + scripts/openbao-apply-workload-kv-lanes.sh --dry-run $(OPENBAO_WORKLOAD_KV_ARGS) + +openbao-configure-workload-kv-lanes: ## Configure OpenBao workload KV read-lane policies + KUBECTL='$(KUBECTL)' OPENBAO_NAMESPACE=$(OPENBAO_NAMESPACE) \ + OPENBAO_RELEASE=$(OPENBAO_RELEASE) \ + scripts/openbao-apply-workload-kv-lanes.sh $(OPENBAO_WORKLOAD_KV_ARGS) + openbao-validate-restore-evidence: ## Validate non-secret OpenBao restore-drill evidence JSON OPENBAO_RESTORE_EVIDENCE='$(OPENBAO_RESTORE_EVIDENCE)' \ scripts/openbao-validate-restore-evidence.sh diff --git a/docs/credential-change-approval.md b/docs/credential-change-approval.md new file mode 100644 index 0000000..6230e68 --- /dev/null +++ b/docs/credential-change-approval.md @@ -0,0 +1,236 @@ +# Credential Change Approval Workflow + +This document sketches the operator workflow we want for it-sec and credential +changes. The goal is to remove raw OpenBao command authoring from routine human +operation while preserving explicit human approval, auditability, and safe +handling of secret values. + +## Problem + +The current workflow still asks operators to translate a reviewed intent into +OpenBao commands by hand: + +- create or update policies; +- create auth roles with the right bound claims; +- create or rotate secret paths and fields; +- verify positive and negative access; +- tell ops-warden or another access front door when a lane may become active. + +That is inefficient and easy to get wrong. It is also hard to review because +the actual unit of work is spread across chat, workplans, OpenBao UI screens, +State Hub notes, and shell commands. + +## Direction + +Treat OpenBao as the enforcement and audit engine, not the primary review UI. +Add a small approval control plane in front of it: + +1. an agent or CLI creates a structured, non-secret credential change request; +2. humans review the rendered proposal, risk notes, generated OpenBao plan, and + verification plan; +3. a human approves or denies with a comment; +4. only approved requests can be applied by an operator-controlled helper; +5. the helper records non-secret evidence and marks the request active, + rejected, deactivated, rotated, or compromised. + +This can be implemented with repo files, State Hub, and CLI/chat integration +first. An OpenBao UI extension can come later if the workflow proves itself. + +## Core Object + +The canonical unit is a credential change request, abbreviated `CCR`. + +The CCR must be non-secret. It may contain: + +- stable request id and title; +- requester, reviewer, approver, and applier identities; +- target domain, tenant, workload, environment, and purpose; +- OpenBao mount, path, field names, policy names, and auth role names; +- exact non-secret policy HCL or generated policy references; +- proposed auth bindings and bound claims; +- delivery surface such as ops-warden, External Secrets, CSI, or direct caller + fetch; +- risk classification and approval requirements; +- generated apply plan; +- verification plan; +- rollback, deactivate, rotate, and compromise response plan; +- comments, approvals, denials, and timestamps; +- non-secret OpenBao audit request ids or timestamps after execution. + +It must not contain: + +- secret values; +- wrapped token values; +- root, platform-admin, or issuer tokens; +- passwords, API keys, private keys, OTP seeds, unseal shares, or recovery + codes; +- command output that includes secret values. + +## State Machine + +Suggested states: + +```text +draft +proposed +needs_changes +approved +denied +apply_pending +applied +verified +active +deactivated +rotated +compromised +superseded +cancelled +``` + +Only `approved` requests may be applied. Only `verified` requests may become +`active`. + +Emergency break-glass work may create a request after the fact, but it must be +marked as break-glass, reviewed retrospectively, and linked to audit evidence. + +## Review Surface + +A reviewer should see a concise rendered proposal: + +```text +Request: whynot-design npm publish token lane +Type: workload-kv-read +Mount/path/field: + platform/workloads/whynot-design/whynot-design/npm-publish + NPM_AUTH_TOKEN +Policy: + workload-kv-read-whynot-design-npm-publish +Auth binding: + netkingdom OIDC role whynot-design-workload-kv-read + bound claim: groups includes whynot-design +Access front door: + ops-warden whynot-design-npm-token +Risk: + grants read access to npm publish credential +Checks: + positive whynot fetch, negative non-whynot denial, OpenBao audit evidence +Decision: + approve | deny | needs changes +Comment: + free text +``` + +The reviewer should not need to know the exact `bao write` syntax. They should +be able to discuss the proposal in chat, request changes, and then make a +formal decision. + +## Minimal Implementation + +Version 1 should be boring: + +- store CCR files under `credential-change-requests/`; +- validate CCR schema offline; +- render a human-readable review summary; +- generate OpenBao apply plans from approved CCRs; +- require an approval record before apply; +- apply only non-secret policy/auth/path metadata; +- prompt or delegate separately for secret value entry; +- record non-secret evidence in State Hub. + +The CLI shape can be: + +```bash +scripts/credential-change.py propose workload-kv ... +scripts/credential-change.py render CCR-YYYY-NNNN +scripts/credential-change.py approve CCR-YYYY-NNNN --comment "..." +scripts/credential-change.py deny CCR-YYYY-NNNN --comment "..." +scripts/credential-change.py apply CCR-YYYY-NNNN +scripts/credential-change.py verify CCR-YYYY-NNNN +scripts/credential-change.py deactivate CCR-YYYY-NNNN --reason "..." +``` + +The same operations can be exposed through chat by having the agent create the +proposal, show the rendered summary, then call the CLI only after the human +gives an explicit approval phrase. + +## State Hub Role + +State Hub should hold: + +- request lifecycle events; +- review comments; +- approval/denial decisions; +- non-secret apply and verification evidence; +- links to workplans and CCR files. + +State Hub should not hold secret values. It can be the first review UI because +it already supports messages, progress, task status, and cross-repo +coordination. + +## OpenBao Role + +OpenBao remains authoritative for: + +- policy enforcement; +- auth method configuration; +- token issuance and revocation; +- secret storage; +- audit logs. + +Where OpenBao supports non-secret metadata on secret paths or auth roles, we can +mirror CCR ids and status labels. The workflow must not depend on OpenBao being +the only index, because operators need to see proposed, rejected, deactivated, +rotated, and compromised items across repos and access front doors. + +## ops-warden Role + +ops-warden should consume only approved and active access lanes. + +For draft requests, ops-warden may create a draft catalog entry that points to +the CCR, but it should not activate the entry until the CCR is verified. + +For `warden access --fetch` / `--exec`, the catalog should include the CCR id +and refuse active use when the CCR state is not `active`. + +## Interactive Runbook Role + +The interactive runbook is the operator bridge: + +1. load a CCR; +2. show the rendered summary and exact generated plan; +3. confirm the request is approved; +4. acquire operator authority through an approved path; +5. apply the plan; +6. ask for attended secret entry when needed; +7. run positive and negative verification; +8. record non-secret evidence; +9. notify downstream front doors such as ops-warden. + +This lets operators safely drive privileged work without needing to remember +every OpenBao command. + +## Compromise And Deactivation + +Every active CCR needs a deactivate and rotate path: + +- `deactivated`: access intentionally disabled but not necessarily compromised; +- `rotated`: secret value replaced and old value no longer valid; +- `compromised`: emergency state requiring immediate disablement, rotation, + blast-radius notes, and incident follow-up. + +The workflow must support marking an existing credential or lane as compromised +even when the original request predates this system. + +## Near-Term Target + +Use the whynot-design npm token lane as the pilot: + +1. encode the existing non-secret lane as a CCR; +2. render it for review; +3. approve or request changes from chat; +4. generate/apply the OpenBao policy and auth role only after approval; +5. provision the secret value by attended operator custody; +6. verify and activate the ops-warden catalog entry. + +Once that path feels good, reuse it for the sibling workload-KV lanes and the +credential broker's OpenBao token-role gates. diff --git a/docs/openbao.md b/docs/openbao.md index 99f2021..27d53ce 100644 --- a/docs/openbao.md +++ b/docs/openbao.md @@ -404,6 +404,10 @@ platform/operators/ The template policy for workload KV reads is `openbao/policies/workload-kv-read-template.hcl`. +Concrete workload access lanes used by ops-warden and similar front doors are +tracked in `docs/workload-kv-access-lanes.md`. These docs carry non-secret +path, field, policy, auth-role, and verification pointers only. + ## Backup, Restore, Audit, And Monitoring Before any live application secrets move into OpenBao: diff --git a/docs/workload-kv-access-lanes.md b/docs/workload-kv-access-lanes.md new file mode 100644 index 0000000..5f8ae15 --- /dev/null +++ b/docs/workload-kv-access-lanes.md @@ -0,0 +1,154 @@ +# Workload KV Access Lanes + +This document records concrete OpenBao workload KV paths that external access +front doors can reference without storing or vending secret values themselves. +The first lane is for ops-warden `warden access --fetch` / `--exec`. + +## Safety Rules + +- Do not put secret values in Git, State Hub, chat, prompts, workplans, or logs. +- Store only non-secret pointers here: path, field name, policy name, auth role, + flex-auth reference, and verification status. +- ops-warden may proxy a read as the caller, but it must not hold the returned + value beyond the caller-requested fetch/exec process. +- Live writes require an approved OpenBao/operator path and attended handling + of the secret value. + +## whynot-design npm Publish Token + +Ops-warden request: +`551031d1-335e-4db8-9535-820fea52d0a3` + +| Item | Value | +| --- | --- | +| ops-warden catalog id | `whynot-design-npm-token` | +| KV mount | `platform` | +| OpenBao CLI path | `platform/workloads/whynot-design/whynot-design/npm-publish` | +| Secret field | `NPM_AUTH_TOKEN` | +| Read policy | `workload-kv-read-whynot-design-npm-publish` | +| Policy file | `openbao/policies/workload-kv-read-whynot-design-npm-publish.hcl` | +| OIDC auth mount | `netkingdom` | +| OIDC role | `whynot-design-workload-kv-read` | +| Kubernetes auth role | `whynot-design-workload-kv-read` if an in-cluster service account consumes this lane | +| flex-auth ref | `secret.read:whynot-design` if tenant policy requires pre-approval | + +Expected caller login shape: + +```bash +bao login -method=oidc -path=netkingdom role=whynot-design-workload-kv-read +``` + +Expected fetch shape: + +```bash +bao kv get -field=NPM_AUTH_TOKEN platform/workloads/whynot-design/whynot-design/npm-publish +``` + +The fetch command returns the secret value to the authenticated caller. Run it +only in an attended shell or through a process that consumes the value without +logging it. + +## OpenBao Policy + +The source policy grants only: + +```text +read platform/data/workloads/whynot-design/whynot-design/npm-publish +read platform/metadata/workloads/whynot-design/whynot-design/npm-publish +``` + +It does not grant write, delete, patch, sudo, auth, sibling workload, or parent +list capabilities. + +Dry-run the policy apply path: + +```bash +make openbao-workload-kv-lanes-dry-run +``` + +Apply the policy with an approved platform-admin/operator token: + +```bash +OPENBAO_TOKEN_FILE=~/.local/openbao/platform-admin.token \ + make openbao-configure-workload-kv-lanes +``` + +If the OpenBao pod has an approved token-helper session, use: + +```bash +make openbao-configure-workload-kv-lanes OPENBAO_WORKLOAD_KV_ARGS=--use-token-helper +``` + +Do not paste the token into shell history or logs. The helper reads a token +from `OPENBAO_TOKEN_FILE` or an interactive hidden prompt unless +`--use-token-helper` is set, and passes it to OpenBao through stdin. + +## Auth Role + +The intended OpenBao OIDC role is: + +```text +auth/netkingdom/role/whynot-design-workload-kv-read +``` + +The role must attach only: + +```text +workload-kv-read-whynot-design-npm-publish +``` + +Before applying the role, confirm the KeyCape/NetKingdom claim that identifies +the whynot-design caller. The role must bind to that claim; do not create an +unbounded OIDC role that grants this policy to every OIDC user. + +If the consumer is an in-cluster service account instead of an OIDC caller, use +Kubernetes auth with the same role name and bind only the approved namespace +and service account. + +## Secret Provisioning + +An approved operator must create or confirm the secret with: + +```text +path: platform/workloads/whynot-design/whynot-design/npm-publish +field: NPM_AUTH_TOKEN +``` + +The value must be entered directly through OpenBao/operator custody. Record only +non-secret evidence: actor, timestamp, path, field name, policy name, and +verification result. + +## Verification + +Positive verification: + +1. Authenticate as the whynot-design caller using the approved OIDC or + Kubernetes auth role. +2. Fetch the field in an attended session or through `warden access --fetch`. +3. Record only that the fetch succeeded; do not record the value. + +Negative verification: + +1. Authenticate as a non-whynot identity. +2. Confirm the same field read is denied. +3. Record the non-secret OpenBao audit request ids or timestamps for the + allowed and denied attempts. + +## ops-warden Handoff + +Send ops-warden only these pointers: + +```text +catalog id: whynot-design-npm-token +mount: platform +path: platform/workloads/whynot-design/whynot-design/npm-publish +field: NPM_AUTH_TOKEN +oidc login: bao login -method=oidc -path=netkingdom role=whynot-design-workload-kv-read +policy: workload-kv-read-whynot-design-npm-publish +policy file: openbao/policies/workload-kv-read-whynot-design-npm-publish.hcl +flex-auth ref: secret.read:whynot-design, if tenant policy requires it +runbook: docs/workload-kv-access-lanes.md +``` + +Until live provisioning and verification are complete, ops-warden should keep +the catalog entry in `draft` or equivalent non-active state. diff --git a/openbao/policies/workload-kv-read-whynot-design-npm-publish.hcl b/openbao/policies/workload-kv-read-whynot-design-npm-publish.hcl new file mode 100644 index 0000000..5a84376 --- /dev/null +++ b/openbao/policies/workload-kv-read-whynot-design-npm-publish.hcl @@ -0,0 +1,13 @@ +# Least-privilege read policy for the whynot-design npm publish token. +# +# This policy intentionally grants only read access to the single KV-v2 secret +# path used by ops-warden's caller-scoped access lane. It does not grant list +# access to sibling workloads or mutation capabilities. + +path "platform/data/workloads/whynot-design/whynot-design/npm-publish" { + capabilities = ["read"] +} + +path "platform/metadata/workloads/whynot-design/whynot-design/npm-publish" { + capabilities = ["read"] +} diff --git a/scripts/openbao-apply-workload-kv-lanes.sh b/scripts/openbao-apply-workload-kv-lanes.sh new file mode 100755 index 0000000..191f988 --- /dev/null +++ b/scripts/openbao-apply-workload-kv-lanes.sh @@ -0,0 +1,137 @@ +#!/usr/bin/env bash +set -euo pipefail + +OPENBAO_NAMESPACE="${OPENBAO_NAMESPACE:-openbao}" +OPENBAO_RELEASE="${OPENBAO_RELEASE:-openbao}" +KUBECTL="${KUBECTL:-kubectl}" +TOKEN_FILE="${OPENBAO_TOKEN_FILE:-}" +REPO_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" +POLICY_NAME="${WORKLOAD_KV_POLICY_NAME:-workload-kv-read-whynot-design-npm-publish}" +POLICY_FILE="${WORKLOAD_KV_POLICY_FILE:-$REPO_DIR/openbao/policies/workload-kv-read-whynot-design-npm-publish.hcl}" +DRY_RUN=0 +USE_TOKEN_HELPER=0 + +usage() { + cat <<'USAGE' +Usage: scripts/openbao-apply-workload-kv-lanes.sh [--dry-run] [--use-token-helper] + +Applies source-owned OpenBao workload KV read-lane policies. + +Current lane: + - policy: workload-kv-read-whynot-design-npm-publish + - path: platform/workloads/whynot-design/whynot-design/npm-publish + - field: NPM_AUTH_TOKEN + +The script reads an OpenBao operator token from OPENBAO_TOKEN_FILE or an +interactive hidden prompt unless --dry-run or --use-token-helper is set. It +never prints or stores the token. + +This script intentionally does not create an OIDC role until the whynot-design +KeyCape/NetKingdom bound claim is confirmed. +USAGE +} + +while [ "$#" -gt 0 ]; do + case "$1" in + --dry-run) + DRY_RUN=1 + shift + ;; + --use-token-helper) + USE_TOKEN_HELPER=1 + shift + ;; + -h|--help) + usage + exit 0 + ;; + *) + echo "ERROR: unknown argument: $1" >&2 + usage >&2 + exit 2 + ;; + esac +done + +pod="${OPENBAO_RELEASE}-0" + +read_token() { + if [ "$DRY_RUN" -eq 1 ] || [ "$USE_TOKEN_HELPER" -eq 1 ]; then + return + fi + if [ -n "$TOKEN_FILE" ]; then + if [ ! -f "$TOKEN_FILE" ]; then + echo "ERROR: OPENBAO_TOKEN_FILE does not exist: $TOKEN_FILE" >&2 + exit 1 + fi + head -n 1 "$TOKEN_FILE" + return + fi + + local token + read -r -s -p "OpenBao token: " token + printf '\n' >&2 + printf '%s\n' "$token" +} + +remote_bao() { + local token="$1" + shift + if [ "$DRY_RUN" -eq 1 ]; then + printf 'DRY-RUN: bao %s\n' "$*" + return 0 + fi + if [ "$USE_TOKEN_HELPER" -eq 1 ]; then + # shellcheck disable=SC2086 + $KUBECTL exec -i -n "$OPENBAO_NAMESPACE" "$pod" -- bao "$@" + return + fi + # shellcheck disable=SC2086 + printf '%s\n' "$token" | $KUBECTL exec -i -n "$OPENBAO_NAMESPACE" "$pod" -- \ + sh -c 'read -r BAO_TOKEN; export BAO_TOKEN; exec bao "$@"' sh "$@" +} + +write_policy() { + local token="$1" + if [ ! -f "$POLICY_FILE" ]; then + echo "ERROR: missing policy file: $POLICY_FILE" >&2 + exit 1 + fi + if [ "$DRY_RUN" -eq 1 ]; then + printf 'DRY-RUN: bao policy write %s %s\n' "$POLICY_NAME" "$POLICY_FILE" + return 0 + fi + if [ "$USE_TOKEN_HELPER" -eq 1 ]; then + # shellcheck disable=SC2086 + cat "$POLICY_FILE" | $KUBECTL exec -i -n "$OPENBAO_NAMESPACE" "$pod" -- \ + bao policy write "$POLICY_NAME" - + return + fi + # shellcheck disable=SC2086 + { printf '%s\n' "$token"; cat "$POLICY_FILE"; } | \ + $KUBECTL exec -i -n "$OPENBAO_NAMESPACE" "$pod" -- \ + sh -c 'read -r BAO_TOKEN; export BAO_TOKEN; bao policy write "$1" -' sh "$POLICY_NAME" +} + +token="$(read_token)" +if [ "$DRY_RUN" -eq 0 ] && [ "$USE_TOKEN_HELPER" -eq 0 ] && [ -z "$token" ]; then + echo "ERROR: empty OpenBao token" >&2 + exit 1 +fi + +remote_bao "$token" status +write_policy "$token" +remote_bao "$token" policy read "$POLICY_NAME" + +cat <<'NEXT' + +Workload KV read-lane policy apply path completed. + +Remaining live steps: + 1. Confirm the whynot-design KeyCape/NetKingdom bound claim or service account. + 2. Create auth/netkingdom/role/whynot-design-workload-kv-read with only the + workload-kv-read-whynot-design-npm-publish policy. + 3. Provision platform/workloads/whynot-design/whynot-design/npm-publish with + field NPM_AUTH_TOKEN through approved OpenBao/operator custody. + 4. Run positive and negative fetch verification without printing the token. +NEXT diff --git a/workplans/RAILIANCE-WP-0006-workload-kv-access-lanes.md b/workplans/RAILIANCE-WP-0006-workload-kv-access-lanes.md new file mode 100644 index 0000000..c760034 --- /dev/null +++ b/workplans/RAILIANCE-WP-0006-workload-kv-access-lanes.md @@ -0,0 +1,298 @@ +--- +id: RAILIANCE-WP-0006 +type: workplan +title: "Workload KV Access Lanes for ops-warden Fetch" +domain: financials +repo: railiance-platform +status: blocked +owner: codex +topic_slug: railiance +planning_priority: high +planning_order: 6 +created: "2026-06-27" +updated: "2026-06-27" +depends_on_workplans: + - RAIL-PL-WP-0002 + - RAILIANCE-WP-0004 +related_state_hub_messages: + - "551031d1-335e-4db8-9535-820fea52d0a3" +state_hub_workstream_id: "96c8a93d-7a5a-4fa9-8f7b-865119551da3" +--- + +# RAILIANCE-WP-0006 - Workload KV Access Lanes for ops-warden Fetch + +## Goal + +Provision concrete, least-privilege OpenBao workload KV read lanes that +`ops-warden` can expose through `warden access --fetch` / `--exec` without +holding secret values itself. + +The immediate request is for `whynot-design` to retrieve its npm publish token. +The path must be concrete, policy-scoped, and documented so the ops-warden +catalog can replace the current unresolved template path with a live +`whynot-design-npm-token` entry. + +No task in this workplan may paste, commit, log, or send secret values through +Git, State Hub, chat, prompts, or workplan text. + +## Requirements Reviewed + +Ops-warden message `551031d1-335e-4db8-9535-820fea52d0a3` asks +`railiance-platform` to provide non-secret pointers for: + +- a concrete OpenBao KV path and field for `NPM_AUTH_TOKEN`; +- the KV mount used by the path; +- the OIDC login role for whynot-design or its operator identity; +- a read policy scoped to whynot-design's identity/service account; +- the flex-auth policy reference, if pre-approval is required. + +Once these pointers are live, ops-warden will add a dedicated +`whynot-design-npm-token` access catalog entry and a playbook, then notify +whynot-design. + +## Proposed Contract + +Use the existing workload convention documented in `docs/openbao.md` and +`docs/argocd-gitops.md`: + +```text +platform/workloads/// +``` + +For this lane, the proposed non-secret contract is: + +| Item | Proposed value | +| --- | --- | +| KV mount | `platform` | +| CLI path | `platform/workloads/whynot-design/whynot-design/npm-publish` | +| KV-v2 policy data path | `platform/data/workloads/whynot-design/whynot-design/npm-publish` | +| KV-v2 policy metadata path | `platform/metadata/workloads/whynot-design/whynot-design/npm-publish` | +| Secret field | `NPM_AUTH_TOKEN` | +| OpenBao read policy | `workload-kv-read-whynot-design-npm-publish` | +| OIDC auth mount | `netkingdom` unless KeyCape compatibility requires `keycape` | +| OIDC role | `whynot-design-workload-kv-read` | +| Kubernetes auth role | `whynot-design-workload-kv-read` if an in-cluster service account consumes it | +| flex-auth ref | `secret.read:whynot-design` if tenant policy requires pre-approval | + +The expected caller-facing read shape is: + +```bash +bao login -method=oidc -path=netkingdom role=whynot-design-workload-kv-read +bao kv get -field=NPM_AUTH_TOKEN platform/workloads/whynot-design/whynot-design/npm-publish +``` + +The command shape is illustrative only. Verification must avoid printing the +secret value; use attended operator checks or commands that prove read access +without persisting the token in logs. + +## Tasks + +## T01 - Capture ops-warden request and path contract + +```task +id: RAILIANCE-WP-0006-T01 +status: done +priority: high +state_hub_task_id: "0c93496a-48bf-44e7-a75b-52e51e2639bc" +``` + +Record the ops-warden request, existing workload path convention, and proposed +whynot-design contract in this workplan. + +Acceptance: + +- The workplan names the concrete path, field, mount, policy, auth role, and + optional flex-auth ref needed by ops-warden. +- The plan distinguishes non-secret pointers from secret values. +- The plan keeps this workload KV read lane separate from + `RAILIANCE-WP-0005`, which tracks short-lived OpenBao token issuance for the + ops-warden signing smoke. + +**2026-06-27:** Reviewed the unread ops-warden request and existing +`platform/workloads///` convention. +Captured the proposed `whynot-design` npm publish lane above with no secret +values. + +## T02 - Add least-privilege OpenBao read policy + +```task +id: RAILIANCE-WP-0006-T02 +status: done +priority: high +state_hub_task_id: "9c06d531-2566-4767-aa2f-8339605f23d5" +``` + +Create a concrete policy artifact for the whynot-design npm publish lane, +derived from `openbao/policies/workload-kv-read-template.hcl` but narrowed to +the selected `npm-publish` path. + +Acceptance: + +- A policy file under `openbao/policies/` defines read access to the exact + `platform/data/workloads/whynot-design/whynot-design/npm-publish` path. +- Metadata/list capabilities are only as broad as needed for the caller and + ops-warden fetch UX. +- The policy grants no write, delete, patch, sudo, auth, or unrelated workload + capabilities. +- The policy name matches the pointer intended for ops-warden: + `workload-kv-read-whynot-design-npm-publish`. + +**2026-06-27:** Added the concrete policy artifact at +`openbao/policies/workload-kv-read-whynot-design-npm-publish.hcl`. It grants +only `read` on the exact KV-v2 data and metadata paths for +`platform/workloads/whynot-design/whynot-design/npm-publish`; it does not grant +write/delete/list/sudo/auth or sibling workload access. Added +`scripts/openbao-apply-workload-kv-lanes.sh`, +`make openbao-workload-kv-lanes-dry-run`, and +`make openbao-configure-workload-kv-lanes` for the source-owned policy apply +step. Dry-run passed. A live apply attempt with +`OPENBAO_WORKLOAD_KV_ARGS=--use-token-helper` reached unsealed OpenBao but was +denied with `403 permission denied` while writing the policy, so live policy +application waits on an approved platform-admin/operator token or a narrow +token-helper capability. + +## T03 - Define and apply auth bindings + +```task +id: RAILIANCE-WP-0006-T03 +status: wait +priority: high +state_hub_task_id: "a217371a-0f85-40c6-b691-ac67834c86b5" +``` + +Define the auth role that lets whynot-design or an approved operator identity +read the lane as itself. + +Acceptance: + +- The OIDC login role is documented as + `bao login -method=oidc -path=netkingdom role=whynot-design-workload-kv-read`, + or a different approved role is recorded with the reason. +- The role attaches only the whynot-design npm publish read policy. +- If an in-cluster whynot-design service account consumes the token, the + Kubernetes auth role binds only the approved namespace and service account. +- Compatibility with the legacy `keycape` auth mount is either configured or + explicitly declined. + +**2026-06-27:** Documented the intended OIDC role pointer as +`auth/netkingdom/role/whynot-design-workload-kv-read` in +`docs/workload-kv-access-lanes.md`. Live application is waiting on confirmation +of the KeyCape/NetKingdom whynot-design bound claim or approved service-account +subject; do not create an unbounded OIDC role. + +## T04 - Provision the KV path without exposing the token + +```task +id: RAILIANCE-WP-0006-T04 +status: wait +priority: high +state_hub_task_id: "c43724a3-c83e-4ab6-b7d1-e427fd93a9a9" +``` + +Have an approved operator create or confirm the OpenBao KV entry for the npm +publish token. + +Acceptance: + +- The path exists at + `platform/workloads/whynot-design/whynot-design/npm-publish`. +- The field is named exactly `NPM_AUTH_TOKEN`. +- The token value is entered through an approved operator/OpenBao path and is + never written to Git, State Hub, chat, prompts, shell history, or workplan + text. +- Non-secret evidence records only the path, field name, actor, timestamp, + policy name, and verification result. + +**2026-06-27:** The concrete path and field are now documented. Live secret +provisioning is waiting on an approved operator/OpenBao custody path for the +actual `NPM_AUTH_TOKEN` value. + +## T05 - Verify caller-scoped fetch behavior + +```task +id: RAILIANCE-WP-0006-T05 +status: wait +priority: high +state_hub_task_id: "dc1f470b-e78a-48a9-9957-965aed47861f" +``` + +Prove that the authorized identity can read the token through the intended +OpenBao path and that unauthorized identities cannot. + +Acceptance: + +- An approved whynot-design identity or operator role can authenticate and + perform the fetch without unresolved `<...>` placeholders. +- Negative verification shows a non-whynot identity cannot read the path. +- Verification output contains no token value. +- OpenBao audit evidence exists for the authorized read and denied read, with + only non-secret request ids/timestamps recorded in the workplan or State Hub. + +**2026-06-27:** Verification is waiting on live policy/role application and +secret provisioning. The runbook requires positive and negative fetch evidence +without printing the token value. + +## T06 - Coordinate ops-warden catalog activation + +```task +id: RAILIANCE-WP-0006-T06 +status: wait +priority: high +state_hub_task_id: "8e84ec19-01db-4baf-a532-de87e51d4994" +``` + +Send ops-warden the non-secret pointers needed to create and activate its +dedicated access catalog entry. + +Acceptance: + +- The State Hub reply to ops-warden includes only path, field, KV mount, + OIDC role, policy name/path, optional flex-auth ref, and runbook location. +- Ops-warden confirms the `whynot-design-npm-token` catalog entry no longer + contains unresolved placeholders. +- `warden access "npm auth token" --fetch` or the agreed exact selector resolves + to the whynot-design lane and proxies the read as the caller. +- ops-warden confirms it holds no token value and only proxies OpenBao access. + +**2026-06-27:** Added `docs/workload-kv-access-lanes.md` with the non-secret +handoff payload for ops-warden and sent the pointers by State Hub message. The +entry should remain draft/non-active until live OpenBao provisioning and +verification complete. + +## T07 - Decide whether to batch sibling workload-KV requests + +```task +id: RAILIANCE-WP-0006-T07 +status: done +priority: medium +state_hub_task_id: "0b3ab5f5-e933-41f2-b29a-ab4ac50593aa" +``` + +Ops-warden noted similar still-open access lanes for +`issue-core-ingestion-api-key` and `openrouter-llm-connect`. Decide whether to +batch those paths in the same provisioning pass or keep this workplan scoped to +whynot-design. + +Acceptance: + +- The decision is recorded without secret values. +- If batching is approved, add concrete sub-tasks or a follow-up workplan for + each additional lane. +- If batching is deferred, notify ops-warden that this workplan will deliver + whynot-design first and leave the sibling entries for separate planning. + +**2026-06-27:** Deferred sibling lanes (`issue-core-ingestion-api-key` and +`openrouter-llm-connect`) so the whynot-design npm token request can be serviced +first. They should get concrete tasks or a follow-up workplan after this access +lane pattern is validated. + +## Exit Criteria + +- The whynot-design npm publish token has a concrete OpenBao KV path, field, + read policy, and auth role. +- The authorized caller can fetch the token as itself through OpenBao and + ops-warden without ops-warden storing the value. +- Unauthorized reads are denied. +- ops-warden has enough non-secret pointers to activate + `whynot-design-npm-token`. +- No secret values appear in Git, State Hub, chat, prompts, logs, or workplans. diff --git a/workplans/RAILIANCE-WP-0007-credential-change-approval-workflow.md b/workplans/RAILIANCE-WP-0007-credential-change-approval-workflow.md new file mode 100644 index 0000000..1ed5082 --- /dev/null +++ b/workplans/RAILIANCE-WP-0007-credential-change-approval-workflow.md @@ -0,0 +1,252 @@ +--- +id: RAILIANCE-WP-0007 +type: workplan +title: "Credential Change Proposal Review Workflow" +domain: financials +repo: railiance-platform +status: ready +owner: codex +topic_slug: railiance +planning_priority: high +planning_order: 7 +created: "2026-06-27" +updated: "2026-06-27" +depends_on_workplans: + - RAIL-PL-WP-0002 + - RAILIANCE-WP-0005 + - RAILIANCE-WP-0006 +state_hub_workstream_id: "4d7ce243-f40a-4249-a46a-a24f75d6fe4c" +--- + +# RAILIANCE-WP-0007 - Credential Change Proposal Review Workflow + +## Goal + +Create a proposal -> review -> approve/deny with comment -> apply -> verify +workflow for credential and it-sec changes, so operators do not need to author +or mentally validate raw OpenBao commands. + +The first target is the whynot-design npm token lane from `RAILIANCE-WP-0006`. +The workflow should then generalize to workload KV paths, OpenBao token roles, +ops-warden access catalog entries, External Secrets lanes, credential rotation, +deactivation, and compromise handling. + +## Direction + +Do not start by extending OpenBao. Instead, build a small approval control +plane around OpenBao: + +- OpenBao remains the enforcement, secret storage, token, and audit engine. +- State Hub stores non-secret request lifecycle, comments, decisions, and + evidence. +- Repo files store reviewable non-secret request specs and generated policy + artifacts. +- Agents and CLIs create proposals and render them for human review. +- Humans approve or deny with comments. +- Only approved requests can be applied by an operator-controlled runner or + interactive runbook. + +If the workflow proves valuable, a later UI or OpenBao extension can surface the +same request index and statuses. + +## Proposed Object + +Introduce a non-secret Credential Change Request, or `CCR`. + +Each CCR captures: + +- request id, title, requester, reviewer, approver, and applier; +- target tenant/workload/environment/purpose; +- OpenBao mount, path, fields, policies, auth roles, and bound claims; +- access front door such as ops-warden, External Secrets, CSI, or direct caller + fetch; +- risk classification and approval requirements; +- generated apply plan and verification plan; +- rollback, deactivate, rotate, and compromise response plan; +- comments, decision, timestamps, and non-secret audit evidence. + +Each CCR explicitly excludes secret values, token values, private keys, +passwords, unseal/recovery material, and secret-bearing command output. + +## Tasks + +## T01 - Record the approval workflow design + +```task +id: RAILIANCE-WP-0007-T01 +status: done +priority: high +state_hub_task_id: "c82ee783-80f1-48da-a9ed-4565eac699fc" +``` + +Document the desired operator workflow and why it should sit around OpenBao +rather than inside the OpenBao UI initially. + +Acceptance: + +- The design describes the proposal, review, approval/denial, apply, verify, + activate, deactivate, rotate, and compromised states. +- The design names where State Hub, OpenBao, ops-warden, repo files, agents, + and interactive runbooks fit. +- The design keeps secret values out of State Hub, Git, chat, and prompts. + +**2026-06-27:** Added `docs/credential-change-approval.md` with the control +plane direction, CCR object, state machine, State Hub/OpenBao/ops-warden roles, +interactive runbook role, and compromise/deactivation path. + +## T02 - Define the CCR schema and storage layout + +```task +id: RAILIANCE-WP-0007-T02 +status: todo +priority: high +state_hub_task_id: "d50fb9e2-68c2-4a2b-8476-ce646d13e60a" +``` + +Create a versioned non-secret schema for credential change requests. + +Acceptance: + +- A schema exists for `workload-kv-read` requests covering mount, path, fields, + policy name, auth role, bound claims, access front door, verification plan, + and activation conditions. +- The schema supports decision metadata: requested, proposed, approved, + denied, needs_changes, applied, verified, active, deactivated, rotated, + compromised, superseded, and cancelled. +- The schema supports comments and references State Hub ids without storing + secrets. +- Example CCR fixtures include the whynot-design npm token lane. + +## T03 - Add offline validation and rendering + +```task +id: RAILIANCE-WP-0007-T03 +status: todo +priority: high +state_hub_task_id: "012f05cd-30ce-43dd-802b-4acc938db133" +``` + +Add a helper that validates CCR files and renders human review summaries. + +Acceptance: + +- Invalid CCRs fail before any OpenBao apply is attempted. +- The renderer produces a compact review block that a human can understand in + chat or State Hub. +- The renderer highlights risky fields: broad claims, wildcard paths, + privileged policies, missing negative verification, and missing deactivation + plan. +- A secret-pattern scan rejects likely token values in CCR files. + +## T04 - Generate OpenBao apply plans from approved CCRs + +```task +id: RAILIANCE-WP-0007-T04 +status: todo +priority: high +state_hub_task_id: "1b2e7752-815c-46f8-a2e2-212e8d04da80" +``` + +Generate deterministic, reviewable OpenBao apply plans from CCRs. + +Acceptance: + +- A workload KV CCR can generate policy HCL and auth-role commands or API + payloads. +- The plan includes a dry-run mode and a diff against existing source + artifacts when available. +- Applying a plan is refused unless the CCR is approved. +- The applier uses an approved operator authority path and does not accept raw + tokens in argv or logs. + +## T05 - Add chat/CLI approval commands + +```task +id: RAILIANCE-WP-0007-T05 +status: todo +priority: high +state_hub_task_id: "e6d4d2d1-1881-4db7-92f8-05e3fdb846ae" +``` + +Make the workflow usable from chat and command line. + +Acceptance: + +- Operators can approve, deny, or request changes with a comment. +- Approvals/denials are recorded as non-secret State Hub events and in the CCR + file or linked decision record. +- The system refuses apply when the latest human decision is denied or + needs_changes. +- Agents can propose changes and respond to review comments without receiving + secret values. + +## T06 - Build an interactive runbook for apply and verify + +```task +id: RAILIANCE-WP-0007-T06 +status: todo +priority: high +state_hub_task_id: "3c3fc38c-afa4-4367-b3e6-ba4b286ced30" +``` + +Wrap privileged application in an operator-friendly guided runbook. + +Acceptance: + +- The runbook loads an approved CCR, shows the plan, asks for final attended + confirmation, then applies policy/auth metadata. +- Secret value entry is handled through an approved OpenBao/operator path and + is never echoed or logged. +- Positive and negative verification steps are guided. +- Non-secret evidence is recorded automatically. + +## T07 - Pilot with whynot-design and ops-warden + +```task +id: RAILIANCE-WP-0007-T07 +status: todo +priority: high +state_hub_task_id: "07a7d8bf-5528-41c8-a791-d6ccd0466a33" +``` + +Use the existing whynot-design npm token lane as the first end-to-end pilot. + +Acceptance: + +- The current whynot-design lane is represented as a CCR. +- The CCR is rendered and reviewed in chat or State Hub. +- A human approval or denial comment is recorded. +- If approved, the runbook applies the policy/auth metadata, guides secret + provisioning, verifies access, and notifies ops-warden. +- ops-warden activates its catalog entry only after CCR verification. + +## T08 - Add deactivation, rotation, and compromise flows + +```task +id: RAILIANCE-WP-0007-T08 +status: todo +priority: medium +state_hub_task_id: "23d6ef9d-8dbc-4468-b486-5ec8ada71130" +``` + +Support lifecycle states beyond initial creation. + +Acceptance: + +- Existing credentials can be imported as CCR-backed inventory without secret + values. +- Operators can mark a lane deactivated, rotated, or compromised with reason + and evidence. +- Deactivation disables the relevant access front door and auth/policy path. +- Compromise flow records blast-radius notes and required follow-up tasks. + +## Exit Criteria + +- A human can review and approve or deny a credential/security change without + writing raw OpenBao commands. +- An approved request can be applied by an operator-controlled helper or + interactive runbook. +- State Hub and repo artifacts contain non-secret lifecycle, decision, and + evidence records. +- OpenBao remains the enforcement and audit source for actual secret access. +- The whynot-design npm token lane can complete through this workflow.