Files

tegwick eb24e04b71 Correct whynot credential tenant path

2026-06-28 01:00:12 +02:00

8.9 KiB

Raw Blame History

Credential Change Approval Workflow

This document sketches the operator workflow we want for it-sec and credential changes. The goal is to remove raw OpenBao command authoring from routine human operation while preserving explicit human approval, auditability, and safe handling of secret values.

Problem

The current workflow still asks operators to translate a reviewed intent into OpenBao commands by hand:

create or update policies;
create auth roles with the right bound claims;
create or rotate secret paths and fields;
verify positive and negative access;
tell ops-warden or another access front door when a lane may become active.

That is inefficient and easy to get wrong. It is also hard to review because the actual unit of work is spread across chat, workplans, OpenBao UI screens, State Hub notes, and shell commands.

Direction

Treat OpenBao as the enforcement and audit engine, not the primary review UI. Add a small approval control plane in front of it:

an agent or CLI creates a structured, non-secret credential change request;
humans review the rendered proposal, risk notes, generated OpenBao plan, and verification plan;
a human approves or denies with a comment;
only approved requests can be applied by an operator-controlled helper;
the helper records non-secret evidence and marks the request active, rejected, deactivated, rotated, or compromised.

This can be implemented with repo files, State Hub, and CLI/chat integration first. An OpenBao UI extension can come later if the workflow proves itself.

Core Object

The canonical unit is a credential change request, abbreviated CCR.

The CCR must be non-secret. It may contain:

stable request id and title;
requester, reviewer, approver, and applier identities;
target domain, tenant, workload, environment, and purpose;
OpenBao mount, path, field names, policy names, and auth role names;
exact non-secret policy HCL or generated policy references;
proposed auth bindings and bound claims;
delivery surface such as ops-warden, External Secrets, CSI, or direct caller fetch;
machine-readable front-door readiness, including readiness and resolvable;
risk classification and approval requirements;
generated apply plan;
verification plan;
rollback, deactivate, rotate, and compromise response plan;
comments, approvals, denials, and timestamps;
non-secret OpenBao audit request ids or timestamps after execution.

It must not contain:

secret values;
wrapped token values;
root, platform-admin, or issuer tokens;
passwords, API keys, private keys, OTP seeds, unseal shares, or recovery codes;
command output that includes secret values.

State Machine

Suggested states:

draft
proposed
needs_changes
approved
denied
apply_pending
applied
verified
active
deactivated
rotated
compromised
superseded
cancelled

Only approved requests may be applied. Only verified requests may become active.

Emergency break-glass work may create a request after the fact, but it must be marked as break-glass, reviewed retrospectively, and linked to audit evidence.

Review Surface

A reviewer should see a concise rendered proposal:

Request: whynot-design npm publish token lane
Type: workload-kv-read
Mount/path/field:
  platform/workloads/coulomb/whynot-design/npm-publish
  NPM_AUTH_TOKEN
Policy:
  workload-kv-read-whynot-design-npm-publish
Auth binding:
  netkingdom OIDC role whynot-design-workload-kv-read
  bound claim: groups includes whynot-design
Access front door:
  ops-warden whynot-design-npm-publish
  readiness: template
  resolvable: false
Risk:
  grants read access to npm publish credential
Checks:
  positive whynot fetch, negative non-whynot denial, OpenBao audit evidence
Decision:
  approve | deny | needs changes
Comment:
  free text

The reviewer should not need to know the exact bao write syntax. They should be able to discuss the proposal in chat, request changes, and then make a formal decision.

Minimal Implementation

Version 1 should be boring:

store CCR files under credential-change-requests/;
validate CCR schema offline;
render a human-readable review summary;
generate OpenBao apply plans from approved CCRs;
require an approval record before apply;
apply only non-secret policy/auth/path metadata;
prompt or delegate separately for secret value entry;
record non-secret evidence in State Hub.

The first implemented CLI slice is:

make credential-change-validate
make credential-change-render CREDENTIAL_CHANGE=CCR-2026-0001
make credential-change-plan CREDENTIAL_CHANGE=CCR-2026-0001
make credential-change-status CREDENTIAL_CHANGE=CCR-2026-0001
make credential-change-status-json CREDENTIAL_CHANGE=CCR-2026-0001
scripts/credential-change.py confirm-binding CCR-2026-0001 --reviewer <name> --comment "..."
scripts/credential-change.py approve CCR-2026-0001 --reviewer <name> --comment "..."
scripts/credential-change.py deny CCR-2026-0001 --reviewer <name> --comment "..."
scripts/credential-change.py needs-changes CCR-2026-0001 --reviewer <name> --comment "..."
make credential-change-sync-decision CREDENTIAL_CHANGE=CCR-2026-0001
make credential-change-apply-plan CREDENTIAL_CHANGE=CCR-2026-0001
make credential-change-operator-commands CREDENTIAL_CHANGE=CCR-2026-0001

apply-plan and operator-commands are intentionally guarded: they refuse anything not approved and refuse unconfirmed auth bindings. operator-commands renders the reviewed non-secret bao policy write and bao write auth/.../role commands for a platform operator; the actual secret value is still provisioned through approved OpenBao/operator custody only.

The same operations can be exposed through chat by having the agent create the proposal, show the rendered summary, then call the CLI only after the human gives an explicit approval phrase.

State Hub Role

State Hub should hold:

request lifecycle events;
review comments;
approval/denial decisions;
non-secret apply and verification evidence;
links to workplans and CCR files.

State Hub should not hold secret values. It can be the first review UI because it already supports messages, progress, task status, and cross-repo coordination.

For CCR review, create a pending State Hub decision that links to the CCR and contains only non-secret coordinates. Operators can inspect it in the dashboard at http://127.0.0.1:3000/decisions and resolve it with a rationale beginning with APPROVE:, DENY:, or NEEDS_CHANGES:. Then run make credential-change-sync-decision CREDENTIAL_CHANGE=<CCR> to copy the resolved decision back into the CCR file-backed state.

OpenBao Role

OpenBao remains authoritative for:

policy enforcement;
auth method configuration;
token issuance and revocation;
secret storage;
audit logs.

Where OpenBao supports non-secret metadata on secret paths or auth roles, we can mirror CCR ids and status labels. The workflow must not depend on OpenBao being the only index, because operators need to see proposed, rejected, deactivated, rotated, and compromised items across repos and access front doors.

ops-warden Role

ops-warden should consume only approved and active access lanes.

For draft requests, ops-warden may create a draft catalog entry that points to the CCR, but it should not activate the entry until the CCR is verified.

For warden access --fetch / --exec, the catalog should include the CCR id and refuse active use when the CCR state is not active, readiness is not ready, or resolvable is not true.

Interactive Runbook Role

The interactive runbook is the operator bridge:

load a CCR;
show the rendered summary and exact generated plan;
confirm the request is approved;
acquire operator authority through an approved path;
apply the plan;
ask for attended secret entry when needed;
run positive and negative verification;
record non-secret evidence;
notify downstream front doors such as ops-warden.

This lets operators safely drive privileged work without needing to remember every OpenBao command.

Compromise And Deactivation

Every active CCR needs a deactivate and rotate path:

deactivated: access intentionally disabled but not necessarily compromised;
rotated: secret value replaced and old value no longer valid;
compromised: emergency state requiring immediate disablement, rotation, blast-radius notes, and incident follow-up.

The workflow must support marking an existing credential or lane as compromised even when the original request predates this system.

Near-Term Target

Use the whynot-design npm token lane as the pilot:

encode the existing non-secret lane as a CCR;
render it for review;
approve or request changes from chat;
generate/apply the OpenBao policy and auth role only after approval;
provision the secret value by attended operator custody;
verify and activate the ops-warden catalog entry.

Once that path feels good, reuse it for the sibling workload-KV lanes and the credential broker's OpenBao token-role gates.

8.9 KiB Raw Blame History