Files

tegwick 7b211acd57 Add OpenBao runtime secret authority; complete NK-WP-0006/0007/0008

Refine the recursive platform security architecture to make OpenBao the
canonical runtime secret authority, with SOPS/age, K8s Secrets, and the
emergency bundle reframed as bootstrap/delivery/break-glass mechanisms.

- credential-management standard v0.2: add OpenBao runtime authority
  section, rotation rules, and prohibited patterns (OpenBao-as-PDP,
  tenant platform-root)
- platform-identity-security-architecture: mark implemented; add
  flex-auth/Topaz implications, Coulomb onboarding path, and a
  production-readiness checklist
- NK-WP-0004/0005: document bootstrap-to-OpenBao handoff boundary
- NK-WP-0006/0007: status -> done with implementation reviews; add
  recursive platform/tenant split and OpenBao broker/audit role for
  object-storage STS vending
- NK-WP-0008: status -> done; repoint corpus to infospace-bench
- new ADR-0007 (orchestration boundary), ADR-0008 (STS vending
  boundary), and the object-storage STS credential-vending architecture

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

2026-05-20 22:51:20 +02:00

16 KiB

Raw Blame History

id, type, title, domain, repo, status, owner, topic_slug, created, updated, depends_on, state_hub_workstream_id

id	type	title	domain	repo	status	owner	topic_slug	created	updated	depends_on	state_hub_workstream_id
NK-WP-0005	workplan	Agent-Driven Credential Bootstrap — Zero Human Ops	netkingdom	net-kingdom	done	custodian	netkingdom	2026-03-21	2026-05-18	NK-WP-0004	75bc472b-cc0a-48f2-afb6-62b896f7cc19

Agent-Driven Credential Bootstrap — Zero Human Ops

Problem

NK-WP-0004 built the right tooling but the wrong operator model. It was designed around a human-as-operator workflow:

Human runs gen-secrets.sh
Human manually types every secret into KeePassXC
Human confirms via keepass_confirmed: true
Human runs creds-apply

This is the wrong interface. You delegated the security setup. Being told "go open KeePassXC and type in 23 fields" is not delegation — it is manual labour with extra ceremony.

Goal

The agent owns the full credential lifecycle end-to-end. The only human touchpoint is receiving the emergency credential bundle — a minimal set of master keys for break-glass recovery — and storing it once in a personal password store.

Agent                                       Human
  │                                           │
  ├── generate all secrets                    │
  ├── encrypt via SOPS/age → commit           │
  ├── inject into cluster (kubectl)           │
  ├── verify all K8s secrets live             │
  ├── create age-encrypted ops bundle         │
  ├── assemble emergency bundle ─────────────►│ store in personal password manager
  │                                           │ (one-time, nothing else ever)
  └── mark state complete                     │

What the human stores (and nothing more):

Item	Why needed
age private key	Decrypt any SOPS-encrypted secret from git
break-glass passwords (3-4)	Direct service access if cluster/auth is down
ops bundle passphrase	Decrypt point-in-time secret snapshot

Everything else — service secrets, rotation, re-injection — is agent work.

NK-WP-0006 Runtime Secret Refinement

With OpenBao in the platform stack, the agent-driven bootstrap is the handoff mechanism from bootstrap secrets to runtime secret authority. The agent may generate, encrypt, inject, and verify initial secrets, but OpenBao becomes the normal authority for platform and workload secret delivery once the control plane is alive.

The bootstrap flow therefore has one additional boundary:

SOPS/age and the emergency bundle establish bootstrap and recovery authority.
Kubernetes Secrets carry the minimum initial material needed to start the identity, MFA, database, and OpenBao platform services.
OpenBao is initialized, unsealed or auto-unsealed by the approved mechanism, audit logging is enabled, backups are verified, and workload auth methods are configured.
Runtime workloads receive scoped secrets, dynamic credentials, or synchronized Kubernetes Secrets from OpenBao. They do not consume platform-root bootstrap material.

OpenBao root tokens, unseal keys, or recovery keys are break-glass material. They must not be stored as ordinary tenant secrets or exposed to tenant administrators. If they are included in an emergency bundle, that bundle is platform-control-plane break-glass material and requires the strongest storage and review procedure available for the deployment.

Design

What changes from NK-WP-0004

NK-WP-0004	NK-WP-0005
Human runs `make creds-generate`	Agent runs bootstrap automatically
Human enters secrets in KeePassXC	No KeePassXC in the operational path
`keepass_confirmed: false` gate	`emergency_bundle_delivered: false` gate
`/creds-bootstrap` skill = guided walkthrough	`/creds-init` skill = autonomous execution
Ops bundle created manually	Ops bundle created automatically
Rotation triggered manually	Rotation can be triggered by agent

KeePassXC role

KeePassXC is removed from the primary workflow. It becomes optional personal infrastructure — if you choose to import the emergency bundle into KeePassXC that is your business, but it is not required or assumed.

The age private key and SOPS-encrypted git files ARE the credential store. The ops bundle IS the backup. The emergency bundle IS the human's key ring.

What the agent cannot automate (genuine human gates)

Confirming receipt of the emergency bundle — the agent must not mark the bootstrap complete until the human confirms they have stored the bundle. This is a deliberate pause, not a workaround.
The privacyIDEA enckey bootstrap — must happen while the pod is live (time-sensitive window). The agent can detect when the pod is ready and run the step automatically, but must verify the human is present or at least that the operation succeeded.
Initial age keypair generation — if no age key exists yet, the agent generates it and the private key is the first thing that goes into the emergency bundle.

Tasks

T01 — Redesign creds-state.yaml for agent mode

id: NK-WP-0005-T01
status: done
priority: high
state_hub_task_id: "6748cf8d-a7c7-47a2-b32a-2e26e05c4cba"

Replace the human-confirmation model with an agent-progress model.

New schema (sso-mfa/bootstrap/creds-state.yaml):

# Credential state — net-kingdom SSO/MFA stack
# Safe to commit. Contains no secrets. Updated by agent.
schema_version: 2
agent_mode: true               # NK-WP-0005: fully automated

# Phase tracking
age_key_present: false         # ~/.config/sops/age/key.txt exists
secrets_generated: false       # gen-secrets.sh ran successfully
ops_bundle_created: false      # age-encrypted bundle created
ops_bundle_location: null      # path or storage hint

# Emergency bundle
emergency_bundle_delivered: false   # human confirmed receipt
emergency_bundle_delivered_at: null

# Cluster injection (per-component)
secrets_applied:
  postgres:    false
  lldap:       false
  authelia:    false
  privacyidea: false
  keycape:     false

# Post-apply bootstrap (agent-run when pod is Ready)
enckey_bootstrapped:  false
pi_admin_created:     false

# Derived: all true → bootstrap complete
bootstrap_complete: false

Remove keepass_confirmed and generated_at (now in git history). Add schema_version: 2 so scripts can detect which model they are running.

T02 — Agent bootstrap script: `creds-bootstrap-agent.sh`

id: NK-WP-0005-T02
status: done
priority: high
state_hub_task_id: "22940c39-8645-40e1-b947-17e85ea6d902"

Create sso-mfa/bootstrap/creds-bootstrap-agent.sh — the single entrypoint for fully automated credential bootstrap.

Flow:

1. pre-flight
   ├── check kubectl / KUBECONFIG reachable
   ├── check SOPS age key (generate if missing → add to emergency bundle)
   └── check cluster is healthy (kubectl get nodes)

2. generate
   └── run gen-secrets.sh ./secrets

3. encrypt + commit
   ├── sops --encrypt each secrets env file → sso-mfa/encrypted/
   └── git add + git commit "chore(creds): encrypted secrets [agent]"

4. inject
   └── run creds-apply.sh (existing — postgres → lldap → authelia → privacyidea)

5. verify
   └── run creds-verify.sh — exit if any K8s secret missing

6. post-apply bootstrap (waits for pods)
   ├── wait for privacyIDEA pod Ready (max 5 min)
   ├── run enckey-bootstrap.sh
   ├── run bootstrap-admin.sh → capture PI_ADMIN_TOKEN
   ├── inject keycape secrets (now PI_ADMIN_TOKEN is known)
   └── run creds-verify.sh again — all components now

7. ops bundle
   └── run pack-bundle.sh ./secrets <age-pub-key>

8. emergency bundle
   ├── assemble emergency-bundle.txt (see T03)
   ├── display to terminal with clear formatting
   └── prompt: "Have you stored this? [y/N]"

9. cleanup
   ├── shred secrets/ plaintext
   └── write creds-state.yaml: bootstrap_complete: true

10. update state
    └── state-hub: mark NK-WP-0005 tasks done via API

Error handling: Each phase updates creds-state.yaml so a restart resumes from where it left off (idempotent re-runs skip completed phases).

T03 — Emergency bundle format and delivery

id: NK-WP-0005-T03
status: done
priority: high
state_hub_task_id: "42ce1486-5322-4cf2-9c71-1c1c61db5f46"

Create sso-mfa/bootstrap/emergency-bundle.sh that assembles and displays the emergency credential bundle.

Bundle contents:

╔══════════════════════════════════════════════════════════════════╗
║          NET-KINGDOM EMERGENCY CREDENTIAL BUNDLE                 ║
║          Generated: <ISO-date>    Store this. Nothing else.      ║
╠══════════════════════════════════════════════════════════════════╣
║ AGE PRIVATE KEY (decrypt all SOPS secrets from git)              ║
║ ────────────────────────────────────────────────────────────── ║
║ AGE-SECRET-KEY-1...                                              ║
╠══════════════════════════════════════════════════════════════════╣
║ BREAK-GLASS PASSWORDS (direct service access, cluster-bypass)   ║
║ ────────────────────────────────────────────────────────────── ║
║ privacyIDEA admin  : <pi-admin password>                         ║
║ LLDAP admin        : <lldap admin password>                      ║
║ PostgreSQL root    : <postgres root password>                    ║
║ break-glass user   : <lldap break-glass password>                ║
╠══════════════════════════════════════════════════════════════════╣
║ OPS BUNDLE (age-encrypted point-in-time secret snapshot)        ║
║ ────────────────────────────────────────────────────────────── ║
║ Location : <path/url to ops bundle>                              ║
║ Decrypt  : age -d -i <age-key-path> ops-bundle-<date>.tar.age   ║
╠══════════════════════════════════════════════════════════════════╣
║ RECOVERY INSTRUCTIONS                                            ║
║ ────────────────────────────────────────────────────────────── ║
║ 1. Restore age key to ~/.config/sops/age/key.txt                 ║
║ 2. Clone net-kingdom repo + run: make creds-apply                ║
║ 3. Use break-glass passwords for direct service access if needed ║
╚══════════════════════════════════════════════════════════════════╝

Delivery:

Print to terminal (operator copies manually into personal password manager — 1Password, Bitwarden, KeePassXC, paper — agent does not care which)
Optionally write to ~/emergency-bundle-<date>.txt for 60 seconds, then shred automatically

Confirmation gate: After display, prompt:

Store the above in your personal password manager now.
Press Enter when done (this will clear the screen and shred any temp file):

Only after Enter does the script continue and mark emergency_bundle_delivered: true.

T04 — `/creds-init` Claude Code skill (autonomous)

id: NK-WP-0005-T04
status: done
priority: medium
state_hub_task_id: "ca713ce7-6f2c-4f0c-8b6c-88fc6e559190"

Replace the guided /creds-bootstrap skill with a fully autonomous /creds-init skill.

Behaviour:

Read creds-state.yaml to determine current phase
If bootstrap_complete: true → report status and exit
If emergency_bundle_delivered: false and some phases done → resume from where the state file says
Otherwise → run creds-bootstrap-agent.sh end-to-end
On completion → log progress event to state-hub

Skill definition:

---
description: >
  Fully automated net-kingdom credential bootstrap. Generates all service
  secrets, encrypts and commits via SOPS, injects into cluster, and delivers
  a minimal emergency bundle for your personal password manager. No manual
  steps required.
argument-hint: "[--dry-run] [--resume]"
allowed-tools:
  - Bash(make creds-*)
  - Bash(bash sso-mfa/bootstrap/creds-bootstrap-agent.sh*)
  - Bash(kubectl get*)
  - Bash(git status*)
  - Read
---

The old /creds-bootstrap skill can be archived or updated to delegate to /creds-init.

T05 — Makefile: `creds-agent-init` target

id: NK-WP-0005-T05
status: done
priority: medium
state_hub_task_id: "ac5d887e-c499-4cf6-91e7-90e2e0e78d4a"

Add to the existing Makefile:

## Fully automated credential bootstrap (NK-WP-0005)
## Generates, encrypts, injects, and delivers emergency bundle.
## Resumes automatically if interrupted.
creds-agent-init:
	@bash sso-mfa/bootstrap/creds-bootstrap-agent.sh

## Show current bootstrap state
creds-agent-status:
	@bash sso-mha/bootstrap/creds-status.sh --v2

## Re-deliver emergency bundle (if lost/stolen — generates new bundle, rotates nothing)
creds-emergency-reprint:
	@bash sso-mfa/bootstrap/emergency-bundle.sh --reprint

T06 — Agent-driven rotation

id: NK-WP-0005-T06
status: done
priority: low
state_hub_task_id: "2f0782f7-db5d-4b8a-920b-582548c4591f"

Extend creds-rotate.sh to run non-interactively when called from the agent (currently it is interactive/guided).

Add --secret <name> --non-interactive flags:

Generates new value
Applies the atomic update sequence for that secret type
Re-encrypts SOPS file
Commits
Verifies
Updates creds-state.yaml with last_rotated_<secret>: <ISO-date>
Does NOT require human confirmation (agent is the operator)

The emergency bundle is not reprinted on routine rotation — only on full re-bootstrap or explicit creds-emergency-reprint.

Exception: if the age private key is rotated, a new emergency bundle MUST be delivered before the old one is revoked.

T07 — Update credential management standard

id: NK-WP-0005-T07
status: done
priority: low
state_hub_task_id: "42ac193d-7b56-48f7-8eba-757a6dad2fba"

Update canon/standards/credential-management_v0.1.md to v0.2, reflecting the agent-driven model:

Section 2 (Trust Hierarchy): remove KeePassXC from operational path; add it as optional personal store for the emergency bundle
Section 3 Phase 0: replace manual steps with make creds-agent-init
New Section: Emergency Bundle — what it contains, how it is delivered, when to use it
Remove Section 4 (KeePassXC group structure) or demote to appendix (still useful if user chooses KeePassXC for their personal store)
Update Section 6 (Ops Bundle): bundle creation is now automated
Update Section 7 (Prohibited Patterns): add "agent MUST NOT skip the emergency_bundle_delivered gate, even in non-interactive runs"

Done Criteria

make creds-agent-init runs from scratch on a clean workstation without any human input until the emergency bundle prompt
Emergency bundle is displayed clearly; confirmation gate works
All K8s secrets verified live after bootstrap
creds-state.yaml shows bootstrap_complete: true after run
/creds-init skill in Claude Code runs the full flow autonomously
Rotation works non-interactively via --non-interactive flag
Credential management standard updated to v0.2
NK-WP-0003 unblocked (T01 was already unblocked by NK-WP-0004; this workplan makes T01 actually safe to run without human ops)

16 KiB Raw Blame History