feat(workplan): NK-WP-0005 — agent-driven credential bootstrap

Replaces the human-as-operator model from NK-WP-0004 with full agent
automation. Agent generates, encrypts (SOPS), injects into cluster,
and delivers a single emergency bundle (age key + break-glass passwords).
Human only stores that bundle in their personal password manager.

KeePassXC removed from operational path. creds-state.yaml redesigned
with agent_mode and emergency_bundle_delivered gate. Standard to be
updated to v0.2 (T07).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-21 09:25:36 +01:00
parent b4a3a5966f
commit 8db000e5f0

View File

@@ -0,0 +1,409 @@
---
id: NK-WP-0005
type: workplan
title: "Agent-Driven Credential Bootstrap — Zero Human Ops"
domain: netkingdom
repo: net-kingdom
status: active
owner: custodian
topic_slug: netkingdom
created: "2026-03-21"
updated: "2026-03-21"
depends_on: NK-WP-0004
state_hub_workstream_id: "75bc472b-cc0a-48f2-afb6-62b896f7cc19"
---
# Agent-Driven Credential Bootstrap — Zero Human Ops
## Problem
NK-WP-0004 built the right tooling but the wrong operator model. It was
designed around a human-as-operator workflow:
- Human runs `gen-secrets.sh`
- Human manually types every secret into KeePassXC
- Human confirms via `keepass_confirmed: true`
- Human runs `creds-apply`
This is the wrong interface. You delegated the security setup. Being told
"go open KeePassXC and type in 23 fields" is not delegation — it is
manual labour with extra ceremony.
## Goal
The agent owns the full credential lifecycle end-to-end. The only human
touchpoint is receiving the **emergency credential bundle** — a minimal
set of master keys for break-glass recovery — and storing it once in a
personal password store.
```
Agent Human
│ │
├── generate all secrets │
├── encrypt via SOPS/age → commit │
├── inject into cluster (kubectl) │
├── verify all K8s secrets live │
├── create age-encrypted ops bundle │
├── assemble emergency bundle ─────────────►│ store in personal password manager
│ │ (one-time, nothing else ever)
└── mark state complete │
```
**What the human stores (and nothing more):**
| Item | Why needed |
|------|------------|
| age private key | Decrypt any SOPS-encrypted secret from git |
| break-glass passwords (3-4) | Direct service access if cluster/auth is down |
| ops bundle passphrase | Decrypt point-in-time secret snapshot |
Everything else — service secrets, rotation, re-injection — is agent work.
## Design
### What changes from NK-WP-0004
| NK-WP-0004 | NK-WP-0005 |
|-----------|-----------|
| Human runs `make creds-generate` | Agent runs bootstrap automatically |
| Human enters secrets in KeePassXC | No KeePassXC in the operational path |
| `keepass_confirmed: false` gate | `emergency_bundle_delivered: false` gate |
| `/creds-bootstrap` skill = guided walkthrough | `/creds-init` skill = autonomous execution |
| Ops bundle created manually | Ops bundle created automatically |
| Rotation triggered manually | Rotation can be triggered by agent |
### KeePassXC role
KeePassXC is **removed from the primary workflow**. It becomes optional
personal infrastructure — if you choose to import the emergency bundle
into KeePassXC that is your business, but it is not required or assumed.
The age private key and SOPS-encrypted git files ARE the credential store.
The ops bundle IS the backup. The emergency bundle IS the human's key ring.
### What the agent cannot automate (genuine human gates)
1. **Confirming receipt of the emergency bundle** — the agent must not
mark the bootstrap complete until the human confirms they have stored
the bundle. This is a deliberate pause, not a workaround.
2. **The privacyIDEA enckey bootstrap** — must happen while the pod is
live (time-sensitive window). The agent can detect when the pod is
ready and run the step automatically, but must verify the human is
present or at least that the operation succeeded.
3. **Initial age keypair generation** — if no age key exists yet, the
agent generates it and the private key is the first thing that goes
into the emergency bundle.
---
## Tasks
### T01 — Redesign creds-state.yaml for agent mode
```task
id: NK-WP-0005-T01
status: todo
priority: high
state_hub_task_id: "6748cf8d-a7c7-47a2-b32a-2e26e05c4cba"
```
Replace the human-confirmation model with an agent-progress model.
**New schema** (`sso-mfa/bootstrap/creds-state.yaml`):
```yaml
# Credential state — net-kingdom SSO/MFA stack
# Safe to commit. Contains no secrets. Updated by agent.
schema_version: 2
agent_mode: true # NK-WP-0005: fully automated
# Phase tracking
age_key_present: false # ~/.config/sops/age/key.txt exists
secrets_generated: false # gen-secrets.sh ran successfully
ops_bundle_created: false # age-encrypted bundle created
ops_bundle_location: null # path or storage hint
# Emergency bundle
emergency_bundle_delivered: false # human confirmed receipt
emergency_bundle_delivered_at: null
# Cluster injection (per-component)
secrets_applied:
postgres: false
lldap: false
authelia: false
privacyidea: false
keycape: false
# Post-apply bootstrap (agent-run when pod is Ready)
enckey_bootstrapped: false
pi_admin_created: false
# Derived: all true → bootstrap complete
bootstrap_complete: false
```
Remove `keepass_confirmed` and `generated_at` (now in git history).
Add `schema_version: 2` so scripts can detect which model they are running.
---
### T02 — Agent bootstrap script: `creds-bootstrap-agent.sh`
```task
id: NK-WP-0005-T02
status: todo
priority: high
state_hub_task_id: "22940c39-8645-40e1-b947-17e85ea6d902"
```
Create `sso-mfa/bootstrap/creds-bootstrap-agent.sh` — the single
entrypoint for fully automated credential bootstrap.
**Flow:**
```
1. pre-flight
├── check kubectl / KUBECONFIG reachable
├── check SOPS age key (generate if missing → add to emergency bundle)
└── check cluster is healthy (kubectl get nodes)
2. generate
└── run gen-secrets.sh ./secrets
3. encrypt + commit
├── sops --encrypt each secrets env file → sso-mfa/encrypted/
└── git add + git commit "chore(creds): encrypted secrets [agent]"
4. inject
└── run creds-apply.sh (existing — postgres → lldap → authelia → privacyidea)
5. verify
└── run creds-verify.sh — exit if any K8s secret missing
6. post-apply bootstrap (waits for pods)
├── wait for privacyIDEA pod Ready (max 5 min)
├── run enckey-bootstrap.sh
├── run bootstrap-admin.sh → capture PI_ADMIN_TOKEN
├── inject keycape secrets (now PI_ADMIN_TOKEN is known)
└── run creds-verify.sh again — all components now
7. ops bundle
└── run pack-bundle.sh ./secrets <age-pub-key>
8. emergency bundle
├── assemble emergency-bundle.txt (see T03)
├── display to terminal with clear formatting
└── prompt: "Have you stored this? [y/N]"
9. cleanup
├── shred secrets/ plaintext
└── write creds-state.yaml: bootstrap_complete: true
10. update state
└── state-hub: mark NK-WP-0005 tasks done via API
```
**Error handling:** Each phase updates `creds-state.yaml` so a restart
resumes from where it left off (idempotent re-runs skip completed phases).
---
### T03 — Emergency bundle format and delivery
```task
id: NK-WP-0005-T03
status: todo
priority: high
state_hub_task_id: "42ce1486-5322-4cf2-9c71-1c1c61db5f46"
```
Create `sso-mfa/bootstrap/emergency-bundle.sh` that assembles and
displays the emergency credential bundle.
**Bundle contents:**
```
╔══════════════════════════════════════════════════════════════════╗
║ NET-KINGDOM EMERGENCY CREDENTIAL BUNDLE ║
║ Generated: <ISO-date> Store this. Nothing else. ║
╠══════════════════════════════════════════════════════════════════╣
║ AGE PRIVATE KEY (decrypt all SOPS secrets from git) ║
║ ────────────────────────────────────────────────────────────── ║
║ AGE-SECRET-KEY-1... ║
╠══════════════════════════════════════════════════════════════════╣
║ BREAK-GLASS PASSWORDS (direct service access, cluster-bypass) ║
║ ────────────────────────────────────────────────────────────── ║
║ privacyIDEA admin : <pi-admin password> ║
║ LLDAP admin : <lldap admin password> ║
║ PostgreSQL root : <postgres root password> ║
║ break-glass user : <lldap break-glass password> ║
╠══════════════════════════════════════════════════════════════════╣
║ OPS BUNDLE (age-encrypted point-in-time secret snapshot) ║
║ ────────────────────────────────────────────────────────────── ║
║ Location : <path/url to ops bundle> ║
║ Decrypt : age -d -i <age-key-path> ops-bundle-<date>.tar.age ║
╠══════════════════════════════════════════════════════════════════╣
║ RECOVERY INSTRUCTIONS ║
║ ────────────────────────────────────────────────────────────── ║
║ 1. Restore age key to ~/.config/sops/age/key.txt ║
║ 2. Clone net-kingdom repo + run: make creds-apply ║
║ 3. Use break-glass passwords for direct service access if needed ║
╚══════════════════════════════════════════════════════════════════╝
```
**Delivery:**
1. Print to terminal (operator copies manually into personal password
manager — 1Password, Bitwarden, KeePassXC, paper — agent does not
care which)
2. Optionally write to `~/emergency-bundle-<date>.txt` for 60 seconds,
then shred automatically
**Confirmation gate:** After display, prompt:
```
Store the above in your personal password manager now.
Press Enter when done (this will clear the screen and shred any temp file):
```
Only after Enter does the script continue and mark
`emergency_bundle_delivered: true`.
---
### T04 — `/creds-init` Claude Code skill (autonomous)
```task
id: NK-WP-0005-T04
status: todo
priority: medium
state_hub_task_id: "ca713ce7-6f2c-4f0c-8b6c-88fc6e559190"
```
Replace the guided `/creds-bootstrap` skill with a fully autonomous
`/creds-init` skill.
**Behaviour:**
1. Read `creds-state.yaml` to determine current phase
2. If `bootstrap_complete: true` → report status and exit
3. If `emergency_bundle_delivered: false` and some phases done → resume
from where the state file says
4. Otherwise → run `creds-bootstrap-agent.sh` end-to-end
5. On completion → log progress event to state-hub
**Skill definition:**
```yaml
---
description: >
Fully automated net-kingdom credential bootstrap. Generates all service
secrets, encrypts and commits via SOPS, injects into cluster, and delivers
a minimal emergency bundle for your personal password manager. No manual
steps required.
argument-hint: "[--dry-run] [--resume]"
allowed-tools:
- Bash(make creds-*)
- Bash(bash sso-mfa/bootstrap/creds-bootstrap-agent.sh*)
- Bash(kubectl get*)
- Bash(git status*)
- Read
---
```
The old `/creds-bootstrap` skill can be archived or updated to delegate
to `/creds-init`.
---
### T05 — Makefile: `creds-agent-init` target
```task
id: NK-WP-0005-T05
status: todo
priority: medium
state_hub_task_id: "ac5d887e-c499-4cf6-91e7-90e2e0e78d4a"
```
Add to the existing Makefile:
```makefile
## Fully automated credential bootstrap (NK-WP-0005)
## Generates, encrypts, injects, and delivers emergency bundle.
## Resumes automatically if interrupted.
creds-agent-init:
@bash sso-mfa/bootstrap/creds-bootstrap-agent.sh
## Show current bootstrap state
creds-agent-status:
@bash sso-mha/bootstrap/creds-status.sh --v2
## Re-deliver emergency bundle (if lost/stolen — generates new bundle, rotates nothing)
creds-emergency-reprint:
@bash sso-mfa/bootstrap/emergency-bundle.sh --reprint
```
---
### T06 — Agent-driven rotation
```task
id: NK-WP-0005-T06
status: todo
priority: low
state_hub_task_id: "2f0782f7-db5d-4b8a-920b-582548c4591f"
```
Extend `creds-rotate.sh` to run non-interactively when called from the
agent (currently it is interactive/guided).
Add `--secret <name> --non-interactive` flags:
- Generates new value
- Applies the atomic update sequence for that secret type
- Re-encrypts SOPS file
- Commits
- Verifies
- Updates `creds-state.yaml` with `last_rotated_<secret>: <ISO-date>`
- Does NOT require human confirmation (agent is the operator)
The emergency bundle is **not** reprinted on routine rotation — only on
full re-bootstrap or explicit `creds-emergency-reprint`.
Exception: if the age private key is rotated, a new emergency bundle
MUST be delivered before the old one is revoked.
---
### T07 — Update credential management standard
```task
id: NK-WP-0005-T07
status: todo
priority: low
state_hub_task_id: "42ac193d-7b56-48f7-8eba-757a6dad2fba"
```
Update `canon/standards/credential-management_v0.1.md` to v0.2,
reflecting the agent-driven model:
- Section 2 (Trust Hierarchy): remove KeePassXC from operational path;
add it as optional personal store for the emergency bundle
- Section 3 Phase 0: replace manual steps with `make creds-agent-init`
- New Section: Emergency Bundle — what it contains, how it is delivered,
when to use it
- Remove Section 4 (KeePassXC group structure) or demote to appendix
(still useful if user chooses KeePassXC for their personal store)
- Update Section 6 (Ops Bundle): bundle creation is now automated
- Update Section 7 (Prohibited Patterns): add "agent MUST NOT skip the
`emergency_bundle_delivered` gate, even in non-interactive runs"
---
## Done Criteria
- [ ] `make creds-agent-init` runs from scratch on a clean workstation
without any human input until the emergency bundle prompt
- [ ] Emergency bundle is displayed clearly; confirmation gate works
- [ ] All K8s secrets verified live after bootstrap
- [ ] `creds-state.yaml` shows `bootstrap_complete: true` after run
- [ ] `/creds-init` skill in Claude Code runs the full flow autonomously
- [ ] Rotation works non-interactively via `--non-interactive` flag
- [ ] Credential management standard updated to v0.2
- [ ] NK-WP-0003 unblocked (T01 was already unblocked by NK-WP-0004;
this workplan makes T01 actually safe to run without human ops)