From 8db000e5f03bc2bd80c19c99aedb993da0d68a63 Mon Sep 17 00:00:00 2001 From: tegwick Date: Sat, 21 Mar 2026 09:25:36 +0100 Subject: [PATCH] =?UTF-8?q?feat(workplan):=20NK-WP-0005=20=E2=80=94=20agen?= =?UTF-8?q?t-driven=20credential=20bootstrap?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replaces the human-as-operator model from NK-WP-0004 with full agent automation. Agent generates, encrypts (SOPS), injects into cluster, and delivers a single emergency bundle (age key + break-glass passwords). Human only stores that bundle in their personal password manager. KeePassXC removed from operational path. creds-state.yaml redesigned with agent_mode and emergency_bundle_delivered gate. Standard to be updated to v0.2 (T07). Co-Authored-By: Claude Sonnet 4.6 --- ...-0005-agent-driven-credential-bootstrap.md | 409 ++++++++++++++++++ 1 file changed, 409 insertions(+) create mode 100644 workplans/NK-WP-0005-agent-driven-credential-bootstrap.md diff --git a/workplans/NK-WP-0005-agent-driven-credential-bootstrap.md b/workplans/NK-WP-0005-agent-driven-credential-bootstrap.md new file mode 100644 index 0000000..fb879d9 --- /dev/null +++ b/workplans/NK-WP-0005-agent-driven-credential-bootstrap.md @@ -0,0 +1,409 @@ +--- +id: NK-WP-0005 +type: workplan +title: "Agent-Driven Credential Bootstrap — Zero Human Ops" +domain: netkingdom +repo: net-kingdom +status: active +owner: custodian +topic_slug: netkingdom +created: "2026-03-21" +updated: "2026-03-21" +depends_on: NK-WP-0004 +state_hub_workstream_id: "75bc472b-cc0a-48f2-afb6-62b896f7cc19" +--- + +# Agent-Driven Credential Bootstrap — Zero Human Ops + +## Problem + +NK-WP-0004 built the right tooling but the wrong operator model. It was +designed around a human-as-operator workflow: + +- Human runs `gen-secrets.sh` +- Human manually types every secret into KeePassXC +- Human confirms via `keepass_confirmed: true` +- Human runs `creds-apply` + +This is the wrong interface. You delegated the security setup. Being told +"go open KeePassXC and type in 23 fields" is not delegation — it is +manual labour with extra ceremony. + +## Goal + +The agent owns the full credential lifecycle end-to-end. The only human +touchpoint is receiving the **emergency credential bundle** — a minimal +set of master keys for break-glass recovery — and storing it once in a +personal password store. + +``` +Agent Human + │ │ + ├── generate all secrets │ + ├── encrypt via SOPS/age → commit │ + ├── inject into cluster (kubectl) │ + ├── verify all K8s secrets live │ + ├── create age-encrypted ops bundle │ + ├── assemble emergency bundle ─────────────►│ store in personal password manager + │ │ (one-time, nothing else ever) + └── mark state complete │ +``` + +**What the human stores (and nothing more):** + +| Item | Why needed | +|------|------------| +| age private key | Decrypt any SOPS-encrypted secret from git | +| break-glass passwords (3-4) | Direct service access if cluster/auth is down | +| ops bundle passphrase | Decrypt point-in-time secret snapshot | + +Everything else — service secrets, rotation, re-injection — is agent work. + +## Design + +### What changes from NK-WP-0004 + +| NK-WP-0004 | NK-WP-0005 | +|-----------|-----------| +| Human runs `make creds-generate` | Agent runs bootstrap automatically | +| Human enters secrets in KeePassXC | No KeePassXC in the operational path | +| `keepass_confirmed: false` gate | `emergency_bundle_delivered: false` gate | +| `/creds-bootstrap` skill = guided walkthrough | `/creds-init` skill = autonomous execution | +| Ops bundle created manually | Ops bundle created automatically | +| Rotation triggered manually | Rotation can be triggered by agent | + +### KeePassXC role + +KeePassXC is **removed from the primary workflow**. It becomes optional +personal infrastructure — if you choose to import the emergency bundle +into KeePassXC that is your business, but it is not required or assumed. + +The age private key and SOPS-encrypted git files ARE the credential store. +The ops bundle IS the backup. The emergency bundle IS the human's key ring. + +### What the agent cannot automate (genuine human gates) + +1. **Confirming receipt of the emergency bundle** — the agent must not + mark the bootstrap complete until the human confirms they have stored + the bundle. This is a deliberate pause, not a workaround. +2. **The privacyIDEA enckey bootstrap** — must happen while the pod is + live (time-sensitive window). The agent can detect when the pod is + ready and run the step automatically, but must verify the human is + present or at least that the operation succeeded. +3. **Initial age keypair generation** — if no age key exists yet, the + agent generates it and the private key is the first thing that goes + into the emergency bundle. + +--- + +## Tasks + +### T01 — Redesign creds-state.yaml for agent mode + +```task +id: NK-WP-0005-T01 +status: todo +priority: high +state_hub_task_id: "6748cf8d-a7c7-47a2-b32a-2e26e05c4cba" +``` + +Replace the human-confirmation model with an agent-progress model. + +**New schema** (`sso-mfa/bootstrap/creds-state.yaml`): + +```yaml +# Credential state — net-kingdom SSO/MFA stack +# Safe to commit. Contains no secrets. Updated by agent. +schema_version: 2 +agent_mode: true # NK-WP-0005: fully automated + +# Phase tracking +age_key_present: false # ~/.config/sops/age/key.txt exists +secrets_generated: false # gen-secrets.sh ran successfully +ops_bundle_created: false # age-encrypted bundle created +ops_bundle_location: null # path or storage hint + +# Emergency bundle +emergency_bundle_delivered: false # human confirmed receipt +emergency_bundle_delivered_at: null + +# Cluster injection (per-component) +secrets_applied: + postgres: false + lldap: false + authelia: false + privacyidea: false + keycape: false + +# Post-apply bootstrap (agent-run when pod is Ready) +enckey_bootstrapped: false +pi_admin_created: false + +# Derived: all true → bootstrap complete +bootstrap_complete: false +``` + +Remove `keepass_confirmed` and `generated_at` (now in git history). +Add `schema_version: 2` so scripts can detect which model they are running. + +--- + +### T02 — Agent bootstrap script: `creds-bootstrap-agent.sh` + +```task +id: NK-WP-0005-T02 +status: todo +priority: high +state_hub_task_id: "22940c39-8645-40e1-b947-17e85ea6d902" +``` + +Create `sso-mfa/bootstrap/creds-bootstrap-agent.sh` — the single +entrypoint for fully automated credential bootstrap. + +**Flow:** + +``` +1. pre-flight + ├── check kubectl / KUBECONFIG reachable + ├── check SOPS age key (generate if missing → add to emergency bundle) + └── check cluster is healthy (kubectl get nodes) + +2. generate + └── run gen-secrets.sh ./secrets + +3. encrypt + commit + ├── sops --encrypt each secrets env file → sso-mfa/encrypted/ + └── git add + git commit "chore(creds): encrypted secrets [agent]" + +4. inject + └── run creds-apply.sh (existing — postgres → lldap → authelia → privacyidea) + +5. verify + └── run creds-verify.sh — exit if any K8s secret missing + +6. post-apply bootstrap (waits for pods) + ├── wait for privacyIDEA pod Ready (max 5 min) + ├── run enckey-bootstrap.sh + ├── run bootstrap-admin.sh → capture PI_ADMIN_TOKEN + ├── inject keycape secrets (now PI_ADMIN_TOKEN is known) + └── run creds-verify.sh again — all components now + +7. ops bundle + └── run pack-bundle.sh ./secrets + +8. emergency bundle + ├── assemble emergency-bundle.txt (see T03) + ├── display to terminal with clear formatting + └── prompt: "Have you stored this? [y/N]" + +9. cleanup + ├── shred secrets/ plaintext + └── write creds-state.yaml: bootstrap_complete: true + +10. update state + └── state-hub: mark NK-WP-0005 tasks done via API +``` + +**Error handling:** Each phase updates `creds-state.yaml` so a restart +resumes from where it left off (idempotent re-runs skip completed phases). + +--- + +### T03 — Emergency bundle format and delivery + +```task +id: NK-WP-0005-T03 +status: todo +priority: high +state_hub_task_id: "42ce1486-5322-4cf2-9c71-1c1c61db5f46" +``` + +Create `sso-mfa/bootstrap/emergency-bundle.sh` that assembles and +displays the emergency credential bundle. + +**Bundle contents:** + +``` +╔══════════════════════════════════════════════════════════════════╗ +║ NET-KINGDOM EMERGENCY CREDENTIAL BUNDLE ║ +║ Generated: Store this. Nothing else. ║ +╠══════════════════════════════════════════════════════════════════╣ +║ AGE PRIVATE KEY (decrypt all SOPS secrets from git) ║ +║ ────────────────────────────────────────────────────────────── ║ +║ AGE-SECRET-KEY-1... ║ +╠══════════════════════════════════════════════════════════════════╣ +║ BREAK-GLASS PASSWORDS (direct service access, cluster-bypass) ║ +║ ────────────────────────────────────────────────────────────── ║ +║ privacyIDEA admin : ║ +║ LLDAP admin : ║ +║ PostgreSQL root : ║ +║ break-glass user : ║ +╠══════════════════════════════════════════════════════════════════╣ +║ OPS BUNDLE (age-encrypted point-in-time secret snapshot) ║ +║ ────────────────────────────────────────────────────────────── ║ +║ Location : ║ +║ Decrypt : age -d -i ops-bundle-.tar.age ║ +╠══════════════════════════════════════════════════════════════════╣ +║ RECOVERY INSTRUCTIONS ║ +║ ────────────────────────────────────────────────────────────── ║ +║ 1. Restore age key to ~/.config/sops/age/key.txt ║ +║ 2. Clone net-kingdom repo + run: make creds-apply ║ +║ 3. Use break-glass passwords for direct service access if needed ║ +╚══════════════════════════════════════════════════════════════════╝ +``` + +**Delivery:** +1. Print to terminal (operator copies manually into personal password + manager — 1Password, Bitwarden, KeePassXC, paper — agent does not + care which) +2. Optionally write to `~/emergency-bundle-.txt` for 60 seconds, + then shred automatically + +**Confirmation gate:** After display, prompt: +``` +Store the above in your personal password manager now. +Press Enter when done (this will clear the screen and shred any temp file): +``` +Only after Enter does the script continue and mark +`emergency_bundle_delivered: true`. + +--- + +### T04 — `/creds-init` Claude Code skill (autonomous) + +```task +id: NK-WP-0005-T04 +status: todo +priority: medium +state_hub_task_id: "ca713ce7-6f2c-4f0c-8b6c-88fc6e559190" +``` + +Replace the guided `/creds-bootstrap` skill with a fully autonomous +`/creds-init` skill. + +**Behaviour:** +1. Read `creds-state.yaml` to determine current phase +2. If `bootstrap_complete: true` → report status and exit +3. If `emergency_bundle_delivered: false` and some phases done → resume + from where the state file says +4. Otherwise → run `creds-bootstrap-agent.sh` end-to-end +5. On completion → log progress event to state-hub + +**Skill definition:** +```yaml +--- +description: > + Fully automated net-kingdom credential bootstrap. Generates all service + secrets, encrypts and commits via SOPS, injects into cluster, and delivers + a minimal emergency bundle for your personal password manager. No manual + steps required. +argument-hint: "[--dry-run] [--resume]" +allowed-tools: + - Bash(make creds-*) + - Bash(bash sso-mfa/bootstrap/creds-bootstrap-agent.sh*) + - Bash(kubectl get*) + - Bash(git status*) + - Read +--- +``` + +The old `/creds-bootstrap` skill can be archived or updated to delegate +to `/creds-init`. + +--- + +### T05 — Makefile: `creds-agent-init` target + +```task +id: NK-WP-0005-T05 +status: todo +priority: medium +state_hub_task_id: "ac5d887e-c499-4cf6-91e7-90e2e0e78d4a" +``` + +Add to the existing Makefile: + +```makefile +## Fully automated credential bootstrap (NK-WP-0005) +## Generates, encrypts, injects, and delivers emergency bundle. +## Resumes automatically if interrupted. +creds-agent-init: + @bash sso-mfa/bootstrap/creds-bootstrap-agent.sh + +## Show current bootstrap state +creds-agent-status: + @bash sso-mha/bootstrap/creds-status.sh --v2 + +## Re-deliver emergency bundle (if lost/stolen — generates new bundle, rotates nothing) +creds-emergency-reprint: + @bash sso-mfa/bootstrap/emergency-bundle.sh --reprint +``` + +--- + +### T06 — Agent-driven rotation + +```task +id: NK-WP-0005-T06 +status: todo +priority: low +state_hub_task_id: "2f0782f7-db5d-4b8a-920b-582548c4591f" +``` + +Extend `creds-rotate.sh` to run non-interactively when called from the +agent (currently it is interactive/guided). + +Add `--secret --non-interactive` flags: +- Generates new value +- Applies the atomic update sequence for that secret type +- Re-encrypts SOPS file +- Commits +- Verifies +- Updates `creds-state.yaml` with `last_rotated_: ` +- Does NOT require human confirmation (agent is the operator) + +The emergency bundle is **not** reprinted on routine rotation — only on +full re-bootstrap or explicit `creds-emergency-reprint`. + +Exception: if the age private key is rotated, a new emergency bundle +MUST be delivered before the old one is revoked. + +--- + +### T07 — Update credential management standard + +```task +id: NK-WP-0005-T07 +status: todo +priority: low +state_hub_task_id: "42ac193d-7b56-48f7-8eba-757a6dad2fba" +``` + +Update `canon/standards/credential-management_v0.1.md` to v0.2, +reflecting the agent-driven model: + +- Section 2 (Trust Hierarchy): remove KeePassXC from operational path; + add it as optional personal store for the emergency bundle +- Section 3 Phase 0: replace manual steps with `make creds-agent-init` +- New Section: Emergency Bundle — what it contains, how it is delivered, + when to use it +- Remove Section 4 (KeePassXC group structure) or demote to appendix + (still useful if user chooses KeePassXC for their personal store) +- Update Section 6 (Ops Bundle): bundle creation is now automated +- Update Section 7 (Prohibited Patterns): add "agent MUST NOT skip the + `emergency_bundle_delivered` gate, even in non-interactive runs" + +--- + +## Done Criteria + +- [ ] `make creds-agent-init` runs from scratch on a clean workstation + without any human input until the emergency bundle prompt +- [ ] Emergency bundle is displayed clearly; confirmation gate works +- [ ] All K8s secrets verified live after bootstrap +- [ ] `creds-state.yaml` shows `bootstrap_complete: true` after run +- [ ] `/creds-init` skill in Claude Code runs the full flow autonomously +- [ ] Rotation works non-interactively via `--non-interactive` flag +- [ ] Credential management standard updated to v0.2 +- [ ] NK-WP-0003 unblocked (T01 was already unblocked by NK-WP-0004; + this workplan makes T01 actually safe to run without human ops)