Files
net-kingdom/workplans/NK-WP-0004-credential-management-foundation.md
tegwick b4a3a5966f chore(consistency): NK-WP-0004 complete — correct regressed task statuses
All 7 tasks were implemented in c10d7d2 but fix-consistency on the
workstation reverted all statuses to todo (CUST-WP-0026 regression bug).
Corrected all task blocks to status: done and workplan status to done.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 09:19:37 +01:00

358 lines
13 KiB
Markdown

---
id: NK-WP-0004
type: workplan
title: "Credential Management Foundation"
domain: netkingdom
repo: net-kingdom
status: done
owner: custodian
topic_slug: netkingdom
created: "2026-03-20"
updated: "2026-03-20"
state_hub_workstream_id: "d9cf7c4b-886b-4cd1-ad7b-99c4e1929c9e"
---
# Credential Management Foundation
## Goal
Make credential management a first-class, reliable foundation rather than
a manual side-task. By the end of this workplan an operator can:
1. Run `make creds-init` to set up the full SOPS + age + KeePassXC workflow
2. Run `make creds-generate` to produce all service secrets and be guided on
KeePassXC entry
3. Run `make creds-apply` to inject secrets into the cluster in the correct order
4. Run `make creds-status` to see what is generated, applied, and verified
5. Invoke `/creds-bootstrap` in Claude Code for guided assistance through
the bootstrap process
This workplan is a **pre-condition for NK-WP-0003** (cluster deployment).
NK-WP-0003-T01 is blocked until this workplan is complete.
## Problem
Current state:
- `gen-secrets.sh` and `pack-bundle.sh` exist but are run manually, in
isolation, with no orchestration
- The five `create-secrets.sh` scripts must be run in a specific order
(postgres → lldap → authelia → privacyidea → keycape) but this is
undocumented and unenforced
- Shared secrets (LLDAP_LDAP_USER_PASS, PI_DB_PASSWORD) are referenced
across component scripts but there is no enforcement that source exists
before consumer runs
- No git pre-commit hook — plaintext secrets can accidentally be committed
- No `.sops.yaml` — net-kingdom is not SOPS-enabled, unlike railiance-infra
- No credential state file — no way to know which secrets are generated,
which are applied, which are verified, without manual cluster inspection
- The enckey-bootstrap.sh step is time-sensitive (must run while the
privacyIDEA pod is live) but nothing flags this or sequences it
- Operator must hold all of this in their head
## Architecture
```
Operator
├── make creds-init # one-time: age key check, .sops.yaml, git hook
├── make creds-generate # run gen-secrets.sh → guided KeePassXC entry
├── make creds-bundle # age-encrypt ops bundle → offsite
├── make creds-apply # run all create-secrets.sh in correct order
├── make creds-verify # check all K8s secrets exist with expected keys
├── make creds-status # show credential state file
└── make creds-rotate SECRET=<name> # guided rotation for one secret
Claude Code skill: /creds-bootstrap
└── guided session for first-time bootstrap (reads credential state,
knows what's done, provides KeePassXC entry instructions,
warns about time-sensitive steps like enckey-bootstrap)
```
## Dependency on canon standard
All design decisions in this workplan follow
`canon/standards/credential-management_v0.1.md`.
The KeePassXC group structure, phase model, SOPS policy, and prohibited
patterns defined there are normative. This workplan implements them.
## Tasks
### T01 — SOPS integration
```task
id: NK-WP-0004-T01
status: done
priority: high
state_hub_task_id: "2340f2a3-9c11-44a8-b264-41d75b6dbc3e"
```
Add SOPS encryption infrastructure to net-kingdom, aligned with
railiance-infra (same age key, same approach).
**Steps:**
1. Verify the operator age key exists:
```bash
ls ~/.config/sops/age/key.txt || age-keygen -o ~/.config/sops/age/key.txt
```
The public key (`age1aq8twfd78wvpra0had8cezcnj96tj4q0068edrz5jez8d6xwmflqdepsh4`
for the primary operator) is already in railiance-infra. Reuse the same
keypair — one age key per operator across all repos.
2. Create `keys/age.pub` at the repo root:
```
age1aq8twfd78wvpra0had8cezcnj96tj4q0068edrz5jez8d6xwmflqdepsh4
```
3. Create `.sops.yaml` at the repo root:
```yaml
creation_rules:
- path_regex: secrets/.*$
key_groups:
- age:
- age1aq8twfd78wvpra0had8cezcnj96tj4q0068edrz5jez8d6xwmflqdepsh4
```
4. Add `secrets/` to `.gitignore` (plaintext secrets MUST NOT enter git).
SOPS-encrypted files (`.sops.yaml` extension) may be committed.
5. Create `.githooks/pre-commit` mirroring railiance-infra:
- Blocks any commit that includes a file under `secrets/` lacking
`sops:` or `"sops":` marker (i.e. plaintext)
- Also blocks any file named `*.env` outside of `sso-mfa/bootstrap/`
being committed
6. `make hooks` target to enable the hook:
```makefile
hooks:
git config core.hooksPath .githooks
```
### T02 — Makefile: SOPS targets
```task
id: NK-WP-0004-T02
status: done
priority: high
state_hub_task_id: "f6ad469c-e1d3-4253-b855-e0554e43f612"
```
Create the top-level `Makefile` for net-kingdom. Port SOPS targets from
railiance-infra and add net-kingdom-specific targets.
**Targets to implement:**
```makefile
## One-time setup
sops-setup: # Copy age key to ~/.config/sops/age/keys.txt
hooks: # Enable git pre-commit hook
## SOPS operations
sops-edit: # sops <file>
sops-encrypt: # sops --encrypt --in-place $(FILE)
sops-decrypt: # sops -d $(FILE) (stdout only, never write plaintext to disk)
sops-rotate: # sops --rotate --in-place $(FILE) (after adding new recipient)
check-secrets: # fail if any secrets/ file is not SOPS-encrypted
## Credential lifecycle
creds-init: # prerequisite check + sops-setup + hooks
creds-generate: # run gen-secrets.sh + print KeePassXC entry guide
creds-bundle: # run pack-bundle.sh with operator age public key
creds-apply: # run all create-secrets.sh in dependency order
creds-verify: # check all expected K8s secrets exist
creds-status: # print credential state file
## Single-secret rotation
creds-rotate: # guided rotation for SECRET= (generate → KeePassXC → apply → verify)
```
### T03 — Credential orchestrator: `creds-apply` ordering
```task
id: NK-WP-0004-T03
status: done
priority: high
state_hub_task_id: "4b386b92-8db9-440c-b116-52dbb2bd68cb"
```
The `creds-apply` Makefile target must run `create-secrets.sh` scripts in
the correct dependency order, with prerequisite checks at each step.
**Dependency graph:**
```
postgres/create-secrets.sh (no dependencies)
lldap/create-secrets.sh (needs: lldap/secrets.env)
├── authelia/create-secrets.sh (needs: lldap/secrets.env → LLDAP_LDAP_USER_PASS)
└── keycape/create-secrets.sh (needs: lldap/secrets.env + PI_ADMIN_TOKEN)
└── PI_ADMIN_TOKEN available only after T04
privacyidea/create-secrets.sh (needs: privacyidea/secrets.env)
└── enckey-bootstrap.sh ← TIME-SENSITIVE: must run while pod is live
```
**Implementation:**
Create `sso-mfa/bootstrap/creds-apply.sh` that:
1. Checks `KUBECONFIG` is set and cluster is reachable
2. Checks each `secrets/<component>/secrets.env` exists before sourcing it
3. Runs scripts in order: postgres → lldap → authelia → privacyidea
4. Explicitly skips keycape (requires PI_ADMIN_TOKEN from post-T04 bootstrap)
5. Prints the keycape step as a manual reminder with the exact command
6. On success, updates `sso-mfa/bootstrap/creds-state.yaml`
### T04 — Credential state file
```task
id: NK-WP-0004-T04
status: done
priority: high
state_hub_task_id: "5bc125a7-ae42-40a3-864c-c356e5fc122d"
```
Create `sso-mfa/bootstrap/creds-state.yaml` — a tracked file (safe to
commit, contains no secrets) that records what has been done:
```yaml
# Credential state — net-kingdom SSO/MFA stack
# This file is safe to commit. It contains no secrets.
# Updated automatically by make creds-* targets.
generated_at: null # ISO datetime from last gen-secrets.sh run
bundle_at: null # ISO datetime from last pack-bundle.sh run
keepass_confirmed: false # Manually set to true after KeePassXC entry
secrets_applied:
postgres: false
lldap: false
authelia: false
privacyidea: false
keycape: false # Requires PI_ADMIN_TOKEN (post privacyIDEA T04)
enckey_bootstrapped: false # Set after enckey-bootstrap.sh runs
pi_admin_created: false # Set after bootstrap-admin.sh runs
```
The `make creds-status` target reads this file and prints a human-readable
status table. The `make creds-verify` target checks actual K8s secret
existence and updates `secrets_applied` accordingly.
`keepass_confirmed` is the only field that requires manual operator
intervention to set to `true` — it represents the irreducibly human step
in the bootstrap process.
### T05 — git pre-commit hook + `check-secrets` gate
```task
id: NK-WP-0004-T05
status: done
priority: high
state_hub_task_id: "d8ea8fbf-ae89-4675-afba-958187ca37f1"
```
Implement `.githooks/pre-commit` that prevents plaintext secrets from
entering git. Port from railiance-infra with net-kingdom-specific additions:
**Blocks:**
- Any file under `secrets/` without a SOPS marker
- Any file matching `*.env` outside of `sso-mfa/bootstrap/`
- Any file containing any of these patterns: `PI_SECRET_KEY=`, `PI_PEPPER=`,
`LLDAP_JWT_SECRET=`, `AUTHELIA_`, `BREAKGLASS_PASSWORD=`
**Warning only (does not block):**
- Files matching `*-bundle*.tar.age` being committed (large encrypted
artifacts belong offsite, not in git)
Add `make hooks-test` target that verifies the hook blocks plaintext
(mirrors railiance-infra pattern).
### T06 — Claude Code skill: `/creds-bootstrap`
```task
id: NK-WP-0004-T06
status: done
priority: medium
state_hub_task_id: "b9ecbd3f-17f0-4c1d-97e5-84bfbb43d360"
```
Create `~/.claude/commands/creds-bootstrap.md` — a Claude Code skill that
provides guided assistance during the credential bootstrap process.
**When to use it:** First-time bootstrap or onboarding a new operator.
The skill reads `sso-mfa/bootstrap/creds-state.yaml` and provides
contextual guidance based on what has been done.
**Skill behavior:**
1. Read `creds-state.yaml` to determine current state
2. Identify the next required step (first `false` in dependency order)
3. For KeePassXC entry steps: display the exact group path and field names
to enter, with values sourced from `secrets/` env files (if present)
4. For time-sensitive steps (enckey-bootstrap): print a prominent warning
with the exact command and timing constraint
5. For verification steps: run `make creds-verify` and interpret results
6. After each confirmed step: prompt operator to update `creds-state.yaml`
or do it automatically when the state can be derived from cluster state
**Skill definition file structure:**
```yaml
---
description: "Guide through net-kingdom credential bootstrap. Reads creds-state.yaml and provides step-by-step KeePassXC entry instructions, timing warnings, and verification."
argument-hint: "[--repo-path /path/to/net-kingdom]"
allowed-tools:
- Read
- Bash(make creds-status:*)
- Bash(make creds-verify:*)
- Bash(kubectl get secret:*)
---
```
**Note:** The skill does NOT automate KeePassXC entry (that remains a
human step). It provides the information an operator needs to do it
correctly and verifies the result afterwards.
### T07 — Secret rotation runbook
```task
id: NK-WP-0004-T07
status: done
priority: medium
state_hub_task_id: "e27762d9-aa6a-4a7e-9c34-f8c546797548"
```
Document and automate the rotation procedure for each secret type.
Different secrets have different rotation complexity:
| Secret | Rotation impact | Procedure |
|--------|----------------|-----------|
| PI_SECRET_KEY | Flask session reset — all users logged out | Stop pod, rotate, restart |
| PI_PEPPER | Cannot rotate without re-hashing all passwords | Treat as permanent |
| PI_DB_PASSWORD | DB + K8s Secret must be rotated atomically | pg GRANT + Secret update |
| LLDAP_JWT_SECRET | All LLDAP sessions invalidated | Rotate Secret, restart pod |
| LLDAP_LDAP_USER_PASS | Must update LLDAP + Authelia + KeyCape atomically | 3-step coordinated |
| AUTHELIA_SESSION_SECRET | All Authelia sessions invalidated | Rotate, restart |
| AUTHELIA_KEYCAPE_CLIENT_SECRET | Must update Authelia (bcrypt) + KeyCape simultaneously | Coordinated 2-step |
| KeyCape RSA signing key | All issued tokens immediately invalidated | Brief auth outage |
| PI_ENCFILE | Cannot rotate — replace and re-enroll all tokens | Major operation |
| BREAKGLASS_PASSWORD | Low impact, rotate freely | Simple update |
Implement `make creds-rotate SECRET=<name>` that:
1. Validates the secret name is known
2. Prints the rotation impact and required coordination steps
3. Generates a new value (same entropy as original)
4. Guides through the atomic update sequence for that secret
5. Updates `creds-state.yaml` and ops bundle after rotation
## Done criteria
- [ ] `make creds-init` runs cleanly on a fresh workstation (age key check + setup)
- [ ] `make creds-generate` produces all secrets and prints KeePassXC entry guide
- [ ] `make creds-bundle` produces an age-encrypted ops bundle
- [ ] `make creds-apply` runs all `create-secrets.sh` scripts in dependency order
- [ ] `make creds-verify` accurately reflects K8s secret state
- [ ] `make creds-status` shows a readable state table from `creds-state.yaml`
- [ ] `make hooks-test` confirms pre-commit hook blocks plaintext commits
- [ ] `/creds-bootstrap` skill loads, reads state, and provides correct next step
- [ ] NK-WP-0003-T01 can be marked done by referencing this workplan as complete