Files
the-custodian/canon/standards/credential-management_v0.1.md
tegwick 0777e5b2f0 feat: add FOS/credential standards, big-picture guidance, and CUST-WP-0025 workplan
- canon/standards/credential-management_v0.1.md: single root-of-trust credential hierarchy standard
- canon/standards/federated-organization-standard_v1.0.md: FOS reference architecture (VSM-based)
- wiki/BigPictureGuidance.md: integration guidance for OAS + FOS orthogonal layers
- workplans/CUST-WP-0025-fos-hub-bootstrap.md: 4-phase plan (identity, hub-core extraction, ops-hub, fin-hub)
- state-hub/Makefile: treat exit 2 (warnings-only) as success in check-consistency targets

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-20 23:48:13 +01:00

301 lines
10 KiB
Markdown

---
title: "Credential Management Standard"
version: "0.1"
status: "Draft Standard"
domain: custodian
scope: all-domains
created: "2026-03-20"
---
# Credential Management Standard
**Version:** 0.1
**Status:** Draft Standard
**Scope:** All domains and repositories in the federated organization
---
## 1. Purpose
This standard defines how credentials, secrets, and key material are
managed across all systems — from a developer workstation with no
infrastructure, to a fully operational Kubernetes cluster.
The core principle is a **single root of trust**: one operator keypair
anchors all credential storage and encryption. Every secret can be
traced back to that root. No secret lives outside this hierarchy.
---
## 2. Trust Hierarchy
```
Operator passphrase (human memory only — never stored anywhere)
└── age keypair (~/.config/sops/age/key.txt — one per operator)
├── SOPS encryption (GitOps secrets in all repos)
│ └── secrets/**/*.sops.yaml — encrypted at rest in git
├── Ops bundle (age-encrypted tar — offsite backup)
│ └── ops-bundle-<date>.tar.age
│ └── all service secrets at point-in-time
└── KeePassXC (pre-cluster primary credential store)
│ └── master password = operator passphrase (or derived)
├── Infrastructure credentials
│ ├── SSH keys (server access)
│ ├── API tokens (Gitea, HostEurope, Hetzner)
│ └── Cloud credentials
├── Service secrets (per-domain groups)
│ ├── net-kingdom/privacyidea/
│ ├── net-kingdom/lldap/
│ ├── net-kingdom/authelia/
│ ├── net-kingdom/keycape/
│ └── railiance/postgres/
└── Vault root token (in-cluster phase, stored here)
└── HashiCorp Vault
└── External Secrets Operator (ESO)
└── K8s Secrets → pods
```
---
## 3. Phases
### Phase 0 — Pre-cluster (bootstrap)
**Used when:** No Kubernetes cluster is available. Local development,
initial server provisioning, CI bootstrap.
**Tools:** age keypair + KeePassXC + ops bundle
**Flow:**
1. Generate service secrets with a `gen-secrets.sh` script
2. Copy each secret manually into KeePassXC (under the appropriate group)
3. Encrypt a point-in-time ops bundle: `pack-bundle.sh <secrets-dir> <age-pub-key>`
4. Store the ops bundle offsite (separate physical location from KeePassXC)
5. Shred the plaintext secrets directory: `find secrets/ -type f -exec shred -u {} \;`
6. When deploying to k8s, read each secret from KeePassXC and inject via
`create-secrets.sh` scripts that produce K8s Secrets
**Invariant:** Plaintext secrets MUST NOT persist on disk after being
stored in KeePassXC. The only durable forms are: KeePassXC + ops bundle.
---
### Phase 1 — GitOps secrets (SOPS)
**Used when:** Secrets need to live alongside infrastructure code in git.
All repos with infrastructure manifests use this pattern.
**Tools:** SOPS + age
**Configuration (`.sops.yaml` in repo root):**
```yaml
creation_rules:
- path_regex: secrets/.*$
age: >-
<operator-age-public-key>
- path_regex: .*\.sops\.yaml$
age: >-
<operator-age-public-key>
```
**Multi-operator:** When a second operator joins, add their age public key
as an additional recipient and re-encrypt all secrets with `sops updatekeys`.
Both keys can decrypt independently — no single point of failure.
**Invariant:** The age private key is NEVER committed to git. The public
key is committed (in `.sops.yaml` and `keys/age.pub`). Encrypted values
in git are safe to store and review.
---
### Phase 2 — In-cluster (HashiCorp Vault)
**Used when:** Kubernetes cluster is operational and stable.
**Tools:** HashiCorp Vault + External Secrets Operator (ESO)
**Why ESO over Vault Agent Injector:** ESO produces standard K8s Secrets,
which are compatible with plain Helm charts and do not require pod
annotation changes. Decision D4 (net-kingdom DECISIONS.md).
**Flow:**
1. Bootstrap Vault with the root token stored in KeePassXC
2. Enable Kubernetes auth method (`vault auth enable kubernetes`)
3. Create per-service policies with least-privilege access
4. Migrate each service secret from KeePassXC into Vault
5. Deploy ESO `SecretStore` pointing to Vault
6. Replace `create-secrets.sh` calls with `ExternalSecret` manifests
7. Vault reconciles secrets into K8s Secrets automatically
**KeePassXC post-cluster:** Remains the source of truth for:
- The Vault root/unseal keys (emergency only)
- Dev/sandbox systems that do not connect to in-cluster Vault
- New secrets before they are migrated into Vault
---
## 4. KeePassXC Group Structure
All service secrets are organized under a standardized group hierarchy:
```
KeePassXC root
├── Infrastructure
│ ├── SSH Keys
│ │ └── <hostname> (private key as attachment, public key as note)
│ ├── API Tokens
│ │ ├── gitea-admin
│ │ ├── hosteurope-api
│ │ └── hetzner-api
│ └── Cloud Credentials
│ └── <provider>
├── net-kingdom
│ ├── privacyidea
│ │ ├── PI_SECRET_KEY
│ │ ├── PI_PEPPER
│ │ ├── PI_DB_PASSWORD
│ │ ├── pi-admin (password + totp-seed)
│ │ ├── trigger-admin (password + API token)
│ │ └── enckey (attachment: enckey file + audit keypair)
│ ├── lldap
│ │ ├── LLDAP_JWT_SECRET
│ │ └── LLDAP_LDAP_USER_PASS
│ ├── authelia
│ │ ├── AUTHELIA_JWT_SECRET
│ │ ├── AUTHELIA_SESSION_SECRET
│ │ ├── AUTHELIA_STORAGE_ENCRYPTION_KEY
│ │ ├── AUTHELIA_OIDC_HMAC_SECRET
│ │ └── AUTHELIA_KEYCAPE_CLIENT_SECRET
│ └── keycape
│ ├── RSA signing key (attachment: private + public PEM)
│ └── PI_ADMIN_TOKEN
├── railiance
│ ├── postgres
│ │ └── PG_ROOT_PASSWORD
│ └── sops-age
│ └── age private key (attachment: key.txt)
└── vault
├── root-token
└── unseal-keys (attachment: unseal-keys.txt, gpg-encrypted)
```
---
## 5. Age Keypair Management
**One keypair per operator.** The same key is used for:
- SOPS encryption across all repos
- Ops bundle encryption
**Generate:**
```bash
age-keygen -o ~/.config/sops/age/key.txt
# output: Public key: age1...
```
**Add to repos:** Copy the public key into `.sops.yaml` of each repo and
into `keys/age.pub`. Commit both.
**Back up:** The private key file MUST be stored in KeePassXC as an
attachment under `railiance/sops-age/age private key`. The KeePassXC
database is the disaster recovery path for the age private key.
**Rotation:** If the private key is compromised, generate a new keypair,
add the new public key to all repos, re-encrypt all secrets with
`sops updatekeys`, then revoke the old key from all `.sops.yaml` files.
---
## 6. Ops Bundle
The ops bundle is a point-in-time snapshot of all service secrets,
encrypted with age and stored offsite.
**Create:**
```bash
bash gen-secrets.sh ./secrets # generates all secrets as env files
# ... enter each into KeePassXC ...
bash pack-bundle.sh ./secrets <age-pub-key> # → ops-bundle-<date>.tar.age
find secrets/ -type f -exec shred -u {} \; # shred plaintext
```
**Restore:**
```bash
age -d -i ~/.config/sops/age/key.txt -o secrets.tar ops-bundle-<date>.tar.age
tar xf secrets.tar
# re-run create-secrets.sh scripts from restored env files
```
**Frequency:** Create a new ops bundle:
- Before any major cluster operation (migration, upgrade, rekey)
- After adding or rotating any service secret
- At least once per quarter
---
## 7. Prohibited Patterns
These are hard violations regardless of context:
| Pattern | Why prohibited |
|---------|----------------|
| Plaintext secrets committed to git | Unrecoverable leak |
| Secrets in environment variables in shell history | ~/.bash_history exposure |
| Sharing secrets via chat, email, or issue trackers | Uncontrolled propagation |
| Using the same password for multiple services | Single-point compromise |
| Storing age private key only on a single machine | Catastrophic loss on disk failure |
| Hardcoded secrets in application code or Helm values | Accidental publishing |
---
## 8. Multi-operator Extension
When a second operator needs access:
1. They generate their own age keypair (`age-keygen`)
2. Share only the **public key** (never the private key)
3. Primary operator adds it to `.sops.yaml` in all repos
4. Primary operator runs `sops updatekeys <file>` on all encrypted files
5. Both operators can now encrypt and decrypt independently
6. Share KeePassXC database via an encrypted channel (never plaintext)
— the other operator opens it with their own master password after import
---
## 9. Vault Migration Checklist
When the cluster is stable enough to operate Vault:
- [ ] Deploy Vault via Helm with HA mode (3 replicas minimum)
- [ ] Store root token and unseal keys in KeePassXC (vault/ group)
- [ ] Enable Kubernetes auth method
- [ ] Create per-service Vault policies (least privilege)
- [ ] Deploy ESO `ClusterSecretStore` pointing to Vault
- [ ] For each service: create `ExternalSecret` manifest, verify K8s Secret
reconciles correctly, then delete the manually-created K8s Secret
- [ ] Verify ESO auto-rotation works (reduce TTL to 1h, confirm rotation)
- [ ] Remove `create-secrets.sh` scripts from deployment runbooks
- [ ] Update this standard to Phase 2 operational status
---
## 10. Summary
| Situation | Tool | Source of truth |
|-----------|------|----------------|
| No cluster, local dev | KeePassXC + create-secrets.sh | KeePassXC |
| GitOps secrets in repo | SOPS + age | Git (ciphertext) |
| Cluster operational | Vault + ESO | Vault (KeePassXC holds root) |
| Disaster recovery | Ops bundle (age) | Offsite encrypted archive |
| Multi-operator | SOPS multi-recipient | Each operator's age keypair |