Files
ops-warden/wiki/OpenBaoSshEngineChecklist.md
tegwick 1865e0744e WARDEN-WP-0006: NetKingdom stewardship docs and alignment
Add credential routing, actor patterns, security map, OpenBao SSH
checklist, and policy-gated signing design. Update registry and SCOPE;
record INTENT↔SCOPE reassessment (C3 completeness).
2026-06-17 08:22:45 +02:00

172 lines
4.2 KiB
Markdown

# OpenBao SSH Engine — Operational Checklist
Date: 2026-06-17
Verify the production SSH signing path for `warden` against platform OpenBao.
Cluster bootstrap and unseal are **not** ops-warden scope — see
`railiance-platform/docs/openbao.md`.
---
## Prerequisites
- [ ] OpenBao deployed on Railiance (`railiance-platform` helm/Makefile)
- [ ] `bao status` reports initialized and **unsealed**
- [ ] Operator has **scoped token** — not root token in `VAULT_TOKEN` for daily warden use
- [ ] `warden.yaml` points `vault.addr` at correct endpoint:
- Workstation: `https://bao.coulomb.social`
- In-cluster: `http://openbao.openbao.svc.cluster.local:8200`
- [ ] Actor exists in inventory — `wiki/ActorInventoryPatterns.md`
- [ ] Test pubkey available (mode 600 private key, never commit)
---
## One-time SSH engine setup (operator)
Run with OpenBao admin policy — not from agent chat logs.
```bash
# Confirm reachability
bao status
# Enable SSH secrets engine (skip if already enabled)
bao secrets enable ssh
# Roles — TTL max must match ActorType policy (wiki/OpsWardenConfig.md)
bao write ssh/roles/agt-role \
key_type=ca \
allowed_users="*" \
allow_user_certificates=true \
default_user="agt" \
ttl=24h max_ttl=24h
bao write ssh/roles/adm-role \
key_type=ca \
allowed_users="*" \
allow_user_certificates=true \
default_user="adm" \
ttl=48h max_ttl=48h
bao write ssh/roles/atm-role \
key_type=ca \
allowed_users="*" \
allow_user_certificates=true \
default_user="atm" \
ttl=8h max_ttl=8h
# Verify roles listed
bao list ssh/roles
```
Document CA public key distribution to hosts via railiance-infra — warden does
not deploy `TrustedUserCAKeys`.
---
## Token policy expectations
| Rule | Rationale |
| --- | --- |
| No root token in `VAULT_TOKEN` for warden workflows | Root is break-glass only |
| Token scoped to `ssh/sign/<role>` for needed roles | Least privilege |
| Short TTL on operator tokens | Limit blast radius |
| Prefer OIDC/login-derived tokens via KeyCape where available | Platform admin path |
Example policy shape (illustrative — adjust in OpenBao policy admin):
```hcl
path "ssh/sign/agt-role" {
capabilities = ["create", "update"]
}
```
---
## warden.yaml sanity check
```yaml
backend: vault
vault:
addr: https://bao.coulomb.social
mount: ssh
role_map:
adm: adm-role
agt: agt-role
atm: atm-role
token_env: VAULT_TOKEN
```
---
## Verification procedure
```bash
export VAULT_TOKEN="<scoped-token>" # never paste in chat or commit
# 1. Config loads
warden status --help
# 2. Sign test actor (replace actor and pubkey paths)
warden sign agt-state-hub-bridge --pubkey ~/.ssh/agt-state-hub-bridge_ed25519.pub \
| head -c 80 && echo "..."
# 3. Metadata
warden status agt-state-hub-bridge
# 4. Audit line
warden log --actor agt-state-hub-bridge --last 1
# 5. Compliance
warden scorecard
```
**Pass criteria:**
- Exit code 0 on sign and status
- Cert `valid_before` in the future
- `signatures.log` has new JSONL line with `"backend": "vault"`
- Scorecard passes on clean state dir
---
## cert_command smoke (ops-bridge)
In `tunnels.yaml`, set:
```yaml
cert_command: "warden sign <actor> --pubkey <path/to>.pub"
```
Bring up tunnel; confirm SSH connects with cert + key (ops-bridge docs).
---
## Failure modes
| Symptom | Likely cause | Action |
| --- | --- | --- |
| `Vault token not found` | `VAULT_TOKEN` unset | Scoped login/token issue |
| HTTP 403 from OpenBao | Token lacks sign permission | Fix policy |
| `No Vault role mapped` | `role_map` mismatch | Fix warden.yaml |
| `ttl exceeds max` | Inventory TTL > ActorType max | Fix inventory or role |
| Connection refused | Wrong `addr` or OpenBao sealed | Check platform ops |
| Host rejects cert | Principal not on host | railiance-infra auth_principals |
**Lab fallback:** `backend: local` in warden.yaml — **not** a production substitute.
Use only for offline dev when OpenBao is unreachable.
---
## Boundaries
- ops-warden does not unseal OpenBao or rotate unseal keys
- ops-warden does not store API keys alongside SSH signing
- Host trust of CA pubkey is railiance-infra responsibility
---
## See also
- `wiki/OpsWardenConfig.md`
- `railiance-platform/docs/openbao.md`
- `wiki/CredentialRouting.md`