Update workplan T5 to progress and assessment next-actions for live cluster apply before WP-0008 warden sign smoke.
10 KiB
OpenBao, SSH, and Bootstrap Custody — State Assessment
Date: 2026-06-17
Author: codex (with operator session evidence)
Purpose: Persist current state, concepts, and navigation map so security
setup work does not lose context while implementing NET-WP-0020 T5 and related
automation.
Repos: net-kingdom, railiance-platform, railiance-infra, ops-warden
1. Executive summary
NetKingdom’s first platform bootstrap is complete (console stage S6,
OpenBao live at https://bao.coulomb.social). SSH certificate infrastructure
via OpenBao is not started: no ssh/ secrets engine, hosts still on legacy
static-key SSH that predates OpenBao and ops-warden.
We adopted an automation-first custody strategy for future greenfield
rebuilds (sops-held-automation), while blocking unimplemented production
models (attended-ceremony, auto-unseal-transit) in the security bootstrap
console. That does not re-init the live cluster.
Next implementation slice (after this assessment): NET-WP-0020 T5 — declarative OpenBao SSH engine + railiance-infra host CA trust — on the live cluster first, then prove full unattended chain on greenfield 3-node.
2. Current state (verified 2026-06-17)
2.1 NetKingdom security bootstrap (operator workstation)
| Item | State |
|---|---|
| Metadata | net-kingdom/.local/security-bootstrap.json |
| Console stage | S6 — Reopen under custody |
| King custody | Approved (temporary-single-king or equivalent) |
| OpenBao unseal custody model | sops-held-automation selected (2026-06-17) |
| OpenBao init | openbao_initialized: true (from first attended bootstrap) |
| All bootstrap gates | done (preflight, OIDC, restore drill, platform reopen) |
| Plaintext bootstrap secrets | absent (good) |
| Encrypted bundle | sso-mfa/bootstrap/secrets.enc (11 files) |
Interpretation: Selecting sops-held-automation records the preferred
model for the next rebuild. Init ceremony gate shows done because OpenBao
was already initialized manually in NET-WP-0015–0017 — not because SOPS-held
automation ran on this cluster.
2.2 OpenBao platform (Railiance01 / production endpoint)
| Check | Result |
|---|---|
/v1/sys/health |
initialized, unsealed, v2.5.4+ |
| UI login | netkingdom / platform-admin (KeyCape OIDC) — works |
ssh/ secrets engine |
Not enabled (operator confirmed) |
platform/operators/ops-warden KV |
Not required for SSH signing |
Evidence: ops-warden/history/2026-06-17-openbao-production-verify.md
2.3 ops-warden workstation
| Item | State |
|---|---|
~/.config/warden/warden.yaml |
Present (backend: vault, bao.coulomb.social) |
~/.config/warden/inventory.yaml |
Present (seed actors) |
| Test keypair | ~/.ssh/agt-state-hub-bridge_ed25519 created |
warden sign against production |
Blocked — no SSH engine |
| WP-0008 T2 | wait — SSH engine + host trust |
| Policy gate (WP-0007) | Shipped, policy.enabled: false default |
2.4 SSH infrastructure lineage
Legacy (today on hosts) Target (not built)
──────────────────────── ──────────────────
Static keys / authorized_keys OpenSSH CA + short-lived certs
CA key on disk (if any) OpenBao ssh/ engine CA
Predates OpenBao ops-warden warden sign
railiance-infra principals + TrustedUserCAKeys
3. Core concepts (do not conflate)
3.1 Two custody dimensions
| Dimension | Field / doc | What it governs |
|---|---|---|
| King / platform recovery custody | custody_mode in metadata |
Who holds recovery authority (single king vs 2-of-3) |
| OpenBao init/unseal execution | openbao_unseal_custody_model |
How init/unseal runs (automation vs attended vs KMS) |
Both are valid and orthogonal. See docs/openbao-unseal-custody-models.md.
3.2 Three unseal custody models (init/unseal execution)
| Model ID | Status | Use |
|---|---|---|
sops-held-automation |
Implemented (console) | Default for greenfield fast test cycles; entry: creds-bootstrap-agent.sh |
attended-ceremony |
Planned (blocked in console) | Production trust; matches first bootstrap already performed |
auto-unseal-transit |
Planned (blocked in console) | HA rebuilds without manual unseal |
Development strategy (agreed 2026-06-17):
- Max automation first → prove SSH engine + host CA +
warden signloops - Add attended ceremony gates for production profiles
- Add auto-unseal for ThreePhoenix HA
3.3 Two operational tracks
Track A — LIVE cluster (Railiance01 today)
• OpenBao: up, attended init done
• Gap: enable ssh/ engine + host CA trust
• Work: NET-WP-0020 T5, ops-warden WP-0008 T2 verify
• Do NOT re-run init; do NOT require platform KV secret for warden
Track B — GREENFIELD 3-node (future automation proof)
• Clean Linux + root SSH on 3 machines
• S1 infra → S2 k3s HA → S3 OpenBao deploy
• sops-held-automation → creds-bootstrap-agent init/unseal (T2)
• T5 SSH engine + host CA → warden sign smoke
• Use separate metadata e.g. .local/security-bootstrap-greenfield.json
3.4 What does NOT help SSH signing
| Action | Why irrelevant |
|---|---|
Create platform/operators/ops-warden KV secret |
KV stores secrets; warden calls ssh/sign/<role> API |
| Browser UI login alone | Does not set VAULT_TOKEN for CLI/warden |
| Re-selecting custody model on S6 metadata | Records preference only; does not enable ssh/ engine |
4. Repo ownership (NetKingdom map)
| Concern | Owner | Artifact |
|---|---|---|
| Bootstrap orchestration & custody canon | net-kingdom | console, smooth-bootstrap-guide, NET-WP-0020 |
| OpenBao deploy + post-unseal config | railiance-platform | openbao-deploy, openbao-configure-initial |
| OpenBao SSH engine enable + roles | railiance-platform (T5) | openbao-configure-ssh (planned) |
Host TrustedUserCAKeys + principals |
railiance-infra (T5) | bootstrap-ssh-ca (planned) |
| Sign CLI + inventory + audit log | ops-warden | warden sign, WP-0007 policy gate |
| flex-auth pre-sign policies | flex-auth | WP-0008 T5 (later) |
5. Workplan map (active strands)
| ID | Repo | Focus | Status |
|---|---|---|---|
| NET-WP-0020 | net-kingdom | Unseal custody models + SSH automation path | T1 done; T5 next |
| WARDEN-WP-0008 | ops-warden | Production warden sign evidence |
T2 wait on T5 |
| RAIL-BS-WP-0007 | railiance-cluster | ThreePhoenix 3-node HA | Prerequisite for Track B at scale |
| NET-WP-0018 | net-kingdom | Smooth bootstrap guide | S6 reached on live bootstrap |
6. Console commands reference (operator session)
cd ~/net-kingdom
make security-bootstrap-openbao-unseal-custody-models
make security-bootstrap-select-openbao-unseal-custody-model MODEL=sops-held-automation
make security-bootstrap-console
Observed (2026-06-17): All gates done, stage S6, unseal model gate
done with automation entry sso-mfa/bootstrap/creds-bootstrap-agent.sh, init
ceremony done (historical init). Next safe action: Review related workplans
— expected for completed bootstrap, not an error.
Greenfield preview (when T2 exists):
export METADATA=.local/security-bootstrap-greenfield.json
make security-bootstrap-metadata-init METADATA="$METADATA"
make security-bootstrap-select-openbao-unseal-custody-model \
MODEL=sops-held-automation METADATA="$METADATA"
make security-bootstrap-console METADATA="$METADATA"
# Expect lower stage, init gate status "automation"
7. Automation chain (target end state)
[3 nodes root SSH]
→ railiance-infra S1 baseline
→ railiance-cluster S2 k3s HA
→ railiance-platform openbao-deploy
→ net-kingdom creds-bootstrap-agent (sops-held init/unseal) [T2]
→ railiance-platform openbao-configure-initial [exists]
→ railiance-platform openbao-configure-ssh [T5 — scripted; operator apply pending]
→ railiance-infra bootstrap-ssh-ca (CA pubkey + principals) [T5]
→ ops-warden warden sign smoke [WP-0008 T2]
→ (later) flex-auth policy.enabled [WP-0008 T5]
On Track A (live): skip init/unseal steps; start at openbao-configure-ssh.
8. Credential management note (ops-warden)
Operator feedback: manual ssh-keygen for WP-0008 T2 is acceptable for first
sign proof but insufficient long-term. ops-warden should eventually document or
automate actor key lifecycle (warden issue, credential roster, rotation).
Deferred until T5 + T2 sign path succeeds.
9. Decisions log
| Date | Decision |
|---|---|
| 2026-06-17 | All three unseal custody models are canon; start automation-first |
| 2026-06-17 | Console blocks planned models with hints; only sops-held-automation selectable |
| 2026-06-17 | Live cluster uses Track A; greenfield uses Track B + separate metadata |
| 2026-06-17 | No platform/operators/ops-warden KV for SSH signing bootstrap |
| 2026-06-17 | Implement T5 on live OpenBao before greenfield full loop |
10. Next actions (ordered)
Persist this assessment(this file)NET-WP-0020 T5 — automation artifacts in railiance-platform + railiance-infra(2026-06-18)- Operator apply —
make openbao-configure-sshthenmake bootstrap-ssh-ca(Track A) - WP-0008 T2 —
warden signsmoke + appendopenbao-production-verify.md - NET-WP-0020 T2 — wire
creds-bootstrap-agent.shfor greenfield init/unseal - NET-WP-0020 T3/T4 — unlock attended + auto-unseal console paths
11. Related files
| Path | Role |
|---|---|
docs/openbao-unseal-custody-models.md |
Unseal custody canon |
docs/smooth-bootstrap-guide.md |
Step 5 unseal model table |
workplans/NET-WP-0020-openbao-unseal-custody-and-ssh-automation.md |
Active workplan |
ops-warden/history/2026-06-17-openbao-production-verify.md |
Health + SSH engine gap |
ops-warden/workplans/WARDEN-WP-0008-*.md |
Production sign verification |
railiance-platform/docs/openbao.md |
Deploy + attended ceremony |
ops-warden/wiki/OpenBaoSshEngineChecklist.md |
Role TTL + verify procedure |
ops-warden/history/2026-06-17-post-wp0007-reassessment.md |
ops-warden completeness |