From 6336c28626ccacc19c51a697838340efecde5189 Mon Sep 17 00:00:00 2001 From: tegwick Date: Thu, 18 Jun 2026 01:01:50 +0200 Subject: [PATCH] docs: persist OpenBao/SSH/bootstrap state assessment in history Capture live vs greenfield tracks, unseal custody models, console S6 interpretation, repo ownership, and ordered next actions before NET-WP-0020 T5. --- ...ao-ssh-custody-and-bootstrap-assessment.md | 251 ++++++++++++++++++ ...enbao-unseal-custody-and-ssh-automation.md | 1 + 2 files changed, 252 insertions(+) create mode 100644 history/2026-06-17-openbao-ssh-custody-and-bootstrap-assessment.md diff --git a/history/2026-06-17-openbao-ssh-custody-and-bootstrap-assessment.md b/history/2026-06-17-openbao-ssh-custody-and-bootstrap-assessment.md new file mode 100644 index 0000000..74e800e --- /dev/null +++ b/history/2026-06-17-openbao-ssh-custody-and-bootstrap-assessment.md @@ -0,0 +1,251 @@ +# OpenBao, SSH, and Bootstrap Custody — State Assessment + +**Date:** 2026-06-17 +**Author:** codex (with operator session evidence) +**Purpose:** Persist current state, concepts, and navigation map so security +setup work does not lose context while implementing NET-WP-0020 T5 and related +automation. + +**Repos:** `net-kingdom`, `railiance-platform`, `railiance-infra`, `ops-warden` + +--- + +## 1. Executive summary + +NetKingdom’s **first platform bootstrap is complete** (console stage **S6**, +OpenBao live at `https://bao.coulomb.social`). **SSH certificate infrastructure +via OpenBao is not started:** no `ssh/` secrets engine, hosts still on **legacy +static-key SSH** that predates OpenBao and ops-warden. + +We adopted an **automation-first custody strategy** for *future* greenfield +rebuilds (`sops-held-automation`), while **blocking** unimplemented production +models (`attended-ceremony`, `auto-unseal-transit`) in the security bootstrap +console. That does **not** re-init the live cluster. + +**Next implementation slice (after this assessment):** NET-WP-0020 **T5** — +declarative OpenBao SSH engine + railiance-infra host CA trust — on the **live** +cluster first, then prove full unattended chain on greenfield 3-node. + +--- + +## 2. Current state (verified 2026-06-17) + +### 2.1 NetKingdom security bootstrap (operator workstation) + +| Item | State | +| --- | --- | +| Metadata | `net-kingdom/.local/security-bootstrap.json` | +| Console stage | **S6 — Reopen under custody** | +| King custody | Approved (`temporary-single-king` or equivalent) | +| OpenBao unseal custody model | **`sops-held-automation`** selected (2026-06-17) | +| OpenBao init | **`openbao_initialized: true`** (from **first attended bootstrap**) | +| All bootstrap gates | **done** (preflight, OIDC, restore drill, platform reopen) | +| Plaintext bootstrap secrets | **absent** (good) | +| Encrypted bundle | `sso-mfa/bootstrap/secrets.enc` (11 files) | + +**Interpretation:** Selecting `sops-held-automation` records the **preferred +model for the next rebuild**. Init ceremony gate shows **done** because OpenBao +was already initialized manually in NET-WP-0015–0017 — not because SOPS-held +automation ran on this cluster. + +### 2.2 OpenBao platform (Railiance01 / production endpoint) + +| Check | Result | +| --- | --- | +| `/v1/sys/health` | initialized, unsealed, v2.5.4+ | +| UI login | `netkingdom` / `platform-admin` (KeyCape OIDC) — works | +| **`ssh/` secrets engine** | **Not enabled** (operator confirmed) | +| `platform/operators/ops-warden` KV | **Not required** for SSH signing | + +Evidence: `ops-warden/history/2026-06-17-openbao-production-verify.md` + +### 2.3 ops-warden workstation + +| Item | State | +| --- | --- | +| `~/.config/warden/warden.yaml` | Present (`backend: vault`, `bao.coulomb.social`) | +| `~/.config/warden/inventory.yaml` | Present (seed actors) | +| Test keypair | `~/.ssh/agt-state-hub-bridge_ed25519` created | +| `warden sign` against production | **Blocked** — no SSH engine | +| WP-0008 T2 | **wait** — SSH engine + host trust | +| Policy gate (WP-0007) | Shipped, `policy.enabled: false` default | + +### 2.4 SSH infrastructure lineage + +```text +Legacy (today on hosts) Target (not built) +──────────────────────── ────────────────── +Static keys / authorized_keys OpenSSH CA + short-lived certs +CA key on disk (if any) OpenBao ssh/ engine CA +Predates OpenBao ops-warden warden sign + railiance-infra principals + TrustedUserCAKeys +``` + +--- + +## 3. Core concepts (do not conflate) + +### 3.1 Two custody dimensions + +| Dimension | Field / doc | What it governs | +| --- | --- | --- | +| **King / platform recovery custody** | `custody_mode` in metadata | Who holds recovery authority (single king vs 2-of-3) | +| **OpenBao init/unseal execution** | `openbao_unseal_custody_model` | *How* init/unseal runs (automation vs attended vs KMS) | + +Both are valid and orthogonal. See `docs/openbao-unseal-custody-models.md`. + +### 3.2 Three unseal custody models (init/unseal execution) + +| Model ID | Status | Use | +| --- | --- | --- | +| `sops-held-automation` | **Implemented** (console) | Default for **greenfield fast test cycles**; entry: `creds-bootstrap-agent.sh` | +| `attended-ceremony` | **Planned** (blocked in console) | Production trust; matches **first bootstrap** already performed | +| `auto-unseal-transit` | **Planned** (blocked in console) | HA rebuilds without manual unseal | + +**Development strategy (agreed 2026-06-17):** + +1. Max automation first → prove SSH engine + host CA + `warden sign` loops +2. Add attended ceremony gates for production profiles +3. Add auto-unseal for ThreePhoenix HA + +### 3.3 Two operational tracks + +```text +Track A — LIVE cluster (Railiance01 today) + • OpenBao: up, attended init done + • Gap: enable ssh/ engine + host CA trust + • Work: NET-WP-0020 T5, ops-warden WP-0008 T2 verify + • Do NOT re-run init; do NOT require platform KV secret for warden + +Track B — GREENFIELD 3-node (future automation proof) + • Clean Linux + root SSH on 3 machines + • S1 infra → S2 k3s HA → S3 OpenBao deploy + • sops-held-automation → creds-bootstrap-agent init/unseal (T2) + • T5 SSH engine + host CA → warden sign smoke + • Use separate metadata e.g. .local/security-bootstrap-greenfield.json +``` + +### 3.4 What does NOT help SSH signing + +| Action | Why irrelevant | +| --- | --- | +| Create `platform/operators/ops-warden` KV secret | KV stores secrets; warden calls **`ssh/sign/`** API | +| Browser UI login alone | Does not set `VAULT_TOKEN` for CLI/`warden` | +| Re-selecting custody model on S6 metadata | Records preference only; does not enable `ssh/` engine | + +--- + +## 4. Repo ownership (NetKingdom map) + +| Concern | Owner | Artifact | +| --- | --- | --- | +| Bootstrap orchestration & custody canon | **net-kingdom** | console, smooth-bootstrap-guide, NET-WP-0020 | +| OpenBao deploy + post-unseal config | **railiance-platform** | `openbao-deploy`, `openbao-configure-initial` | +| OpenBao SSH engine enable + roles | **railiance-platform** (T5) | `openbao-configure-ssh` (planned) | +| Host `TrustedUserCAKeys` + principals | **railiance-infra** (T5) | `bootstrap-ssh-ca` (planned) | +| Sign CLI + inventory + audit log | **ops-warden** | `warden sign`, WP-0007 policy gate | +| flex-auth pre-sign policies | **flex-auth** | WP-0008 T5 (later) | + +--- + +## 5. Workplan map (active strands) + +| ID | Repo | Focus | Status | +| --- | --- | --- | --- | +| **NET-WP-0020** | net-kingdom | Unseal custody models + SSH automation path | T1 done; **T5 next** | +| **WARDEN-WP-0008** | ops-warden | Production `warden sign` evidence | T2 wait on T5 | +| **RAIL-BS-WP-0007** | railiance-cluster | ThreePhoenix 3-node HA | Prerequisite for Track B at scale | +| NET-WP-0018 | net-kingdom | Smooth bootstrap guide | S6 reached on live bootstrap | + +--- + +## 6. Console commands reference (operator session) + +```bash +cd ~/net-kingdom + +make security-bootstrap-openbao-unseal-custody-models +make security-bootstrap-select-openbao-unseal-custody-model MODEL=sops-held-automation +make security-bootstrap-console +``` + +**Observed (2026-06-17):** All gates `done`, stage S6, unseal model gate +`done` with automation entry `sso-mfa/bootstrap/creds-bootstrap-agent.sh`, init +ceremony `done` (historical init). Next safe action: *Review related workplans* +— expected for completed bootstrap, not an error. + +**Greenfield preview (when T2 exists):** + +```bash +export METADATA=.local/security-bootstrap-greenfield.json +make security-bootstrap-metadata-init METADATA="$METADATA" +make security-bootstrap-select-openbao-unseal-custody-model \ + MODEL=sops-held-automation METADATA="$METADATA" +make security-bootstrap-console METADATA="$METADATA" +# Expect lower stage, init gate status "automation" +``` + +--- + +## 7. Automation chain (target end state) + +```text +[3 nodes root SSH] + → railiance-infra S1 baseline + → railiance-cluster S2 k3s HA + → railiance-platform openbao-deploy + → net-kingdom creds-bootstrap-agent (sops-held init/unseal) [T2] + → railiance-platform openbao-configure-initial [exists] + → railiance-platform openbao-configure-ssh [T5 — next] + → railiance-infra bootstrap-ssh-ca (CA pubkey + principals) [T5] + → ops-warden warden sign smoke [WP-0008 T2] + → (later) flex-auth policy.enabled [WP-0008 T5] +``` + +On **Track A (live):** skip init/unseal steps; start at **openbao-configure-ssh**. + +--- + +## 8. Credential management note (ops-warden) + +Operator feedback: manual `ssh-keygen` for WP-0008 T2 is acceptable for first +sign proof but insufficient long-term. ops-warden should eventually document or +automate actor key lifecycle (`warden issue`, credential roster, rotation). +**Deferred** until T5 + T2 sign path succeeds. + +--- + +## 9. Decisions log + +| Date | Decision | +| --- | --- | +| 2026-06-17 | All three unseal custody models are canon; start automation-first | +| 2026-06-17 | Console blocks planned models with hints; only `sops-held-automation` selectable | +| 2026-06-17 | Live cluster uses Track A; greenfield uses Track B + separate metadata | +| 2026-06-17 | No `platform/operators/ops-warden` KV for SSH signing bootstrap | +| 2026-06-17 | Implement T5 on live OpenBao before greenfield full loop | + +--- + +## 10. Next actions (ordered) + +1. ~~Persist this assessment~~ (this file) +2. **NET-WP-0020 T5** — `openbao-apply-ssh-engine.sh` + railiance-infra host CA role +3. **WP-0008 T2** — `warden sign` smoke + append `openbao-production-verify.md` +4. **NET-WP-0020 T2** — wire `creds-bootstrap-agent.sh` for greenfield init/unseal +5. **NET-WP-0020 T3/T4** — unlock attended + auto-unseal console paths + +--- + +## 11. Related files + +| Path | Role | +| --- | --- | +| `docs/openbao-unseal-custody-models.md` | Unseal custody canon | +| `docs/smooth-bootstrap-guide.md` | Step 5 unseal model table | +| `workplans/NET-WP-0020-openbao-unseal-custody-and-ssh-automation.md` | Active workplan | +| `ops-warden/history/2026-06-17-openbao-production-verify.md` | Health + SSH engine gap | +| `ops-warden/workplans/WARDEN-WP-0008-*.md` | Production sign verification | +| `railiance-platform/docs/openbao.md` | Deploy + attended ceremony | +| `ops-warden/wiki/OpenBaoSshEngineChecklist.md` | Role TTL + verify procedure | +| `ops-warden/history/2026-06-17-post-wp0007-reassessment.md` | ops-warden completeness | \ No newline at end of file diff --git a/workplans/NET-WP-0020-openbao-unseal-custody-and-ssh-automation.md b/workplans/NET-WP-0020-openbao-unseal-custody-and-ssh-automation.md index 9f1db86..557c359 100644 --- a/workplans/NET-WP-0020-openbao-unseal-custody-and-ssh-automation.md +++ b/workplans/NET-WP-0020-openbao-unseal-custody-and-ssh-automation.md @@ -90,5 +90,6 @@ priority: high ## See also +- `history/2026-06-17-openbao-ssh-custody-and-bootstrap-assessment.md` — state + concepts (read before T5) - `ops-warden/workplans/WARDEN-WP-0008-production-ssh-path-and-stewardship-closeout.md` - `railiance-platform/docs/openbao.md` \ No newline at end of file