generated from coulomb/repo-seed
docs: persist OpenBao/SSH/bootstrap state assessment in history
Capture live vs greenfield tracks, unseal custody models, console S6 interpretation, repo ownership, and ordered next actions before NET-WP-0020 T5.
This commit is contained in:
@@ -0,0 +1,251 @@
|
||||
# OpenBao, SSH, and Bootstrap Custody — State Assessment
|
||||
|
||||
**Date:** 2026-06-17
|
||||
**Author:** codex (with operator session evidence)
|
||||
**Purpose:** Persist current state, concepts, and navigation map so security
|
||||
setup work does not lose context while implementing NET-WP-0020 T5 and related
|
||||
automation.
|
||||
|
||||
**Repos:** `net-kingdom`, `railiance-platform`, `railiance-infra`, `ops-warden`
|
||||
|
||||
---
|
||||
|
||||
## 1. Executive summary
|
||||
|
||||
NetKingdom’s **first platform bootstrap is complete** (console stage **S6**,
|
||||
OpenBao live at `https://bao.coulomb.social`). **SSH certificate infrastructure
|
||||
via OpenBao is not started:** no `ssh/` secrets engine, hosts still on **legacy
|
||||
static-key SSH** that predates OpenBao and ops-warden.
|
||||
|
||||
We adopted an **automation-first custody strategy** for *future* greenfield
|
||||
rebuilds (`sops-held-automation`), while **blocking** unimplemented production
|
||||
models (`attended-ceremony`, `auto-unseal-transit`) in the security bootstrap
|
||||
console. That does **not** re-init the live cluster.
|
||||
|
||||
**Next implementation slice (after this assessment):** NET-WP-0020 **T5** —
|
||||
declarative OpenBao SSH engine + railiance-infra host CA trust — on the **live**
|
||||
cluster first, then prove full unattended chain on greenfield 3-node.
|
||||
|
||||
---
|
||||
|
||||
## 2. Current state (verified 2026-06-17)
|
||||
|
||||
### 2.1 NetKingdom security bootstrap (operator workstation)
|
||||
|
||||
| Item | State |
|
||||
| --- | --- |
|
||||
| Metadata | `net-kingdom/.local/security-bootstrap.json` |
|
||||
| Console stage | **S6 — Reopen under custody** |
|
||||
| King custody | Approved (`temporary-single-king` or equivalent) |
|
||||
| OpenBao unseal custody model | **`sops-held-automation`** selected (2026-06-17) |
|
||||
| OpenBao init | **`openbao_initialized: true`** (from **first attended bootstrap**) |
|
||||
| All bootstrap gates | **done** (preflight, OIDC, restore drill, platform reopen) |
|
||||
| Plaintext bootstrap secrets | **absent** (good) |
|
||||
| Encrypted bundle | `sso-mfa/bootstrap/secrets.enc` (11 files) |
|
||||
|
||||
**Interpretation:** Selecting `sops-held-automation` records the **preferred
|
||||
model for the next rebuild**. Init ceremony gate shows **done** because OpenBao
|
||||
was already initialized manually in NET-WP-0015–0017 — not because SOPS-held
|
||||
automation ran on this cluster.
|
||||
|
||||
### 2.2 OpenBao platform (Railiance01 / production endpoint)
|
||||
|
||||
| Check | Result |
|
||||
| --- | --- |
|
||||
| `/v1/sys/health` | initialized, unsealed, v2.5.4+ |
|
||||
| UI login | `netkingdom` / `platform-admin` (KeyCape OIDC) — works |
|
||||
| **`ssh/` secrets engine** | **Not enabled** (operator confirmed) |
|
||||
| `platform/operators/ops-warden` KV | **Not required** for SSH signing |
|
||||
|
||||
Evidence: `ops-warden/history/2026-06-17-openbao-production-verify.md`
|
||||
|
||||
### 2.3 ops-warden workstation
|
||||
|
||||
| Item | State |
|
||||
| --- | --- |
|
||||
| `~/.config/warden/warden.yaml` | Present (`backend: vault`, `bao.coulomb.social`) |
|
||||
| `~/.config/warden/inventory.yaml` | Present (seed actors) |
|
||||
| Test keypair | `~/.ssh/agt-state-hub-bridge_ed25519` created |
|
||||
| `warden sign` against production | **Blocked** — no SSH engine |
|
||||
| WP-0008 T2 | **wait** — SSH engine + host trust |
|
||||
| Policy gate (WP-0007) | Shipped, `policy.enabled: false` default |
|
||||
|
||||
### 2.4 SSH infrastructure lineage
|
||||
|
||||
```text
|
||||
Legacy (today on hosts) Target (not built)
|
||||
──────────────────────── ──────────────────
|
||||
Static keys / authorized_keys OpenSSH CA + short-lived certs
|
||||
CA key on disk (if any) OpenBao ssh/ engine CA
|
||||
Predates OpenBao ops-warden warden sign
|
||||
railiance-infra principals + TrustedUserCAKeys
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. Core concepts (do not conflate)
|
||||
|
||||
### 3.1 Two custody dimensions
|
||||
|
||||
| Dimension | Field / doc | What it governs |
|
||||
| --- | --- | --- |
|
||||
| **King / platform recovery custody** | `custody_mode` in metadata | Who holds recovery authority (single king vs 2-of-3) |
|
||||
| **OpenBao init/unseal execution** | `openbao_unseal_custody_model` | *How* init/unseal runs (automation vs attended vs KMS) |
|
||||
|
||||
Both are valid and orthogonal. See `docs/openbao-unseal-custody-models.md`.
|
||||
|
||||
### 3.2 Three unseal custody models (init/unseal execution)
|
||||
|
||||
| Model ID | Status | Use |
|
||||
| --- | --- | --- |
|
||||
| `sops-held-automation` | **Implemented** (console) | Default for **greenfield fast test cycles**; entry: `creds-bootstrap-agent.sh` |
|
||||
| `attended-ceremony` | **Planned** (blocked in console) | Production trust; matches **first bootstrap** already performed |
|
||||
| `auto-unseal-transit` | **Planned** (blocked in console) | HA rebuilds without manual unseal |
|
||||
|
||||
**Development strategy (agreed 2026-06-17):**
|
||||
|
||||
1. Max automation first → prove SSH engine + host CA + `warden sign` loops
|
||||
2. Add attended ceremony gates for production profiles
|
||||
3. Add auto-unseal for ThreePhoenix HA
|
||||
|
||||
### 3.3 Two operational tracks
|
||||
|
||||
```text
|
||||
Track A — LIVE cluster (Railiance01 today)
|
||||
• OpenBao: up, attended init done
|
||||
• Gap: enable ssh/ engine + host CA trust
|
||||
• Work: NET-WP-0020 T5, ops-warden WP-0008 T2 verify
|
||||
• Do NOT re-run init; do NOT require platform KV secret for warden
|
||||
|
||||
Track B — GREENFIELD 3-node (future automation proof)
|
||||
• Clean Linux + root SSH on 3 machines
|
||||
• S1 infra → S2 k3s HA → S3 OpenBao deploy
|
||||
• sops-held-automation → creds-bootstrap-agent init/unseal (T2)
|
||||
• T5 SSH engine + host CA → warden sign smoke
|
||||
• Use separate metadata e.g. .local/security-bootstrap-greenfield.json
|
||||
```
|
||||
|
||||
### 3.4 What does NOT help SSH signing
|
||||
|
||||
| Action | Why irrelevant |
|
||||
| --- | --- |
|
||||
| Create `platform/operators/ops-warden` KV secret | KV stores secrets; warden calls **`ssh/sign/<role>`** API |
|
||||
| Browser UI login alone | Does not set `VAULT_TOKEN` for CLI/`warden` |
|
||||
| Re-selecting custody model on S6 metadata | Records preference only; does not enable `ssh/` engine |
|
||||
|
||||
---
|
||||
|
||||
## 4. Repo ownership (NetKingdom map)
|
||||
|
||||
| Concern | Owner | Artifact |
|
||||
| --- | --- | --- |
|
||||
| Bootstrap orchestration & custody canon | **net-kingdom** | console, smooth-bootstrap-guide, NET-WP-0020 |
|
||||
| OpenBao deploy + post-unseal config | **railiance-platform** | `openbao-deploy`, `openbao-configure-initial` |
|
||||
| OpenBao SSH engine enable + roles | **railiance-platform** (T5) | `openbao-configure-ssh` (planned) |
|
||||
| Host `TrustedUserCAKeys` + principals | **railiance-infra** (T5) | `bootstrap-ssh-ca` (planned) |
|
||||
| Sign CLI + inventory + audit log | **ops-warden** | `warden sign`, WP-0007 policy gate |
|
||||
| flex-auth pre-sign policies | **flex-auth** | WP-0008 T5 (later) |
|
||||
|
||||
---
|
||||
|
||||
## 5. Workplan map (active strands)
|
||||
|
||||
| ID | Repo | Focus | Status |
|
||||
| --- | --- | --- | --- |
|
||||
| **NET-WP-0020** | net-kingdom | Unseal custody models + SSH automation path | T1 done; **T5 next** |
|
||||
| **WARDEN-WP-0008** | ops-warden | Production `warden sign` evidence | T2 wait on T5 |
|
||||
| **RAIL-BS-WP-0007** | railiance-cluster | ThreePhoenix 3-node HA | Prerequisite for Track B at scale |
|
||||
| NET-WP-0018 | net-kingdom | Smooth bootstrap guide | S6 reached on live bootstrap |
|
||||
|
||||
---
|
||||
|
||||
## 6. Console commands reference (operator session)
|
||||
|
||||
```bash
|
||||
cd ~/net-kingdom
|
||||
|
||||
make security-bootstrap-openbao-unseal-custody-models
|
||||
make security-bootstrap-select-openbao-unseal-custody-model MODEL=sops-held-automation
|
||||
make security-bootstrap-console
|
||||
```
|
||||
|
||||
**Observed (2026-06-17):** All gates `done`, stage S6, unseal model gate
|
||||
`done` with automation entry `sso-mfa/bootstrap/creds-bootstrap-agent.sh`, init
|
||||
ceremony `done` (historical init). Next safe action: *Review related workplans*
|
||||
— expected for completed bootstrap, not an error.
|
||||
|
||||
**Greenfield preview (when T2 exists):**
|
||||
|
||||
```bash
|
||||
export METADATA=.local/security-bootstrap-greenfield.json
|
||||
make security-bootstrap-metadata-init METADATA="$METADATA"
|
||||
make security-bootstrap-select-openbao-unseal-custody-model \
|
||||
MODEL=sops-held-automation METADATA="$METADATA"
|
||||
make security-bootstrap-console METADATA="$METADATA"
|
||||
# Expect lower stage, init gate status "automation"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 7. Automation chain (target end state)
|
||||
|
||||
```text
|
||||
[3 nodes root SSH]
|
||||
→ railiance-infra S1 baseline
|
||||
→ railiance-cluster S2 k3s HA
|
||||
→ railiance-platform openbao-deploy
|
||||
→ net-kingdom creds-bootstrap-agent (sops-held init/unseal) [T2]
|
||||
→ railiance-platform openbao-configure-initial [exists]
|
||||
→ railiance-platform openbao-configure-ssh [T5 — next]
|
||||
→ railiance-infra bootstrap-ssh-ca (CA pubkey + principals) [T5]
|
||||
→ ops-warden warden sign smoke [WP-0008 T2]
|
||||
→ (later) flex-auth policy.enabled [WP-0008 T5]
|
||||
```
|
||||
|
||||
On **Track A (live):** skip init/unseal steps; start at **openbao-configure-ssh**.
|
||||
|
||||
---
|
||||
|
||||
## 8. Credential management note (ops-warden)
|
||||
|
||||
Operator feedback: manual `ssh-keygen` for WP-0008 T2 is acceptable for first
|
||||
sign proof but insufficient long-term. ops-warden should eventually document or
|
||||
automate actor key lifecycle (`warden issue`, credential roster, rotation).
|
||||
**Deferred** until T5 + T2 sign path succeeds.
|
||||
|
||||
---
|
||||
|
||||
## 9. Decisions log
|
||||
|
||||
| Date | Decision |
|
||||
| --- | --- |
|
||||
| 2026-06-17 | All three unseal custody models are canon; start automation-first |
|
||||
| 2026-06-17 | Console blocks planned models with hints; only `sops-held-automation` selectable |
|
||||
| 2026-06-17 | Live cluster uses Track A; greenfield uses Track B + separate metadata |
|
||||
| 2026-06-17 | No `platform/operators/ops-warden` KV for SSH signing bootstrap |
|
||||
| 2026-06-17 | Implement T5 on live OpenBao before greenfield full loop |
|
||||
|
||||
---
|
||||
|
||||
## 10. Next actions (ordered)
|
||||
|
||||
1. ~~Persist this assessment~~ (this file)
|
||||
2. **NET-WP-0020 T5** — `openbao-apply-ssh-engine.sh` + railiance-infra host CA role
|
||||
3. **WP-0008 T2** — `warden sign` smoke + append `openbao-production-verify.md`
|
||||
4. **NET-WP-0020 T2** — wire `creds-bootstrap-agent.sh` for greenfield init/unseal
|
||||
5. **NET-WP-0020 T3/T4** — unlock attended + auto-unseal console paths
|
||||
|
||||
---
|
||||
|
||||
## 11. Related files
|
||||
|
||||
| Path | Role |
|
||||
| --- | --- |
|
||||
| `docs/openbao-unseal-custody-models.md` | Unseal custody canon |
|
||||
| `docs/smooth-bootstrap-guide.md` | Step 5 unseal model table |
|
||||
| `workplans/NET-WP-0020-openbao-unseal-custody-and-ssh-automation.md` | Active workplan |
|
||||
| `ops-warden/history/2026-06-17-openbao-production-verify.md` | Health + SSH engine gap |
|
||||
| `ops-warden/workplans/WARDEN-WP-0008-*.md` | Production sign verification |
|
||||
| `railiance-platform/docs/openbao.md` | Deploy + attended ceremony |
|
||||
| `ops-warden/wiki/OpenBaoSshEngineChecklist.md` | Role TTL + verify procedure |
|
||||
| `ops-warden/history/2026-06-17-post-wp0007-reassessment.md` | ops-warden completeness |
|
||||
@@ -90,5 +90,6 @@ priority: high
|
||||
|
||||
## See also
|
||||
|
||||
- `history/2026-06-17-openbao-ssh-custody-and-bootstrap-assessment.md` — state + concepts (read before T5)
|
||||
- `ops-warden/workplans/WARDEN-WP-0008-production-ssh-path-and-stewardship-closeout.md`
|
||||
- `railiance-platform/docs/openbao.md`
|
||||
Reference in New Issue
Block a user