generated from coulomb/repo-seed
Close ops-warden's side of the last Partial INTENT criterion (ops-bridge integrates via a stable cert_command). The migration playbook and contract already existed; what was missing was an automated readiness gate before touching tunnel config. T1 — scripts/check_tunnel_cert_readiness.py: read-only preflight that asserts the cert_command path is ready without signing — config/backend, actor inventory + TTL within type max, pubkey exists/parses/not-private, principals present, and optional host-principal deployment (mirrors check_principals_drift). Exit 0/1/2. T2 — opt-in --sign-smoke: runs the cert_command against the local backend and validates identity/principals/TTL of the emitted cert; refuses a vault backend. Window measured from the cert's own valid_from->valid_before so it's timezone-robust (fixes a CEST off-by-2h artifact). integration-marked test + a vault-refusal unit test. T3 — playbook now leads with Step 0 readiness gate; ops-bridge handoff message sent. T4 — SCOPE INTENT row: Partial -> Pilot-ready; known-gaps + SSH-lane list updated. 9 unit + 1 integration test, 209 default passing, lint clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
143 lines
4.7 KiB
Markdown
143 lines
4.7 KiB
Markdown
# ops-bridge Tunnel — cert_command Migration
|
|
|
|
Date: 2026-06-24
|
|
Workplan: WARDEN-WP-0013 T3
|
|
Catalog: `ops-bridge-tunnel`
|
|
|
|
Migrate an ops-bridge tunnel from **static SSH keys** to **short-lived warden-signed
|
|
certificates** via the `cert_command` contract (`wiki/CertCommandInterface.md`).
|
|
|
|
ops-warden documents the migration; **ops-bridge** owns tunnel config changes.
|
|
|
|
---
|
|
|
|
## Step 0 — Readiness gate (run this first)
|
|
|
|
Before editing any tunnel config, run the read-only readiness gate (WARDEN-WP-0016).
|
|
It confirms ops-warden's side is set — actor inventory, TTL, public key, and (optionally)
|
|
host principals — **without signing anything**:
|
|
|
|
```bash
|
|
python scripts/check_tunnel_cert_readiness.py \
|
|
--actor agt-state-hub-bridge \
|
|
--pubkey ~/.ssh/agt-state-hub-bridge_ed25519.pub \
|
|
--config ~/.config/warden/warden.yaml \
|
|
--infra ~/railiance-infra/ansible/inventory/ssh_principals.yaml
|
|
```
|
|
|
|
Exit 0 = ready, 1 = a check failed (fix before proceeding), 2 = bad input. The
|
|
Prerequisites and Migration checklist below are the human-readable backing for what the
|
|
gate verifies. To additionally prove the `cert_command` contract end to end against a
|
|
**local** backend (issues a throwaway cert, validates identity/principals/TTL), add
|
|
`--sign-smoke` with a local `warden.yaml`.
|
|
|
|
---
|
|
|
|
## Prerequisites
|
|
|
|
- [ ] Actor registered in `~/.config/warden/inventory.yaml` (see `wiki/ActorInventoryPatterns.md`)
|
|
- [ ] Actor keypair on disk (`ssh_key` private, `.pub` for signing)
|
|
- [ ] Production `warden.yaml` with `backend: vault` and valid scoped `VAULT_TOKEN`
|
|
- [ ] Host trusts warden/OpenBao CA (`railiance-infra` `bootstrap-ssh-ca`)
|
|
- [ ] Host principal allows the actor's principals (`railiance-infra` `ssh_principals.yaml`)
|
|
|
|
---
|
|
|
|
## Pilot tunnel: `agt-state-hub-bridge`
|
|
|
|
| Field | Value |
|
|
| --- | --- |
|
|
| Actor | `agt-state-hub-bridge` |
|
|
| Type | `agt` |
|
|
| Principals | `agt-task-bridge` |
|
|
| TTL | 24 h |
|
|
| Private key | `~/.ssh/agt-state-hub-bridge_ed25519` |
|
|
| Public key | `~/.ssh/agt-state-hub-bridge_ed25519.pub` |
|
|
| cert_command | `warden sign agt-state-hub-bridge --pubkey ~/.ssh/agt-state-hub-bridge_ed25519.pub` |
|
|
|
|
### Pre-migration smoke (operator workstation)
|
|
|
|
```bash
|
|
export VAULT_TOKEN="<scoped-warden-sign-token>" # never commit or paste in chat
|
|
warden status agt-state-hub-bridge
|
|
warden sign agt-state-hub-bridge --pubkey ~/.ssh/agt-state-hub-bridge_ed25519.pub | head -1
|
|
```
|
|
|
|
Confirm exit 0 and cert line starts with `ssh-ed25519-cert-v01@openssh.com`.
|
|
|
|
---
|
|
|
|
## Migration checklist
|
|
|
|
### 1. Inventory and signing path
|
|
|
|
- [ ] Actor exists: `warden inventory list` shows `agt-state-hub-bridge`
|
|
- [ ] `warden sign` succeeds with production OpenBao backend
|
|
- [ ] `signatures.log` records the sign (`~/.local/state/warden/signatures.log`)
|
|
|
|
### 2. ops-bridge tunnel config
|
|
|
|
Edit `~/.config/bridge/tunnels.yaml` (ops-bridge repo owns schema; example below):
|
|
|
|
```yaml
|
|
tunnels:
|
|
state-hub-coulombcore:
|
|
host: coulombcore
|
|
remote_port: 8001
|
|
local_port: 8000
|
|
ssh_user: agt-state-hub-bridge
|
|
ssh_key: ~/.ssh/agt-state-hub-bridge_ed25519
|
|
actor: agt-state-hub-bridge
|
|
cert_command: "warden sign agt-state-hub-bridge --pubkey ~/.ssh/agt-state-hub-bridge_ed25519.pub"
|
|
```
|
|
|
|
- [ ] `cert_command` uses the **public** key path (warden reads pubkey, writes cert to stdout)
|
|
- [ ] `ssh_user` matches the certificate identity / host expectation
|
|
- [ ] Remove or disable static-key-only fallback once cert path is verified
|
|
|
|
### 3. Host-side verification
|
|
|
|
- [ ] Principal `agt-task-bridge` present in `railiance-infra` `ssh_principals.yaml` for target host
|
|
- [ ] Run `scripts/check_principals_drift.py` if inventory `hosts` section documents allowed principals
|
|
|
|
### 4. Tunnel smoke
|
|
|
|
```bash
|
|
# ops-bridge (from ops-bridge repo)
|
|
bridge status state-hub-coulombcore
|
|
bridge up state-hub-coulombcore
|
|
```
|
|
|
|
- [ ] Tunnel establishes without static cert file on disk
|
|
- [ ] Re-run `bridge up` after cert TTL expires — `cert_command` re-issues automatically
|
|
|
|
### 5. Policy gate (optional, after FLEX-WP-0007)
|
|
|
|
When `policy.enabled: true`, confirm `signatures.log` includes `policy_decision_id`
|
|
on tunnel-driven signs. See `wiki/PolicyGatedSigning.md`.
|
|
|
|
---
|
|
|
|
## Rollback
|
|
|
|
Keep the static key path until cert_command smoke passes. To roll back:
|
|
|
|
1. Remove `cert_command` from tunnel config
|
|
2. Restore prior static-key or `CertificateFile` workflow
|
|
3. Document rollback in ops-bridge session notes (not in git secrets)
|
|
|
|
---
|
|
|
|
## Static-key tunnels (legacy)
|
|
|
|
Tunnels using `agt-claude-*` or other long-lived keys are **out of scope** for this
|
|
pilot. Migrate per-tunnel when ops-bridge owner prioritizes them.
|
|
|
|
---
|
|
|
|
## See also
|
|
|
|
- `wiki/CertCommandInterface.md`
|
|
- `wiki/OpsWardenConfig.md` — cert_command example
|
|
- `wiki/playbooks/operator-openbao-token-hygiene.md`
|
|
- `warden route show ops-bridge-tunnel --json` |