Close ops-warden's side of the last Partial INTENT criterion (ops-bridge integrates via a stable cert_command). The migration playbook and contract already existed; what was missing was an automated readiness gate before touching tunnel config. T1 — scripts/check_tunnel_cert_readiness.py: read-only preflight that asserts the cert_command path is ready without signing — config/backend, actor inventory + TTL within type max, pubkey exists/parses/not-private, principals present, and optional host-principal deployment (mirrors check_principals_drift). Exit 0/1/2. T2 — opt-in --sign-smoke: runs the cert_command against the local backend and validates identity/principals/TTL of the emitted cert; refuses a vault backend. Window measured from the cert's own valid_from->valid_before so it's timezone-robust (fixes a CEST off-by-2h artifact). integration-marked test + a vault-refusal unit test. T3 — playbook now leads with Step 0 readiness gate; ops-bridge handoff message sent. T4 — SCOPE INTENT row: Partial -> Pilot-ready; known-gaps + SSH-lane list updated. 9 unit + 1 integration test, 209 default passing, lint clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
4.7 KiB
ops-bridge Tunnel — cert_command Migration
Date: 2026-06-24
Workplan: WARDEN-WP-0013 T3
Catalog: ops-bridge-tunnel
Migrate an ops-bridge tunnel from static SSH keys to short-lived warden-signed
certificates via the cert_command contract (wiki/CertCommandInterface.md).
ops-warden documents the migration; ops-bridge owns tunnel config changes.
Step 0 — Readiness gate (run this first)
Before editing any tunnel config, run the read-only readiness gate (WARDEN-WP-0016). It confirms ops-warden's side is set — actor inventory, TTL, public key, and (optionally) host principals — without signing anything:
python scripts/check_tunnel_cert_readiness.py \
--actor agt-state-hub-bridge \
--pubkey ~/.ssh/agt-state-hub-bridge_ed25519.pub \
--config ~/.config/warden/warden.yaml \
--infra ~/railiance-infra/ansible/inventory/ssh_principals.yaml
Exit 0 = ready, 1 = a check failed (fix before proceeding), 2 = bad input. The
Prerequisites and Migration checklist below are the human-readable backing for what the
gate verifies. To additionally prove the cert_command contract end to end against a
local backend (issues a throwaway cert, validates identity/principals/TTL), add
--sign-smoke with a local warden.yaml.
Prerequisites
- Actor registered in
~/.config/warden/inventory.yaml(seewiki/ActorInventoryPatterns.md) - Actor keypair on disk (
ssh_keyprivate,.pubfor signing) - Production
warden.yamlwithbackend: vaultand valid scopedVAULT_TOKEN - Host trusts warden/OpenBao CA (
railiance-infrabootstrap-ssh-ca) - Host principal allows the actor's principals (
railiance-infrassh_principals.yaml)
Pilot tunnel: agt-state-hub-bridge
| Field | Value |
|---|---|
| Actor | agt-state-hub-bridge |
| Type | agt |
| Principals | agt-task-bridge |
| TTL | 24 h |
| Private key | ~/.ssh/agt-state-hub-bridge_ed25519 |
| Public key | ~/.ssh/agt-state-hub-bridge_ed25519.pub |
| cert_command | warden sign agt-state-hub-bridge --pubkey ~/.ssh/agt-state-hub-bridge_ed25519.pub |
Pre-migration smoke (operator workstation)
export VAULT_TOKEN="<scoped-warden-sign-token>" # never commit or paste in chat
warden status agt-state-hub-bridge
warden sign agt-state-hub-bridge --pubkey ~/.ssh/agt-state-hub-bridge_ed25519.pub | head -1
Confirm exit 0 and cert line starts with ssh-ed25519-cert-v01@openssh.com.
Migration checklist
1. Inventory and signing path
- Actor exists:
warden inventory listshowsagt-state-hub-bridge warden signsucceeds with production OpenBao backendsignatures.logrecords the sign (~/.local/state/warden/signatures.log)
2. ops-bridge tunnel config
Edit ~/.config/bridge/tunnels.yaml (ops-bridge repo owns schema; example below):
tunnels:
state-hub-coulombcore:
host: coulombcore
remote_port: 8001
local_port: 8000
ssh_user: agt-state-hub-bridge
ssh_key: ~/.ssh/agt-state-hub-bridge_ed25519
actor: agt-state-hub-bridge
cert_command: "warden sign agt-state-hub-bridge --pubkey ~/.ssh/agt-state-hub-bridge_ed25519.pub"
cert_commanduses the public key path (warden reads pubkey, writes cert to stdout)ssh_usermatches the certificate identity / host expectation- Remove or disable static-key-only fallback once cert path is verified
3. Host-side verification
- Principal
agt-task-bridgepresent inrailiance-infrassh_principals.yamlfor target host - Run
scripts/check_principals_drift.pyif inventoryhostssection documents allowed principals
4. Tunnel smoke
# ops-bridge (from ops-bridge repo)
bridge status state-hub-coulombcore
bridge up state-hub-coulombcore
- Tunnel establishes without static cert file on disk
- Re-run
bridge upafter cert TTL expires —cert_commandre-issues automatically
5. Policy gate (optional, after FLEX-WP-0007)
When policy.enabled: true, confirm signatures.log includes policy_decision_id
on tunnel-driven signs. See wiki/PolicyGatedSigning.md.
Rollback
Keep the static key path until cert_command smoke passes. To roll back:
- Remove
cert_commandfrom tunnel config - Restore prior static-key or
CertificateFileworkflow - Document rollback in ops-bridge session notes (not in git secrets)
Static-key tunnels (legacy)
Tunnels using agt-claude-* or other long-lived keys are out of scope for this
pilot. Migrate per-tunnel when ops-bridge owner prioritizes them.
See also
wiki/CertCommandInterface.mdwiki/OpsWardenConfig.md— cert_command examplewiki/playbooks/operator-openbao-token-hygiene.mdwarden route show ops-bridge-tunnel --json