# cert_command Interface **Version:** 1.0 **Date:** 2026-03-28 **Purpose:** Define the contract between OpsWarden (issuer) and callers such as ops-bridge (consumer) for just-in-time SSH certificate acquisition. --- ## Overview `cert_command` is a shell string that a caller executes to obtain a short-lived, CA-signed SSH certificate for a named actor. The caller passes the cert to the SSH process alongside the actor's private key. This interface is intentionally tool-agnostic: the caller (`ops-bridge`, a script, a CI pipeline) does not need to know whether the CA is a local file or HashiCorp Vault. Any command that writes a cert to stdout and exits 0 satisfies the contract. --- ## Contract ### Invocation ``` warden sign --pubkey ``` Or any equivalent shell command: ``` vault write -field=signed_key ssh/sign/agt-role public_key=@/tmp/key.pub ssh-keygen -s /path/to/ca -I agt-test -n agt-task -V +24h /tmp/key.pub && cat /tmp/key-cert.pub ``` ### Success (exit 0) - Stdout: certificate text only โ€” a single line starting with the key type, e.g.: ``` ssh-ed25519-cert-v01@openssh.com AAAA... ``` - Stderr: ignored by the caller (warden may print warnings there) - Side effect: cert is also written to `~/.local/state/warden/-cert.pub` by warden (for use by `warden status` and `warden scorecard`) ### Failure (exit non-zero) - Exit code: any non-zero value - Stdout: ignored - Stderr: passed through to caller logs / audit detail field - Caller behaviour: treat as a transient error; apply reconnect backoff and retry --- ## Caller Responsibilities (ops-bridge) 1. Run `cert_command` via `subprocess.run(shell=True)` before each SSH subprocess launch 2. Write stdout to a tempfile in the state dir: `~/.local/state/bridge/-cert.pub` 3. Add `-i ` after `-i ` in the `ssh` command 4. Parse `ssh-keygen -L -f ` to extract `Key ID` โ†’ log as `cert_identity` in audit 5. Parse `Valid before:` โ†’ schedule pre-emptive cert refresh ~5 min before expiry 6. On `cert_command` failure: log `BRIDGE_DISCONNECTED` with stderr; apply backoff ## What the Caller Must NOT Do - Cache or reuse a cert across reconnects (always re-run `cert_command` per reconnect) - Write the cert to disk with world-readable permissions (mode 600) - Ignore a non-zero exit from `cert_command` (must treat as failure, trigger backoff) --- ## Example: ops-bridge tunnels.yaml ```yaml tunnels: state-hub-coulombcore: host: coulombcore remote_port: 8001 local_port: 8000 ssh_user: agt-state-hub-bridge ssh_key: ~/.ssh/agt-state-hub-bridge_ed25519 actor: agt-state-hub-bridge # cert_command is optional. When absent, ssh_key is used directly (static key mode). cert_command: "warden sign agt-state-hub-bridge --pubkey ~/.ssh/agt-state-hub-bridge_ed25519.pub" ``` --- ## TTL Guidelines (AccessManagementDirective ยง2) | Actor type | Max TTL | Pre-emptive refresh | |---|---|---| | `adm` | 48 h | 5 min before expiry | | `agt` | 24 h | 5 min before expiry | | `atm` | 8 h | 5 min before expiry | ops-bridge enforces the refresh schedule. OpsWarden enforces the max TTL at signing time. --- ## Backward Compatibility Callers that do not set `cert_command` continue to use the static key (`ssh_key`) with no TTL, cert logic, or refresh. The two modes are fully independent.