generated from coulomb/repo-seed
Document OpenBao as the platform production secrets service while keeping the vault-compatible warden.yaml config shape. Update OpsWardenConfig, SCOPE, and CertCommandInterface cross-references.
107 lines
3.3 KiB
Markdown
107 lines
3.3 KiB
Markdown
# cert_command Interface
|
|
|
|
**Version:** 1.0
|
|
**Date:** 2026-03-28
|
|
**Purpose:** Define the contract between OpsWarden (issuer) and callers such as ops-bridge
|
|
(consumer) for just-in-time SSH certificate acquisition.
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
`cert_command` is a shell string that a caller executes to obtain a short-lived, CA-signed
|
|
SSH certificate for a named actor. The caller passes the cert to the SSH process alongside
|
|
the actor's private key.
|
|
|
|
This interface is intentionally tool-agnostic: the caller (`ops-bridge`, a script, a CI
|
|
pipeline) does not need to know whether the CA is a local file, OpenBao, or another
|
|
Vault-compatible SSH secrets engine. Any command that writes a cert to stdout and exits 0
|
|
satisfies the contract.
|
|
|
|
---
|
|
|
|
## Contract
|
|
|
|
### Invocation
|
|
|
|
```
|
|
warden sign <actor-name> --pubkey <path/to/actor.pub>
|
|
```
|
|
|
|
Or any equivalent shell command:
|
|
|
|
```
|
|
bao write -field=signed_key ssh/sign/agt-role public_key=@/tmp/key.pub
|
|
ssh-keygen -s /path/to/ca -I agt-test -n agt-task -V +24h /tmp/key.pub && cat /tmp/key-cert.pub
|
|
```
|
|
|
|
### Success (exit 0)
|
|
|
|
- Stdout: certificate text only — a single line starting with the key type, e.g.:
|
|
```
|
|
ssh-ed25519-cert-v01@openssh.com AAAA...
|
|
```
|
|
- Stderr: ignored by the caller (warden may print warnings there)
|
|
- Side effect: cert is also written to `~/.local/state/warden/<actor>-cert.pub` by warden
|
|
(for use by `warden status` and `warden scorecard`)
|
|
|
|
### Failure (exit non-zero)
|
|
|
|
- Exit code: any non-zero value
|
|
- Stdout: ignored
|
|
- Stderr: passed through to caller logs / audit detail field
|
|
- Caller behaviour: treat as a transient error; apply reconnect backoff and retry
|
|
|
|
---
|
|
|
|
## Caller Responsibilities (ops-bridge)
|
|
|
|
1. Run `cert_command` via `subprocess.run(shell=True)` before each SSH subprocess launch
|
|
2. Write stdout to a tempfile in the state dir: `~/.local/state/bridge/<tunnel>-cert.pub`
|
|
3. Add `-i <cert_path>` after `-i <key_path>` in the `ssh` command
|
|
4. Parse `ssh-keygen -L -f <cert>` to extract `Key ID` → log as `cert_identity` in audit
|
|
5. Parse `Valid before:` → schedule pre-emptive cert refresh ~5 min before expiry
|
|
6. On `cert_command` failure: log `BRIDGE_DISCONNECTED` with stderr; apply backoff
|
|
|
|
## What the Caller Must NOT Do
|
|
|
|
- Cache or reuse a cert across reconnects (always re-run `cert_command` per reconnect)
|
|
- Write the cert to disk with world-readable permissions (mode 600)
|
|
- Ignore a non-zero exit from `cert_command` (must treat as failure, trigger backoff)
|
|
|
|
---
|
|
|
|
## Example: ops-bridge tunnels.yaml
|
|
|
|
```yaml
|
|
tunnels:
|
|
state-hub-coulombcore:
|
|
host: coulombcore
|
|
remote_port: 8001
|
|
local_port: 8000
|
|
ssh_user: agt-state-hub-bridge
|
|
ssh_key: ~/.ssh/agt-state-hub-bridge_ed25519
|
|
actor: agt-state-hub-bridge
|
|
# cert_command is optional. When absent, ssh_key is used directly (static key mode).
|
|
cert_command: "warden sign agt-state-hub-bridge --pubkey ~/.ssh/agt-state-hub-bridge_ed25519.pub"
|
|
```
|
|
|
|
---
|
|
|
|
## TTL Guidelines (AccessManagementDirective §2)
|
|
|
|
| Actor type | Max TTL | Pre-emptive refresh |
|
|
|---|---|---|
|
|
| `adm` | 48 h | 5 min before expiry |
|
|
| `agt` | 24 h | 5 min before expiry |
|
|
| `atm` | 8 h | 5 min before expiry |
|
|
|
|
ops-bridge enforces the refresh schedule. OpsWarden enforces the max TTL at signing time.
|
|
|
|
---
|
|
|
|
## Backward Compatibility
|
|
|
|
Callers that do not set `cert_command` continue to use the static key (`ssh_key`) with no
|
|
TTL, cert logic, or refresh. The two modes are fully independent.
|