ops-warden/wiki/OpsWardenConfig.md

# OpsWarden Configuration Reference

Config file: `~/.config/warden/warden.yaml` (override with `WARDEN_CONFIG` env var)

---

## Backend overview

| Backend | Config value | Use when |
|---------|--------------|----------|
| Local CA | `backend: local` | Labs, CI, air-gapped dev, hosts without platform secrets access |
| Platform CA | `backend: vault` | Production and shared ops environments |

**Platform standard:** Railiance S3 uses [OpenBao](https://openbao.org/) as the
runtime platform secrets service (`RAIL-PL-WP-0002` in `railiance-platform`).
OpenBao exposes a **Vault-compatible HTTP API**, so ops-warden keeps the config
keys `backend: vault` and the `vault:` block — no separate OpenBao backend name
is required. The same config works against OpenBao or HashiCorp Vault if you point
`vault.addr` at either service.

ops-warden signs SSH certificates only. It does **not** deploy OpenBao, manage
unseal keys, or store long-lived API secrets. Cluster bootstrap and custody live
in `railiance-platform` and NetKingdom docs.

---

## Local backend (lab / offline)

```yaml
# Uses ssh-keygen -s with a CA private key on disk.
backend: local

# Path to the CA private key. Keep this file mode 600 and never commit it.
ca_key: ~/.ssh/ops-ca-user

inventory_path: ~/.config/warden/inventory.yaml
state_dir: ~/.local/state/warden

# Optional flex-auth gate (default off — see wiki/PolicyGatedSigning.md)
policy:
  enabled: false
  flex_auth_url: http://127.0.0.1:8080
  fail_closed: true
```

### Bootstrapping the local CA key

```bash
# Generate CA keypair once (offline, secure location)
ssh-keygen -t ed25519 -f ~/.ssh/ops-ca-user -C "Ops SSH User CA (2026)" -N ""
chmod 600 ~/.ssh/ops-ca-user
chmod 644 ~/.ssh/ops-ca-user.pub

# Distribute ops-ca-user.pub to every host:
#   TrustedUserCAKeys /etc/ssh/ca/ca_user.pub  (in sshd_config)
# See railiance-infra bootstrap-ssh-ca.yml playbook.
```

---

## OpenBao / Vault-compatible backend (production)

Use this backend against the platform OpenBao instance or any other SSH secrets
engine that implements the Vault signing API (`POST /v1/<mount>/sign/<role>`).

### Example — Railiance01 (browser / operator workstation)

```yaml
backend: vault

vault:
  # OpenBao UI/API (KeyCape OIDC). Prefer short-lived tokens from policy, not root.
  addr: https://bao.coulomb.social

  mount: ssh

  role_map:
    adm: adm-role
    agt: agt-role
    atm: atm-role

  # OpenBao accepts the same X-Vault-Token header name as Vault.
  token_env: VAULT_TOKEN

inventory_path: ~/.config/warden/inventory.yaml
state_dir: ~/.local/state/warden

# Enable after flex-auth ssh-certificate policies are deployed:
# policy:
#   enabled: true
#   flex_auth_url: http://flex-auth.flex-auth.svc.cluster.local:8080
#   fail_closed: true
```

### Example — in-cluster caller (pod or trusted host)

```yaml
backend: vault

vault:
  addr: http://openbao.openbao.svc.cluster.local:8200
  mount: ssh
  role_map:
    adm: adm-role
    agt: agt-role
    atm: atm-role
  token_env: VAULT_TOKEN
```

Choose the `addr` that matches where `warden` runs: operators on a laptop use
the external HTTPS endpoint; workloads inside the cluster use the internal
service URL. See `railiance-platform/docs/openbao.md` for deployment and access
paths.

### Authentication

Export a token with permission to sign against the mapped roles:

```bash
# After OIDC login or policy-issued token (OpenBao CLI)
export VAULT_TOKEN="<short-lived-token>"

# Or HashiCorp Vault CLI against a Vault-compatible endpoint
vault login
```

`warden` reads the token from the env var named in `vault.token_env` (default
`VAULT_TOKEN`). OpenBao uses the same header; you do not need a separate
`BAO_TOKEN` unless you configure `token_env` that way.

On failure, `warden sign` suggests falling back to `--backend local` only for
lab recovery — not as a production substitute.

### SSH secrets engine setup (OpenBao)

Run once per environment after OpenBao is initialized and unsealed. Adjust TTL
limits to match `ActorType` policy in `wiki/AccessManagementDirective.md`
(adm 48 h, agt 24 h, atm 8 h).

```bash
# OpenBao CLI (bao) — preferred on Railiance
bao secrets enable ssh

bao write ssh/roles/agt-role \
    key_type=ca \
    allowed_users="*" \
    allow_user_certificates=true \
    default_user="agt" \
    ttl=24h max_ttl=24h

bao write ssh/roles/adm-role \
    key_type=ca \
    allowed_users="*" \
    allow_user_certificates=true \
    default_user="adm" \
    ttl=48h max_ttl=48h

bao write ssh/roles/atm-role \
    key_type=ca \
    allowed_users="*" \
    allow_user_certificates=true \
    default_user="atm" \
    ttl=8h max_ttl=8h
```

HashiCorp Vault uses the same paths with the `vault` CLI:

```bash
vault secrets enable ssh
vault write ssh/roles/agt-role key_type=ca ...  # same role parameters
```

Mount path defaults to `ssh`; override with `vault.mount` in `warden.yaml` if
your engine lives elsewhere.

### Platform references

| Topic | Location |
|-------|----------|
| OpenBao deploy, unseal, OIDC admin | `railiance-platform/docs/openbao.md` |
| Host CA trust and principals | `railiance-infra` Ansible playbooks |
| Signing contract for callers | `wiki/CertCommandInterface.md` |

---

## Principals inventory (`inventory.yaml`)

```yaml
actors:
  # Actor name must carry the prefix matching its type:
  #   adm-*  for adm, agt-*  for agt, atm-*  for atm
  agt-state-hub-bridge:
    type: agt
    # Principals embedded in the cert; matched against /etc/ssh/auth_principals/%u
    principals:
      - agt-task-bridge
    # Certificate TTL in hours. Defaults: adm=48, agt=24, atm=8
    ttl_hours: 24
    description: "ops-bridge tunnel agent for state-hub"

  adm-bernd:
    type: adm
    principals:
      - adm-full
    ttl_hours: 48

  atm-backup-daily:
    type: atm
    principals:
      - atm-backup-daily
    ttl_hours: 8
    description: "nightly backup automation"

hosts:
  # Optional: documents which principals are allowed on each host.
  # Not enforced by warden; used for reference and future tooling.
  coulombcore:
    allowed_principals:
      agt:
        - agt-task-bridge
      atm:
        - atm-backup-daily
```

---

## Policy gate (flex-auth, opt-in)

When `policy.enabled: true`, `warden sign` and `warden issue` call flex-auth
`POST /v1/check` before signing. Deny or unreachable (with `fail_closed: true`)
blocks issuance. Allowed decisions store `policy_decision_id` in `signatures.log`.

```yaml
policy:
  enabled: false                    # default — no behavior change
  flex_auth_url: http://127.0.0.1:8080
  fail_closed: true                 # deny when flex-auth unreachable
  tenant: tenant:platform
  subject_env: WARDEN_POLICY_SUBJECT
  system: ops-warden
```

Full request shape and rollout notes: `wiki/PolicyGatedSigning.md`.

---

## Environment variables

| Variable | Default | Description |
|----------|---------|-------------|
| `WARDEN_CONFIG` | `~/.config/warden/warden.yaml` | Config file path |
| `VAULT_TOKEN` | — | API token for `backend: vault` (OpenBao or Vault; name configurable via `vault.token_env`) |
| `WARDEN_POLICY_SUBJECT` | — | IAM subject id for flex-auth checks (when `policy.enabled`) |

---

## cert_command integration with ops-bridge

Add `cert_command` to a tunnel in `~/.config/bridge/tunnels.yaml`:

```yaml
tunnels:
  state-hub-coulombcore:
    host: coulombcore
    remote_port: 8001
    local_port: 8000
    ssh_user: agt-state-hub-bridge
    ssh_key: ~/.ssh/agt-state-hub-bridge_ed25519
    actor: agt-state-hub-bridge
    cert_command: "warden sign agt-state-hub-bridge --pubkey ~/.ssh/agt-state-hub-bridge_ed25519.pub"
```

`ops-bridge` runs `cert_command` before each SSH launch, captures stdout as the cert,
and passes it alongside the private key via `ssh -i <key> -i <cert>`.
See `wiki/CertCommandInterface.md` for the full contract.