Files
ops-warden/wiki/PolicyGatedSigning.md
tegwick 8e9383a33a feat: opt-in flex-auth policy gate and OpenBao verify (WP-0007)
Add policy.py client that calls flex-auth /v1/check before sign/issue when
policy.enabled is true. Record policy_decision_id in signatures.log. Default
off preserves existing inventory-only behavior. Document production OpenBao
health probe and update config/wiki references.
2026-06-17 08:37:14 +02:00

121 lines
3.4 KiB
Markdown

# Policy-Gated SSH Signing
Date: 2026-06-17
Status: **implemented (opt-in)** — WARDEN-WP-0007
By default `warden sign` authorizes via **inventory allow-list** and TTL policy
only. When `policy.enabled: true` in `warden.yaml`, ops-warden calls flex-auth
before signing and records the decision id in `signatures.log`.
---
## Flow
```text
warden sign <actor> --pubkey <path>
|
v
Load actor from inventory (type, principals, ttl)
|
v
policy.enabled?
no -> skip
yes -> flex-auth POST /v1/check
|
+-- DENY / unreachable (fail_closed) -> CAError
|
v ALLOW
CABackend.sign() (local or OpenBao SSH engine)
|
v
Append signatures.log (+ policy_decision_id when set)
```
The same gate runs for `warden issue` (local backend only).
---
## flex-auth request shape
| Field | Source |
| --- | --- |
| `subject.id` | `WARDEN_POLICY_SUBJECT` env var, or actor name |
| `subject.type` | Actor type (`adm` / `agt` / `atm`) |
| `tenant` | `policy.tenant` (default `tenant:platform`) |
| `resource.id` | `ssh-cert:actor/<actor-name>` |
| `resource.type` | `ssh-certificate` |
| `action` | `sign` |
| `context.principals` | From inventory |
| `context.actor_type` | adm \| agt \| atm |
| `context.pubkey_fingerprint` | SHA256 of pubkey text |
| `context.ttl_hours` | Requested TTL |
flex-auth must return `effect: allow` and an `id` (or `request_id`) on allow.
Deny responses include a `reason` surfaced in the CLI error.
---
## Configuration
```yaml
# warden.yaml — policy gate (opt-in, default off)
policy:
enabled: false
flex_auth_url: http://127.0.0.1:8080
fail_closed: true
tenant: tenant:platform
subject_env: WARDEN_POLICY_SUBJECT
system: ops-warden
```
| Key | Default | Description |
| --- | --- | --- |
| `enabled` | `false` | When `true`, call flex-auth before every sign/issue |
| `flex_auth_url` | `http://127.0.0.1:8080` | flex-auth base URL |
| `fail_closed` | `true` | Deny sign when flex-auth is unreachable or returns HTTP error |
| `tenant` | `tenant:platform` | Tenant sent in subject and resource |
| `subject_env` | `WARDEN_POLICY_SUBJECT` | Env var for IAM subject id override |
| `system` | `ops-warden` | Resource system identifier |
Set `WARDEN_POLICY_SUBJECT` to the caller's IAM profile `sub` when available.
If unset, the actor name is used as subject id.
---
## Versioning
| Version | Gate | Status |
| --- | --- | --- |
| **v1** | Inventory + TTL max | Shipped |
| **v2** | flex-auth opt-in via `policy.enabled` | Shipped (WP-0007) |
| **v2.1** | Identity claims required for `adm` signs | Planned |
| **v3** | Tenant-scoped policies per `tenant:*` | Planned |
---
## What stays in inventory
- Actor registration (name, type, default principals, default TTL)
- Host reference documentation
- Scorecard local checks
flex-auth decides **whether this sign request is allowed now**; inventory
defines **what the actor is allowed to request**.
---
## Production rollout
1. Deploy flex-auth policies for resource type `ssh-certificate`.
2. Enable `policy.enabled: true` in production `warden.yaml`.
3. Keep `fail_closed: true` unless an explicit break-glass procedure exists.
4. Verify `signatures.log` entries include `policy_decision_id`.
---
## See also
- `wiki/OpsWardenConfig.md` — full config reference
- `wiki/CredentialRouting.md`
- `flex-auth/INTENT.md`
- `net-kingdom/docs/platform-identity-security-architecture.md`