generated from coulomb/repo-seed
Ship flex-auth policy gate registry and smoke evidence, archive WP-0009 through WP-0013, and add integration docs: ops-bridge cert_command migration playbook, operator OpenBao token hygiene, principals drift check script, and 2026-06-24 INTENT/SCOPE gap analysis.
239 lines
8.1 KiB
Markdown
239 lines
8.1 KiB
Markdown
# Policy-Gated SSH Signing
|
||
|
||
Date: 2026-06-23
|
||
Status: **implemented (opt-in)** — WARDEN-WP-0007; policy package confirmed FLEX-WP-0006
|
||
|
||
By default `warden sign` authorizes via **inventory allow-list** and TTL policy
|
||
only. When `policy.enabled: true` in `warden.yaml`, ops-warden calls flex-auth
|
||
before signing and records the decision id in `signatures.log`.
|
||
|
||
---
|
||
|
||
## Flow
|
||
|
||
```text
|
||
warden sign <actor> --pubkey <path>
|
||
|
|
||
v
|
||
Load actor from inventory (type, principals, ttl)
|
||
|
|
||
v
|
||
policy.enabled?
|
||
no -> skip
|
||
yes -> flex-auth POST /v1/check
|
||
|
|
||
+-- DENY / unreachable (fail_closed) -> CAError
|
||
|
|
||
v ALLOW
|
||
CABackend.sign() (local or OpenBao SSH engine)
|
||
|
|
||
v
|
||
Append signatures.log (+ policy_decision_id when set)
|
||
```
|
||
|
||
The same gate runs for `warden issue` (local backend only).
|
||
|
||
---
|
||
|
||
## flex-auth request shape
|
||
|
||
| Field | Source |
|
||
| --- | --- |
|
||
| `subject.id` | `WARDEN_POLICY_SUBJECT` env var, or actor name |
|
||
| `subject.type` | Actor type (`adm` / `agt` / `atm`) |
|
||
| `tenant` | `policy.tenant` (default `tenant:platform`) |
|
||
| `resource.id` | `ssh-cert:actor/<actor-name>` |
|
||
| `resource.type` | `ssh-certificate` |
|
||
| `action` | `sign` |
|
||
| `context.principals` | From inventory |
|
||
| `context.actor_type` | adm \| agt \| atm |
|
||
| `context.pubkey_fingerprint` | SHA256 of pubkey text |
|
||
| `context.ttl_hours` | Requested TTL |
|
||
|
||
flex-auth must return `effect: allow` and an `id` (or `request_id`) on allow.
|
||
Deny responses include a `reason` surfaced in the CLI error.
|
||
|
||
---
|
||
|
||
## Configuration
|
||
|
||
```yaml
|
||
# warden.yaml — policy gate (opt-in, default off)
|
||
policy:
|
||
enabled: false
|
||
flex_auth_url: http://127.0.0.1:8080
|
||
fail_closed: true
|
||
tenant: tenant:platform
|
||
subject_env: WARDEN_POLICY_SUBJECT
|
||
system: ops-warden
|
||
```
|
||
|
||
| Key | Default | Description |
|
||
| --- | --- | --- |
|
||
| `enabled` | `false` | When `true`, call flex-auth before every sign/issue |
|
||
| `flex_auth_url` | `http://127.0.0.1:8080` | flex-auth base URL |
|
||
| `fail_closed` | `true` | Deny sign when flex-auth is unreachable or returns HTTP error |
|
||
| `tenant` | `tenant:platform` | Tenant sent in subject and resource |
|
||
| `subject_env` | `WARDEN_POLICY_SUBJECT` | Env var for IAM subject id override |
|
||
| `system` | `ops-warden` | Resource system identifier |
|
||
|
||
Set `WARDEN_POLICY_SUBJECT` to the caller's IAM profile `sub` when available.
|
||
If unset, the actor name is used as subject id.
|
||
|
||
---
|
||
|
||
## Versioning
|
||
|
||
| Version | Gate | Status |
|
||
| --- | --- | --- |
|
||
| **v1** | Inventory + TTL max | Shipped |
|
||
| **v2** | flex-auth opt-in via `policy.enabled` | Shipped (WP-0007) |
|
||
| **v2.1** | Identity claims required for `adm` signs | Planned |
|
||
| **v3** | Tenant-scoped policies per `tenant:*` | Planned |
|
||
|
||
---
|
||
|
||
## What stays in inventory
|
||
|
||
- Actor registration (name, type, default principals, default TTL)
|
||
- Host reference documentation
|
||
- Scorecard local checks
|
||
|
||
flex-auth decides **whether this sign request is allowed now**; inventory
|
||
defines **what the actor is allowed to request**.
|
||
|
||
---
|
||
|
||
## flex-auth policy package (FLEX-WP-0006)
|
||
|
||
flex-auth owns the `ssh-certificate` / `sign` policy package. ops-warden consumes
|
||
it via `POST /v1/check` when `policy.enabled: true`.
|
||
|
||
**Handoff (canonical):** `~/flex-auth/docs/ops-warden-policy-gate-handoff.md`
|
||
|
||
| Asset | flex-auth path |
|
||
| --- | --- |
|
||
| Policy package | `examples/ops-warden/policy_package.md` |
|
||
| Allow/deny fixtures | `examples/ops-warden/policy_fixtures.yaml` |
|
||
| Registry snapshot | `examples/ops-warden/registry_snapshot.json` |
|
||
| Subject manifest | `examples/ops-warden/subject_manifest.yaml` |
|
||
| Resource manifest | `examples/ops-warden/resource_manifest.yaml` |
|
||
|
||
### Tenant and subject bindings
|
||
|
||
| Field | Value |
|
||
| --- | --- |
|
||
| Tenant | `tenant:platform` (`policy.tenant`) |
|
||
| Resource system | `ops-warden` (`policy.system`) |
|
||
| Resource type | `ssh-certificate` |
|
||
| Action | `sign` |
|
||
| Resource id | `ssh-cert:actor/<actor-name>` |
|
||
|
||
| Actor type | Example flex-auth subject | ops-warden inventory name pattern |
|
||
| --- | --- | --- |
|
||
| `adm` | `platform-steward` | `adm-*` |
|
||
| `agt` | `ci-deploy-agent` | `agt-*` |
|
||
| `atm` | `backup-automation` | `atm-*` |
|
||
|
||
**Subject id sent to flex-auth:** `WARDEN_POLICY_SUBJECT` when set, otherwise the
|
||
inventory actor name. flex-auth may also allow `iam:<actor-name>` when listed in
|
||
`allowed_subjects` on the resource.
|
||
|
||
**Principals and TTL:** Taken from the sign request (inventory defaults). flex-auth
|
||
denies when principals are empty/disallowed or TTL exceeds `max_ttl_hours` on the
|
||
registered resource.
|
||
|
||
### Fixture coverage (flex-auth)
|
||
|
||
Allow: `fixture:ops-warden-adm-sign-allow`, `fixture:ops-warden-agt-sign-allow`,
|
||
`fixture:ops-warden-atm-sign-allow`.
|
||
|
||
Deny: `fixture:ops-warden-unknown-subject-deny`,
|
||
`fixture:ops-warden-actor-type-mismatch-deny`, `fixture:ops-warden-ttl-above-max-deny`,
|
||
`fixture:ops-warden-disallowed-principal-deny`,
|
||
`fixture:ops-warden-missing-fingerprint-deny`.
|
||
|
||
### Local smoke
|
||
|
||
```bash
|
||
# flex-auth (from ~/flex-auth)
|
||
flex-auth serve --addr 127.0.0.1:8080 \
|
||
--registry examples/ops-warden/registry_snapshot.json \
|
||
--policy examples/ops-warden/policy_package.md \
|
||
--log /tmp/flex-auth-ops-warden-decisions.jsonl
|
||
|
||
# warden.yaml — policy.enabled: true, flex_auth_url pointing at flex-auth
|
||
# Use an actor registered in the flex-auth registry (example fixtures use
|
||
# template names; production needs a registry slice for real inventory actors).
|
||
```
|
||
|
||
Local end-to-end evidence: `history/2026-06-23-flex-auth-policy-gate-local-smoke.md`.
|
||
|
||
### Production registry from inventory
|
||
|
||
Build a flex-auth registry snapshot that mirrors `inventory.yaml` actors:
|
||
|
||
```bash
|
||
python scripts/build_flex_auth_registry.py ~/.config/warden/inventory.yaml \
|
||
-o registry/flex-auth/production_registry_snapshot.json
|
||
flex-auth load-registry --file registry/flex-auth/production_registry_snapshot.json
|
||
```
|
||
|
||
Re-run after adding or changing actors. Deploy the snapshot to the production
|
||
flex-auth runtime together with `~/flex-auth/examples/ops-warden/policy_package.md`.
|
||
|
||
Smoke (non-secret):
|
||
|
||
```bash
|
||
./scripts/policy_gate_production_smoke.sh
|
||
# OpenBao-backed when VAULT_TOKEN is valid:
|
||
SMOKE_VAULT=1 ./scripts/policy_gate_production_smoke.sh
|
||
```
|
||
|
||
Evidence: `history/2026-06-23-flex-auth-policy-gate-production-smoke.md`.
|
||
|
||
---
|
||
|
||
## Production rollout
|
||
|
||
**Keep `policy.enabled: false` until flex-auth is reachable** at `policy.flex_auth_url`
|
||
with `fail_closed: true`, unreachable flex-auth blocks all signs.
|
||
|
||
### Operator checklist
|
||
|
||
| Step | Owner | Action |
|
||
| --- | --- | --- |
|
||
| 1 | flex-auth | Deploy runtime; confirm `curl <flex_auth_url>/healthz` → 200 (**FLEX-WP-0007**) |
|
||
| 2 | flex-auth | Load production registry + policy package (`~/flex-auth/examples/ops-warden/`) |
|
||
| 3 | ops-warden | Regenerate registry from inventory: `scripts/build_flex_auth_registry.py` |
|
||
| 4 | ops-warden | Local smoke: `./scripts/policy_gate_production_smoke.sh` |
|
||
| 5 | operator | Vault smoke: `SMOKE_VAULT=1 ./scripts/policy_gate_production_smoke.sh` (valid `VAULT_TOKEN`) |
|
||
| 6 | operator | Set `policy.flex_auth_url` in `~/.config/warden/warden.yaml` |
|
||
| 7 | operator | Set `policy.enabled: true`; keep `fail_closed: true` |
|
||
| 8 | operator | Allow smoke: `warden sign <actor>` — `signatures.log` has `policy_decision_id` |
|
||
| 9 | operator | Deny smoke: e.g. `--ttl` above max — CLI shows flex-auth `reason`, no cert |
|
||
|
||
Cross-repo references:
|
||
|
||
- `~/flex-auth/workplans/FLEX-WP-0007-ops-warden-policy-gate-production-deployment.md`
|
||
- `history/2026-06-23-flex-auth-production-pickup-suggestion.md`
|
||
- `history/2026-06-23-flex-auth-policy-gate-production-smoke.md`
|
||
|
||
### Summary
|
||
|
||
1. Deploy the flex-auth registry and policy package to the production flex-auth
|
||
runtime — **not** only the example fixtures.
|
||
2. Set `policy.flex_auth_url` to the production flex-auth base URL.
|
||
3. Enable `policy.enabled: true` only after steps 1–5 pass.
|
||
4. Keep `fail_closed: true` unless an explicit break-glass procedure exists.
|
||
5. Smoke allow and deny paths; preserve non-secret evidence only.
|
||
|
||
---
|
||
|
||
## See also
|
||
|
||
- `wiki/OpsWardenConfig.md` — full config reference
|
||
- `wiki/CredentialRouting.md`
|
||
- `~/flex-auth/docs/ops-warden-policy-gate-handoff.md` — flex-auth handoff
|
||
- `flex-auth/INTENT.md`
|
||
- `net-kingdom/docs/platform-identity-security-architecture.md` |