Add production_registry_snapshot.json from ops-warden inventory with CI coverage for real actors, IAM subject binding, ttl_out_of_bounds, and unknown_actor_resource. Extend serve contract tests with /healthz and publish the registry sync contract for operator deployment.
5.2 KiB
Ops-Warden Registry Sync
Date: 2026-06-23 Workplan: FLEX-WP-0007
This is the flex-auth side of the production policy gate runbook for ops-warden SSH signing. ops-warden owns actor inventory and generated registry content; flex-auth hosts that registry, evaluates the policy package, and returns the decision envelope used by warden sign.
Production Runtime Target
Use the NetKingdom operator-reachable service URL as the canonical policy.flex_auth_url. The preferred target is an in-cluster flex-auth Service fronted by the existing operator access path:
http://flex-auth.flex-auth.svc.cluster.local:8080
If cluster DNS is not reachable from the workstation that runs warden sign, use an approved operator tunnel or ingress URL with the same base path semantics. Do not turn on policy.enabled with fail_closed true until this pre-flight succeeds from the same workstation:
curl -fsS <policy.flex_auth_url>/healthz
Start the runtime with the production registry snapshot and the ops-warden policy package:
flex-auth serve --addr 0.0.0.0:8080 --registry examples/ops-warden/production_registry_snapshot.json --policy examples/ops-warden/policy_package.md --log /var/log/flex-auth/ops-warden-decisions.jsonl
The checked-in production snapshot is a non-secret fixture and initial load target. Regenerate it from ops-warden inventory whenever actors, principals, or TTL defaults change.
Current Operator Tunnel
As of 2026-06-24, the reachable operator-tunnel URL for CoulombCore is:
http://127.0.0.1:18090
The tunnel name is flex-auth-coulombcore. It forwards CoulombCore 127.0.0.1:18090 to the local flex-auth runtime on 127.0.0.1:18090. Verified checks from CoulombCore:
- GET /healthz returned HTTP 200.
- POST /v1/check for agt-state-hub-bridge returned allow with decision:873c6c682a52bebc.
This is an operator tunnel pattern, not a substitute for a future in-cluster Service if flex-auth should run inside the cluster.
Ownership Contract
| Concern | Owner | Notes |
|---|---|---|
| Actor names and actor types | ops-warden | inventory.yaml defines adm, agt, and atm actors. |
| Default principals and TTLs | ops-warden | Used by warden sign and by generated registry attributes. |
| Registry hosting and reload | flex-auth | Runtime serves the generated snapshot and evaluates it with the policy package. |
| Policy package semantics | flex-auth | examples/ops-warden/policy_package.md owns allow and deny reasons. |
| OpenBao SSH signing | ops-warden | flex-auth never receives SSH private keys or Vault tokens. |
| Production policy.enabled flip | ops-warden operator | Only after healthz and allow/deny smoke pass. |
Sync Procedure
-
In ops-warden, update the managed inventory source or ~/.config/warden/inventory.yaml.
-
Regenerate the flex-auth snapshot from ops-warden:
python scripts/build_flex_auth_registry.py ~/.config/warden/inventory.yaml -o registry/flex-auth/production_registry_snapshot.json -
Validate the generated file before handoff:
flex-auth load-registry --file registry/flex-auth/production_registry_snapshot.json -
Copy or promote the snapshot to the flex-auth runtime. For repo-level drift coverage, update examples/ops-warden/production_registry_snapshot.json when the intended production fixture changes.
-
Restart or reload the flex-auth runtime with the new snapshot.
-
From the workstation that runs warden sign, verify:
curl -fsS <policy.flex_auth_url>/healthz -
Run one allow smoke and one deny smoke. Record only non-secret evidence: actor name, decision id, effect, reason, backend, and whether a certificate was issued.
Current Production Fixture
The initial fixture mirrors ops-warden production inventory as of 2026-06-23. It registers:
| Actor | Type | Principal | Max TTL hours | Allowed subjects |
|---|---|---|---|---|
| adm-example | adm | adm-full | 48 | adm-example, iam:adm-example |
| agt-codex-interhub-bootstrap | agt | agt-interhub-bootstrap | 2 | agt-codex-interhub-bootstrap, iam:agt-codex-interhub-bootstrap |
| agt-state-hub-bridge | agt | agt-task-bridge | 24 | agt-state-hub-bridge, iam:agt-state-hub-bridge |
| atm-backup-daily | atm | atm-backup-daily | 8 | atm-backup-daily, iam:atm-backup-daily |
The IAM subject form is intended for WARDEN_POLICY_SUBJECT. If that environment variable is unset, ops-warden sends the actor name and the same policy path continues to work.
Smoke Expectations
Allow path:
warden sign agt-state-hub-bridge
Expected non-secret evidence: decision effect allow, reason signing_policy_matched, signatures.log includes policy_decision_id.
Deny path:
warden sign agt-state-hub-bridge --ttl 999
Expected non-secret evidence: effect deny, reason ttl_out_of_bounds, no certificate issued. With fail_closed true, unreachable flex-auth must also block signing.
OpenBao-backed signing remains an operator smoke because it requires a scoped VAULT_TOKEN. The previous session returned HTTP 403 on 2026-06-23; retry with:
SMOKE_VAULT=1 ~/ops-warden/scripts/policy_gate_production_smoke.sh
References
- docs/ops-warden-policy-gate-handoff.md
- examples/ops-warden/production_registry_snapshot.json
- ~/ops-warden/wiki/PolicyGatedSigning.md
- ~/ops-warden/history/2026-06-23-flex-auth-policy-gate-production-smoke.md