--- id: FLEX-WP-0007 type: workplan title: "Ops-Warden Policy Gate Production Deployment" domain: infotech repo: flex-auth status: blocked owner: codex topic_slug: flex-auth planning_priority: P0 planning_order: 70 depends_on_workplans: - FLEX-WP-0006 related_workplans: - WARDEN-WP-0009 created: "2026-06-23" updated: "2026-06-23" state_hub_workstream_id: "358ce697-2611-4fe9-89ab-63e86ceb00fa" --- # FLEX-WP-0007: Ops-Warden Policy Gate Production Deployment ## Purpose Deploy flex-auth as a reachable production runtime for ops-warden's opt-in SSH signing policy gate, load a production registry aligned with real inventory actors, and complete joint smoke evidence so operators can set policy.enabled: true in warden.yaml. Review update: repo-side production readiness is now separated from operator-only work. flex-auth can publish the production fixture, tests, runtime command, and sync contract in this repo. The actual stable URL deployment and OpenBao smoke remain blocked because they need NetKingdom reachability and a refreshed scoped VAULT_TOKEN. ## Background ops-warden finished WARDEN-WP-0009 on the caller side: local and production-registry smoke passed, and the production registry generator exists. The remaining risk is operational, not policy shape: warden workstations need a reachable flex-auth URL, and the vault-backed joint smoke needs a valid scoped VAULT_TOKEN. Production registry artifacts: - flex-auth fixture: examples/ops-warden/production_registry_snapshot.json - ops-warden source artifact: ~/ops-warden/registry/flex-auth/production_registry_snapshot.json - ops-warden generator: ~/ops-warden/scripts/build_flex_auth_registry.py ## Ownership Boundary | Concern | Owner | | --- | --- | | Policy package and PDP decision | flex-auth | | Actor inventory and TTL/principal defaults | ops-warden | | SSH CA and OpenBao signing | ops-warden | | Production registry content for SSH actors | Joint: ops-warden generates, flex-auth hosts | | policy.enabled flip | ops-warden operator after flex-auth is reachable | No SSH private keys, OpenBao tokens, or other secrets belong in fixtures, docs, State Hub messages, or smoke evidence. ## T1 - Deploy production flex-auth runtime ```task id: FLEX-WP-0007-T01 status: done priority: high state_hub_task_id: "727573fc-86a3-4f5a-abd7-40b0ccb01e68" ``` Deploy flex-auth serve, or equivalent, to a stable URL reachable from workstations that run warden sign. - [x] Choose preferred target: in-cluster Service at http://flex-auth.flex-auth.svc.cluster.local:8080 when reachable; otherwise approved operator tunnel or ingress with the same base path - [x] Document canonical policy.flex_auth_url selection in docs/ops-warden-registry-sync.md - [x] Document healthz pre-flight: GET /healthz returns HTTP 200 - [x] Add service test coverage for /healthz - [x] Operator tunnel deployed as flex-auth-coulombcore and confirmed POST /v1/check is reachable from CoulombCore Acceptance: operator runs curl /healthz from the warden workstation and receives HTTP 200. Verified from CoulombCore on 2026-06-24 with flex_auth_url http://127.0.0.1:18090. ## T2 - Load production registry and verify real actors ```task id: FLEX-WP-0007-T02 status: done priority: high state_hub_task_id: "6ec1e00c-4a3a-475b-aefb-af3961de7070" ``` Load the production registry snapshot derived from ops-warden inventory, not only the template actors in examples/ops-warden/registry_snapshot.json. - [x] Add examples/ops-warden/production_registry_snapshot.json from the ops-warden generated artifact - [x] Document regenerate and load procedure in docs/ops-warden-registry-sync.md - [x] Verify allow for agt-state-hub-bridge / sign - [x] Verify deny for ttl_out_of_bounds - [x] Verify deny for unregistered actors with unknown_actor_resource - [x] Add CI tests using production actor names: agt-state-hub-bridge, agt-codex-interhub-bootstrap, adm-example, atm-backup-daily Acceptance: local flex-auth coverage allows agt-state-hub-bridge without ops-warden-local registry patching. Deployed runtime verification remains part of T1. ## T3 - Publish registry sync contract with ops-warden ```task id: FLEX-WP-0007-T03 status: done priority: medium state_hub_task_id: "afa09ec3-516c-433d-87a7-330cb79845a8" ``` Document the two-repo workflow when inventory or policy boundaries change. - [x] Publish docs/ops-warden-registry-sync.md - [x] Cover ops-warden ownership of actor names, actor types, principals, and TTL defaults - [x] Cover flex-auth ownership of hosted registry, relationships, and policy package evaluation - [x] Document trigger: inventory add/change -> regenerate snapshot -> flex-auth reload - [x] Cross-link from docs/ops-warden-policy-gate-handoff.md - [x] Confirm ops-warden wiki/PolicyGatedSigning.md already points to the flex-auth handoff; flex-auth now points back from the sync runbook Acceptance: a new agt-* actor addition has an unambiguous procedure across both repos. ## T4 - Joint OpenBao + policy gate production smoke ```task id: FLEX-WP-0007-T04 status: wait priority: medium state_hub_task_id: "32a96f1c-e0e8-4e27-baa6-7b8c445cf7a1" ``` Coordinate with ops-warden for vault-backed signing through the deployed flex-auth runtime. - [x] flex-auth deployed with production registry via operator tunnel, completing T1 - [ ] ops-warden policy.enabled: true and policy.flex_auth_url points to deployed URL http://127.0.0.1:18090 on CoulombCore - [ ] Valid scoped VAULT_TOKEN with warden-sign policy, operator-provided - [ ] Allow smoke: warden sign agt-state-hub-bridge records backend vault and policy_decision_id - [ ] Deny smoke: TTL above registry max is denied by flex-auth before OpenBao - [ ] Record non-secret evidence: decision ids, reasons, actor names only Blocked on: scoped VAULT_TOKEN refresh. Previous ops-warden session returned HTTP 403 on 2026-06-23; no VAULT_TOKEN is present in this session. Smoke runner when token is valid: SMOKE_VAULT=1 ~/ops-warden/scripts/policy_gate_production_smoke.sh ## T5 - IAM subject binding for production ```task id: FLEX-WP-0007-T05 status: done priority: low state_hub_task_id: "65dc3c59-1e4b-4335-b6a0-db492ea9b2b5" ``` Clarify how WARDEN_POLICY_SUBJECT maps to flex-auth allowed_subjects in production. - [x] Document production default: actor name as subject.id unless WARDEN_POLICY_SUBJECT supplies the IAM subject - [x] Confirm production registry allowed_subjects includes iam: entries - [x] Add test coverage for iam:agt-state-hub-bridge allow path Acceptance: documented subject-id strategy; no ops-warden special-casing is required beyond existing policy behavior. ## Exit Criteria - flex-auth production runtime reachable from CoulombCore warden path: done via flex-auth-coulombcore operator tunnel - Production registry loaded and real inventory actors covered locally: done - Registry sync contract published and cross-linked: done - Joint vault-backed smoke evidence recorded, or T4 explicitly waits on token: T4 waits on scoped VAULT_TOKEN - ops-warden operator has the repo-side artifacts needed to set policy.enabled: true after the stable URL and token are ready ## Implementation Notes 2026-06-23 repo-side implementation: - Added examples/ops-warden/production_registry_snapshot.json from the ops-warden generated production registry artifact. - Added Go coverage for production actor allows, IAM subject allow, ttl_out_of_bounds, unknown_actor_resource, production registry counts, and /healthz. - Published docs/ops-warden-registry-sync.md and cross-linked it from the handoff and examples docs. Remaining blocked work: - Operator refreshes scoped VAULT_TOKEN and reruns the OpenBao-backed smoke. - After workplan file changes, run make fix-consistency REPO=flex-auth from ~/state-hub to mirror these statuses into State Hub. ## See Also - docs/ops-warden-policy-gate-handoff.md - docs/ops-warden-registry-sync.md - workplans/FLEX-WP-0006-ops-warden-ssh-signing-policy-gate.md - ~/ops-warden/wiki/PolicyGatedSigning.md - ~/ops-warden/workplans/WARDEN-WP-0009-flex-auth-policy-gate-production.md - ~/ops-warden/history/2026-06-23-flex-auth-production-pickup-suggestion.md 2026-06-24 operator tunnel update: - Built /tmp/flex-auth and started the production registry runtime on local 127.0.0.1:18090. - Added local ops-bridge tunnel flex-auth-coulombcore, forwarding CoulombCore 127.0.0.1:18090 to the local runtime. - Verified remote health from CoulombCore: GET /healthz returned HTTP 200. - Verified remote POST /v1/check from CoulombCore allowed agt-state-hub-bridge with decision:873c6c682a52bebc. - VAULT_TOKEN is absent, so OpenBao-backed smoke remains blocked on operator credential refresh.