9.7 KiB
id, type, title, domain, repo, status, owner, topic_slug, planning_priority, planning_order, depends_on_workplans, related_workplans, created, updated, state_hub_workstream_id
| id | type | title | domain | repo | status | owner | topic_slug | planning_priority | planning_order | depends_on_workplans | related_workplans | created | updated | state_hub_workstream_id | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| FLEX-WP-0007 | workplan | Ops-Warden Policy Gate Production Deployment | infotech | flex-auth | finished | codex | flex-auth | P0 | 70 |
|
|
2026-06-23 | 2026-06-30 | 358ce697-2611-4fe9-89ab-63e86ceb00fa |
FLEX-WP-0007: Ops-Warden Policy Gate Production Deployment
Purpose
Deploy flex-auth as a reachable production runtime for ops-warden's opt-in SSH signing policy gate, load a production registry aligned with real inventory actors, and complete joint smoke evidence so operators can set policy.enabled: true in warden.yaml when the ecosystem maturity stage calls for live enforcement.
Review update: repo-side production readiness is now separated from
operator-only work. flex-auth can publish the production fixture, tests,
runtime command, and sync contract in this repo. The actual stable URL
deployment and OpenBao smoke were completed through the operator tunnel and a
scoped warden-sign OpenBao lane. The final policy.enabled production flip is
explicitly deferred until the ecosystem reaches testing/production maturity.
Background
ops-warden finished WARDEN-WP-0009 on the caller side: local and production-registry smoke passed, and the production registry generator exists. The remaining risk is operational, not policy shape: warden workstations need a reachable flex-auth URL and a vault-backed joint smoke before the gate can be banked for later enforcement.
Production registry artifacts:
- flex-auth fixture: examples/ops-warden/production_registry_snapshot.json
- ops-warden source artifact: ~/ops-warden/registry/flex-auth/production_registry_snapshot.json
- ops-warden generator: ~/ops-warden/scripts/build_flex_auth_registry.py
Ownership Boundary
| Concern | Owner |
|---|---|
| Policy package and PDP decision | flex-auth |
| Actor inventory and TTL/principal defaults | ops-warden |
| SSH CA and OpenBao signing | ops-warden |
| Production registry content for SSH actors | Joint: ops-warden generates, flex-auth hosts |
| policy.enabled flip | ops-warden operator after flex-auth is reachable |
No SSH private keys, OpenBao tokens, or other secrets belong in fixtures, docs, State Hub messages, or smoke evidence.
T1 - Deploy production flex-auth runtime
id: FLEX-WP-0007-T01
status: done
priority: high
state_hub_task_id: "727573fc-86a3-4f5a-abd7-40b0ccb01e68"
Deploy flex-auth serve, or equivalent, to a stable URL reachable from workstations that run warden sign.
- Choose preferred target: in-cluster Service at http://flex-auth.flex-auth.svc.cluster.local:8080 when reachable; otherwise approved operator tunnel or ingress with the same base path
- Document canonical policy.flex_auth_url selection in docs/ops-warden-registry-sync.md
- Document healthz pre-flight: GET /healthz returns HTTP 200
- Add service test coverage for /healthz
- Operator tunnel deployed as flex-auth-coulombcore and confirmed POST /v1/check is reachable from CoulombCore
Acceptance: operator runs curl <flex_auth_url>/healthz from the warden workstation and receives HTTP 200. Verified from CoulombCore on 2026-06-24 with flex_auth_url http://127.0.0.1:18090.
T2 - Load production registry and verify real actors
id: FLEX-WP-0007-T02
status: done
priority: high
state_hub_task_id: "6ec1e00c-4a3a-475b-aefb-af3961de7070"
Load the production registry snapshot derived from ops-warden inventory, not only the template actors in examples/ops-warden/registry_snapshot.json.
- Add examples/ops-warden/production_registry_snapshot.json from the ops-warden generated artifact
- Document regenerate and load procedure in docs/ops-warden-registry-sync.md
- Verify allow for agt-state-hub-bridge / sign
- Verify deny for ttl_out_of_bounds
- Verify deny for unregistered actors with unknown_actor_resource
- Add CI tests using production actor names: agt-state-hub-bridge, agt-codex-interhub-bootstrap, adm-example, atm-backup-daily
Acceptance: local flex-auth coverage allows agt-state-hub-bridge without ops-warden-local registry patching. Deployed runtime verification remains part of T1.
T3 - Publish registry sync contract with ops-warden
id: FLEX-WP-0007-T03
status: done
priority: medium
state_hub_task_id: "afa09ec3-516c-433d-87a7-330cb79845a8"
Document the two-repo workflow when inventory or policy boundaries change.
- Publish docs/ops-warden-registry-sync.md
- Cover ops-warden ownership of actor names, actor types, principals, and TTL defaults
- Cover flex-auth ownership of hosted registry, relationships, and policy package evaluation
- Document trigger: inventory add/change -> regenerate snapshot -> flex-auth reload
- Cross-link from docs/ops-warden-policy-gate-handoff.md
- Confirm ops-warden wiki/PolicyGatedSigning.md already points to the flex-auth handoff; flex-auth now points back from the sync runbook
Acceptance: a new agt-* actor addition has an unambiguous procedure across both repos.
T4 - Joint OpenBao + policy gate production smoke
id: FLEX-WP-0007-T04
status: done
priority: medium
state_hub_task_id: "32a96f1c-e0e8-4e27-baa6-7b8c445cf7a1"
Coordinate with ops-warden for vault-backed signing through the deployed flex-auth runtime.
- flex-auth deployed with production registry via operator tunnel, completing T1
- policy.flex_auth_url validated against deployed URL http://127.0.0.1:18090 on CoulombCore;
policy.enabledintentionally remains off until testing/production maturity - Scoped warden-sign OpenBao lane available for the smoke; no token value recorded here
- Allow smoke:
warden sign agt-state-hub-bridgerecorded backendvaultand policy_decision_iddecision:032b096c433ad80c - Deny smoke: TTL above registry max was denied by flex-auth before OpenBao with reason
ttl_out_of_bounds - Record non-secret evidence: decision ids, reasons, actor names only
Closed on 2026-06-30 from ops-warden non-secret smoke evidence received
2026-06-29. The operator deliberately keeps policy.enabled off for now because
the ecosystem is still build-stage/pre-testing; the gate is verified and banked
for later live enforcement rather than forced into premature production rigor.
Smoke runner when token is valid:
SMOKE_VAULT=1 ~/ops-warden/scripts/policy_gate_production_smoke.sh
T5 - IAM subject binding for production
id: FLEX-WP-0007-T05
status: done
priority: low
state_hub_task_id: "65dc3c59-1e4b-4335-b6a0-db492ea9b2b5"
Clarify how WARDEN_POLICY_SUBJECT maps to flex-auth allowed_subjects in production.
- Document production default: actor name as subject.id unless WARDEN_POLICY_SUBJECT supplies the IAM subject
- Confirm production registry allowed_subjects includes iam: entries
- Add test coverage for iam:agt-state-hub-bridge allow path
Acceptance: documented subject-id strategy; no ops-warden special-casing is required beyond existing policy behavior.
Exit Criteria
- flex-auth production runtime reachable from CoulombCore warden path: done via flex-auth-coulombcore operator tunnel
- Production registry loaded and real inventory actors covered locally: done
- Registry sync contract published and cross-linked: done
- Joint vault-backed smoke evidence recorded: done, decision:032b096c433ad80c
- ops-warden operator has the repo-side artifacts needed to set policy.enabled: true later, when maturity posture calls for live enforcement
Implementation Notes
2026-06-23 repo-side implementation:
- Added examples/ops-warden/production_registry_snapshot.json from the ops-warden generated production registry artifact.
- Added Go coverage for production actor allows, IAM subject allow, ttl_out_of_bounds, unknown_actor_resource, production registry counts, and /healthz.
- Published docs/ops-warden-registry-sync.md and cross-linked it from the handoff and examples docs.
Closeout note:
- The OpenBao-backed smoke passed through ops-warden with the scoped warden-sign lane.
- The
policy.enabledflip is intentionally deferred by operator/maturity decision, not treated as an open repo-side blocker. - After workplan file changes, run make fix-consistency REPO=flex-auth from ~/state-hub to mirror these statuses into State Hub.
See Also
- docs/ops-warden-policy-gate-handoff.md
- docs/ops-warden-registry-sync.md
- workplans/FLEX-WP-0006-ops-warden-ssh-signing-policy-gate.md
- ~/ops-warden/wiki/PolicyGatedSigning.md
- ~/ops-warden/workplans/WARDEN-WP-0009-flex-auth-policy-gate-production.md
- ~/ops-warden/history/2026-06-23-flex-auth-production-pickup-suggestion.md
2026-06-24 operator tunnel update:
- Built /tmp/flex-auth and started the production registry runtime on local 127.0.0.1:18090.
- Added local ops-bridge tunnel flex-auth-coulombcore, forwarding CoulombCore 127.0.0.1:18090 to the local runtime.
- Verified remote health from CoulombCore: GET /healthz returned HTTP 200.
- Verified remote POST /v1/check from CoulombCore allowed agt-state-hub-bridge with decision:873c6c682a52bebc.
- VAULT_TOKEN is absent, so OpenBao-backed smoke remains blocked on operator credential refresh.
2026-06-30 closeout from ops-warden smoke handoff:
- Mode:
FLEX_AUTH_EXTERNALagainst deployed runtime127.0.0.1:18090via the CoulombCore operator path. - Allow:
warden sign agt-state-hub-bridgereturned policy_decision_iddecision:032b096c433ad80c. - Deny:
--ttl 999was rejected withttl_out_of_boundsbefore OpenBao signing. - Vault-backed allow: backend
vaultproduced the same policy_decision_id through the scoped warden-sign OpenBao lane. - Operator decision: keep
policy.enabledoff during build-stage/pre-testing and flip it later when the ecosystem reaches the appropriate maturity posture.