generated from coulomb/repo-seed
FLEX-WP-0007: production registry fixture, tests, and sync runbook
Add production_registry_snapshot.json from ops-warden inventory with CI coverage for real actors, IAM subject binding, ttl_out_of_bounds, and unknown_actor_resource. Extend serve contract tests with /healthz and publish the registry sync contract for operator deployment.
This commit is contained in:
@@ -0,0 +1,211 @@
|
||||
---
|
||||
id: FLEX-WP-0007
|
||||
type: workplan
|
||||
title: "Ops-Warden Policy Gate Production Deployment"
|
||||
domain: infotech
|
||||
repo: flex-auth
|
||||
status: blocked
|
||||
owner: codex
|
||||
topic_slug: flex-auth
|
||||
planning_priority: P0
|
||||
planning_order: 70
|
||||
depends_on_workplans:
|
||||
- FLEX-WP-0006
|
||||
related_workplans:
|
||||
- WARDEN-WP-0009
|
||||
created: "2026-06-23"
|
||||
updated: "2026-06-23"
|
||||
state_hub_workstream_id: "358ce697-2611-4fe9-89ab-63e86ceb00fa"
|
||||
---
|
||||
|
||||
# FLEX-WP-0007: Ops-Warden Policy Gate Production Deployment
|
||||
|
||||
## Purpose
|
||||
|
||||
Deploy flex-auth as a reachable production runtime for ops-warden's opt-in SSH
|
||||
signing policy gate, load a production registry aligned with real inventory
|
||||
actors, and complete joint smoke evidence so operators can set policy.enabled:
|
||||
true in warden.yaml.
|
||||
|
||||
Review update: repo-side production readiness is now separated from
|
||||
operator-only work. flex-auth can publish the production fixture, tests,
|
||||
runtime command, and sync contract in this repo. The actual stable URL
|
||||
deployment and OpenBao smoke remain blocked because they need NetKingdom
|
||||
reachability and a refreshed scoped VAULT_TOKEN.
|
||||
|
||||
## Background
|
||||
|
||||
ops-warden finished WARDEN-WP-0009 on the caller side: local and
|
||||
production-registry smoke passed, and the production registry generator exists.
|
||||
The remaining risk is operational, not policy shape: warden workstations need a
|
||||
reachable flex-auth URL, and the vault-backed joint smoke needs a valid scoped
|
||||
VAULT_TOKEN.
|
||||
|
||||
Production registry artifacts:
|
||||
|
||||
- flex-auth fixture: examples/ops-warden/production_registry_snapshot.json
|
||||
- ops-warden source artifact: ~/ops-warden/registry/flex-auth/production_registry_snapshot.json
|
||||
- ops-warden generator: ~/ops-warden/scripts/build_flex_auth_registry.py
|
||||
|
||||
## Ownership Boundary
|
||||
|
||||
| Concern | Owner |
|
||||
| --- | --- |
|
||||
| Policy package and PDP decision | flex-auth |
|
||||
| Actor inventory and TTL/principal defaults | ops-warden |
|
||||
| SSH CA and OpenBao signing | ops-warden |
|
||||
| Production registry content for SSH actors | Joint: ops-warden generates, flex-auth hosts |
|
||||
| policy.enabled flip | ops-warden operator after flex-auth is reachable |
|
||||
|
||||
No SSH private keys, OpenBao tokens, or other secrets belong in fixtures, docs,
|
||||
State Hub messages, or smoke evidence.
|
||||
|
||||
## T1 - Deploy production flex-auth runtime
|
||||
|
||||
```task
|
||||
id: FLEX-WP-0007-T01
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "727573fc-86a3-4f5a-abd7-40b0ccb01e68"
|
||||
```
|
||||
|
||||
Deploy flex-auth serve, or equivalent, to a stable URL reachable from
|
||||
workstations that run warden sign.
|
||||
|
||||
- [x] Choose preferred target: in-cluster Service at http://flex-auth.flex-auth.svc.cluster.local:8080 when reachable; otherwise approved operator tunnel or ingress with the same base path
|
||||
- [x] Document canonical policy.flex_auth_url selection in docs/ops-warden-registry-sync.md
|
||||
- [x] Document healthz pre-flight: GET /healthz returns HTTP 200
|
||||
- [x] Add service test coverage for /healthz
|
||||
- [x] Operator tunnel deployed as flex-auth-coulombcore and confirmed POST /v1/check is reachable from CoulombCore
|
||||
|
||||
Acceptance: operator runs curl <flex_auth_url>/healthz from the warden
|
||||
workstation and receives HTTP 200. Verified from CoulombCore on 2026-06-24 with
|
||||
flex_auth_url http://127.0.0.1:18090.
|
||||
|
||||
## T2 - Load production registry and verify real actors
|
||||
|
||||
```task
|
||||
id: FLEX-WP-0007-T02
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "6ec1e00c-4a3a-475b-aefb-af3961de7070"
|
||||
```
|
||||
|
||||
Load the production registry snapshot derived from ops-warden inventory, not
|
||||
only the template actors in examples/ops-warden/registry_snapshot.json.
|
||||
|
||||
- [x] Add examples/ops-warden/production_registry_snapshot.json from the ops-warden generated artifact
|
||||
- [x] Document regenerate and load procedure in docs/ops-warden-registry-sync.md
|
||||
- [x] Verify allow for agt-state-hub-bridge / sign
|
||||
- [x] Verify deny for ttl_out_of_bounds
|
||||
- [x] Verify deny for unregistered actors with unknown_actor_resource
|
||||
- [x] Add CI tests using production actor names: agt-state-hub-bridge, agt-codex-interhub-bootstrap, adm-example, atm-backup-daily
|
||||
|
||||
Acceptance: local flex-auth coverage allows agt-state-hub-bridge without
|
||||
ops-warden-local registry patching. Deployed runtime verification remains part
|
||||
of T1.
|
||||
|
||||
## T3 - Publish registry sync contract with ops-warden
|
||||
|
||||
```task
|
||||
id: FLEX-WP-0007-T03
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "afa09ec3-516c-433d-87a7-330cb79845a8"
|
||||
```
|
||||
|
||||
Document the two-repo workflow when inventory or policy boundaries change.
|
||||
|
||||
- [x] Publish docs/ops-warden-registry-sync.md
|
||||
- [x] Cover ops-warden ownership of actor names, actor types, principals, and TTL defaults
|
||||
- [x] Cover flex-auth ownership of hosted registry, relationships, and policy package evaluation
|
||||
- [x] Document trigger: inventory add/change -> regenerate snapshot -> flex-auth reload
|
||||
- [x] Cross-link from docs/ops-warden-policy-gate-handoff.md
|
||||
- [x] Confirm ops-warden wiki/PolicyGatedSigning.md already points to the flex-auth handoff; flex-auth now points back from the sync runbook
|
||||
|
||||
Acceptance: a new agt-* actor addition has an unambiguous procedure across both
|
||||
repos.
|
||||
|
||||
## T4 - Joint OpenBao + policy gate production smoke
|
||||
|
||||
```task
|
||||
id: FLEX-WP-0007-T04
|
||||
status: wait
|
||||
priority: medium
|
||||
state_hub_task_id: "32a96f1c-e0e8-4e27-baa6-7b8c445cf7a1"
|
||||
```
|
||||
|
||||
Coordinate with ops-warden for vault-backed signing through the deployed
|
||||
flex-auth runtime.
|
||||
|
||||
- [x] flex-auth deployed with production registry via operator tunnel, completing T1
|
||||
- [ ] ops-warden policy.enabled: true and policy.flex_auth_url points to deployed URL http://127.0.0.1:18090 on CoulombCore
|
||||
- [ ] Valid scoped VAULT_TOKEN with warden-sign policy, operator-provided
|
||||
- [ ] Allow smoke: warden sign agt-state-hub-bridge records backend vault and policy_decision_id
|
||||
- [ ] Deny smoke: TTL above registry max is denied by flex-auth before OpenBao
|
||||
- [ ] Record non-secret evidence: decision ids, reasons, actor names only
|
||||
|
||||
Blocked on: scoped VAULT_TOKEN refresh. Previous ops-warden session returned
|
||||
HTTP 403 on 2026-06-23; no VAULT_TOKEN is present in this session.
|
||||
|
||||
Smoke runner when token is valid:
|
||||
|
||||
SMOKE_VAULT=1 ~/ops-warden/scripts/policy_gate_production_smoke.sh
|
||||
|
||||
## T5 - IAM subject binding for production
|
||||
|
||||
```task
|
||||
id: FLEX-WP-0007-T05
|
||||
status: done
|
||||
priority: low
|
||||
state_hub_task_id: "65dc3c59-1e4b-4335-b6a0-db492ea9b2b5"
|
||||
```
|
||||
|
||||
Clarify how WARDEN_POLICY_SUBJECT maps to flex-auth allowed_subjects in
|
||||
production.
|
||||
|
||||
- [x] Document production default: actor name as subject.id unless WARDEN_POLICY_SUBJECT supplies the IAM subject
|
||||
- [x] Confirm production registry allowed_subjects includes iam:<actor> entries
|
||||
- [x] Add test coverage for iam:agt-state-hub-bridge allow path
|
||||
|
||||
Acceptance: documented subject-id strategy; no ops-warden special-casing is
|
||||
required beyond existing policy behavior.
|
||||
|
||||
## Exit Criteria
|
||||
|
||||
- flex-auth production runtime reachable from CoulombCore warden path: done via flex-auth-coulombcore operator tunnel
|
||||
- Production registry loaded and real inventory actors covered locally: done
|
||||
- Registry sync contract published and cross-linked: done
|
||||
- Joint vault-backed smoke evidence recorded, or T4 explicitly waits on token: T4 waits on scoped VAULT_TOKEN
|
||||
- ops-warden operator has the repo-side artifacts needed to set policy.enabled: true after the stable URL and token are ready
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
2026-06-23 repo-side implementation:
|
||||
|
||||
- Added examples/ops-warden/production_registry_snapshot.json from the ops-warden generated production registry artifact.
|
||||
- Added Go coverage for production actor allows, IAM subject allow, ttl_out_of_bounds, unknown_actor_resource, production registry counts, and /healthz.
|
||||
- Published docs/ops-warden-registry-sync.md and cross-linked it from the handoff and examples docs.
|
||||
|
||||
Remaining blocked work:
|
||||
|
||||
- Operator refreshes scoped VAULT_TOKEN and reruns the OpenBao-backed smoke.
|
||||
- After workplan file changes, run make fix-consistency REPO=flex-auth from ~/state-hub to mirror these statuses into State Hub.
|
||||
|
||||
## See Also
|
||||
|
||||
- docs/ops-warden-policy-gate-handoff.md
|
||||
- docs/ops-warden-registry-sync.md
|
||||
- workplans/FLEX-WP-0006-ops-warden-ssh-signing-policy-gate.md
|
||||
- ~/ops-warden/wiki/PolicyGatedSigning.md
|
||||
- ~/ops-warden/workplans/WARDEN-WP-0009-flex-auth-policy-gate-production.md
|
||||
- ~/ops-warden/history/2026-06-23-flex-auth-production-pickup-suggestion.md
|
||||
|
||||
|
||||
2026-06-24 operator tunnel update:
|
||||
|
||||
- Built /tmp/flex-auth and started the production registry runtime on local 127.0.0.1:18090.
|
||||
- Added local ops-bridge tunnel flex-auth-coulombcore, forwarding CoulombCore 127.0.0.1:18090 to the local runtime.
|
||||
- Verified remote health from CoulombCore: GET /healthz returned HTTP 200.
|
||||
- Verified remote POST /v1/check from CoulombCore allowed agt-state-hub-bridge with decision:873c6c682a52bebc.
|
||||
- VAULT_TOKEN is absent, so OpenBao-backed smoke remains blocked on operator credential refresh.
|
||||
Reference in New Issue
Block a user