FLEX-WP-0007: production registry fixture, tests, and sync runbook
Some checks are pending
CI / Build and Test (push) Waiting to run
CI / Lint (push) Waiting to run

Add production_registry_snapshot.json from ops-warden inventory with CI
coverage for real actors, IAM subject binding, ttl_out_of_bounds, and
unknown_actor_resource. Extend serve contract tests with /healthz and
publish the registry sync contract for operator deployment.
This commit is contained in:
2026-06-24 14:52:35 +02:00
parent fae0f00a69
commit 941501c590
7 changed files with 981 additions and 3 deletions

View File

@@ -80,3 +80,25 @@ integration, host documentation, and signatures.log production evidence.
No SSH private keys, OpenBao tokens, database credentials, or real public-key
material are stored in these fixtures.
## FLEX-WP-0007 Production Update
Additional published assets:
- Production registry fixture: examples/ops-warden/production_registry_snapshot.json
- Registry sync runbook: docs/ops-warden-registry-sync.md
Production runtime command:
flex-auth serve --addr 0.0.0.0:8080 --registry examples/ops-warden/production_registry_snapshot.json --policy examples/ops-warden/policy_package.md --log /var/log/flex-auth/ops-warden-decisions.jsonl
Use http://flex-auth.flex-auth.svc.cluster.local:8080 when cluster DNS is
reachable from warden workstations. Otherwise use the approved operator tunnel
or ingress URL. Always pre-flight GET /healthz from the same workstation before
enabling policy.enabled with fail_closed true.
Production actor coverage now verifies agt-state-hub-bridge,
agt-codex-interhub-bootstrap, adm-example, atm-backup-daily, ttl_out_of_bounds,
unknown_actor_resource, and the iam:agt-state-hub-bridge subject path used by
WARDEN_POLICY_SUBJECT.

View File

@@ -0,0 +1,128 @@
# Ops-Warden Registry Sync
Date: 2026-06-23
Workplan: FLEX-WP-0007
This is the flex-auth side of the production policy gate runbook for ops-warden
SSH signing. ops-warden owns actor inventory and generated registry content;
flex-auth hosts that registry, evaluates the policy package, and returns the
decision envelope used by warden sign.
## Production Runtime Target
Use the NetKingdom operator-reachable service URL as the canonical
policy.flex_auth_url. The preferred target is an in-cluster flex-auth Service
fronted by the existing operator access path:
http://flex-auth.flex-auth.svc.cluster.local:8080
If cluster DNS is not reachable from the workstation that runs warden sign, use
an approved operator tunnel or ingress URL with the same base path semantics. Do
not turn on policy.enabled with fail_closed true until this pre-flight succeeds
from the same workstation:
curl -fsS <policy.flex_auth_url>/healthz
Start the runtime with the production registry snapshot and the ops-warden
policy package:
flex-auth serve --addr 0.0.0.0:8080 --registry examples/ops-warden/production_registry_snapshot.json --policy examples/ops-warden/policy_package.md --log /var/log/flex-auth/ops-warden-decisions.jsonl
The checked-in production snapshot is a non-secret fixture and initial load
target. Regenerate it from ops-warden inventory whenever actors, principals, or
TTL defaults change.
## Current Operator Tunnel
As of 2026-06-24, the reachable operator-tunnel URL for CoulombCore is:
http://127.0.0.1:18090
The tunnel name is flex-auth-coulombcore. It forwards CoulombCore
127.0.0.1:18090 to the local flex-auth runtime on 127.0.0.1:18090. Verified
checks from CoulombCore:
- GET /healthz returned HTTP 200.
- POST /v1/check for agt-state-hub-bridge returned allow with decision:873c6c682a52bebc.
This is an operator tunnel pattern, not a substitute for a future in-cluster
Service if flex-auth should run inside the cluster.
## Ownership Contract
| Concern | Owner | Notes |
| --- | --- | --- |
| Actor names and actor types | ops-warden | inventory.yaml defines adm, agt, and atm actors. |
| Default principals and TTLs | ops-warden | Used by warden sign and by generated registry attributes. |
| Registry hosting and reload | flex-auth | Runtime serves the generated snapshot and evaluates it with the policy package. |
| Policy package semantics | flex-auth | examples/ops-warden/policy_package.md owns allow and deny reasons. |
| OpenBao SSH signing | ops-warden | flex-auth never receives SSH private keys or Vault tokens. |
| Production policy.enabled flip | ops-warden operator | Only after healthz and allow/deny smoke pass. |
## Sync Procedure
1. In ops-warden, update the managed inventory source or ~/.config/warden/inventory.yaml.
2. Regenerate the flex-auth snapshot from ops-warden:
python scripts/build_flex_auth_registry.py ~/.config/warden/inventory.yaml -o registry/flex-auth/production_registry_snapshot.json
3. Validate the generated file before handoff:
flex-auth load-registry --file registry/flex-auth/production_registry_snapshot.json
4. Copy or promote the snapshot to the flex-auth runtime. For repo-level drift
coverage, update examples/ops-warden/production_registry_snapshot.json when
the intended production fixture changes.
5. Restart or reload the flex-auth runtime with the new snapshot.
6. From the workstation that runs warden sign, verify:
curl -fsS <policy.flex_auth_url>/healthz
7. Run one allow smoke and one deny smoke. Record only non-secret evidence:
actor name, decision id, effect, reason, backend, and whether a certificate
was issued.
## Current Production Fixture
The initial fixture mirrors ops-warden production inventory as of 2026-06-23.
It registers:
| Actor | Type | Principal | Max TTL hours | Allowed subjects |
| --- | --- | --- | --- | --- |
| adm-example | adm | adm-full | 48 | adm-example, iam:adm-example |
| agt-codex-interhub-bootstrap | agt | agt-interhub-bootstrap | 2 | agt-codex-interhub-bootstrap, iam:agt-codex-interhub-bootstrap |
| agt-state-hub-bridge | agt | agt-task-bridge | 24 | agt-state-hub-bridge, iam:agt-state-hub-bridge |
| atm-backup-daily | atm | atm-backup-daily | 8 | atm-backup-daily, iam:atm-backup-daily |
The IAM subject form is intended for WARDEN_POLICY_SUBJECT. If that environment
variable is unset, ops-warden sends the actor name and the same policy path
continues to work.
## Smoke Expectations
Allow path:
warden sign agt-state-hub-bridge
Expected non-secret evidence: decision effect allow, reason
signing_policy_matched, signatures.log includes policy_decision_id.
Deny path:
warden sign agt-state-hub-bridge --ttl 999
Expected non-secret evidence: effect deny, reason ttl_out_of_bounds, no
certificate issued. With fail_closed true, unreachable flex-auth must also block
signing.
OpenBao-backed signing remains an operator smoke because it requires a scoped
VAULT_TOKEN. The previous session returned HTTP 403 on 2026-06-23; retry with:
SMOKE_VAULT=1 ~/ops-warden/scripts/policy_gate_production_smoke.sh
## References
- docs/ops-warden-policy-gate-handoff.md
- examples/ops-warden/production_registry_snapshot.json
- ~/ops-warden/wiki/PolicyGatedSigning.md
- ~/ops-warden/history/2026-06-23-flex-auth-policy-gate-production-smoke.md

View File

@@ -25,6 +25,7 @@ This document captures the current sequencing view for flex-auth workplans.
| `FLEX-WP-0003` | complete | completed | `FLEX-WP-0002` | Markitect consumer integration and first CARING benchmark are complete: resource namespace, manifest import, action vocabulary, descriptor fixtures, decision fixtures, integration docs. |
| `FLEX-WP-0004` | complete | completed | `FLEX-WP-0002`, `FLEX-WP-0005` | Delegated PDP and directory adapter boundary work is complete: Topaz adapter shape, OpenFGA/SpiceDB, OPA/Cedar, Keycloak Authorization Services, Entra/Graph/SCIM, CARING envelope preservation. |
| `FLEX-WP-0006` | complete | finished | `FLEX-WP-0002`, `FLEX-WP-0005` | Ops-warden unblocker is complete: flex-auth publishes `ssh-certificate` / `sign` policies, fixtures, and `/v1/check` smoke evidence for the opt-in pre-sign gate shipped in ops-warden `WARDEN-WP-0007` and tracked for production in `WARDEN-WP-0009`. |
| `FLEX-WP-0007` | `P0` | blocked | `FLEX-WP-0006` | Repo-side production registry fixture, sync contract, runtime command, healthz coverage, and real actor/IAM tests are implemented. Operator deployment and OpenBao smoke remain blocked on reachable runtime selection and scoped VAULT_TOKEN refresh. |
## Dependency Notes
@@ -79,5 +80,6 @@ Native State Hub dependency edges:
- `FLEX-WP-0004 -> FLEX-WP-0005` (Topaz adapter consumes the spike)
- `FLEX-WP-0006 -> FLEX-WP-0002`
- `FLEX-WP-0006 -> FLEX-WP-0005`
- ops-warden: `WARDEN-WP-0009` waits for `FLEX-WP-0006` output before
production enablement of `policy.enabled`.
- ops-warden: `WARDEN-WP-0009` finished (caller + registry smoke). Production
`policy.enabled: true` waits for `FLEX-WP-0007` (reachable flex-auth runtime).
- `FLEX-WP-0007 -> FLEX-WP-0006`