flex-auth/docs/ops-warden-registry-sync.md

# Ops-Warden Registry Sync

Date: 2026-06-23
Workplan: FLEX-WP-0007

This is the flex-auth side of the production policy gate runbook for ops-warden
SSH signing. ops-warden owns actor inventory and generated registry content;
flex-auth hosts that registry, evaluates the policy package, and returns the
decision envelope used by warden sign.

## Production Runtime Target

Use the NetKingdom operator-reachable service URL as the canonical
policy.flex_auth_url. The preferred target is an in-cluster flex-auth Service
fronted by the existing operator access path:

    http://flex-auth.flex-auth.svc.cluster.local:8080

If cluster DNS is not reachable from the workstation that runs warden sign, use
an approved operator tunnel or ingress URL with the same base path semantics. Do
not turn on policy.enabled with fail_closed true until this pre-flight succeeds
from the same workstation:

    curl -fsS <policy.flex_auth_url>/healthz

Start the runtime with the production registry snapshot and the ops-warden
policy package:

    flex-auth serve --addr 0.0.0.0:8080 --registry examples/ops-warden/production_registry_snapshot.json --policy examples/ops-warden/policy_package.md --log /var/log/flex-auth/ops-warden-decisions.jsonl

The checked-in production snapshot is a non-secret fixture and initial load
target. Regenerate it from ops-warden inventory whenever actors, principals, or
TTL defaults change.

## Current Operator Tunnel

As of 2026-06-24, the reachable operator-tunnel URL for CoulombCore is:

    http://127.0.0.1:18090

The tunnel name is flex-auth-coulombcore. It forwards CoulombCore
127.0.0.1:18090 to the local flex-auth runtime on 127.0.0.1:18090. Verified
checks from CoulombCore:

- GET /healthz returned HTTP 200.
- POST /v1/check for agt-state-hub-bridge returned allow with decision:873c6c682a52bebc.

This is an operator tunnel pattern, not a substitute for a future in-cluster
Service if flex-auth should run inside the cluster.

## Ownership Contract

| Concern | Owner | Notes |
| --- | --- | --- |
| Actor names and actor types | ops-warden | inventory.yaml defines adm, agt, and atm actors. |
| Default principals and TTLs | ops-warden | Used by warden sign and by generated registry attributes. |
| Registry hosting and reload | flex-auth | Runtime serves the generated snapshot and evaluates it with the policy package. |
| Policy package semantics | flex-auth | examples/ops-warden/policy_package.md owns allow and deny reasons. |
| OpenBao SSH signing | ops-warden | flex-auth never receives SSH private keys or Vault tokens. |
| Production policy.enabled flip | ops-warden operator | Only after healthz and allow/deny smoke pass. |

## Sync Procedure

1. In ops-warden, update the managed inventory source or ~/.config/warden/inventory.yaml.
2. Regenerate the flex-auth snapshot from ops-warden:

       python scripts/build_flex_auth_registry.py ~/.config/warden/inventory.yaml -o registry/flex-auth/production_registry_snapshot.json

3. Validate the generated file before handoff:

       flex-auth load-registry --file registry/flex-auth/production_registry_snapshot.json

4. Copy or promote the snapshot to the flex-auth runtime. For repo-level drift
   coverage, update examples/ops-warden/production_registry_snapshot.json when
   the intended production fixture changes.
5. Restart or reload the flex-auth runtime with the new snapshot.
6. From the workstation that runs warden sign, verify:

       curl -fsS <policy.flex_auth_url>/healthz

7. Run one allow smoke and one deny smoke. Record only non-secret evidence:
   actor name, decision id, effect, reason, backend, and whether a certificate
   was issued.

## Current Production Fixture

The initial fixture mirrors ops-warden production inventory as of 2026-06-23.
It registers:

| Actor | Type | Principal | Max TTL hours | Allowed subjects |
| --- | --- | --- | --- | --- |
| adm-example | adm | adm-full | 48 | adm-example, iam:adm-example |
| agt-codex-interhub-bootstrap | agt | agt-interhub-bootstrap | 2 | agt-codex-interhub-bootstrap, iam:agt-codex-interhub-bootstrap |
| agt-state-hub-bridge | agt | agt-task-bridge | 24 | agt-state-hub-bridge, iam:agt-state-hub-bridge |
| atm-backup-daily | atm | atm-backup-daily | 8 | atm-backup-daily, iam:atm-backup-daily |

The IAM subject form is intended for WARDEN_POLICY_SUBJECT. If that environment
variable is unset, ops-warden sends the actor name and the same policy path
continues to work.

## Smoke Expectations

Allow path:

    warden sign agt-state-hub-bridge

Expected non-secret evidence: decision effect allow, reason
signing_policy_matched, signatures.log includes policy_decision_id.

Deny path:

    warden sign agt-state-hub-bridge --ttl 999

Expected non-secret evidence: effect deny, reason ttl_out_of_bounds, no
certificate issued. With fail_closed true, unreachable flex-auth must also block
signing.

OpenBao-backed signing remains an operator smoke because it requires a scoped
VAULT_TOKEN. The previous session returned HTTP 403 on 2026-06-23; retry with:

    SMOKE_VAULT=1 ~/ops-warden/scripts/policy_gate_production_smoke.sh

## References

- docs/ops-warden-policy-gate-handoff.md
- examples/ops-warden/production_registry_snapshot.json
- ~/ops-warden/wiki/PolicyGatedSigning.md
- ~/ops-warden/history/2026-06-23-flex-auth-policy-gate-production-smoke.md