Files
ops-warden/wiki/AccessRouting.md
tegwick 1237cc767b Complete WARDEN-WP-0012 routing scenario playbooks
Add platform-secret playbooks for issue-core ingestion, OpenRouter llm-connect,
object-storage STS, and database dynamic credentials. Extend the routing catalog
with draft entries and implement `warden route list --stale` for quarterly drift
review. Document the review cadence in AccessRouting and mark the workplan finished.
2026-06-25 10:27:23 +02:00

164 lines
7.3 KiB
Markdown

# Access Routing — what ops-warden answers
Date: 2026-06-18
ops-warden **issues short-lived SSH certificates** and **routes every other
credential need to the subsystem that owns it.** This page states that role
plainly so it cannot be misread as a desk that wraps the platform.
- **What ops-warden executes:** the SSH certificate lane only (`warden sign`,
`cert_command`, `ops-ssh-wrapper`).
- **What ops-warden answers:** *where* a credential need belongs and *who owns it*
pointing at the owner's docs, never restating their procedure.
- **What ops-warden never does:** vend API keys, log you in, decide policy, open
tunnels, or deploy hosts.
For the worker-facing decision tree see `CredentialRouting.md`; for component
literacy see `NetKingdomSecurityMap.md`. This page is the steward's statement of
**role and boundary**.
---
## Issue vs route
| Need | Subsystem | ops-warden role | Who acts |
| --- | --- | --- | --- |
| SSH cert for host/ops access (`adm`/`agt`/`atm`) | **ops-warden** | **Issue** (`warden sign`) | ops-warden signs; worker uses cert |
| API key / DB cred / dynamic lease | OpenBao | Route — point at path | Worker calls OpenBao |
| "May I perform action X?" | flex-auth (+ Topaz PDP) | Route — point at policy | Worker/PEP calls flex-auth |
| Login / OIDC token / MFA | key-cape / Keycloak | Route — point at IAM Profile | Worker authenticates |
| Object-storage STS / S3 creds | net-kingdom + flex-auth + OpenBao | Route — point at vending path | Worker follows NK-WP-0007 |
| SSH tunnel / port forward | ops-bridge | Route — supply `cert_command` | ops-bridge opens tunnel |
| Host principal / force-command | railiance-infra | Route — point at Ansible | infra deploys host |
| OpenBao cluster init / unseal | railiance-platform | Route — point at ceremony | platform operates |
Only the first row is something ops-warden **executes**. Every other row is a
**pointer**: ops-warden names the owner and the doc, and the worker acts on the
owning system directly.
---
## Anti-patterns (not coming to ops-warden)
These commands do **not** exist and will **not** be added — they belong to other
subsystems. If you find yourself wanting one, you are on the wrong desk:
| Tempting command | Why it's wrong | Right path |
| --- | --- | --- |
| `warden secret` / `warden bao` | ops-warden does not store or vend secrets | OpenBao |
| `warden login` | ops-warden does not establish identity | key-cape / Keycloak |
| `warden policy` | ops-warden does not decide authorization | flex-auth |
| `warden tunnel` | ops-warden does not manage transport | ops-bridge |
ops-warden authors step-by-step procedure for exactly one lane — SSH issuance —
because it owns it. For everything else it carries a **pointer**, not a fork of
the owner's runbook. See the no-double-source rule in
`workplans/WARDEN-WP-0010-access-routing-charter.md`.
---
## Routing lookup CLI (`warden route`)
Agents and operators query the pointer catalog directly instead of re-deriving
routing from wiki prose. The command group is **read-only** — it never calls
OpenBao, flex-auth, key-cape, or any other subsystem, and never returns secret
material.
```bash
warden route list [--json] [--all] [--tag <keyword>] # active-only unless --all
warden route list --stale [--stale-days 90] [--all] [--json] # past review cadence
warden route show <id> [--json] # owner + pointers; SSH adds steps
warden route find "<free text need>" [--json] [--all] # rank by keyword overlap
```
Agent-oriented examples:
```bash
# "I need an API key" — find the owner, get a pointer, act there yourself
warden route find "openrouter api key" --json
warden route show openbao-api-key --json
# → {"warden_executes": false, "next_action": "next action on `railiance-platform` — see `wiki/CredentialRouting.md#routing-table`"}
# The one lane ops-warden executes: SSH. `show` appends the authored steps + cert pattern.
warden route show ssh-cert-host-access --json
# → {"warden_executes": true, "cert_command": "warden sign <actor> --pubkey <path>", "steps": [...]}
```
`show` on a routed (non-SSH) need always ends with **"next action on
`<owner_repo>` — see `<wiki_ref>`"** and never implies ops-warden performed
anything. Draft scenarios (owner path not yet shipped) are hidden unless `--all`.
---
## Audience notes
- **Human operators** read this page and `CredentialRouting.md` to choose the
right subsystem, then follow that subsystem's own docs.
- **Agents / CI** read the machine-readable routing catalog
(`registry/routing/catalog.yaml`) via `warden route` (above) so routing does
not have to be re-derived from wiki prose each session.
- **Same truth, two shapes:** humans read the wiki; agents read the catalog. The
catalog references wiki sections by anchor so the two cannot drift apart — a
test (`tests/test_routing.py`) fails CI if any `wiki_ref` anchor stops resolving.
---
## How this stays aligned
NetKingdom security architecture is canonical in `net-kingdom`. ops-warden tracks
it: when canon changes, the wiki section is updated and the catalog pointer
(`wiki_ref` + `canon_ref`) follows. ops-warden never overrides canon and never
silently forks it.
Report drift via a custodian workplan or a State Hub message to `ops-warden`.
---
## Drift review cadence
Every catalog entry carries a `reviewed:` date (`YYYY-MM-DD`) — the last time an
ops-warden steward confirmed the pointer still matches net-kingdom canon and the
owner repo's shipped path.
| Cadence | Action |
| --- | --- |
| **Quarterly** (default 90 days) | Run `warden route list --stale` — reconcile every listed entry against canon |
| **On canon change** | When net-kingdom security docs change, review affected `canon_ref` entries immediately |
| **On owner ship** | When an owning repo merges a new OpenBao path or playbook, promote `draft``active` and bump `reviewed` |
| **On agent confusion** | If `warden route find` misses a common query, add `need_keywords` or a playbook — do not restate owner procedure in the catalog |
### Stale check (operators and agents)
```bash
# Entries not reviewed in the last 90 days (default threshold)
warden route list --stale
# Include draft scenarios in the stale report
warden route list --stale --all
# Custom threshold (e.g. monthly review)
warden route list --stale --stale-days 30 --json
```
For each stale entry:
1. Open `canon_ref` in net-kingdom — confirm ownership and vocabulary unchanged.
2. Open `wiki_ref` in this repo — update the playbook section if canon moved.
3. Confirm the owner path still exists (anti-stale rule: unshipped paths stay `draft`).
4. Bump `reviewed:` in `registry/routing/catalog.yaml` to today's date.
5. Run `uv run pytest tests/test_routing.py` — anchor resolution must still pass.
CI enforces structural drift (every `wiki_ref` anchor resolves; no-double-source
rule). The quarterly cadence catches **semantic** drift CI cannot detect — canon
moved but anchors still resolve.
---
## See also
- `CredentialRouting.md` — worker decision tree and routing table
- `NetKingdomSecurityMap.md` — component literacy
- `INTENT.md` — steward mission ("issue SSH, route the rest")
- `workplans/WARDEN-WP-0010-access-routing-charter.md` — charter + no-double-source rule
- `net-kingdom/docs/platform-identity-security-architecture.md` — platform canon