Files
ops-warden/INTENT.md
tegwick dcfcc4b20a docs(WP-0010): rewire INTENT to "issue SSH, route the rest"; add access-routing plan
Drop the "operational access desk" framing (and the rejected "coach"
metaphor) for plain language: ops-warden issues short-lived SSH certs and
routes every other credential need to its owner. SSH is the only lane it
executes.

Adds WARDEN-WP-0010/0011/0012 with a pointer-layer routing catalog that
points at owner docs rather than restating them, enforced structurally
(non-SSH entries carrying a steps block fail CI). Drops the scope-creep-prone
`check` command; hides unshipped-path scenarios as draft.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-18 20:07:01 +02:00

233 lines
9.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# INTENT
> This file captures **why this repository exists**, the **direction it is
> moving toward**, and the **kind of system it is meant to become**.
> It is intentionally **aspirational and stable**, not a description of
> current implementation. See `SCOPE.md` for what is implemented today.
---
## One-liner
**Operational access steward for the NetKingdom security model — knows the platform
credential lanes, keeps them aligned, and issues short-lived SSH certificates where
that lane belongs to ops-warden.**
---
## Why This Exists
Development workers — human operators, kaizen agents, CI automations, and
custodian tooling — need **safe, attributable access** across an increasingly
complex NetKingdom stack: identity, MFA, authorization, runtime secrets, SSH
reachability, and tunnel transport.
That stack is easy to misuse:
- static SSH keys and pasted API tokens in chat or Git
- wrong subsystem chosen for a credential need (OpenBao vs warden vs key-cape)
- drift between NetKingdom architecture canon and what operators actually run
- ad hoc rediscovery of bootstrap and custody rules every time a worker needs access
**ops-warden exists so operational access has a custodian-domain home** that
understands NetKingdom security infrastructure, routes workers to the right
subsystem, keeps local guidance current, and **directly operates only the SSH
short-lived certificate lane** it owns.
---
## The Mission
> *Where we are going.*
ops-warden **issues short-lived SSH certificates and routes every other credential
need to the subsystem that owns it.** It is not a desk that wraps the platform; it
owns one lane and points at the rest:
1. **Know** the NetKingdom security model — identity, authorization, secrets,
SSH access, tunnels, bootstrap custody, and tenant/platform boundaries.
2. **Route** workers to the correct subsystem for each credential type instead
of becoming a universal secret vending machine — through the wiki and a
machine-readable routing catalog that *points at* the owner's docs rather than
restating them.
3. **Align** runbooks, wiki, inventory patterns, and scorecard checks with
NetKingdom canon as the platform evolves (OpenBao-first, flex-auth policy,
key-cape IAM Profile, railiance deployment layers).
4. **Issue** short-lived SSH certificates for `adm` / `agt` / `atm` actors when
host or ops reachability requires the SSH lane — via `warden sign`,
`cert_command`, and `ops-ssh-wrapper`. This is the **only** lane ops-warden
executes.
5. **Audit** SSH signing operations and cert-side compliance so gatekeeping is
observable, not tribal knowledge.
---
## NetKingdom Security Literacy
ops-warden should be fluent in the platform architecture documented in
`net-kingdom` — especially:
| Plane / component | Role in access | ops-warden relationship |
| --- | --- | --- |
| **key-cape / Keycloak** | Identity — who is the actor, MFA, IAM Profile claims | Instruct identity path; do not re-implement OIDC |
| **flex-auth + Topaz** | Authorization — may this actor perform this action | Future policy gate before SSH issuance; document integration |
| **OpenBao** | Runtime secrets — API keys, dynamic creds, leases, audit | Instruct secret custody paths; SSH engine is signing backend only |
| **ops-warden** | Operational SSH certificates — short-lived host access | **Own and issue** this lane |
| **ops-bridge** | Tunnel transport — consumes certs via `cert_command` | Primary consumer; document integration |
| **railiance-infra** | Host principals, force-command, SSH hardening | Instruct host-side deployment; do not own Ansible |
| **railiance-platform** | OpenBao/K8s/platform service deployment | Instruct production endpoints; do not deploy clusters |
Canonical references:
- `net-kingdom/docs/platform-identity-security-architecture.md`
- `net-kingdom/docs/responsibility-map.md`
- `wiki/AccessManagementDirective.md` (ops SSH actor model)
---
## Responsibility Boundary
### ops-warden owns
- NetKingdom-aligned **operational SSH access** guidance and stewardship
- **SSH certificate issuance** for registered `adm` / `agt` / `atm` actors
- Actor inventory, TTL/principal policy, cert-side scorecard, signatures log
- `cert_command` contract and `ops-ssh-wrapper` automation surface
- Keeping ops-warden docs and patterns aligned with NetKingdom security evolution
### ops-warden instructs but does not own
| Need | Route to |
| --- | --- |
| OIDC login, MFA, human identity claims | key-cape / Keycloak (NetKingdom IAM Profile) |
| Policy decision — may actor X access resource Y | flex-auth |
| API keys, provider secrets, DB creds, object-storage STS | OpenBao (+ flex-auth policy where required) |
| Inter-Hub operator keys, LLM provider credentials | OpenBao or approved operator secret store |
| Tunnel lifecycle, port forwarding | ops-bridge |
| `/etc/ssh/auth_principals/`, host hardening | railiance-infra |
| OpenBao cluster init/unseal, platform deploy | railiance-platform |
**ops-warden is not a general secrets manager.** It may document *how* workers
obtain non-SSH credentials; it must not store long-lived secrets in Git, State
Hub, workplans, logs, or chat.
---
## Design Principles
### 1. Right lane, right subsystem
Every credential request should land in the subsystem NetKingdom designed for it.
ops-warden optimizes for **correct routing** as much as for **fast issuance**.
### 2. Short-lived by default (SSH lane)
Operational SSH access uses CA-signed certificates with TTL and principals —
never unbounded static keys in worker workflows.
### 3. Align with canon, reduce drift
When NetKingdom security architecture changes (e.g. OpenBao standardization,
new bootstrap lanes), ops-warden updates its wiki, SCOPE, and runbooks so dev
workers do not reconstruct decisions from stale chat history.
### 4. Attributable actors
Humans, agents, and automations are distinct actor types (`adm` / `agt` / `atm`)
with naming, TTL, and principal conventions — matching the Access Management
Directive and NetKingdom agent-operating model.
### 5. Implement narrowly, guide broadly
**Implement** only what belongs in the SSH certificate lane.
**Guide** across the full NetKingdom security surface through documentation,
scorecard checks, inventory patterns, and future policy-integration hooks.
### 6. Observable gatekeeping
Every successful SSH sign is auditable (`signatures.log`). Compliance checks
(scorecard) make cert-side policy violations visible before they become incidents.
---
## Credential flow (target mental model)
```text
Development worker needs access
|
v
ops-warden (issue SSH; route the rest)
|
+-- SSH host / ops reachability? ----> warden sign / cert_command
|
+-- Runtime API / platform secret? --> OpenBao path (documented)
|
+-- Authorization required? ---------> flex-auth decision (future hook)
|
+-- Identity / MFA required? --------> key-cape / Keycloak path
|
+-- Tunnel only? --------------------> ops-bridge + cert_command
```
Today the steward role is primarily documentation, runbooks, and the implemented
SSH CLI. The machine-readable routing catalog and `warden route` lookup, plus
policy-gated issuance, are intentional follow-ups, not current promises.
---
## Relationship to NetKingdom
NetKingdom owns the **canonical security architecture** and meta-orchestration
across orchestrated repos. ops-warden is a **custodian-domain execution repo**
for one security lane plus operational guidance.
- NetKingdom defines *what the platform security model is*
- ops-warden keeps *operational SSH access and worker routing* aligned with it
- Railiance repos *deploy* what NetKingdom and component repos specify
ops-warden should appear in NetKingdom responsibility and pattern material as
the **operational SSH credential authority**, not as a replacement for
OpenBao or flex-auth.
---
## Success criteria
ops-warden is succeeding when:
1. A dev worker can determine **which subsystem** to use for a credential need
without guessing or pasting secrets into agent sessions.
2. SSH access for agents and operators is **short-lived, inventoried, and audited**.
3. ops-bridge and other consumers integrate via **stable cert_command** without
backend-specific branching.
4. NetKingdom security evolution (OpenBao, IAM Profile, bootstrap lanes) is
reflected in ops-warden docs within the same maintenance cycle.
5. Non-SSH secrets remain **out of ops-warden storage** — only documented paths.
---
## Non-goals
- Universal credential broker for all secret types
- Replacing OpenBao, flex-auth, key-cape, or railiance deployment ownership
- Storing Inter-Hub, LLM provider, or other long-lived API keys
- Host-side SSH configuration deployment
- **Duplicating or restating another subsystem's procedure** — routing material
points at the owner's docs; it does not fork them
- SSO / Teleport at scale (trigger per Access Management Directive §6.2)
---
## Evolution notes
The repository shipped the SSH CA CLI first (WARDEN-WP-00010003). The
stewardship and NetKingdom-alignment mission is the **next stratum** — docs,
routing canon, inventory standards, production OpenBao SSH engine alignment,
flex-auth integration design, and NetKingdom cross-links — without collapsing
platform boundaries.
See `wiki/CredentialRouting.md` for worker-facing routing,
`wiki/NetKingdomSecurityMap.md` for component literacy,
`history/2026-06-18-post-wp0008-intent-scope-reassessment.md` for the latest
gap analysis (production SSH path verified), and archived workplans WP-00060008
for stewardship and production closeout execution.