WARDEN-WP-0006: NetKingdom stewardship docs and alignment

Add credential routing, actor patterns, security map, OpenBao SSH
checklist, and policy-gated signing design. Update registry and SCOPE;
record INTENT↔SCOPE reassessment (C3 completeness).
This commit is contained in:
2026-06-17 08:22:45 +02:00
parent 5ae3821b88
commit 1865e0744e
14 changed files with 879 additions and 108 deletions

View File

@@ -11,7 +11,23 @@ This repo owns **ops-warden** only. It does not own:
| State Hub service code and consistency tooling | `state-hub` |
| Workstream coordination across custodian domain | `the-custodian` |
| Human admin SSH key generation | self-service (`ssh-keygen`) |
| Identity / OIDC / MFA | `key-cape`, Keycloak |
| Authorization policy | `flex-auth` |
| Runtime secrets (non-SSH) | OpenBao |
ops-warden issues **short-lived SSH certificates** only. It is not a general
secrets manager and must not store long-lived API keys in Git, State Hub, or
workplans.
## NetKingdom credential routing (quick reference)
| Worker need | Route to | ops-warden |
|-------------|----------|------------|
| SSH cert for host/ops access | ops-warden | Issue (`warden sign`) |
| API key / DB cred / lease | OpenBao | Document only — `wiki/CredentialRouting.md` |
| May I perform action X? | flex-auth | Design: `wiki/PolicyGatedSigning.md` |
| Login / MFA / OIDC | key-cape / Keycloak | Document only |
| SSH tunnel | ops-bridge | cert_command consumer |
| Host principals | railiance-infra | Document only |
Full map: `wiki/NetKingdomSecurityMap.md`.
ops-warden issues **short-lived SSH certificates** and maintains **operational
access stewardship docs**. It is not a general secrets manager and must not
store long-lived API keys in Git, State Hub, workplans, logs, or chat.

View File

@@ -219,6 +219,8 @@ routing canon, inventory standards, production OpenBao SSH engine alignment,
flex-auth integration design, and NetKingdom cross-links — without collapsing
platform boundaries.
See `history/2026-06-17-intent-scope-assessment.md` for the current SCOPE ↔
INTENT gap analysis and `workplans/WARDEN-WP-0006-netkingdom-alignment-and-access-stewardship.md`
for the active execution plan.
See `wiki/CredentialRouting.md` for worker-facing routing,
`wiki/NetKingdomSecurityMap.md` for component literacy,
`history/2026-06-17-intent-scope-assessment.md` for the initial gap analysis,
and `workplans/WARDEN-WP-0006-netkingdom-alignment-and-access-stewardship.md`
for stewardship execution.

View File

@@ -57,7 +57,9 @@ uv run ruff check .
## Documentation
- `INTENT.md` — operational access steward mission (NetKingdom-aligned)
- `wiki/CredentialRouting.md` *planned WP-0006* which subsystem for each credential type
- `wiki/CredentialRouting.md` — which subsystem for each credential type
- `wiki/NetKingdomSecurityMap.md` — platform security component map
- `wiki/ActorInventoryPatterns.md` — standard adm/agt/atm actor patterns
- `wiki/OpsWardenConfig.md` — configuration reference
- `wiki/CertCommandInterface.md``cert_command` contract for callers
- `wiki/InterHubBootstrapAccessLane.md` — short-lived cert envelope for bootstrap tasks

View File

@@ -52,13 +52,19 @@ Vault-compatible SSH secrets engine API, production).
- Capability registry entry for SSH certificate issuance
- Keeping ops access patterns consistent with `net-kingdom` platform architecture
### Planned (see workplan)
### Stewardship (shipped WP-0006)
- NetKingdom cross-links and responsibility-map alignment
- Credential routing runbook for dev workers
- Standard actor inventory patterns for agents and CI
- flex-auth policy hook design for gated SSH issuance
- Production OpenBao SSH engine operational checklist
- `wiki/CredentialRouting.md` — credential type → subsystem routing
- `wiki/NetKingdomSecurityMap.md` — NetKingdom component literacy
- `wiki/ActorInventoryPatterns.md` + `examples/inventory.seed.yaml`
- `wiki/OpenBaoSshEngineChecklist.md` — production SSH signing verify
- `wiki/PolicyGatedSigning.md` — flex-auth integration design (not implemented)
### Planned (follow-up)
- flex-auth policy hook implementation (WARDEN-WP-0007, proposed)
- Live production OpenBao SSH engine verification on Railiance
- NK-WP-0009 SSH tutorial joint with net-kingdom
---
@@ -101,8 +107,9 @@ Vault-compatible SSH secrets engine API, production).
- **SSH CLI:** shipped v0.1.0 (WARDEN-WP-00010003)
- **Docs:** OpenBao-first config (WARDEN-WP-0005), Inter-Hub bootstrap runbook
- **Registry:** `capability.security.ssh-certificate-issuance` published
- **INTENT:** defined 2026-06-17; stewardship layer largely **documentation-only**
- **Gap:** see `history/2026-06-17-intent-scope-assessment.md`
- **INTENT:** operational access steward (2026-06-17)
- **Stewardship docs:** WP-0006 complete — routing, inventory patterns, OpenBao checklist
- **Gap reassessment:** `history/2026-06-17-intent-scope-reassessment.md`
---
@@ -166,7 +173,9 @@ keywords: [ssh, certificate, ca, credential, warden, ops-warden, pki, openbao, v
| --- | --- |
| `INTENT.md` | Why ops-warden exists and where it is going |
| `SCOPE.md` | What is implemented today (this file) |
| `history/2026-06-17-intent-scope-assessment.md` | INTENT ↔ SCOPE gaps |
| `wiki/CredentialRouting.md` | Which subsystem for each credential need |
| `wiki/NetKingdomSecurityMap.md` | Platform security component map |
| `history/2026-06-17-intent-scope-reassessment.md` | Latest INTENT ↔ SCOPE assessment |
| `wiki/AccessManagementDirective.md` | SSH actor model |
| `wiki/OpsWardenConfig.md` | warden.yaml and OpenBao |
| `wiki/CertCommandInterface.md` | cert_command contract |

View File

@@ -0,0 +1,41 @@
# Non-secret inventory template — copy to ~/.config/warden/inventory.yaml
# and adjust for your environment. Do not commit real operator paths or keys.
#
# See wiki/ActorInventoryPatterns.md and wiki/OpsWardenConfig.md
actors:
agt-state-hub-bridge:
type: agt
principals:
- agt-task-bridge
ttl_hours: 24
description: "ops-bridge tunnel agent for state-hub"
agt-codex-interhub-bootstrap:
type: agt
principals:
- agt-interhub-bootstrap
ttl_hours: 2
description: "Short-lived agent access for attended Inter-Hub bootstrap"
adm-example:
type: adm
principals:
- adm-full
ttl_hours: 48
description: "Example human operator — replace with per-person adm-* actors"
atm-backup-daily:
type: atm
principals:
- atm-backup-daily
ttl_hours: 8
description: "Example nightly automation actor"
hosts:
example-host:
allowed_principals:
agt:
- agt-task-bridge
atm:
- atm-backup-daily

View File

@@ -0,0 +1,74 @@
# INTENT ↔ SCOPE Reassessment — ops-warden
**Date:** 2026-06-17
**Author:** codex
**Trigger:** WARDEN-WP-0006 complete (T1T7).
**Prior assessment:** `history/2026-06-17-intent-scope-assessment.md`
---
## 1. Executive summary
WARDEN-WP-0006 closed the primary **stewardship documentation gaps**. ops-warden
now has worker-facing credential routing, NetKingdom security literacy, actor
inventory patterns, OpenBao SSH verification checklist, and flex-auth integration
design. NetKingdom canon updated (`responsibility-map`, platform architecture
Operational SSH Path).
**Vector movement:** `D4/A3/C2/R2`**`D5/A3/C3/R2`**
| Dimension | Was | Now | Notes |
| --- | --- | --- | --- |
| Discovery | D4 | **D5** | Routing + security map + NK canon cross-links |
| Availability | A3 | A3 | CLI unchanged; no desk API yet |
| Completeness | C2 | **C3** | Stewardship operationalized in wiki; policy gate not coded |
| Reliability | R2 | R2 | Production OpenBao sign still operator-verified, not CI-proven |
---
## 2. Deliverables (WP-0006)
| Task | Deliverable | Status |
| --- | --- | --- |
| T1 | `wiki/CredentialRouting.md` | Done |
| T2 | `wiki/ActorInventoryPatterns.md`, `examples/inventory.seed.yaml` | Done |
| T3 | `wiki/NetKingdomSecurityMap.md`, registry, repo-boundary | Done |
| T4 | net-kingdom responsibility-map + platform SSH path | Done |
| T5 | `wiki/OpenBaoSshEngineChecklist.md` | Done |
| T6 | `wiki/PolicyGatedSigning.md` | Done (design) |
| T7 | This reassessment | Done |
---
## 3. Success criteria (INTENT.md) — updated
| Criterion | Was | Now |
| --- | --- | --- |
| Worker knows which subsystem for each credential type | No | **Yes**`wiki/CredentialRouting.md` |
| SSH access short-lived, inventoried, audited | Yes (tooling) | **Yes** — + patterns seed |
| ops-bridge integrates via cert_command | Yes (contract) | Yes |
| NetKingdom evolution reflected in ops-warden docs | Partial | **Yes** — NK canon patched + security map |
| Non-SSH secrets stay out of ops-warden | Yes | Yes |
**Score: 4 yes, 1 unchanged (live tunnel matrix)**
---
## 4. Remaining gaps (next work)
| Prio | Gap | Proposed work |
| --- | --- | --- |
| P1 | Production OpenBao SSH sign not executed in CI | Operator run checklist on Railiance; log evidence |
| P2 | flex-auth pre-sign not implemented | WARDEN-WP-0007 from `wiki/PolicyGatedSigning.md` |
| P3 | NK-WP-0009 tutorial not joint | Coordinate net-kingdom SSH tutorial |
| P4 | Optional `warden guide` CLI | Ad hoc if doc-only routing insufficient |
---
## 5. Recommendation
Mark **WARDEN-WP-0006 finished**. Open **WARDEN-WP-0007** when ready for
flex-auth integration or production OpenBao verification milestone.
**Completeness C3** is justified: central stewardship use case (routing + alignment)
works; SSH issuance was already C3; policy gate remains bounded known gap.

View File

@@ -1,7 +1,7 @@
---
id: capability.security.ssh-certificate-issuance
name: SSH Certificate Issuance
summary: Issue short-lived CA-signed SSH certificates for adm, agt, and atm actors through a stable cert_command CLI interface.
summary: Issue short-lived CA-signed SSH certificates for adm, agt, and atm actors through a stable cert_command CLI interface; steward operational access routing across NetKingdom security lanes.
owner: ops-warden
status: draft
domain: helix_forge
@@ -62,13 +62,15 @@ discovery:
intent: >
Give the ops fleet short-lived SSH credentials for humans, agents, and
automations without static keys, through a single cert_command surface that
callers can rely on regardless of CA backend.
callers can rely on regardless of CA backend; route non-SSH credential needs
to the correct NetKingdom subsystems (OpenBao, flex-auth, key-cape).
includes:
- certificate signing for adm, agt, and atm actors
- actor principals inventory and TTL policy
- cert_command interface (`warden sign`)
- cert-side compliance scorecard and signatures log
- ops-ssh-wrapper for automatic cert acquisition
- NetKingdom credential routing and alignment documentation
excludes:
- tunnel lifecycle
- host /etc/ssh/auth_principals deployment
@@ -108,6 +110,7 @@ consumer_guidance:
- issuing short-lived SSH certs for ops-bridge tunnels
- agent or automation access with TTL-bound principals
- checking cert-side compliance before rotation windows
- orienting dev workers on which NetKingdom subsystem owns each credential type
not_recommended_for:
- storing OpenRouter or Inter-Hub API keys
- replacing OpenBao deployment or host SSH hardening playbooks

View File

@@ -5,7 +5,7 @@ capabilities:
- id: capability.security.ssh-certificate-issuance
name: SSH Certificate Issuance
summary: Issue short-lived CA-signed SSH certificates for adm, agt, and atm actors
through a stable cert_command CLI interface.
through a stable cert_command CLI interface; steward NetKingdom operational access routing.
vector: D4 / A3 / C3 / R2
domain: helix_forge
status: draft

View File

@@ -0,0 +1,141 @@
# Actor Inventory Patterns
Date: 2026-06-17
Standard naming and TTL patterns for `~/.config/warden/inventory.yaml` (or
Git-tracked inventory in your environment). Actor names **must** use the prefix
matching `ActorType`: `adm-`, `agt-`, `atm-`.
See `wiki/AccessManagementDirective.md` for policy background and
`examples/inventory.seed.yaml` for a copy-paste template.
---
## Naming convention
```text
<type>-<scope>-<purpose>[-<instance>]
```
| Segment | Meaning |
| --- | --- |
| `type` | `adm` \| `agt` \| `atm` |
| `scope` | team, repo, or environment slug (`codex`, `state-hub`, `ci`) |
| `purpose` | narrow function (`bridge`, `bootstrap`, `backup`) |
| `instance` | optional disambiguator (`railiance01`) |
**Examples:** `agt-state-hub-bridge`, `agt-codex-interhub-bootstrap`, `atm-nightly-backup`.
---
## Pattern catalog
### Tunnel agents (`agt`)
Used by ops-bridge `cert_command` for SSH tunnels.
```yaml
agt-state-hub-bridge:
type: agt
principals: [agt-task-bridge]
ttl_hours: 24
description: "ops-bridge tunnel to state-hub backend"
```
- One actor per tunnel identity (match `ssh_user` / `actor` in `tunnels.yaml`).
- Principal should match host `auth_principals` entry deployed by railiance-infra.
- TTL default 24 h; shorten for high-risk paths.
### Kaizen / Codex agents (`agt`)
Attended or semi-attended agent work on trusted hosts.
```yaml
agt-codex-interhub-bootstrap:
type: agt
principals: [agt-interhub-bootstrap]
ttl_hours: 2
description: "Short-lived agent access for Inter-Hub bootstrap execution"
```
- Prefer **12 h TTL** for bootstrap; never multi-day agent certs.
- Principal narrower than general ops access (`agt-interhub-bootstrap` not `agt-ops-full`).
- Remove or disable actor when lane is retired.
- See `wiki/InterHubBootstrapAccessLane.md`.
### Human operators (`adm`)
```yaml
adm-bernd:
type: adm
principals: [adm-full]
ttl_hours: 48
description: "Human operator — interactive shell when policy allows"
```
- Humans bring their own keypair (`ssh-keygen`); warden signs pubkey only.
- Use separate actors per person, not shared `adm-shared`.
- Principals may be narrowed (`adm-readonly`) where railiance-infra supports it.
### CI / cron automations (`atm`)
```yaml
atm-backup-daily:
type: atm
principals: [atm-backup-daily]
ttl_hours: 8
description: "Nightly backup automation — force-command on host"
```
- Lowest TTL practical (≤ 8 h per directive max).
- Principal tied to single force-command on host.
- Prefer `warden issue` only in secured CI secret store contexts.
---
## TTL guidance
| Type | Default max (warden) | Typical attended | Typical automation |
| --- | --- | --- | --- |
| `adm` | 48 h | 2448 h | N/A |
| `agt` | 24 h | 14 h bootstrap | 824 h supervised |
| `atm` | 8 h | N/A | 18 h |
`warden sign` **rejects** TTL above type maximum (WARDEN-WP-0002).
---
## Principal narrowing
1. One principal per automation purpose — avoid `agt-ops-admin`.
2. Match host-side `auth_principals` exactly — coordinate with railiance-infra before add.
3. Document `description` field for audit and scorecard reviews.
4. Use `hosts:` section in inventory for reference (not enforced by warden).
---
## Adding a new worker
```bash
warden inventory add agt-myrepo-ci \
--type agt \
--principal agt-myrepo-ci \
--ttl 4 \
--description "CI deploy actor for myrepo"
warden inventory list
warden sign agt-myrepo-ci --pubkey /path/to/ci.pub
```
Copy patterns from `examples/inventory.seed.yaml` before inventing new names.
---
## Anti-patterns
| Do not | Why |
| --- | --- |
| Reuse `adm` actor for agents | Breaks attribution; use `agt-*` |
| Store private keys in inventory YAML | Inventory is registry only — keys live in secure paths |
| 72 h `agt` cert for convenience | Violates TTL policy and directive |
| One `agt-ops` for all tunnels | Cannot revoke or audit per tunnel |
| Put API keys in inventory | Route to OpenBao — `wiki/CredentialRouting.md` |

140
wiki/CredentialRouting.md Normal file
View File

@@ -0,0 +1,140 @@
# Credential Routing — NetKingdom Access Desk
Date: 2026-06-17
Use this page when a development worker (human, kaizen agent, CI job, or
custodian tool) needs **access or credentials** and is unsure which subsystem
owns the request.
ops-warden maintains this routing guide. It **issues SSH certificates only**.
For every other credential type, follow the routed path — do not paste secrets
into Git, State Hub, agent chat, or workplans.
---
## Quick decision tree
```text
What do you need?
|
+-- Log in as a human / get OIDC claims / MFA
| -> key-cape (lightweight) or Keycloak (expanded)
| net-kingdom/docs/platform-identity-security-architecture.md
|
+-- Permission to perform an action on a resource
| -> flex-auth (policy decision)
| flex-auth/INTENT.md
|
+-- API key, DB password, provider token, K8s secret, dynamic lease
| -> OpenBao (after flex-auth approval where policy requires it)
| railiance-platform/docs/openbao.md
| NEVER ops-warden
|
+-- S3 / object-storage temporary credentials
| -> NK-WP-0007 vending path (flex-auth + OpenBao + storage STS)
| net-kingdom/docs/object-storage-sts-credential-vending.md
| NEVER ops-warden
|
+-- SSH certificate for host / ops reachability (adm/agt/atm)
| -> ops-warden (warden sign / cert_command)
| wiki/OpsWardenConfig.md
|
+-- SSH tunnel / port forward (already have or will get a cert)
| -> ops-bridge
| ops-bridge tunnels.yaml + cert_command from ops-warden
|
+-- Host accepts your SSH principal / force-command on server
| -> railiance-infra Ansible
| /etc/ssh/auth_principals/, sshd hardening
```
**Under two minutes:** match your need to a branch above, open the linked doc,
stop if you landed on "NEVER ops-warden" for non-SSH secrets.
---
## Routing table
| I need… | Subsystem | ops-warden role |
| --- | --- | --- |
| Interactive login, OIDC token, MFA | key-cape / Keycloak | Document only — use IAM Profile |
| "May I do X on resource Y?" | flex-auth (+ Topaz PDP) | Future pre-sign gate for SSH; document only today |
| OpenRouter / LLM provider API key | OpenBao → K8s Secret | **Do not** ask ops-warden |
| Inter-Hub operator / runtime API key | OpenBao or `0600` temp file | See `wiki/InterHubBootstrapAccessLane.md` |
| Database or service password | OpenBao dynamic/KV | Document only |
| Short-lived SSH cert for operator | ops-warden (`adm-*`) | **Issue** via `warden sign` |
| Short-lived SSH cert for agent | ops-warden (`agt-*`) | **Issue** via `warden sign` / wrapper |
| Short-lived SSH cert for CI/cron | ops-warden (`atm-*`) | **Issue** via `warden sign` / `warden issue` |
| Tunnel to remote service | ops-bridge | Consumer of `cert_command` |
| Principal file on host | railiance-infra | Document only |
---
## Examples — do NOT ask ops-warden
| Request | Correct path |
| --- | --- |
| "Populate `OPENROUTER_API_KEY` for llm-connect" | Operator → OpenBao/K8s Secret in `activity-core` namespace |
| "Store Inter-Hub admin key for bootstrap" | Operator → OpenBao or `IHUB_OPERATOR_KEY_FILE` (`CUST-WP-0049`) |
| "Give me Vault root token" | Break-glass ceremony → `railiance-platform/docs/openbao.md` |
| "S3 credentials for artifact upload" | NK-WP-0007 / artifact-store consumer path |
| "JWT for my app" | key-cape / Keycloak IAM Profile |
---
## Examples — ops-warden IS correct
| Request | Command / pattern |
| --- | --- |
| ops-bridge tunnel needs a cert | `cert_command: warden sign <actor> --pubkey <path>` |
| Agent reaching bootstrap host | `agt-codex-interhub-bootstrap``wiki/InterHubBootstrapAccessLane.md` |
| Check cert expiry before shift | `warden status <actor>` |
| New tunnel actor | `warden inventory add``wiki/ActorInventoryPatterns.md` |
| Lab without OpenBao | `backend: local``wiki/OpsWardenConfig.md` |
---
## Typical flows
### Human operator → remote host
1. Identity: key-cape login if web/API access needed (optional for pure SSH).
2. SSH cert: `warden sign adm-<you> --pubkey ~/.ssh/id_ed25519.pub`.
3. Tunnel (if needed): ops-bridge with `cert_command` pointing at warden.
4. Host: principal deployed by railiance-infra.
### Kaizen / Codex agent → attended task
1. Register actor: `agt-codex-<task>` per `wiki/ActorInventoryPatterns.md`.
2. SSH cert: `WARDEN_ACTOR=... ops-ssh-wrapper ssh ...` or `warden sign`.
3. Secrets for task (API keys): OpenBao path — not warden.
4. Tunnel: ops-bridge if required.
### CI automation → scheduled job
1. Actor: `atm-<job>` with narrow principal and low TTL (≤ 8 h).
2. `warden issue atm-<job>` or sign with pre-provisioned key.
3. No long-lived keys in CI env vars.
---
## When guidance drifts
NetKingdom security architecture is canonical in `net-kingdom`. When it
changes (OpenBao, IAM Profile, new bootstrap lanes), ops-warden updates:
- This file
- `wiki/NetKingdomSecurityMap.md`
- `SCOPE.md` / `INTENT.md` as needed
Report drift via custodian workplan or State Hub message to `ops-warden`.
---
## See also
- `INTENT.md` — steward mission
- `wiki/NetKingdomSecurityMap.md` — component literacy
- `wiki/ActorInventoryPatterns.md` — actor naming
- `wiki/OpenBaoSshEngineChecklist.md` — production SSH signing verify
- `net-kingdom/docs/platform-identity-security-architecture.md` — platform canon

View File

@@ -0,0 +1,101 @@
# NetKingdom Security Map (ops-warden view)
Date: 2026-06-17
Condensed literacy guide for ops-warden stewards and development workers.
Canonical source remains `net-kingdom/docs/platform-identity-security-architecture.md`.
ops-warden **implements** the operational SSH lane and **documents** how the
other lanes connect.
---
## Planes
```text
Bootstrap plane railiance-infra, railiance-cluster, net-kingdom bootstrap
Platform control key-cape, flex-auth, OpenBao, Topaz, railiance-platform
Tenant plane railiance-apps, coulomb workloads, future tenants
Operational access ops-warden (SSH certs), ops-bridge (tunnels)
```
---
## Component map
| Component | Answers | Credential types | ops-warden |
| --- | --- | --- | --- |
| **key-cape** | Who are you? (lightweight IAM) | OIDC tokens, MFA | Route — do not issue |
| **Keycloak** | Who are you? (expanded IAM) | OIDC/SAML federation | Route — do not issue |
| **privacyIDEA** | MFA / step-up | OTP, hardware tokens | Route — do not issue |
| **flex-auth** | May you do this action? | Policy decisions, audit envelopes | Future SSH pre-sign; route today |
| **Topaz** | PDP runtime for flex-auth | Authorization evaluations | Route — do not issue |
| **OpenBao** | Runtime secret authority | API keys, DB creds, leases, K8s auth | SSH engine **signing backend** only |
| **ops-warden** | SSH ops access | Short-lived SSH certificates | **Own and issue** |
| **ops-bridge** | Tunnel transport | Uses certs via cert_command | Consumer |
| **railiance-infra** | Host enforcement | auth_principals, sshd | Route — deploy hosts |
| **railiance-platform** | Platform deploy | OpenBao, Postgres, ingress | Route — do not deploy from warden |
---
## Credential lanes (summary)
| Lane | Owner | Lifetime | Worker entrypoint |
| --- | --- | --- | --- |
| Identity | key-cape / Keycloak | Session / token TTL | Login / OIDC |
| Authorization | flex-auth | Per request | Policy API / embedded PEP |
| Runtime secrets | OpenBao | Lease-bound | `bao` CLI, K8s ESO, app integration |
| SSH operational | ops-warden | adm 48h / agt 24h / atm 8h | `warden sign` |
| Tunnel | ops-bridge | Session | `bridge` + cert_command |
Full routing: `wiki/CredentialRouting.md`.
---
## Trust flow (simplified)
```text
Worker request
-> Identity? key-cape / Keycloak
-> Authorized? flex-auth
-> Secret material? OpenBao
-> SSH cert? ops-warden
-> Tunnel? ops-bridge (cert from warden)
-> Host accepts? railiance-infra principals
```
OpenBao does **not** replace identity or authorization. flex-auth decides;
OpenBao stores/issues; ops-warden signs SSH certs when host reachability is
the need.
---
## NetKingdom documents to watch
| Document | Why ops-warden cares |
| --- | --- |
| `platform-identity-security-architecture.md` | Planes, secret path, SSH path |
| `responsibility-map.md` | Operational SSH dependency section |
| `platform-identity-security-architecture.md` | Operational SSH Path section |
| `platform-root-custody.md` | OpenBao ceremony — not warden's job |
| `object-storage-sts-credential-vending.md` | S3 creds — never warden |
| `canon/standards/iam-profile_v0.2.md` | Claims for future policy-gated sign |
When these change, update ops-warden wiki and `wiki/CredentialRouting.md`.
---
## Recursive platform rule
Tenant admins (including `tenant:coulomb`) must not gain platform-root
authority. ops-warden SSH actors should use **narrow principals** for agent
and automation work — not platform-admin equivalents on hosts.
---
## See also
- `INTENT.md`
- `wiki/CredentialRouting.md`
- `wiki/PolicyGatedSigning.md` (future flex-auth hook)
- `net-kingdom/docs/platform-identity-security-architecture.md`

View File

@@ -0,0 +1,172 @@
# OpenBao SSH Engine — Operational Checklist
Date: 2026-06-17
Verify the production SSH signing path for `warden` against platform OpenBao.
Cluster bootstrap and unseal are **not** ops-warden scope — see
`railiance-platform/docs/openbao.md`.
---
## Prerequisites
- [ ] OpenBao deployed on Railiance (`railiance-platform` helm/Makefile)
- [ ] `bao status` reports initialized and **unsealed**
- [ ] Operator has **scoped token** — not root token in `VAULT_TOKEN` for daily warden use
- [ ] `warden.yaml` points `vault.addr` at correct endpoint:
- Workstation: `https://bao.coulomb.social`
- In-cluster: `http://openbao.openbao.svc.cluster.local:8200`
- [ ] Actor exists in inventory — `wiki/ActorInventoryPatterns.md`
- [ ] Test pubkey available (mode 600 private key, never commit)
---
## One-time SSH engine setup (operator)
Run with OpenBao admin policy — not from agent chat logs.
```bash
# Confirm reachability
bao status
# Enable SSH secrets engine (skip if already enabled)
bao secrets enable ssh
# Roles — TTL max must match ActorType policy (wiki/OpsWardenConfig.md)
bao write ssh/roles/agt-role \
key_type=ca \
allowed_users="*" \
allow_user_certificates=true \
default_user="agt" \
ttl=24h max_ttl=24h
bao write ssh/roles/adm-role \
key_type=ca \
allowed_users="*" \
allow_user_certificates=true \
default_user="adm" \
ttl=48h max_ttl=48h
bao write ssh/roles/atm-role \
key_type=ca \
allowed_users="*" \
allow_user_certificates=true \
default_user="atm" \
ttl=8h max_ttl=8h
# Verify roles listed
bao list ssh/roles
```
Document CA public key distribution to hosts via railiance-infra — warden does
not deploy `TrustedUserCAKeys`.
---
## Token policy expectations
| Rule | Rationale |
| --- | --- |
| No root token in `VAULT_TOKEN` for warden workflows | Root is break-glass only |
| Token scoped to `ssh/sign/<role>` for needed roles | Least privilege |
| Short TTL on operator tokens | Limit blast radius |
| Prefer OIDC/login-derived tokens via KeyCape where available | Platform admin path |
Example policy shape (illustrative — adjust in OpenBao policy admin):
```hcl
path "ssh/sign/agt-role" {
capabilities = ["create", "update"]
}
```
---
## warden.yaml sanity check
```yaml
backend: vault
vault:
addr: https://bao.coulomb.social
mount: ssh
role_map:
adm: adm-role
agt: agt-role
atm: atm-role
token_env: VAULT_TOKEN
```
---
## Verification procedure
```bash
export VAULT_TOKEN="<scoped-token>" # never paste in chat or commit
# 1. Config loads
warden status --help
# 2. Sign test actor (replace actor and pubkey paths)
warden sign agt-state-hub-bridge --pubkey ~/.ssh/agt-state-hub-bridge_ed25519.pub \
| head -c 80 && echo "..."
# 3. Metadata
warden status agt-state-hub-bridge
# 4. Audit line
warden log --actor agt-state-hub-bridge --last 1
# 5. Compliance
warden scorecard
```
**Pass criteria:**
- Exit code 0 on sign and status
- Cert `valid_before` in the future
- `signatures.log` has new JSONL line with `"backend": "vault"`
- Scorecard passes on clean state dir
---
## cert_command smoke (ops-bridge)
In `tunnels.yaml`, set:
```yaml
cert_command: "warden sign <actor> --pubkey <path/to>.pub"
```
Bring up tunnel; confirm SSH connects with cert + key (ops-bridge docs).
---
## Failure modes
| Symptom | Likely cause | Action |
| --- | --- | --- |
| `Vault token not found` | `VAULT_TOKEN` unset | Scoped login/token issue |
| HTTP 403 from OpenBao | Token lacks sign permission | Fix policy |
| `No Vault role mapped` | `role_map` mismatch | Fix warden.yaml |
| `ttl exceeds max` | Inventory TTL > ActorType max | Fix inventory or role |
| Connection refused | Wrong `addr` or OpenBao sealed | Check platform ops |
| Host rejects cert | Principal not on host | railiance-infra auth_principals |
**Lab fallback:** `backend: local` in warden.yaml — **not** a production substitute.
Use only for offline dev when OpenBao is unreachable.
---
## Boundaries
- ops-warden does not unseal OpenBao or rotate unseal keys
- ops-warden does not store API keys alongside SSH signing
- Host trust of CA pubkey is railiance-infra responsibility
---
## See also
- `wiki/OpsWardenConfig.md`
- `railiance-platform/docs/openbao.md`
- `wiki/CredentialRouting.md`

129
wiki/PolicyGatedSigning.md Normal file
View File

@@ -0,0 +1,129 @@
# Policy-Gated SSH Signing (design)
Date: 2026-06-17
Status: **design only** — not implemented in WARDEN-WP-0006
Today `warden sign` authorizes via **inventory allow-list** and TTL policy only.
This document proposes flex-auth integration so SSH issuance matches the
NetKingdom authorization path before OpenBao/SSH engine signing.
---
## Problem
Inventory-only gating is sufficient for early ops but weak for:
- many agents and automations across tenants
- temporary elevation without inventory edits
- unified audit with flex-auth decision envelopes
- aligning SSH issuance with IAM Profile claims
---
## Target flow (v2)
```text
warden sign <actor> --pubkey <path>
|
v
Load actor from inventory (type, principals, ttl)
|
v
Obtain identity claims (optional v2.1)
OIDC token / env-injected JWT from key-cape session
|
v
flex-auth Evaluate
resource: ssh-certificate / actor:<name>
action: sign
context: tenant, principal list, pubkey fingerprint, requestor
|
+-- DENY -> CAError with flex-auth explanation
|
v ALLOW
CABackend.sign() (local or OpenBao SSH engine)
|
v
Append signatures.log (+ optional flex-auth audit correlation id)
```
---
## flex-auth request shape (proposed)
| Field | Source |
| --- | --- |
| `subject` | IAM Profile `sub` or service identity |
| `tenant` | `tenant:platform` or `tenant:coulomb` |
| `resource` | `ssh-cert:actor/<actor-name>` |
| `action` | `sign` |
| `context.principals` | From inventory |
| `context.actor_type` | adm \| agt \| atm |
| `context.pubkey_fingerprint` | SHA256 of pubkey |
| `context.ttl_hours` | Requested TTL |
Decision envelope should return `allow` \| `deny` and `audit_correlation_id`
stored in `signatures.log`.
---
## Versioning
| Version | Gate | Notes |
| --- | --- | --- |
| **v1 (today)** | Inventory + TTL max | Shipped |
| **v2** | flex-auth required for `backend: vault` production | Config flag |
| **v2.1** | Identity claims required for `adm` signs | OIDC from key-cape |
| **v3** | Tenant-scoped policies per `tenant:*` | NK recursive rule |
---
## Configuration sketch (future)
```yaml
# warden.yaml — not implemented
policy:
enabled: true
flex_auth_url: http://flex-auth.flex-auth.svc.cluster.local:8080
require_identity_for_adm: true
fail_closed: true
```
`fail_closed: true` — if flex-auth unreachable, deny sign (no silent bypass).
---
## What stays in inventory (v2)
- Actor registration (name, type, default principals, default TTL)
- Host reference documentation
- Scorecard local checks
flex-auth decides **whether this sign request is allowed now**; inventory
defines **what the actor is allowed to request**.
---
## Non-goals (this design)
- flex-auth implementation changes in WP-0006
- Replacing OpenBao SSH engine with flex-auth
- Storing flex-auth policies in ops-warden repo
---
## Implementation follow-up
Promote to **WARDEN-WP-0007** (proposed) after:
1. flex-auth resource type for `ssh-certificate` agreed
2. NK platform policy for platform vs tenant sign paths
3. Operator approval for `fail_closed` production behavior
---
## See also
- `flex-auth/INTENT.md`
- `wiki/CredentialRouting.md`
- `net-kingdom/docs/platform-identity-security-architecture.md`

View File

@@ -4,7 +4,7 @@ type: workplan
title: "NetKingdom Alignment and Operational Access Stewardship"
domain: custodian
repo: ops-warden
status: ready
status: finished
owner: codex
topic_slug: custodian
planning_priority: high
@@ -24,13 +24,6 @@ to optional CLI ergonomics.
**Out of scope:** flex-auth integration implementation, OpenBao cluster deploy,
universal credential broker, net-kingdom INTENT.md rewrite.
**References:**
- `INTENT.md`, `SCOPE.md`, `history/2026-06-17-intent-scope-assessment.md`
- `net-kingdom/docs/platform-identity-security-architecture.md`
- `net-kingdom/docs/responsibility-map.md`
- `NK-WP-0009` (SSH tutorial, net-kingdom)
---
## Goal
@@ -51,147 +44,95 @@ After this workplan, a development worker or agent can:
```task
id: WARDEN-WP-0006-T01
status: todo
status: done
priority: high
state_hub_task_id: "ffc6a0c2-4312-4584-be7a-c8411cb01899"
```
Create `wiki/CredentialRouting.md`:
- Decision tree: SSH vs runtime secret vs identity vs authorization vs tunnel
- Per-subsystem links (OpenBao, flex-auth, key-cape, ops-bridge, railiance-infra)
- Explicit “do not ask ops-warden for API keys” examples
- Link from `SCOPE.md`, `INTENT.md`, `README.md`
**Done when:** A worker with no prior context can route a credential request in
under two minutes using this page alone.
- [x] `wiki/CredentialRouting.md` with decision tree and anti-examples
- [x] Linked from SCOPE, INTENT, README
### T2 — Actor inventory patterns
```task
id: WARDEN-WP-0006-T02
status: todo
status: done
priority: high
state_hub_task_id: "3816463d-7dfd-469d-9324-fd7880b50608"
```
Create `wiki/ActorInventoryPatterns.md` with standard patterns:
- Tunnel agents (`agt-*-bridge`)
- Kaizen / codex agents (`agt-codex-*`)
- CI automations (`atm-*`)
- Human admins (`adm-*`)
- TTL and principal narrowing guidance
Optional: `examples/inventory.seed.yaml` (non-secret, Git-safe template).
**Done when:** Adding a new dev worker actor does not require inventing naming
from scratch.
- [x] `wiki/ActorInventoryPatterns.md`
- [x] `examples/inventory.seed.yaml`
### T3 — NetKingdom cross-links (ops-warden side)
```task
id: WARDEN-WP-0006-T03
status: todo
status: done
priority: high
state_hub_task_id: "f158366a-5746-48b8-acce-472dce8f925e"
```
- Add `wiki/NetKingdomSecurityMap.md` — condensed literacy table from INTENT
- Update `registry/capabilities/capability.security.ssh-certificate-issuance.md`
summary to mention stewardship/routing
- Update `.claude/rules/repo-boundary.md` with NetKingdom routing table
**Done when:** ops-warden docs stand alone for NetKingdom operational access
orientation without reading net-kingdom first (but link to canon).
- [x] `wiki/NetKingdomSecurityMap.md`
- [x] Registry capability stewardship summary
- [x] `.claude/rules/repo-boundary.md` routing table
### T4 — NetKingdom canon patch (coordination)
```task
id: WARDEN-WP-0006-T04
status: todo
status: done
priority: medium
state_hub_task_id: "e40e4395-8f01-4f79-a539-d0de8e427321"
```
Coordinate updates in `net-kingdom` (separate commit/PR there):
- `docs/responsibility-map.md` — move ops-warden from pure out-of-scope to
**operational SSH credential dependency**
- `docs/platform-identity-security-architecture.md` — add Operational SSH Path
(ops-warden → ops-bridge → hosts)
**Done when:** NetKingdom canon names ops-wardens lane; ops-warden wiki links
back to the updated sections.
**Note:** Requires `net-kingdom` repo write access; may need `needs_human` if
blocked on review.
- [x] `net-kingdom/docs/responsibility-map.md` — Operational SSH dependency
- [x] `net-kingdom/docs/platform-identity-security-architecture.md` — Operational SSH Path
### T5 — OpenBao SSH engine operational checklist
```task
id: WARDEN-WP-0006-T05
status: todo
status: done
priority: medium
state_hub_task_id: "a94e20a2-970b-4a0c-bd23-8510b841b938"
```
Create `wiki/OpenBaoSshEngineChecklist.md`:
- Prerequisites (OpenBao initialized/unsealed per railiance-platform)
- Role creation commands (from OpsWardenConfig)
- Token policy expectations (no root token in warden workflows)
- Verification: `warden sign` against production endpoint
- Failure modes and fallback boundaries
**Done when:** Operator can verify production SSH signing path without
reconstructing steps from multiple repos.
- [x] `wiki/OpenBaoSshEngineChecklist.md`
### T6 — Policy-gated signing design (design only)
```task
id: WARDEN-WP-0006-T06
status: todo
status: done
priority: low
state_hub_task_id: "b10a4b4d-bfa1-4f49-b6a5-f339f1e6a2e1"
```
Create `wiki/PolicyGatedSigning.md`:
- flex-auth decision before `warden sign` — proposed flow
- Claims needed from IAM Profile
- What stays inventory-based in v1 vs policy-based in v2
- Explicit non-implementation in this workplan
**Done when:** Reviewable design exists; no code dependency.
- [x] `wiki/PolicyGatedSigning.md`
### T7 — Re-assess INTENT ↔ SCOPE
```task
id: WARDEN-WP-0006-T07
status: todo
status: done
priority: medium
state_hub_task_id: "ef8b5c57-2343-4cfc-9fee-48db1e56f69a"
```
After T1T5 complete:
- Update `history/2026-06-17-intent-scope-assessment.md` or add
`history/YYYYMMDD-intent-scope-reassessment.md`
- Refresh SCOPE.md Current State and completeness notes
- Run `make fix-consistency REPO=ops-warden`
**Done when:** Completeness target C3+ documented with evidence.
- [x] `history/2026-06-17-intent-scope-reassessment.md`
- [x] SCOPE.md Current State updated
- [x] `make fix-consistency REPO=ops-warden`
---
## Acceptance Criteria
- [ ] `wiki/CredentialRouting.md` exists and is linked from README/SCOPE
- [ ] `wiki/ActorInventoryPatterns.md` exists
- [ ] `wiki/NetKingdomSecurityMap.md` exists
- [ ] NetKingdom responsibility-map recognizes ops-warden SSH lane (T4)
- [ ] OpenBao SSH checklist documented (T5)
- [ ] Policy-gated signing design drafted (T6)
- [ ] INTENT ↔ SCOPE re-assessment recorded (T7)
- [ ] `reuse-surface validate --root .` passes if registry entry changed
- [x] `wiki/CredentialRouting.md` exists and is linked from README/SCOPE
- [x] `wiki/ActorInventoryPatterns.md` exists
- [x] `wiki/NetKingdomSecurityMap.md` exists
- [x] NetKingdom responsibility-map recognizes ops-warden SSH lane (T4)
- [x] OpenBao SSH checklist documented (T5)
- [x] Policy-gated signing design drafted (T6)
- [x] INTENT ↔ SCOPE re-assessment recorded (T7)
- [x] `reuse-surface validate --root .` passes