From 1865e0744e6aedbdece0b2047dd507d1fdfb1e32 Mon Sep 17 00:00:00 2001 From: tegwick Date: Wed, 17 Jun 2026 08:22:45 +0200 Subject: [PATCH] WARDEN-WP-0006: NetKingdom stewardship docs and alignment MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add credential routing, actor patterns, security map, OpenBao SSH checklist, and policy-gated signing design. Update registry and SCOPE; record INTENT↔SCOPE reassessment (C3 completeness). --- .claude/rules/repo-boundary.md | 22 ++- INTENT.md | 8 +- README.md | 4 +- SCOPE.md | 27 ++- examples/inventory.seed.yaml | 41 +++++ .../2026-06-17-intent-scope-reassessment.md | 74 ++++++++ ...ility.security.ssh-certificate-issuance.md | 7 +- registry/indexes/capabilities.yaml | 2 +- wiki/ActorInventoryPatterns.md | 141 ++++++++++++++ wiki/CredentialRouting.md | 140 ++++++++++++++ wiki/NetKingdomSecurityMap.md | 101 ++++++++++ wiki/OpenBaoSshEngineChecklist.md | 172 ++++++++++++++++++ wiki/PolicyGatedSigning.md | 129 +++++++++++++ ...ingdom-alignment-and-access-stewardship.md | 119 +++--------- 14 files changed, 879 insertions(+), 108 deletions(-) create mode 100644 examples/inventory.seed.yaml create mode 100644 history/2026-06-17-intent-scope-reassessment.md create mode 100644 wiki/ActorInventoryPatterns.md create mode 100644 wiki/CredentialRouting.md create mode 100644 wiki/NetKingdomSecurityMap.md create mode 100644 wiki/OpenBaoSshEngineChecklist.md create mode 100644 wiki/PolicyGatedSigning.md diff --git a/.claude/rules/repo-boundary.md b/.claude/rules/repo-boundary.md index 8b0715e..3a1e360 100644 --- a/.claude/rules/repo-boundary.md +++ b/.claude/rules/repo-boundary.md @@ -11,7 +11,23 @@ This repo owns **ops-warden** only. It does not own: | State Hub service code and consistency tooling | `state-hub` | | Workstream coordination across custodian domain | `the-custodian` | | Human admin SSH key generation | self-service (`ssh-keygen`) | +| Identity / OIDC / MFA | `key-cape`, Keycloak | +| Authorization policy | `flex-auth` | +| Runtime secrets (non-SSH) | OpenBao | -ops-warden issues **short-lived SSH certificates** only. It is not a general -secrets manager and must not store long-lived API keys in Git, State Hub, or -workplans. \ No newline at end of file +## NetKingdom credential routing (quick reference) + +| Worker need | Route to | ops-warden | +|-------------|----------|------------| +| SSH cert for host/ops access | ops-warden | Issue (`warden sign`) | +| API key / DB cred / lease | OpenBao | Document only — `wiki/CredentialRouting.md` | +| May I perform action X? | flex-auth | Design: `wiki/PolicyGatedSigning.md` | +| Login / MFA / OIDC | key-cape / Keycloak | Document only | +| SSH tunnel | ops-bridge | cert_command consumer | +| Host principals | railiance-infra | Document only | + +Full map: `wiki/NetKingdomSecurityMap.md`. + +ops-warden issues **short-lived SSH certificates** and maintains **operational +access stewardship docs**. It is not a general secrets manager and must not +store long-lived API keys in Git, State Hub, workplans, logs, or chat. \ No newline at end of file diff --git a/INTENT.md b/INTENT.md index 8ea5b9a..67032fc 100644 --- a/INTENT.md +++ b/INTENT.md @@ -219,6 +219,8 @@ routing canon, inventory standards, production OpenBao SSH engine alignment, flex-auth integration design, and NetKingdom cross-links — without collapsing platform boundaries. -See `history/2026-06-17-intent-scope-assessment.md` for the current SCOPE ↔ -INTENT gap analysis and `workplans/WARDEN-WP-0006-netkingdom-alignment-and-access-stewardship.md` -for the active execution plan. \ No newline at end of file +See `wiki/CredentialRouting.md` for worker-facing routing, +`wiki/NetKingdomSecurityMap.md` for component literacy, +`history/2026-06-17-intent-scope-assessment.md` for the initial gap analysis, +and `workplans/WARDEN-WP-0006-netkingdom-alignment-and-access-stewardship.md` +for stewardship execution. \ No newline at end of file diff --git a/README.md b/README.md index 94d18cc..b0912ec 100644 --- a/README.md +++ b/README.md @@ -57,7 +57,9 @@ uv run ruff check . ## Documentation - `INTENT.md` — operational access steward mission (NetKingdom-aligned) -- `wiki/CredentialRouting.md` — *planned WP-0006* — which subsystem for each credential type +- `wiki/CredentialRouting.md` — which subsystem for each credential type +- `wiki/NetKingdomSecurityMap.md` — platform security component map +- `wiki/ActorInventoryPatterns.md` — standard adm/agt/atm actor patterns - `wiki/OpsWardenConfig.md` — configuration reference - `wiki/CertCommandInterface.md` — `cert_command` contract for callers - `wiki/InterHubBootstrapAccessLane.md` — short-lived cert envelope for bootstrap tasks diff --git a/SCOPE.md b/SCOPE.md index 1b32492..4dbb32e 100644 --- a/SCOPE.md +++ b/SCOPE.md @@ -52,13 +52,19 @@ Vault-compatible SSH secrets engine API, production). - Capability registry entry for SSH certificate issuance - Keeping ops access patterns consistent with `net-kingdom` platform architecture -### Planned (see workplan) +### Stewardship (shipped WP-0006) -- NetKingdom cross-links and responsibility-map alignment -- Credential routing runbook for dev workers -- Standard actor inventory patterns for agents and CI -- flex-auth policy hook design for gated SSH issuance -- Production OpenBao SSH engine operational checklist +- `wiki/CredentialRouting.md` — credential type → subsystem routing +- `wiki/NetKingdomSecurityMap.md` — NetKingdom component literacy +- `wiki/ActorInventoryPatterns.md` + `examples/inventory.seed.yaml` +- `wiki/OpenBaoSshEngineChecklist.md` — production SSH signing verify +- `wiki/PolicyGatedSigning.md` — flex-auth integration design (not implemented) + +### Planned (follow-up) + +- flex-auth policy hook implementation (WARDEN-WP-0007, proposed) +- Live production OpenBao SSH engine verification on Railiance +- NK-WP-0009 SSH tutorial joint with net-kingdom --- @@ -101,8 +107,9 @@ Vault-compatible SSH secrets engine API, production). - **SSH CLI:** shipped v0.1.0 (WARDEN-WP-0001–0003) - **Docs:** OpenBao-first config (WARDEN-WP-0005), Inter-Hub bootstrap runbook - **Registry:** `capability.security.ssh-certificate-issuance` published -- **INTENT:** defined 2026-06-17; stewardship layer largely **documentation-only** -- **Gap:** see `history/2026-06-17-intent-scope-assessment.md` +- **INTENT:** operational access steward (2026-06-17) +- **Stewardship docs:** WP-0006 complete — routing, inventory patterns, OpenBao checklist +- **Gap reassessment:** `history/2026-06-17-intent-scope-reassessment.md` --- @@ -166,7 +173,9 @@ keywords: [ssh, certificate, ca, credential, warden, ops-warden, pki, openbao, v | --- | --- | | `INTENT.md` | Why ops-warden exists and where it is going | | `SCOPE.md` | What is implemented today (this file) | -| `history/2026-06-17-intent-scope-assessment.md` | INTENT ↔ SCOPE gaps | +| `wiki/CredentialRouting.md` | Which subsystem for each credential need | +| `wiki/NetKingdomSecurityMap.md` | Platform security component map | +| `history/2026-06-17-intent-scope-reassessment.md` | Latest INTENT ↔ SCOPE assessment | | `wiki/AccessManagementDirective.md` | SSH actor model | | `wiki/OpsWardenConfig.md` | warden.yaml and OpenBao | | `wiki/CertCommandInterface.md` | cert_command contract | diff --git a/examples/inventory.seed.yaml b/examples/inventory.seed.yaml new file mode 100644 index 0000000..3fd272c --- /dev/null +++ b/examples/inventory.seed.yaml @@ -0,0 +1,41 @@ +# Non-secret inventory template — copy to ~/.config/warden/inventory.yaml +# and adjust for your environment. Do not commit real operator paths or keys. +# +# See wiki/ActorInventoryPatterns.md and wiki/OpsWardenConfig.md + +actors: + agt-state-hub-bridge: + type: agt + principals: + - agt-task-bridge + ttl_hours: 24 + description: "ops-bridge tunnel agent for state-hub" + + agt-codex-interhub-bootstrap: + type: agt + principals: + - agt-interhub-bootstrap + ttl_hours: 2 + description: "Short-lived agent access for attended Inter-Hub bootstrap" + + adm-example: + type: adm + principals: + - adm-full + ttl_hours: 48 + description: "Example human operator — replace with per-person adm-* actors" + + atm-backup-daily: + type: atm + principals: + - atm-backup-daily + ttl_hours: 8 + description: "Example nightly automation actor" + +hosts: + example-host: + allowed_principals: + agt: + - agt-task-bridge + atm: + - atm-backup-daily \ No newline at end of file diff --git a/history/2026-06-17-intent-scope-reassessment.md b/history/2026-06-17-intent-scope-reassessment.md new file mode 100644 index 0000000..f98c4c0 --- /dev/null +++ b/history/2026-06-17-intent-scope-reassessment.md @@ -0,0 +1,74 @@ +# INTENT ↔ SCOPE Reassessment — ops-warden + +**Date:** 2026-06-17 +**Author:** codex +**Trigger:** WARDEN-WP-0006 complete (T1–T7). +**Prior assessment:** `history/2026-06-17-intent-scope-assessment.md` + +--- + +## 1. Executive summary + +WARDEN-WP-0006 closed the primary **stewardship documentation gaps**. ops-warden +now has worker-facing credential routing, NetKingdom security literacy, actor +inventory patterns, OpenBao SSH verification checklist, and flex-auth integration +design. NetKingdom canon updated (`responsibility-map`, platform architecture +Operational SSH Path). + +**Vector movement:** `D4/A3/C2/R2` → **`D5/A3/C3/R2`** + +| Dimension | Was | Now | Notes | +| --- | --- | --- | --- | +| Discovery | D4 | **D5** | Routing + security map + NK canon cross-links | +| Availability | A3 | A3 | CLI unchanged; no desk API yet | +| Completeness | C2 | **C3** | Stewardship operationalized in wiki; policy gate not coded | +| Reliability | R2 | R2 | Production OpenBao sign still operator-verified, not CI-proven | + +--- + +## 2. Deliverables (WP-0006) + +| Task | Deliverable | Status | +| --- | --- | --- | +| T1 | `wiki/CredentialRouting.md` | Done | +| T2 | `wiki/ActorInventoryPatterns.md`, `examples/inventory.seed.yaml` | Done | +| T3 | `wiki/NetKingdomSecurityMap.md`, registry, repo-boundary | Done | +| T4 | net-kingdom responsibility-map + platform SSH path | Done | +| T5 | `wiki/OpenBaoSshEngineChecklist.md` | Done | +| T6 | `wiki/PolicyGatedSigning.md` | Done (design) | +| T7 | This reassessment | Done | + +--- + +## 3. Success criteria (INTENT.md) — updated + +| Criterion | Was | Now | +| --- | --- | --- | +| Worker knows which subsystem for each credential type | No | **Yes** — `wiki/CredentialRouting.md` | +| SSH access short-lived, inventoried, audited | Yes (tooling) | **Yes** — + patterns seed | +| ops-bridge integrates via cert_command | Yes (contract) | Yes | +| NetKingdom evolution reflected in ops-warden docs | Partial | **Yes** — NK canon patched + security map | +| Non-SSH secrets stay out of ops-warden | Yes | Yes | + +**Score: 4 yes, 1 unchanged (live tunnel matrix)** + +--- + +## 4. Remaining gaps (next work) + +| Prio | Gap | Proposed work | +| --- | --- | --- | +| P1 | Production OpenBao SSH sign not executed in CI | Operator run checklist on Railiance; log evidence | +| P2 | flex-auth pre-sign not implemented | WARDEN-WP-0007 from `wiki/PolicyGatedSigning.md` | +| P3 | NK-WP-0009 tutorial not joint | Coordinate net-kingdom SSH tutorial | +| P4 | Optional `warden guide` CLI | Ad hoc if doc-only routing insufficient | + +--- + +## 5. Recommendation + +Mark **WARDEN-WP-0006 finished**. Open **WARDEN-WP-0007** when ready for +flex-auth integration or production OpenBao verification milestone. + +**Completeness C3** is justified: central stewardship use case (routing + alignment) +works; SSH issuance was already C3; policy gate remains bounded known gap. \ No newline at end of file diff --git a/registry/capabilities/capability.security.ssh-certificate-issuance.md b/registry/capabilities/capability.security.ssh-certificate-issuance.md index 9c6a450..0a0e10b 100644 --- a/registry/capabilities/capability.security.ssh-certificate-issuance.md +++ b/registry/capabilities/capability.security.ssh-certificate-issuance.md @@ -1,7 +1,7 @@ --- id: capability.security.ssh-certificate-issuance name: SSH Certificate Issuance -summary: Issue short-lived CA-signed SSH certificates for adm, agt, and atm actors through a stable cert_command CLI interface. +summary: Issue short-lived CA-signed SSH certificates for adm, agt, and atm actors through a stable cert_command CLI interface; steward operational access routing across NetKingdom security lanes. owner: ops-warden status: draft domain: helix_forge @@ -62,13 +62,15 @@ discovery: intent: > Give the ops fleet short-lived SSH credentials for humans, agents, and automations without static keys, through a single cert_command surface that - callers can rely on regardless of CA backend. + callers can rely on regardless of CA backend; route non-SSH credential needs + to the correct NetKingdom subsystems (OpenBao, flex-auth, key-cape). includes: - certificate signing for adm, agt, and atm actors - actor principals inventory and TTL policy - cert_command interface (`warden sign`) - cert-side compliance scorecard and signatures log - ops-ssh-wrapper for automatic cert acquisition + - NetKingdom credential routing and alignment documentation excludes: - tunnel lifecycle - host /etc/ssh/auth_principals deployment @@ -108,6 +110,7 @@ consumer_guidance: - issuing short-lived SSH certs for ops-bridge tunnels - agent or automation access with TTL-bound principals - checking cert-side compliance before rotation windows + - orienting dev workers on which NetKingdom subsystem owns each credential type not_recommended_for: - storing OpenRouter or Inter-Hub API keys - replacing OpenBao deployment or host SSH hardening playbooks diff --git a/registry/indexes/capabilities.yaml b/registry/indexes/capabilities.yaml index 7811a80..218a9ab 100644 --- a/registry/indexes/capabilities.yaml +++ b/registry/indexes/capabilities.yaml @@ -5,7 +5,7 @@ capabilities: - id: capability.security.ssh-certificate-issuance name: SSH Certificate Issuance summary: Issue short-lived CA-signed SSH certificates for adm, agt, and atm actors - through a stable cert_command CLI interface. + through a stable cert_command CLI interface; steward NetKingdom operational access routing. vector: D4 / A3 / C3 / R2 domain: helix_forge status: draft diff --git a/wiki/ActorInventoryPatterns.md b/wiki/ActorInventoryPatterns.md new file mode 100644 index 0000000..23d8c72 --- /dev/null +++ b/wiki/ActorInventoryPatterns.md @@ -0,0 +1,141 @@ +# Actor Inventory Patterns + +Date: 2026-06-17 + +Standard naming and TTL patterns for `~/.config/warden/inventory.yaml` (or +Git-tracked inventory in your environment). Actor names **must** use the prefix +matching `ActorType`: `adm-`, `agt-`, `atm-`. + +See `wiki/AccessManagementDirective.md` for policy background and +`examples/inventory.seed.yaml` for a copy-paste template. + +--- + +## Naming convention + +```text +--[-] +``` + +| Segment | Meaning | +| --- | --- | +| `type` | `adm` \| `agt` \| `atm` | +| `scope` | team, repo, or environment slug (`codex`, `state-hub`, `ci`) | +| `purpose` | narrow function (`bridge`, `bootstrap`, `backup`) | +| `instance` | optional disambiguator (`railiance01`) | + +**Examples:** `agt-state-hub-bridge`, `agt-codex-interhub-bootstrap`, `atm-nightly-backup`. + +--- + +## Pattern catalog + +### Tunnel agents (`agt`) + +Used by ops-bridge `cert_command` for SSH tunnels. + +```yaml +agt-state-hub-bridge: + type: agt + principals: [agt-task-bridge] + ttl_hours: 24 + description: "ops-bridge tunnel to state-hub backend" +``` + +- One actor per tunnel identity (match `ssh_user` / `actor` in `tunnels.yaml`). +- Principal should match host `auth_principals` entry deployed by railiance-infra. +- TTL default 24 h; shorten for high-risk paths. + +### Kaizen / Codex agents (`agt`) + +Attended or semi-attended agent work on trusted hosts. + +```yaml +agt-codex-interhub-bootstrap: + type: agt + principals: [agt-interhub-bootstrap] + ttl_hours: 2 + description: "Short-lived agent access for Inter-Hub bootstrap execution" +``` + +- Prefer **1–2 h TTL** for bootstrap; never multi-day agent certs. +- Principal narrower than general ops access (`agt-interhub-bootstrap` not `agt-ops-full`). +- Remove or disable actor when lane is retired. +- See `wiki/InterHubBootstrapAccessLane.md`. + +### Human operators (`adm`) + +```yaml +adm-bernd: + type: adm + principals: [adm-full] + ttl_hours: 48 + description: "Human operator — interactive shell when policy allows" +``` + +- Humans bring their own keypair (`ssh-keygen`); warden signs pubkey only. +- Use separate actors per person, not shared `adm-shared`. +- Principals may be narrowed (`adm-readonly`) where railiance-infra supports it. + +### CI / cron automations (`atm`) + +```yaml +atm-backup-daily: + type: atm + principals: [atm-backup-daily] + ttl_hours: 8 + description: "Nightly backup automation — force-command on host" +``` + +- Lowest TTL practical (≤ 8 h per directive max). +- Principal tied to single force-command on host. +- Prefer `warden issue` only in secured CI secret store contexts. + +--- + +## TTL guidance + +| Type | Default max (warden) | Typical attended | Typical automation | +| --- | --- | --- | --- | +| `adm` | 48 h | 24–48 h | N/A | +| `agt` | 24 h | 1–4 h bootstrap | 8–24 h supervised | +| `atm` | 8 h | N/A | 1–8 h | + +`warden sign` **rejects** TTL above type maximum (WARDEN-WP-0002). + +--- + +## Principal narrowing + +1. One principal per automation purpose — avoid `agt-ops-admin`. +2. Match host-side `auth_principals` exactly — coordinate with railiance-infra before add. +3. Document `description` field for audit and scorecard reviews. +4. Use `hosts:` section in inventory for reference (not enforced by warden). + +--- + +## Adding a new worker + +```bash +warden inventory add agt-myrepo-ci \ + --type agt \ + --principal agt-myrepo-ci \ + --ttl 4 \ + --description "CI deploy actor for myrepo" +warden inventory list +warden sign agt-myrepo-ci --pubkey /path/to/ci.pub +``` + +Copy patterns from `examples/inventory.seed.yaml` before inventing new names. + +--- + +## Anti-patterns + +| Do not | Why | +| --- | --- | +| Reuse `adm` actor for agents | Breaks attribution; use `agt-*` | +| Store private keys in inventory YAML | Inventory is registry only — keys live in secure paths | +| 72 h `agt` cert for convenience | Violates TTL policy and directive | +| One `agt-ops` for all tunnels | Cannot revoke or audit per tunnel | +| Put API keys in inventory | Route to OpenBao — `wiki/CredentialRouting.md` | \ No newline at end of file diff --git a/wiki/CredentialRouting.md b/wiki/CredentialRouting.md new file mode 100644 index 0000000..4b16912 --- /dev/null +++ b/wiki/CredentialRouting.md @@ -0,0 +1,140 @@ +# Credential Routing — NetKingdom Access Desk + +Date: 2026-06-17 + +Use this page when a development worker (human, kaizen agent, CI job, or +custodian tool) needs **access or credentials** and is unsure which subsystem +owns the request. + +ops-warden maintains this routing guide. It **issues SSH certificates only**. +For every other credential type, follow the routed path — do not paste secrets +into Git, State Hub, agent chat, or workplans. + +--- + +## Quick decision tree + +```text +What do you need? +| ++-- Log in as a human / get OIDC claims / MFA +| -> key-cape (lightweight) or Keycloak (expanded) +| net-kingdom/docs/platform-identity-security-architecture.md +| ++-- Permission to perform an action on a resource +| -> flex-auth (policy decision) +| flex-auth/INTENT.md +| ++-- API key, DB password, provider token, K8s secret, dynamic lease +| -> OpenBao (after flex-auth approval where policy requires it) +| railiance-platform/docs/openbao.md +| NEVER ops-warden +| ++-- S3 / object-storage temporary credentials +| -> NK-WP-0007 vending path (flex-auth + OpenBao + storage STS) +| net-kingdom/docs/object-storage-sts-credential-vending.md +| NEVER ops-warden +| ++-- SSH certificate for host / ops reachability (adm/agt/atm) +| -> ops-warden (warden sign / cert_command) +| wiki/OpsWardenConfig.md +| ++-- SSH tunnel / port forward (already have or will get a cert) +| -> ops-bridge +| ops-bridge tunnels.yaml + cert_command from ops-warden +| ++-- Host accepts your SSH principal / force-command on server +| -> railiance-infra Ansible +| /etc/ssh/auth_principals/, sshd hardening +``` + +**Under two minutes:** match your need to a branch above, open the linked doc, +stop if you landed on "NEVER ops-warden" for non-SSH secrets. + +--- + +## Routing table + +| I need… | Subsystem | ops-warden role | +| --- | --- | --- | +| Interactive login, OIDC token, MFA | key-cape / Keycloak | Document only — use IAM Profile | +| "May I do X on resource Y?" | flex-auth (+ Topaz PDP) | Future pre-sign gate for SSH; document only today | +| OpenRouter / LLM provider API key | OpenBao → K8s Secret | **Do not** ask ops-warden | +| Inter-Hub operator / runtime API key | OpenBao or `0600` temp file | See `wiki/InterHubBootstrapAccessLane.md` | +| Database or service password | OpenBao dynamic/KV | Document only | +| Short-lived SSH cert for operator | ops-warden (`adm-*`) | **Issue** via `warden sign` | +| Short-lived SSH cert for agent | ops-warden (`agt-*`) | **Issue** via `warden sign` / wrapper | +| Short-lived SSH cert for CI/cron | ops-warden (`atm-*`) | **Issue** via `warden sign` / `warden issue` | +| Tunnel to remote service | ops-bridge | Consumer of `cert_command` | +| Principal file on host | railiance-infra | Document only | + +--- + +## Examples — do NOT ask ops-warden + +| Request | Correct path | +| --- | --- | +| "Populate `OPENROUTER_API_KEY` for llm-connect" | Operator → OpenBao/K8s Secret in `activity-core` namespace | +| "Store Inter-Hub admin key for bootstrap" | Operator → OpenBao or `IHUB_OPERATOR_KEY_FILE` (`CUST-WP-0049`) | +| "Give me Vault root token" | Break-glass ceremony → `railiance-platform/docs/openbao.md` | +| "S3 credentials for artifact upload" | NK-WP-0007 / artifact-store consumer path | +| "JWT for my app" | key-cape / Keycloak IAM Profile | + +--- + +## Examples — ops-warden IS correct + +| Request | Command / pattern | +| --- | --- | +| ops-bridge tunnel needs a cert | `cert_command: warden sign --pubkey ` | +| Agent reaching bootstrap host | `agt-codex-interhub-bootstrap` — `wiki/InterHubBootstrapAccessLane.md` | +| Check cert expiry before shift | `warden status ` | +| New tunnel actor | `warden inventory add` — `wiki/ActorInventoryPatterns.md` | +| Lab without OpenBao | `backend: local` — `wiki/OpsWardenConfig.md` | + +--- + +## Typical flows + +### Human operator → remote host + +1. Identity: key-cape login if web/API access needed (optional for pure SSH). +2. SSH cert: `warden sign adm- --pubkey ~/.ssh/id_ed25519.pub`. +3. Tunnel (if needed): ops-bridge with `cert_command` pointing at warden. +4. Host: principal deployed by railiance-infra. + +### Kaizen / Codex agent → attended task + +1. Register actor: `agt-codex-` per `wiki/ActorInventoryPatterns.md`. +2. SSH cert: `WARDEN_ACTOR=... ops-ssh-wrapper ssh ...` or `warden sign`. +3. Secrets for task (API keys): OpenBao path — not warden. +4. Tunnel: ops-bridge if required. + +### CI automation → scheduled job + +1. Actor: `atm-` with narrow principal and low TTL (≤ 8 h). +2. `warden issue atm-` or sign with pre-provisioned key. +3. No long-lived keys in CI env vars. + +--- + +## When guidance drifts + +NetKingdom security architecture is canonical in `net-kingdom`. When it +changes (OpenBao, IAM Profile, new bootstrap lanes), ops-warden updates: + +- This file +- `wiki/NetKingdomSecurityMap.md` +- `SCOPE.md` / `INTENT.md` as needed + +Report drift via custodian workplan or State Hub message to `ops-warden`. + +--- + +## See also + +- `INTENT.md` — steward mission +- `wiki/NetKingdomSecurityMap.md` — component literacy +- `wiki/ActorInventoryPatterns.md` — actor naming +- `wiki/OpenBaoSshEngineChecklist.md` — production SSH signing verify +- `net-kingdom/docs/platform-identity-security-architecture.md` — platform canon \ No newline at end of file diff --git a/wiki/NetKingdomSecurityMap.md b/wiki/NetKingdomSecurityMap.md new file mode 100644 index 0000000..2019919 --- /dev/null +++ b/wiki/NetKingdomSecurityMap.md @@ -0,0 +1,101 @@ +# NetKingdom Security Map (ops-warden view) + +Date: 2026-06-17 + +Condensed literacy guide for ops-warden stewards and development workers. +Canonical source remains `net-kingdom/docs/platform-identity-security-architecture.md`. + +ops-warden **implements** the operational SSH lane and **documents** how the +other lanes connect. + +--- + +## Planes + +```text +Bootstrap plane railiance-infra, railiance-cluster, net-kingdom bootstrap +Platform control key-cape, flex-auth, OpenBao, Topaz, railiance-platform +Tenant plane railiance-apps, coulomb workloads, future tenants +Operational access ops-warden (SSH certs), ops-bridge (tunnels) +``` + +--- + +## Component map + +| Component | Answers | Credential types | ops-warden | +| --- | --- | --- | --- | +| **key-cape** | Who are you? (lightweight IAM) | OIDC tokens, MFA | Route — do not issue | +| **Keycloak** | Who are you? (expanded IAM) | OIDC/SAML federation | Route — do not issue | +| **privacyIDEA** | MFA / step-up | OTP, hardware tokens | Route — do not issue | +| **flex-auth** | May you do this action? | Policy decisions, audit envelopes | Future SSH pre-sign; route today | +| **Topaz** | PDP runtime for flex-auth | Authorization evaluations | Route — do not issue | +| **OpenBao** | Runtime secret authority | API keys, DB creds, leases, K8s auth | SSH engine **signing backend** only | +| **ops-warden** | SSH ops access | Short-lived SSH certificates | **Own and issue** | +| **ops-bridge** | Tunnel transport | Uses certs via cert_command | Consumer | +| **railiance-infra** | Host enforcement | auth_principals, sshd | Route — deploy hosts | +| **railiance-platform** | Platform deploy | OpenBao, Postgres, ingress | Route — do not deploy from warden | + +--- + +## Credential lanes (summary) + +| Lane | Owner | Lifetime | Worker entrypoint | +| --- | --- | --- | --- | +| Identity | key-cape / Keycloak | Session / token TTL | Login / OIDC | +| Authorization | flex-auth | Per request | Policy API / embedded PEP | +| Runtime secrets | OpenBao | Lease-bound | `bao` CLI, K8s ESO, app integration | +| SSH operational | ops-warden | adm 48h / agt 24h / atm 8h | `warden sign` | +| Tunnel | ops-bridge | Session | `bridge` + cert_command | + +Full routing: `wiki/CredentialRouting.md`. + +--- + +## Trust flow (simplified) + +```text +Worker request + -> Identity? key-cape / Keycloak + -> Authorized? flex-auth + -> Secret material? OpenBao + -> SSH cert? ops-warden + -> Tunnel? ops-bridge (cert from warden) + -> Host accepts? railiance-infra principals +``` + +OpenBao does **not** replace identity or authorization. flex-auth decides; +OpenBao stores/issues; ops-warden signs SSH certs when host reachability is +the need. + +--- + +## NetKingdom documents to watch + +| Document | Why ops-warden cares | +| --- | --- | +| `platform-identity-security-architecture.md` | Planes, secret path, SSH path | +| `responsibility-map.md` | Operational SSH dependency section | +| `platform-identity-security-architecture.md` | Operational SSH Path section | +| `platform-root-custody.md` | OpenBao ceremony — not warden's job | +| `object-storage-sts-credential-vending.md` | S3 creds — never warden | +| `canon/standards/iam-profile_v0.2.md` | Claims for future policy-gated sign | + +When these change, update ops-warden wiki and `wiki/CredentialRouting.md`. + +--- + +## Recursive platform rule + +Tenant admins (including `tenant:coulomb`) must not gain platform-root +authority. ops-warden SSH actors should use **narrow principals** for agent +and automation work — not platform-admin equivalents on hosts. + +--- + +## See also + +- `INTENT.md` +- `wiki/CredentialRouting.md` +- `wiki/PolicyGatedSigning.md` (future flex-auth hook) +- `net-kingdom/docs/platform-identity-security-architecture.md` \ No newline at end of file diff --git a/wiki/OpenBaoSshEngineChecklist.md b/wiki/OpenBaoSshEngineChecklist.md new file mode 100644 index 0000000..a8c17f9 --- /dev/null +++ b/wiki/OpenBaoSshEngineChecklist.md @@ -0,0 +1,172 @@ +# OpenBao SSH Engine — Operational Checklist + +Date: 2026-06-17 + +Verify the production SSH signing path for `warden` against platform OpenBao. +Cluster bootstrap and unseal are **not** ops-warden scope — see +`railiance-platform/docs/openbao.md`. + +--- + +## Prerequisites + +- [ ] OpenBao deployed on Railiance (`railiance-platform` helm/Makefile) +- [ ] `bao status` reports initialized and **unsealed** +- [ ] Operator has **scoped token** — not root token in `VAULT_TOKEN` for daily warden use +- [ ] `warden.yaml` points `vault.addr` at correct endpoint: + - Workstation: `https://bao.coulomb.social` + - In-cluster: `http://openbao.openbao.svc.cluster.local:8200` +- [ ] Actor exists in inventory — `wiki/ActorInventoryPatterns.md` +- [ ] Test pubkey available (mode 600 private key, never commit) + +--- + +## One-time SSH engine setup (operator) + +Run with OpenBao admin policy — not from agent chat logs. + +```bash +# Confirm reachability +bao status + +# Enable SSH secrets engine (skip if already enabled) +bao secrets enable ssh + +# Roles — TTL max must match ActorType policy (wiki/OpsWardenConfig.md) +bao write ssh/roles/agt-role \ + key_type=ca \ + allowed_users="*" \ + allow_user_certificates=true \ + default_user="agt" \ + ttl=24h max_ttl=24h + +bao write ssh/roles/adm-role \ + key_type=ca \ + allowed_users="*" \ + allow_user_certificates=true \ + default_user="adm" \ + ttl=48h max_ttl=48h + +bao write ssh/roles/atm-role \ + key_type=ca \ + allowed_users="*" \ + allow_user_certificates=true \ + default_user="atm" \ + ttl=8h max_ttl=8h + +# Verify roles listed +bao list ssh/roles +``` + +Document CA public key distribution to hosts via railiance-infra — warden does +not deploy `TrustedUserCAKeys`. + +--- + +## Token policy expectations + +| Rule | Rationale | +| --- | --- | +| No root token in `VAULT_TOKEN` for warden workflows | Root is break-glass only | +| Token scoped to `ssh/sign/` for needed roles | Least privilege | +| Short TTL on operator tokens | Limit blast radius | +| Prefer OIDC/login-derived tokens via KeyCape where available | Platform admin path | + +Example policy shape (illustrative — adjust in OpenBao policy admin): + +```hcl +path "ssh/sign/agt-role" { + capabilities = ["create", "update"] +} +``` + +--- + +## warden.yaml sanity check + +```yaml +backend: vault +vault: + addr: https://bao.coulomb.social + mount: ssh + role_map: + adm: adm-role + agt: agt-role + atm: atm-role + token_env: VAULT_TOKEN +``` + +--- + +## Verification procedure + +```bash +export VAULT_TOKEN="" # never paste in chat or commit + +# 1. Config loads +warden status --help + +# 2. Sign test actor (replace actor and pubkey paths) +warden sign agt-state-hub-bridge --pubkey ~/.ssh/agt-state-hub-bridge_ed25519.pub \ + | head -c 80 && echo "..." + +# 3. Metadata +warden status agt-state-hub-bridge + +# 4. Audit line +warden log --actor agt-state-hub-bridge --last 1 + +# 5. Compliance +warden scorecard +``` + +**Pass criteria:** + +- Exit code 0 on sign and status +- Cert `valid_before` in the future +- `signatures.log` has new JSONL line with `"backend": "vault"` +- Scorecard passes on clean state dir + +--- + +## cert_command smoke (ops-bridge) + +In `tunnels.yaml`, set: + +```yaml +cert_command: "warden sign --pubkey .pub" +``` + +Bring up tunnel; confirm SSH connects with cert + key (ops-bridge docs). + +--- + +## Failure modes + +| Symptom | Likely cause | Action | +| --- | --- | --- | +| `Vault token not found` | `VAULT_TOKEN` unset | Scoped login/token issue | +| HTTP 403 from OpenBao | Token lacks sign permission | Fix policy | +| `No Vault role mapped` | `role_map` mismatch | Fix warden.yaml | +| `ttl exceeds max` | Inventory TTL > ActorType max | Fix inventory or role | +| Connection refused | Wrong `addr` or OpenBao sealed | Check platform ops | +| Host rejects cert | Principal not on host | railiance-infra auth_principals | + +**Lab fallback:** `backend: local` in warden.yaml — **not** a production substitute. +Use only for offline dev when OpenBao is unreachable. + +--- + +## Boundaries + +- ops-warden does not unseal OpenBao or rotate unseal keys +- ops-warden does not store API keys alongside SSH signing +- Host trust of CA pubkey is railiance-infra responsibility + +--- + +## See also + +- `wiki/OpsWardenConfig.md` +- `railiance-platform/docs/openbao.md` +- `wiki/CredentialRouting.md` \ No newline at end of file diff --git a/wiki/PolicyGatedSigning.md b/wiki/PolicyGatedSigning.md new file mode 100644 index 0000000..22f7f37 --- /dev/null +++ b/wiki/PolicyGatedSigning.md @@ -0,0 +1,129 @@ +# Policy-Gated SSH Signing (design) + +Date: 2026-06-17 +Status: **design only** — not implemented in WARDEN-WP-0006 + +Today `warden sign` authorizes via **inventory allow-list** and TTL policy only. +This document proposes flex-auth integration so SSH issuance matches the +NetKingdom authorization path before OpenBao/SSH engine signing. + +--- + +## Problem + +Inventory-only gating is sufficient for early ops but weak for: + +- many agents and automations across tenants +- temporary elevation without inventory edits +- unified audit with flex-auth decision envelopes +- aligning SSH issuance with IAM Profile claims + +--- + +## Target flow (v2) + +```text +warden sign --pubkey + | + v +Load actor from inventory (type, principals, ttl) + | + v +Obtain identity claims (optional v2.1) + OIDC token / env-injected JWT from key-cape session + | + v +flex-auth Evaluate + resource: ssh-certificate / actor: + action: sign + context: tenant, principal list, pubkey fingerprint, requestor + | + +-- DENY -> CAError with flex-auth explanation + | + v ALLOW +CABackend.sign() (local or OpenBao SSH engine) + | + v +Append signatures.log (+ optional flex-auth audit correlation id) +``` + +--- + +## flex-auth request shape (proposed) + +| Field | Source | +| --- | --- | +| `subject` | IAM Profile `sub` or service identity | +| `tenant` | `tenant:platform` or `tenant:coulomb` | +| `resource` | `ssh-cert:actor/` | +| `action` | `sign` | +| `context.principals` | From inventory | +| `context.actor_type` | adm \| agt \| atm | +| `context.pubkey_fingerprint` | SHA256 of pubkey | +| `context.ttl_hours` | Requested TTL | + +Decision envelope should return `allow` \| `deny` and `audit_correlation_id` +stored in `signatures.log`. + +--- + +## Versioning + +| Version | Gate | Notes | +| --- | --- | --- | +| **v1 (today)** | Inventory + TTL max | Shipped | +| **v2** | flex-auth required for `backend: vault` production | Config flag | +| **v2.1** | Identity claims required for `adm` signs | OIDC from key-cape | +| **v3** | Tenant-scoped policies per `tenant:*` | NK recursive rule | + +--- + +## Configuration sketch (future) + +```yaml +# warden.yaml — not implemented +policy: + enabled: true + flex_auth_url: http://flex-auth.flex-auth.svc.cluster.local:8080 + require_identity_for_adm: true + fail_closed: true +``` + +`fail_closed: true` — if flex-auth unreachable, deny sign (no silent bypass). + +--- + +## What stays in inventory (v2) + +- Actor registration (name, type, default principals, default TTL) +- Host reference documentation +- Scorecard local checks + +flex-auth decides **whether this sign request is allowed now**; inventory +defines **what the actor is allowed to request**. + +--- + +## Non-goals (this design) + +- flex-auth implementation changes in WP-0006 +- Replacing OpenBao SSH engine with flex-auth +- Storing flex-auth policies in ops-warden repo + +--- + +## Implementation follow-up + +Promote to **WARDEN-WP-0007** (proposed) after: + +1. flex-auth resource type for `ssh-certificate` agreed +2. NK platform policy for platform vs tenant sign paths +3. Operator approval for `fail_closed` production behavior + +--- + +## See also + +- `flex-auth/INTENT.md` +- `wiki/CredentialRouting.md` +- `net-kingdom/docs/platform-identity-security-architecture.md` \ No newline at end of file diff --git a/workplans/WARDEN-WP-0006-netkingdom-alignment-and-access-stewardship.md b/workplans/WARDEN-WP-0006-netkingdom-alignment-and-access-stewardship.md index 40c8b50..9d12ba8 100644 --- a/workplans/WARDEN-WP-0006-netkingdom-alignment-and-access-stewardship.md +++ b/workplans/WARDEN-WP-0006-netkingdom-alignment-and-access-stewardship.md @@ -4,7 +4,7 @@ type: workplan title: "NetKingdom Alignment and Operational Access Stewardship" domain: custodian repo: ops-warden -status: ready +status: finished owner: codex topic_slug: custodian planning_priority: high @@ -24,13 +24,6 @@ to optional CLI ergonomics. **Out of scope:** flex-auth integration implementation, OpenBao cluster deploy, universal credential broker, net-kingdom INTENT.md rewrite. -**References:** - -- `INTENT.md`, `SCOPE.md`, `history/2026-06-17-intent-scope-assessment.md` -- `net-kingdom/docs/platform-identity-security-architecture.md` -- `net-kingdom/docs/responsibility-map.md` -- `NK-WP-0009` (SSH tutorial, net-kingdom) - --- ## Goal @@ -51,147 +44,95 @@ After this workplan, a development worker or agent can: ```task id: WARDEN-WP-0006-T01 -status: todo +status: done priority: high state_hub_task_id: "ffc6a0c2-4312-4584-be7a-c8411cb01899" ``` -Create `wiki/CredentialRouting.md`: - -- Decision tree: SSH vs runtime secret vs identity vs authorization vs tunnel -- Per-subsystem links (OpenBao, flex-auth, key-cape, ops-bridge, railiance-infra) -- Explicit “do not ask ops-warden for API keys” examples -- Link from `SCOPE.md`, `INTENT.md`, `README.md` - -**Done when:** A worker with no prior context can route a credential request in -under two minutes using this page alone. +- [x] `wiki/CredentialRouting.md` with decision tree and anti-examples +- [x] Linked from SCOPE, INTENT, README ### T2 — Actor inventory patterns ```task id: WARDEN-WP-0006-T02 -status: todo +status: done priority: high state_hub_task_id: "3816463d-7dfd-469d-9324-fd7880b50608" ``` -Create `wiki/ActorInventoryPatterns.md` with standard patterns: - -- Tunnel agents (`agt-*-bridge`) -- Kaizen / codex agents (`agt-codex-*`) -- CI automations (`atm-*`) -- Human admins (`adm-*`) -- TTL and principal narrowing guidance - -Optional: `examples/inventory.seed.yaml` (non-secret, Git-safe template). - -**Done when:** Adding a new dev worker actor does not require inventing naming -from scratch. +- [x] `wiki/ActorInventoryPatterns.md` +- [x] `examples/inventory.seed.yaml` ### T3 — NetKingdom cross-links (ops-warden side) ```task id: WARDEN-WP-0006-T03 -status: todo +status: done priority: high state_hub_task_id: "f158366a-5746-48b8-acce-472dce8f925e" ``` -- Add `wiki/NetKingdomSecurityMap.md` — condensed literacy table from INTENT -- Update `registry/capabilities/capability.security.ssh-certificate-issuance.md` - summary to mention stewardship/routing -- Update `.claude/rules/repo-boundary.md` with NetKingdom routing table - -**Done when:** ops-warden docs stand alone for NetKingdom operational access -orientation without reading net-kingdom first (but link to canon). +- [x] `wiki/NetKingdomSecurityMap.md` +- [x] Registry capability stewardship summary +- [x] `.claude/rules/repo-boundary.md` routing table ### T4 — NetKingdom canon patch (coordination) ```task id: WARDEN-WP-0006-T04 -status: todo +status: done priority: medium state_hub_task_id: "e40e4395-8f01-4f79-a539-d0de8e427321" ``` -Coordinate updates in `net-kingdom` (separate commit/PR there): - -- `docs/responsibility-map.md` — move ops-warden from pure out-of-scope to - **operational SSH credential dependency** -- `docs/platform-identity-security-architecture.md` — add Operational SSH Path - (ops-warden → ops-bridge → hosts) - -**Done when:** NetKingdom canon names ops-warden’s lane; ops-warden wiki links -back to the updated sections. - -**Note:** Requires `net-kingdom` repo write access; may need `needs_human` if -blocked on review. +- [x] `net-kingdom/docs/responsibility-map.md` — Operational SSH dependency +- [x] `net-kingdom/docs/platform-identity-security-architecture.md` — Operational SSH Path ### T5 — OpenBao SSH engine operational checklist ```task id: WARDEN-WP-0006-T05 -status: todo +status: done priority: medium state_hub_task_id: "a94e20a2-970b-4a0c-bd23-8510b841b938" ``` -Create `wiki/OpenBaoSshEngineChecklist.md`: - -- Prerequisites (OpenBao initialized/unsealed per railiance-platform) -- Role creation commands (from OpsWardenConfig) -- Token policy expectations (no root token in warden workflows) -- Verification: `warden sign` against production endpoint -- Failure modes and fallback boundaries - -**Done when:** Operator can verify production SSH signing path without -reconstructing steps from multiple repos. +- [x] `wiki/OpenBaoSshEngineChecklist.md` ### T6 — Policy-gated signing design (design only) ```task id: WARDEN-WP-0006-T06 -status: todo +status: done priority: low state_hub_task_id: "b10a4b4d-bfa1-4f49-b6a5-f339f1e6a2e1" ``` -Create `wiki/PolicyGatedSigning.md`: - -- flex-auth decision before `warden sign` — proposed flow -- Claims needed from IAM Profile -- What stays inventory-based in v1 vs policy-based in v2 -- Explicit non-implementation in this workplan - -**Done when:** Reviewable design exists; no code dependency. +- [x] `wiki/PolicyGatedSigning.md` ### T7 — Re-assess INTENT ↔ SCOPE ```task id: WARDEN-WP-0006-T07 -status: todo +status: done priority: medium state_hub_task_id: "ef8b5c57-2343-4cfc-9fee-48db1e56f69a" ``` -After T1–T5 complete: - -- Update `history/2026-06-17-intent-scope-assessment.md` or add - `history/YYYYMMDD-intent-scope-reassessment.md` -- Refresh SCOPE.md Current State and completeness notes -- Run `make fix-consistency REPO=ops-warden` - -**Done when:** Completeness target C3+ documented with evidence. +- [x] `history/2026-06-17-intent-scope-reassessment.md` +- [x] SCOPE.md Current State updated +- [x] `make fix-consistency REPO=ops-warden` --- ## Acceptance Criteria -- [ ] `wiki/CredentialRouting.md` exists and is linked from README/SCOPE -- [ ] `wiki/ActorInventoryPatterns.md` exists -- [ ] `wiki/NetKingdomSecurityMap.md` exists -- [ ] NetKingdom responsibility-map recognizes ops-warden SSH lane (T4) -- [ ] OpenBao SSH checklist documented (T5) -- [ ] Policy-gated signing design drafted (T6) -- [ ] INTENT ↔ SCOPE re-assessment recorded (T7) -- [ ] `reuse-surface validate --root .` passes if registry entry changed \ No newline at end of file +- [x] `wiki/CredentialRouting.md` exists and is linked from README/SCOPE +- [x] `wiki/ActorInventoryPatterns.md` exists +- [x] `wiki/NetKingdomSecurityMap.md` exists +- [x] NetKingdom responsibility-map recognizes ops-warden SSH lane (T4) +- [x] OpenBao SSH checklist documented (T5) +- [x] Policy-gated signing design drafted (T6) +- [x] INTENT ↔ SCOPE re-assessment recorded (T7) +- [x] `reuse-surface validate --root .` passes \ No newline at end of file