From 7b211acd57a70f66e72f119ee75ef199fac455df Mon Sep 17 00:00:00 2001 From: tegwick Date: Wed, 20 May 2026 22:51:20 +0200 Subject: [PATCH] Add OpenBao runtime secret authority; complete NK-WP-0006/0007/0008 Refine the recursive platform security architecture to make OpenBao the canonical runtime secret authority, with SOPS/age, K8s Secrets, and the emergency bundle reframed as bootstrap/delivery/break-glass mechanisms. - credential-management standard v0.2: add OpenBao runtime authority section, rotation rules, and prohibited patterns (OpenBao-as-PDP, tenant platform-root) - platform-identity-security-architecture: mark implemented; add flex-auth/Topaz implications, Coulomb onboarding path, and a production-readiness checklist - NK-WP-0004/0005: document bootstrap-to-OpenBao handoff boundary - NK-WP-0006/0007: status -> done with implementation reviews; add recursive platform/tenant split and OpenBao broker/audit role for object-storage STS vending - NK-WP-0008: status -> done; repoint corpus to infospace-bench - new ADR-0007 (orchestration boundary), ADR-0008 (STS vending boundary), and the object-storage STS credential-vending architecture Co-Authored-By: Claude Opus 4.7 --- canon/standards/credential-management_v0.2.md | 103 +++- ...DR-0007-security-orchestration-boundary.md | 87 ++++ ...8-object-storage-sts-credential-vending.md | 82 +++ docs/object-storage-sts-credential-vending.md | 482 ++++++++++++++++++ ...platform-identity-security-architecture.md | 84 ++- ...P-0004-credential-management-foundation.md | 28 +- ...-0005-agent-driven-credential-bootstrap.md | 29 +- ...platform-identity-security-architecture.md | 48 +- ...7-object-storage-sts-credential-vending.md | 74 ++- ...ecurity-architecture-patterns-infospace.md | 202 +++++++- 10 files changed, 1150 insertions(+), 69 deletions(-) create mode 100644 docs/adr/ADR-0007-security-orchestration-boundary.md create mode 100644 docs/adr/ADR-0008-object-storage-sts-credential-vending.md create mode 100644 docs/object-storage-sts-credential-vending.md diff --git a/canon/standards/credential-management_v0.2.md b/canon/standards/credential-management_v0.2.md index 0d76cdd..9d7fc13 100644 --- a/canon/standards/credential-management_v0.2.md +++ b/canon/standards/credential-management_v0.2.md @@ -1,14 +1,16 @@ # Credential Management Standard — net-kingdom -**Version:** 0.2 **Status:** current **Supersedes:** v0.1 (retired with NK-WP-0004) +**Version:** 0.2 **Status:** current with OpenBao runtime refinement **Supersedes:** v0.1 (retired with NK-WP-0004) --- ## 1. Purpose -Define how service credentials are generated, stored, rotated, and recovered -in the net-kingdom SSO/MFA platform. This standard governs operational -security of all secrets used by the Authelia + LLDAP + KeyCape + privacyIDEA -stack and its PostgreSQL backend. +Define how service credentials are generated, stored, rotated, handed off +to runtime secret authority, and recovered in the net-kingdom SSO/MFA +platform. This standard governs operational security of all secrets used +by the Authelia + LLDAP + KeyCape + privacyIDEA stack, its PostgreSQL +backend, and the OpenBao runtime secret authority used by the platform +control plane. --- @@ -24,17 +26,24 @@ break-glass passwords ────────► direct service access if clu │ └──► stored in human's personal password manager -K8s Secrets ──────────────────► live credential store for running services +K8s Secrets ──────────────────► bootstrap/delivery state for running services │ - └──► created by create-secrets.sh scripts; sourced from secrets.enc/ + └──► bootstrap/delivery mechanism; created by create-secrets.sh scripts + +OpenBao runtime authority ─────► scoped workload secrets and dynamic credentials + │ + ├──► leases, revocation, and audit records + └──► direct client, External Secrets Operator, or CSI delivery ``` **KeePassXC is NOT in the operational path.** If you choose to import the emergency bundle into KeePassXC for personal use, that is your business — it is not required or assumed by any tooling in this repo. -The age private key and SOPS/age-encrypted git files are the credential store. -The ops bundle is the backup. The emergency bundle is the human's key ring. +The age private key and SOPS/age-encrypted git files are the bootstrap +credential store. The ops bundle is the bootstrap backup. The emergency +bundle is the human's break-glass key ring. OpenBao is the runtime secret +authority once the platform control plane is alive. --- @@ -55,10 +64,12 @@ This single command runs the full bootstrap end-to-end: 6. Verifies all K8s Secrets exist 7. Waits for privacyIDEA to be Ready, then runs enckey bootstrap + admin creation 8. Applies KeyCape secrets (requires pi-admin) -9. Creates the ops bundle (age-encrypted snapshot) -10. Delivers the emergency bundle to the terminal for human storage +9. Hands off runtime secret authority to OpenBao when the Railiance + platform layer is present and verified +10. Creates the ops bundle (age-encrypted snapshot) +11. Delivers the emergency bundle to the terminal for human storage ← **only human touchpoint** -11. Shreds all plaintext and marks `bootstrap_complete: true` +12. Shreds all plaintext and marks `bootstrap_complete: true` The script resumes from where it left off if interrupted — each phase is tracked in `creds-state.yaml`. @@ -67,7 +78,13 @@ tracked in `creds-state.yaml`. No human credential management is needed after bootstrap. All secrets live in: - `secrets.enc/` — encrypted in git (decrypt with age key) -- K8s Secrets — live cluster state (updated by `creds-apply.sh`) +- K8s Secrets — bootstrap or delivery state (updated by `creds-apply.sh`, + External Secrets Operator, CSI, or other approved delivery paths) +- OpenBao — runtime secret authority for platform services, workload + secrets, dynamic credentials, leases, revocation, and audit + +Once OpenBao is ready, new long-lived workload secret authority should be +introduced there rather than by expanding the bootstrap SOPS/age surface. ### Phase 2 — Rotation @@ -77,7 +94,7 @@ SECRET= bash sso-mfa/bootstrap/creds-rotate.sh --non-interactive # agent ``` Rotation is handled per-secret with appropriate atomicity guarantees. -See Section 5 for details. +See Section 6 for details. ### Phase 3 — Recovery @@ -98,6 +115,7 @@ See Section 4 — Emergency Bundle. | PostgreSQL root password | Direct database access | | break-glass user password | Emergency login if Authelia/KeyCape is down | | ops bundle location + decrypt command | Point-in-time snapshot of all secrets | +| OpenBao unseal/recovery material, if used | Platform-control-plane break-glass only | ### Delivery @@ -138,12 +156,45 @@ The agent does not care which — it only cares that you confirm receipt. --- -## 5. Secret Rotation +## 5. OpenBao Runtime Authority + +OpenBao is the canonical runtime secret authority for the platform +control plane once deployed and verified. It stores, issues, leases, +audits, and revokes runtime secret material; it does not replace identity +or authorization. + +Required boundaries: + +- SOPS/age remains the bootstrap and Git-at-rest protection mechanism. +- OpenBao root tokens, unseal keys, recovery keys, platform mounts, and + global auth methods are platform-root material. +- Tenant administrators may manage tenant-scoped secret paths only + through approved policies; they must not receive OpenBao platform-root + authority. +- Workloads should use scoped OpenBao auth roles, External Secrets + Operator, CSI-mounted secrets, or another approved delivery mechanism. +- flex-auth decides whether a secret or dynamic credential request is + allowed when the request is authorization-sensitive; OpenBao performs + storage, issuance, lease, revocation, and audit. +- OpenBao audit logs must be shipped to durable storage and included in + restore and break-glass drills. + +OpenBao may issue dynamic credentials for databases, object storage, or +other systems where provider support and policy make that safer than +static secret distribution. Provider-native STS remains valid where it +gives better-scoped temporary credentials; OpenBao can still broker, +store, or audit the handoff where appropriate. + +--- + +## 6. Secret Rotation ### Rotatable secrets | Secret | Blast radius | Notes | |--------|-------------|-------| +| OpenBao workload secret | Variable | Prefer lease expiry or dynamic regeneration | +| OpenBao auth role/policy | Variable | Requires policy review and audit check | | `PI_SECRET_KEY` | Low — invalidates PI sessions | Safe to rotate anytime | | `PI_DB_PASSWORD` | Medium — DB + pod | Atomic update required | | `LLDAP_JWT_SECRET` | Low — invalidates LLDAP sessions | Safe to rotate anytime | @@ -168,9 +219,19 @@ Rotating the age private key is a special case: 4. A **new emergency bundle must be delivered** before the old key is revoked (see Section 4) +### OpenBao root and recovery material + +OpenBao root tokens, unseal keys, and recovery keys are not routine +rotation targets. Treat them as platform-root break-glass material: + +1. Prefer short-lived or revoked root tokens after initial setup. +2. Use scoped policies and auth methods for normal operations. +3. Rotate recovery or unseal material only with an explicit maintenance + window, backup verification, and post-rotation emergency-bundle update. + --- -## 6. Ops Bundle +## 7. Ops Bundle The ops bundle is an age-encrypted tar archive of all plaintext secrets at a point in time. It is created automatically during bootstrap and can be @@ -188,7 +249,7 @@ age -d -i ~/.config/sops/age/keys.txt ops-bundle-.tar.age | tar xf - --- -## 7. Prohibited Patterns +## 8. Prohibited Patterns The following are permanently prohibited: @@ -209,6 +270,14 @@ The following are permanently prohibited: 5. **Storing the age private key in the repo** — the key lives outside the repo at `~/.config/sops/age/keys.txt` +6. **Using OpenBao as a policy decision point** — OpenBao stores, leases, + audits, and revokes secret material; identity comes from the IAM + Profile and authorization decisions come from flex-auth where a + decision boundary is needed + +7. **Giving tenants OpenBao platform-root authority** — tenants may only + receive scoped access paths approved for their tenant resources + --- ## Appendix A — KeePassXC Group Structure (optional) diff --git a/docs/adr/ADR-0007-security-orchestration-boundary.md b/docs/adr/ADR-0007-security-orchestration-boundary.md new file mode 100644 index 0000000..a07fae6 --- /dev/null +++ b/docs/adr/ADR-0007-security-orchestration-boundary.md @@ -0,0 +1,87 @@ +# ADR-0007 - Security Orchestration Boundary + +**Status:** Accepted +**Date:** 2026-05-18 +**Deciders:** Bernd Worsch, Codex + +## Context + +The recursive platform security architecture needs careful sequencing: +host trust, cluster trust, bootstrap secrets, runtime secret authority, +runtime identity, runtime authorization, tenant onboarding, and readiness +verification. + +That sequencing crosses NetKingdom and Railiance ownership boundaries. +NetKingdom owns the canonical security architecture, IAM Profile, +credential/bootstrap standards, and authorization semantics. Railiance +owns deployment layering for infrastructure, clusters, platform services, +and applications. OpenBao adds an important runtime-secret authority to +the platform control plane, but it does not change those ownership +boundaries. + +Creating a dedicated orchestration repo too early would risk encoding +temporary bootstrap order and accidental stack assumptions as a permanent +interface. Leaving every sequence implicit would also be risky: platform +root actions, OpenBao initialization, policy import, and tenant onboarding +must be auditable and repeatable. + +## Decision + +Security orchestration will stay in Railiance playbooks for now. + +NetKingdom will define the trust-state model, readiness checks, policy +semantics, OpenBao boundaries, and tenant/control-plane rules. Railiance +playbooks will own the concrete deployment sequencing across +`railiance-infra`, `railiance-cluster`, `railiance-platform`, and +`railiance-apps`. + +A dedicated orchestration repo is deferred until the sequencing surface is +stable enough to justify its own product boundary. If created later, it +must coordinate safe sequencing and readiness reporting; it must not own +security policy semantics or bypass Railiance stack ownership. + +## Consequences + +- NK-WP-0006 is implemented as architecture, standards, ADRs, and + workplan constraints rather than a new repo. +- OpenBao bootstrap, unseal/recovery, audit, backup, and workload-secret + delivery belong in Railiance platform playbooks, governed by + NetKingdom standards. +- Cross-repo readiness should be reported as checks against explicit + trust states, not as a hidden imperative script. +- A future orchestration repo needs a new ADR before creation. + +## Future Repo Trigger + +Revisit a dedicated orchestration repo only if at least two of these are +true: + +- multiple Railiance deployments need the same security sequencing + interface; +- readiness reporting becomes a reusable artifact consumed by operators, + agents, or CI; +- rollback and recovery workflows need a cross-repo state machine that no + single Railiance layer can own cleanly; +- tenant onboarding becomes a repeatable workflow spanning identity, + flex-auth, Topaz, OpenBao, object storage, and application repos. + +## Alternatives Considered + +### Create A Dedicated Orchestration Repo Now + +This would give sequencing a visible home, but it would probably encode +unstable details before OpenBao runtime operations, flex-auth/Topaz +policy import, and tenant onboarding have enough implementation feedback. + +### Put Orchestration In NetKingdom + +NetKingdom owns the security model, but it should not become the +deployment repo for every stack layer. This would blur architecture +ownership with platform deployment ownership. + +### Leave Sequencing Entirely Informal + +This avoids premature structure but leaves bootstrap and runtime trust +transitions too dependent on operator memory. The accepted approach keeps +the sequence explicit while leaving concrete deployment in the Railiance +stack. diff --git a/docs/adr/ADR-0008-object-storage-sts-credential-vending.md b/docs/adr/ADR-0008-object-storage-sts-credential-vending.md new file mode 100644 index 0000000..6dd312f --- /dev/null +++ b/docs/adr/ADR-0008-object-storage-sts-credential-vending.md @@ -0,0 +1,82 @@ +# ADR-0008 - Object Storage STS Credential Vending Boundary + +**Status:** Accepted +**Date:** 2026-05-18 +**Deciders:** Bernd Worsch, Codex + +## Context + +NetKingdom needs a canonical pattern for issuing short-lived +object-storage credentials to platform and tenant workloads. The first +known consumer is `artifact-store`, but the pattern must work for future +S3-compatible consumers without making each application repo own identity, +authorization, root object-store credentials, or backend-specific STS +differences. + +The backend landscape is not uniform. AWS S3, Ceph RGW, and MinIO/AIStor +can use web-identity STS-style flows. Cloudflare R2 exposes temporary +credentials through a provider API or local signing with parent access +material. OpenBao is now part of the Railiance platform stack as runtime +secret authority, but it is not an identity provider or authorization +policy engine. + +## Decision + +NetKingdom will define a provider-neutral credential-vending interface +backed by provider-native temporary credential mechanisms where possible. + +The trust path is: + +1. IAM Profile token proves the actor or workload. +2. flex-auth decides whether the actor may receive credentials for the + requested protected system, tenant, bucket, prefix, action set, TTL, + and assurance level. +3. The credential-vending service exchanges the approved request with + the backend-specific temporary credential mechanism. +4. OpenBao stores parent credentials, broker configuration, lease + metadata, and audit evidence where useful, but it does not replace + flex-auth authorization. +5. Consumers receive normalized temporary credentials containing access + key id, secret access key, session token, and expiration. + +## Consequences + +- `artifact-store` needs temporary credential support, especially + `AWS_SESSION_TOKEN` and refresh behavior, before it can fully consume + the production vending pattern. +- Backend-specific differences are isolated in the vending service, not + leaked into application policy. +- OpenBao remains runtime secret infrastructure and audit support; it + does not become the object-storage policy source. +- Provider-native STS is preferred when available because it gives the + storage backend direct lease/expiration semantics. +- Cloudflare R2 requires a broker path that protects parent access + material, most likely through OpenBao custody. + +## Alternatives Considered + +### Give Applications Long-Lived Access Keys + +This is simple but leaves applications holding durable credentials and +pushes policy into ad hoc bucket configuration. It is acceptable only as +a transitional bridge with scoped credentials and explicit rotation. + +### Put Object-Storage Policy In Keycloak Or key-cape + +Identity providers can assert who the actor is and coarse groups or +roles, but they should not become the canonical source of bucket, +prefix, action, TTL, and explanation semantics. + +### Use OpenBao As The Credential Vending Policy Engine + +OpenBao is valuable for secret custody, broker configuration, leases, +and audit records. Making it the policy decision point would duplicate +flex-auth, blur the platform/tenant boundary, and make authorization +semantics backend-specific. + +### Require One Backend Everywhere + +A single backend would simplify implementation but does not match the +platform direction. Railiance and NetKingdom need a stable security +interface across AWS, self-hosted S3-compatible stores, and Cloudflare +R2-like APIs. diff --git a/docs/object-storage-sts-credential-vending.md b/docs/object-storage-sts-credential-vending.md new file mode 100644 index 0000000..14c859c --- /dev/null +++ b/docs/object-storage-sts-credential-vending.md @@ -0,0 +1,482 @@ +# Object Storage STS Credential Vending + +Status: architecture baseline for NK-WP-0007 +Date: 2026-05-18 + +## Purpose + +This document defines the NetKingdom pattern for vending short-lived +object-storage credentials from verified identity and policy decisions. +It is provider-neutral at the NetKingdom boundary and provider-aware at +the backend exchange boundary. + +The goal is to let consumers such as `artifact-store` use S3-compatible +temporary credentials without owning identity, authorization, secret +custody, or object-storage root credentials. + +## Ownership Boundary + +| Capability | Owner | +| --- | --- | +| IAM Profile, issuer and claim requirements | NetKingdom | +| Resource/action vocabulary and policy decision envelope | flex-auth, governed by NetKingdom architecture | +| Delegated PDP runtime | Topaz first, behind flex-auth | +| Runtime secret custody, broker configuration, audit, leases | OpenBao, deployed by Railiance platform | +| Object-storage backend configuration | Railiance platform | +| Artifact package behavior and S3 client refresh behavior | artifact-store | +| Application deployment | Railiance apps or the owning application repo | + +OpenBao may store parent credentials, broker configuration, or issued +credential metadata where appropriate. It does not replace flex-auth as +the authorization decision point and must not become the object-storage +policy model. + +## Core Flow + +```text +Human, service, or agent principal + | + v +NetKingdom IAM Profile token + key-cape lightweight mode or Keycloak expanded mode + | + v +credential-vending service + verifies issuer, audience, subject, assurance, tenant + | + v +flex-auth decision + tenant, protected-system, bucket, prefix, actions, TTL, obligations + | + v +backend exchange + AWS STS, Ceph RGW STS, MinIO/AIStor STS, Cloudflare R2 temp API, + or OpenBao-assisted broker path + | + v +temporary S3 credentials + access key id, secret access key, session token, expiration + | + v +consumer + artifact-store, SDK, CLI, sidecar, controller, or batch job +``` + +## Trust Boundaries + +### Platform Control Plane + +`tenant:platform` administers the credential-vending service, approved +issuer list, flex-auth policy import pipeline, OpenBao mounts/auth +methods, backend parent credentials, audit retention, and emergency +recovery. + +### Tenant Plane + +`tenant:coulomb` and later tenants may request scoped credentials for +registered tenant resources. Tenant administrators must not receive +OpenBao root tokens, object-storage root credentials, global backend STS +configuration, or platform policy import authority. + +### Backend Boundary + +The credential-vending service is the only component that exchanges an +approved decision for provider-native credentials. Consumers receive only +short-lived credentials scoped to the approved bucket, prefix, actions, +and TTL. + +## Token And Decision Flow + +1. The caller authenticates through a NetKingdom IAM Profile + implementation. +2. The caller sends a request to the credential-vending service with a + bearer token or a workload identity binding. +3. The service validates issuer, audience, signature, expiration, + subject, tenant claim, and assurance evidence. +4. The service builds a flex-auth request with the protected-system id, + resource, action set, requested TTL, tenant, actor, and context. +5. flex-auth evaluates policy through its standalone evaluator or a + delegated PDP such as Topaz. +6. If denied, the service returns a deny envelope with a stable reason + code and audit correlation id. +7. If allowed, the service exchanges the approved request with the + backend or OpenBao-assisted broker path. +8. The service returns normalized temporary credentials and records + identity, policy, backend, lease, and audit metadata. + +## Resource Model + +Every object-storage resource belongs to a protected system and tenant. + +Suggested identifiers: + +```text +protected_system:object-storage:artifact-store-prod +tenant:platform +tenant:coulomb + +bucket:artifact-store-prod +prefix:tenant/coulomb/packages/ +object:tenant/coulomb/packages/ +``` + +The protected-system id names the storage integration boundary, not just +the backend product. For example, a MinIO tenant and an AWS bucket used +by the same application should still be distinct protected systems if +their trust, audit, or policy lifecycle differs. + +## flex-auth Vocabulary + +| Resource | Example | Notes | +| --- | --- | --- | +| protected system | `object-storage:artifact-store-prod` | Required in every decision | +| bucket | `bucket:artifact-store-prod` | Coarse storage boundary | +| prefix | `prefix:tenant/coulomb/packages/` | Preferred grant boundary for workloads | +| object | `object:tenant/coulomb/packages/a.tar.zst` | Use for exceptional single-object decisions | + +Canonical action names: + +| Action | Meaning | +| --- | --- | +| `s3:GetObject` | Read object data | +| `s3:PutObject` | Create or replace object data | +| `s3:DeleteObject` | Delete object data | +| `s3:ListBucket` | List bucket or prefix contents | +| `s3:GetObjectAttributes` | Read metadata, checksums, or object attributes | +| `s3:AbortMultipartUpload` | Abort multipart state | +| `s3:CreateMultipartUpload` | Start multipart upload | +| `s3:UploadPart` | Upload multipart chunk | +| `s3:CompleteMultipartUpload` | Complete multipart upload | + +Required decision inputs: + +- subject id, subject type, issuer, audience, and tenant; +- protected-system id; +- bucket and prefix or object; +- requested action set; +- requested TTL; +- assurance level and MFA evidence where privileged or destructive + actions are requested; +- workload identity evidence for service or agent callers; +- request purpose and audit correlation id when available. + +Required decision outputs: + +- allow or deny; +- maximum TTL; +- permitted actions; +- permitted bucket and prefix/object scope; +- obligations such as read-only, checksum-required, write-once, or + audit-detail-required; +- deny reason code; +- explanation/audit correlation id; +- backend exchange hint where policy deliberately restricts backend use. + +TTL policy: + +- default interactive TTL: 15 minutes; +- default workload TTL: 30 minutes; +- maximum normal TTL: 1 hour; +- longer TTLs require explicit policy and should not exceed backend + limits; +- destructive or platform-scoped credentials should use shorter TTLs and + MFA or dual-control obligations. + +## IAM Profile Requirements + +Accepted issuers: + +- key-cape lightweight mode for local, sandbox, and small deployments; +- Keycloak expanded mode for production and enterprise federation; +- local-identity only for development or bootstrap contexts explicitly + marked non-production. + +Required token properties: + +- `iss` matches an approved NetKingdom issuer; +- `aud` targets the credential-vending service or an approved backend + exchange audience; +- `sub` is stable for the principal; +- `exp`, `nbf`, and `iat` are present and within skew tolerance; +- `tenant` or equivalent tenant mapping is present for tenant-scoped + requests; +- service accounts and agents are distinguishable from humans; +- assurance/MFA claims are present when policy needs them; +- groups or roles are mapped through IAM Profile semantics, not + provider-specific bucket policy. + +Local-dev restrictions: + +- local issuers must only be accepted by explicitly configured dev + vending instances; +- local issuer tokens must not be trusted by production backends; +- credentials minted from local issuers must be restricted to local or + sandbox object stores. + +Emergency principals: + +- break-glass use is platform-control-plane access, not tenant access; +- emergency credentials must be short-lived where possible; +- every emergency vending event requires a post-event review record. + +## Backend Assessment + +| Backend | Temporary credential path | NetKingdom stance | +| --- | --- | --- | +| AWS S3 | AWS STS `AssumeRoleWithWebIdentity` returns access key id, secret access key, session token, and expiration | Best fit for AWS-native deployments. Use IAM OIDC provider and role trust policies, with flex-auth deciding before exchange. | +| Ceph RGW | RGW implements a subset of STS, including `AssumeRoleWithWebIdentity` for OIDC-backed temporary credentials | Good fit for self-hosted S3-compatible storage when RGW IAM/STS maturity is acceptable for the deployment. | +| MinIO/AIStor | MinIO STS supports `AssumeRoleWithWebIdentity` with OIDC JWTs and AWS-like response semantics | Strong fit for lightweight/self-hosted deployments if session-token support is wired through consumers. | +| Cloudflare R2 | R2 temporary credentials are created through the R2 Temporary Credentials API or local signing with parent access material | Use a backend-specific broker. Store parent material in OpenBao; do not expose parent credentials to workloads. | +| OpenBao | Can store parent credentials, broker dynamic material, record leases, and audit secret access | Runtime secret infrastructure and audit point, not the canonical object-storage authorization engine. | + +Decision summary: prefer provider-native temporary credentials when the +backend has a mature STS or temporary-credentials API. Keep the +NetKingdom interface stable and normalize backend differences in the +credential-vending service. + +## OpenBao Role + +OpenBao participates in credential vending only after flex-auth approval. +Allowed OpenBao responsibilities: + +- store backend parent credentials for Cloudflare R2 or other APIs that + need privileged signing material; +- store broker configuration and backend endpoint metadata; +- issue or lease dynamic credentials where a supported backend plugin or + controlled broker path exists; +- provide audit records for parent credential access and broker + operations; +- deliver credential-vending service configuration through Kubernetes + auth, CSI, or External Secrets Operator. + +Prohibited OpenBao responsibilities: + +- deciding whether a tenant may access a bucket or prefix; +- storing tenant policy as the canonical object-storage authorization + model; +- exposing platform mounts, root tokens, unseal/recovery material, or + parent credentials to tenants; +- bypassing flex-auth because a backend secret path is readable. + +## Interface Prototype + +HTTP request: + +```http +POST /v1/object-storage/credentials +Authorization: Bearer +Content-Type: application/json +``` + +```json +{ + "protected_system_id": "object-storage:artifact-store-prod", + "tenant_id": "tenant:coulomb", + "bucket": "artifact-store-prod", + "prefix": "tenant/coulomb/packages/", + "actions": ["s3:GetObject", "s3:PutObject", "s3:ListBucket"], + "ttl_seconds": 1800, + "purpose": "artifact-store package upload", + "correlation_id": "01JYNETKINGDOMSTS000000000001" +} +``` + +Normalized response: + +```json +{ + "credentials": { + "access_key_id": "AKIA...", + "secret_access_key": "redacted-by-client-logging", + "session_token": "token...", + "expiration": "2026-05-18T16:45:00Z" + }, + "scope": { + "protected_system_id": "object-storage:artifact-store-prod", + "tenant_id": "tenant:coulomb", + "bucket": "artifact-store-prod", + "prefix": "tenant/coulomb/packages/", + "actions": ["s3:GetObject", "s3:PutObject", "s3:ListBucket"] + }, + "lease": { + "ttl_seconds": 1800, + "renewable": false, + "backend": "minio-assume-role-with-web-identity", + "openbao_lease_id": null + }, + "decision": { + "decision_id": "dec_01JYNETKINGDOMSTS000000000001", + "policy_package": "object-storage-artifact-store-prod@2026-05-18", + "obligations": ["checksum-required"], + "audit_correlation_id": "01JYNETKINGDOMSTS000000000001" + } +} +``` + +Deny response: + +```json +{ + "error": "credential_denied", + "reason_code": "prefix_not_registered_for_tenant", + "decision_id": "dec_01JYNETKINGDOMSTS000000000002", + "audit_correlation_id": "01JYNETKINGDOMSTS000000000002" +} +``` + +`credential_process` output for SDK consumers: + +```json +{ + "Version": 1, + "AccessKeyId": "AKIA...", + "SecretAccessKey": "...", + "SessionToken": "...", + "Expiration": "2026-05-18T16:45:00Z" +} +``` + +CLI shape: + +```bash +netkingdom-object-creds vend \ + --protected-system object-storage:artifact-store-prod \ + --tenant tenant:coulomb \ + --bucket artifact-store-prod \ + --prefix tenant/coulomb/packages/ \ + --action s3:GetObject \ + --action s3:PutObject \ + --ttl 1800 \ + --credential-process +``` + +## Audit Event + +Each successful or denied request should emit one canonical audit event: + +```json +{ + "event_type": "object_storage_credential_vending", + "outcome": "allowed", + "actor": { + "subject": "service:artifact-store", + "issuer": "https://kc.coulomb.social", + "tenant": "tenant:coulomb", + "assurance": "workload" + }, + "request": { + "protected_system_id": "object-storage:artifact-store-prod", + "bucket": "artifact-store-prod", + "prefix": "tenant/coulomb/packages/", + "actions": ["s3:GetObject", "s3:PutObject"], + "ttl_seconds": 1800 + }, + "decision": { + "decision_id": "dec_01JYNETKINGDOMSTS000000000001", + "policy_package": "object-storage-artifact-store-prod@2026-05-18" + }, + "backend": { + "type": "minio-assume-role-with-web-identity", + "credential_expiration": "2026-05-18T16:45:00Z", + "openbao_lease_id": null + } +} +``` + +OpenBao audit events should be correlated when OpenBao parent material, +broker config, dynamic secret engines, or delivery paths are used. + +## Consumer Guidance + +### artifact-store + +`artifact-store` should consume temporary credentials without owning the +vending authority. + +Required consumer support: + +- `AWS_ACCESS_KEY_ID`; +- `AWS_SECRET_ACCESS_KEY`; +- `AWS_SESSION_TOKEN`; +- credential expiration awareness; +- refresh before expiration, preferably with jitter; +- env, file, sidecar, controller, or `credential_process` delivery. + +The existing static bridge can remain transitional: + +```bash +export ARTIFACTSTORE_S3_ACCESS_KEY_REF=file:/run/secrets/artifactstore/s3-access-key +export ARTIFACTSTORE_S3_SECRET_KEY_REF=file:/run/secrets/artifactstore/s3-secret-key +``` + +Temporary credentials require either a session-token ref or a refresh +pattern that updates all three credential values atomically: + +```bash +export ARTIFACTSTORE_S3_ACCESS_KEY_REF=file:/run/secrets/artifactstore/aws-access-key-id +export ARTIFACTSTORE_S3_SECRET_KEY_REF=file:/run/secrets/artifactstore/aws-secret-access-key +export ARTIFACTSTORE_S3_SESSION_TOKEN_REF=file:/run/secrets/artifactstore/aws-session-token +export ARTIFACTSTORE_S3_CREDENTIAL_EXPIRATION_REF=file:/run/secrets/artifactstore/expiration +``` + +Recommended deployment patterns: + +- CLI or SDK `credential_process` for developer and batch use; +- sidecar refresh process for pods that cannot call the vending API + directly; +- controller plus mounted files when platform operators need centralized + refresh and audit; +- direct vending API call only when the workload can protect its IAM + token and handle refresh safely. + +### Other S3 Consumers + +Consumers must support the session token. Access-key/secret-key-only +clients are limited to transitional static credentials and should not be +used for production tenant workloads. + +Prohibited patterns: + +- object-store root credentials in application pods; +- long-lived tenant access keys for normal workload traffic; +- bucket policy managed by application repos as the source of truth; +- storing parent R2/API credentials in tenant namespaces; +- ignoring credential expiration and retrying indefinitely with expired + credentials; +- accepting local-identity tokens in production. + +## Failure Modes + +| Failure | Expected behavior | +| --- | --- | +| IAM token invalid or wrong audience | Deny before policy evaluation; emit audit event | +| Tenant missing or mismatched | Deny with `tenant_scope_missing` or `tenant_mismatch` | +| Prefix not registered | Deny with `prefix_not_registered_for_tenant` | +| TTL too long | Reduce to policy maximum or deny, depending on policy | +| flex-auth or Topaz unavailable | Fail closed except for explicitly documented emergency platform workflows | +| Backend STS unavailable | Do not mint credentials; return retryable backend error | +| OpenBao unavailable | Fail if parent material or broker config requires OpenBao; otherwise continue only for backend paths that do not depend on it | +| Audit sink unavailable | Deny privileged/platform-scoped requests; allow low-risk tenant requests only if policy permits buffered audit | +| Consumer refresh fails | Stop writes before expiration; retry vending with backoff; never fall back to root credentials | + +## Readiness Checks + +- IAM Profile token validation test passes for key-cape or Keycloak. +- flex-auth has policy packages for platform and tenant scopes. +- Topaz policy load and health are verified where delegated PDP is used. +- Backend-specific STS or temporary credential path returns credentials + with session token and expiration. +- OpenBao parent credential access, lease metadata, and audit correlation + work where OpenBao is in the path. +- artifact-store or the consumer can refresh all credential fields before + expiration. +- Deny paths produce stable reason codes and audit records. +- Break-glass operation is documented and post-event review is required. + +## References + +- [AWS STS AssumeRoleWithWebIdentity](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRoleWithWebIdentity.html) +- [Ceph RGW STS](https://docs.ceph.com/en/latest/radosgw/STS/) +- [MinIO AssumeRoleWithWebIdentity](https://min.io/docs/minio/linux/developers/security-token-service/AssumeRoleWithWebIdentity.html) +- [Cloudflare R2 Temporary Credentials API](https://developers.cloudflare.com/api/resources/r2/subresources/temporary_credentials/) +- [Cloudflare R2 temporary credential example](https://developers.cloudflare.com/r2/examples/authenticate-r2-temp-credentials/) diff --git a/docs/platform-identity-security-architecture.md b/docs/platform-identity-security-architecture.md index eb87953..8f92984 100644 --- a/docs/platform-identity-security-architecture.md +++ b/docs/platform-identity-security-architecture.md @@ -1,7 +1,7 @@ # Platform Identity and Security Architecture -Status: draft architecture baseline for NetKingdom/Railiance/Coulomb -Date: 2026-05-17 +Status: implemented architecture baseline for NetKingdom/Railiance/Coulomb +Date: 2026-05-18 ## Purpose @@ -305,6 +305,86 @@ Possible responsibilities: This orchestration layer should build on Railiance capabilities rather than bypassing the Railiance stack boundaries. +ADR-0007 records the current decision: keep orchestration in Railiance +playbooks for now, with NetKingdom defining the trust-state model, +readiness checks, OpenBao boundaries, and security semantics. + +## flex-auth And Topaz Implications + +flex-auth work must preserve the recursive boundary between platform +control-plane resources and tenant resources. + +Required implications: + +- CARING descriptors must include scope and tenant metadata for + tenant-scoped access, and must mark rare platform-scoped access + explicitly. +- Policy packages must distinguish `tenant:platform` policy from + tenant-local packages such as `tenant:coulomb`. +- Decision envelopes must carry subject, issuer, audience, tenant, + protected-system id, resource, action, requested TTL where relevant, + assurance evidence, obligations, deny reasons, and audit correlation + ids. +- Topaz is a delegated PDP runtime behind flex-auth. It must not become + the canonical policy model, identity provider, or platform control + plane. +- Audit and explain records must be durable enough to reconstruct why a + platform-root, secret, credential, or tenant-administration decision was + allowed or denied. +- Platform-root guardrails must deny tenant administrators the ability to + alter IAM Profile semantics, OpenBao platform mounts/auth methods, + flex-auth policy import pipelines, Topaz runtime configuration, or + platform audit retention. + +OpenBao secret access and dynamic credential requests follow the same +authorization rule: identity proves the actor or workload, flex-auth +decides whether the request is permitted, and OpenBao stores, issues, +leases, audits, and revokes the secret material. + +## Coulomb Tenant Onboarding Path + +The first Coulomb tenant onboarding path should be repeatable before it +becomes automated: + +1. Register `tenant:coulomb` as a tenant distinct from + `tenant:platform`. +2. Map Coulomb human, service, and agent principals to IAM Profile claims + with issuer, audience, subject, group, tenant, and assurance evidence. +3. Register Coulomb protected systems and resources in flex-auth with + stable protected-system ids. +4. Import tenant-scoped policy packages and CARING descriptors for + Coulomb resources. +5. Initialize the delegated PDP runtime, starting with Topaz, using only + the policy packages approved for the tenant and platform boundary. +6. Provision Coulomb workload secret paths, Kubernetes auth roles, or + delivery mechanisms in OpenBao without granting access to platform + mounts, unseal/recovery material, or global auth configuration. +7. Run audit readiness checks before admitting production traffic: + identity issuance, flex-auth decision envelope, Topaz health, + OpenBao audit event, workload enforcement event, and correlation id. + +The onboarding path is complete when a Coulomb workload can authenticate, +receive a scoped authorization decision, obtain only the allowed secret or +short-lived credential, enforce the decision locally, and produce an +auditable record without receiving platform-root authority. + +## Production Readiness Checks + +Before the security platform is production-ready, each trust state needs +an explicit check: + +| Area | Readiness check | +| --- | --- | +| MFA and identity | key-cape or Keycloak issues IAM Profile-compatible tokens; privacyIDEA or the selected MFA provider enforces required assurance for privileged actions | +| Bootstrap and recovery | age/SOPS material, emergency bundle, and break-glass credentials are present, tested, and separated from tenant administration | +| OpenBao runtime secrets | OpenBao is initialized, unsealed or auto-unsealed by the approved mechanism, backed up, audited, and using scoped auth methods and mounts | +| Secret rotation | service, database, OpenBao-issued, and break-glass rotation paths have documented blast radius and verification steps | +| flex-auth policy state | platform and tenant policy packages are versioned, reviewable, imported, and explainable | +| Topaz runtime | delegated PDP health, data freshness, policy load status, and fail-closed behavior are verified | +| Tenant onboarding | `tenant:coulomb` resources, claims, policies, OpenBao paths, and audit correlation are registered and tested | +| Audit sink | identity, flex-auth, Topaz, OpenBao, Kubernetes, and workload audit records land in durable storage with restore/drill coverage | +| Break-glass | emergency access works when normal identity is unavailable and produces a post-event review record | + ## Open Questions - Where is the durable audit log stored for platform-root decisions? diff --git a/workplans/NK-WP-0004-credential-management-foundation.md b/workplans/NK-WP-0004-credential-management-foundation.md index 04e21b0..45497ef 100644 --- a/workplans/NK-WP-0004-credential-management-foundation.md +++ b/workplans/NK-WP-0004-credential-management-foundation.md @@ -8,7 +8,7 @@ status: done owner: custodian topic_slug: netkingdom created: "2026-03-20" -updated: "2026-03-20" +updated: "2026-05-18" state_hub_workstream_id: "d9cf7c4b-886b-4cd1-ad7b-99c4e1929c9e" --- @@ -68,10 +68,34 @@ Operator warns about time-sensitive steps like enckey-bootstrap) ``` +## NK-WP-0006 Runtime Secret Refinement + +This workplan remains the bootstrap credential foundation. With OpenBao in +the platform stack, its outputs are not the final runtime secret model. +They establish enough trust to bring up identity, MFA, and platform +services safely. + +Trust-state mapping: + +- bare host and cluster trust are established by Railiance layers; +- bootstrap secret trust is established by SOPS/age, encrypted bundles, + emergency material, and Kubernetes Secret injection; +- bootstrap identity trust is established by local/key-cape/bootstrap + identity paths; +- runtime secret trust begins only after OpenBao is deployed, + initialized, unsealed or auto-unsealed by the approved mechanism, + audited, backed up, and ready to issue scoped secrets or dynamic + credentials. + +After runtime secret trust exists, Kubernetes Secrets created here should +be treated as bootstrap artifacts, delivery caches, or compatibility +mechanisms. Long-lived workload secret authority belongs in OpenBao, +governed by NetKingdom policy and Railiance platform operations. + ## Dependency on canon standard All design decisions in this workplan follow -`canon/standards/credential-management_v0.1.md`. +`canon/standards/credential-management_v0.2.md`. The KeePassXC group structure, phase model, SOPS policy, and prohibited patterns defined there are normative. This workplan implements them. diff --git a/workplans/NK-WP-0005-agent-driven-credential-bootstrap.md b/workplans/NK-WP-0005-agent-driven-credential-bootstrap.md index d464c9f..e72a86b 100644 --- a/workplans/NK-WP-0005-agent-driven-credential-bootstrap.md +++ b/workplans/NK-WP-0005-agent-driven-credential-bootstrap.md @@ -8,7 +8,7 @@ status: done owner: custodian topic_slug: netkingdom created: "2026-03-21" -updated: "2026-03-21" +updated: "2026-05-18" depends_on: NK-WP-0004 state_hub_workstream_id: "75bc472b-cc0a-48f2-afb6-62b896f7cc19" --- @@ -59,6 +59,33 @@ Agent Human Everything else — service secrets, rotation, re-injection — is agent work. +## NK-WP-0006 Runtime Secret Refinement + +With OpenBao in the platform stack, the agent-driven bootstrap is the +handoff mechanism from bootstrap secrets to runtime secret authority. +The agent may generate, encrypt, inject, and verify initial secrets, but +OpenBao becomes the normal authority for platform and workload secret +delivery once the control plane is alive. + +The bootstrap flow therefore has one additional boundary: + +1. SOPS/age and the emergency bundle establish bootstrap and recovery + authority. +2. Kubernetes Secrets carry the minimum initial material needed to start + the identity, MFA, database, and OpenBao platform services. +3. OpenBao is initialized, unsealed or auto-unsealed by the approved + mechanism, audit logging is enabled, backups are verified, and + workload auth methods are configured. +4. Runtime workloads receive scoped secrets, dynamic credentials, or + synchronized Kubernetes Secrets from OpenBao. They do not consume + platform-root bootstrap material. + +OpenBao root tokens, unseal keys, or recovery keys are break-glass +material. They must not be stored as ordinary tenant secrets or exposed +to tenant administrators. If they are included in an emergency bundle, +that bundle is platform-control-plane break-glass material and requires +the strongest storage and review procedure available for the deployment. + ## Design ### What changes from NK-WP-0004 diff --git a/workplans/NK-WP-0006-recursive-platform-identity-security-architecture.md b/workplans/NK-WP-0006-recursive-platform-identity-security-architecture.md index fbbd0b4..5b5c1c2 100644 --- a/workplans/NK-WP-0006-recursive-platform-identity-security-architecture.md +++ b/workplans/NK-WP-0006-recursive-platform-identity-security-architecture.md @@ -4,11 +4,11 @@ type: workplan title: Recursive platform identity and security architecture domain: netkingdom repo: net-kingdom -status: proposed +status: done owner: Bernd Worsch topic_slug: netkingdom created: 2026-05-17 -updated: 2026-05-17 +updated: 2026-05-18 depends_on: - NK-WP-0001 - NK-WP-0004 @@ -27,7 +27,7 @@ accidentally becoming the platform root of trust. The workplan turns the recursive insight into operational structure: bootstrap plane, platform control plane, tenant plane, IAM Profile, flex-auth authorization, Topaz runtime, privacyIDEA MFA/token handling, -and safe orchestration boundaries. +OpenBao runtime secret authority, and safe orchestration boundaries. ## Context @@ -49,7 +49,7 @@ In scope: - document the three-plane architecture - define platform-root versus tenant authority - define how NetKingdom, key-cape, Keycloak, privacyIDEA, flex-auth, - Topaz, and Railiance relate + Topaz, OpenBao, and Railiance relate - define bootstrap-to-runtime trust states - update related workplans and ADRs when implementation details become concrete @@ -59,6 +59,7 @@ Out of scope: - implementing flex-auth adapters - deploying Keycloak, key-cape, privacyIDEA, Topaz, or Railiance services +- deploying OpenBao itself - designing customer-specific tenant policy - replacing existing Railiance layer ownership @@ -84,7 +85,7 @@ to a stable decision. ```task id: NK-WP-0006-T3 -status: todo +status: done priority: high state_hub_task_id: "842ba5a7-5199-490a-8af5-3150388e0d42" ``` @@ -94,7 +95,7 @@ scope, audit/explain records, and platform-root guardrails. ```task id: NK-WP-0006-T4 -status: todo +status: done priority: high state_hub_task_id: "ce153339-f493-44ed-a2c5-befb578334fe" ``` @@ -104,7 +105,7 @@ runtime identity, runtime authorization, tenant onboarding. ```task id: NK-WP-0006-T5 -status: todo +status: done priority: medium state_hub_task_id: "6c9a3561-4e63-4acd-87a7-bf0f374fa6b2" ``` @@ -114,7 +115,7 @@ audit readiness. ```task id: NK-WP-0006-T6 -status: todo +status: done priority: medium state_hub_task_id: "27760e30-f773-4552-97f4-7fbe56507f9e" ``` @@ -123,7 +124,7 @@ a dedicated repo. Capture the decision as an ADR before implementation. ```task id: NK-WP-0006-T7 -status: todo +status: done priority: medium state_hub_task_id: "f09519ac-cf97-4f8b-8a7b-6ff828bbd8d9" ``` @@ -131,6 +132,33 @@ Define production readiness checks for the security platform: MFA state, secret rotation state, flex-auth policy state, Topaz health, audit sink, and break-glass verification. +## Implementation Review - 2026-05-18 + +The recursive architecture remains the right framing. The refinement from +the current stack is that OpenBao is now part of the platform control +plane as the runtime secret authority. SOPS/age and emergency bundles +remain bootstrap and recovery mechanisms; they must not become the +long-lived runtime authority for every workload secret once OpenBao is +available. + +Implemented refinements: + +- `docs/platform-identity-security-architecture.md` now includes explicit + flex-auth/Topaz implications, Coulomb tenant onboarding, production + readiness checks, and OpenBao secret authority boundaries. +- `docs/adr/ADR-0007-security-orchestration-boundary.md` records that + orchestration stays in Railiance playbooks for now; a dedicated repo is + deferred until sequencing has a stable, cross-repo product surface. +- `workplans/NK-WP-0007-object-storage-sts-credential-vending.md` now + treats OpenBao as the runtime broker/audit option without letting it + replace flex-auth authorization or storage-native STS semantics. +- `workplans/NK-WP-0004-credential-management-foundation.md`, + `workplans/NK-WP-0005-agent-driven-credential-bootstrap.md`, and + `canon/standards/credential-management_v0.2.md` now distinguish + bootstrap credential handling from the OpenBao runtime-secret handoff. + +State Hub task statuses should be synchronized to match this workplan. + ## Acceptance Criteria - Architecture docs distinguish bootstrap plane, platform control plane, @@ -138,7 +166,7 @@ and break-glass verification. - Coulomb is represented as tenant zero/reference tenant, not platform root. - The role of NetKingdom, key-cape, Keycloak, privacyIDEA, flex-auth, - Topaz, and Railiance is clear. + Topaz, OpenBao, and Railiance is clear. - Follow-up workplans identify where flex-auth and bootstrap work need to adapt. - Any future orchestration repo is justified by an ADR before it is diff --git a/workplans/NK-WP-0007-object-storage-sts-credential-vending.md b/workplans/NK-WP-0007-object-storage-sts-credential-vending.md index 54b43ee..6c3b98b 100644 --- a/workplans/NK-WP-0007-object-storage-sts-credential-vending.md +++ b/workplans/NK-WP-0007-object-storage-sts-credential-vending.md @@ -4,13 +4,13 @@ type: workplan title: Object Storage STS Credential Vending domain: netkingdom repo: net-kingdom -status: proposed +status: done owner: codex topic_slug: netkingdom planning_priority: high planning_order: 7 created: 2026-05-17 -updated: 2026-05-17 +updated: 2026-05-18 depends_on: - NK-WP-0004 - NK-WP-0005 @@ -50,6 +50,11 @@ security architecture. The surrounding ecosystem matters: lightweight stack, not object-storage policy owners. - flex-auth owns policy-as-code decisions, resource/action vocabulary, decision envelopes, delegated PDP adapters, and audit semantics. +- OpenBao is now part of the platform stack as the runtime secret + authority, dynamic credential broker where appropriate, and audit + source for secret access. It can broker or store credential material, + but it does not replace flex-auth authorization or provider-native STS + semantics. - ops-warden and ops-bridge provide a useful precedent for short-lived credentials and actor attribution, but they are SSH-specific and should not be overloaded with object-storage credentials. @@ -80,12 +85,38 @@ Out of scope: - replacing flex-auth with provider-specific bucket policies - putting object-storage policy inside key-cape, ops-warden, or ops-bridge +- letting OpenBao root/admin authority become the object-storage policy + model + +## Recursive Platform Implications + +This workplan depends on NK-WP-0006, so object-storage credential vending +must honor the platform/tenant split: + +- `tenant:platform` may administer the vending service, OpenBao mounts, + storage backends, policy import pipeline, and audit retention. +- `tenant:coulomb` and future tenants may request scoped credentials only + for registered tenant resources. +- flex-auth decision envelopes must include tenant id, protected-system + id, bucket or prefix, action set, TTL, assurance evidence, obligations, + deny reasons, and audit correlation ids. +- CARING descriptors must mark whether a request is platform-scoped or + tenant-scoped; platform-scoped descriptor use is rare, reviewed, and + auditable. +- Topaz is the first delegated PDP runtime behind flex-auth. Its data and + policy loading must not give a tenant administrator control over + platform policies. +- OpenBao may broker, lease, audit, or store temporary credential + material after flex-auth approval. OpenBao must not become the source of + object-storage authorization policy, and tenants must not receive + OpenBao root tokens, unseal/recovery material, platform mounts, or + global auth-method control. ## Tasks ```task id: NK-WP-0007-T1 -status: todo +status: done priority: high state_hub_task_id: "3b50c48f-1ab2-4631-b176-d49d9d705f1e" ``` @@ -97,7 +128,7 @@ flow, and failure modes. ```task id: NK-WP-0007-T2 -status: todo +status: done priority: high state_hub_task_id: "5b942d22-6f29-4975-88fb-e3e5bcaf4029" ``` @@ -109,7 +140,7 @@ multipart operations), TTL limits, obligations, and deny reasons. ```task id: NK-WP-0007-T3 -status: todo +status: done priority: high state_hub_task_id: "8d27e5b4-9bbb-4a53-a079-0df1047d755e" ``` @@ -121,30 +152,31 @@ issuer restrictions. ```task id: NK-WP-0007-T4 -status: todo +status: done priority: medium state_hub_task_id: "c0c4f297-6cff-419b-9ce3-be5537c92e93" ``` Assess backend STS implementations and write a decision record covering Ceph RGW STS, MinIO/AIStor STS, AWS STS, Cloudflare R2 temporary -credentials, and whether OpenBao/Vault should broker any of these -directly. +credentials, and when OpenBao should broker, lease, audit, or store the +resulting credential material. ```task id: NK-WP-0007-T5 -status: todo +status: done priority: medium state_hub_task_id: "ccb10b2d-6378-4824-90b1-c31bd882d93d" ``` Prototype the smallest credential-vending interface: CLI or HTTP request shape, normalized response shape, lease metadata, audit event, -and a `credential_process`-compatible option for SDK consumers. +OpenBao lease/audit metadata where used, and a +`credential_process`-compatible option for SDK consumers. ```task id: NK-WP-0007-T6 -status: todo +status: done priority: medium state_hub_task_id: "63c6859b-980e-44da-a5a6-b92a8a3225dd" ``` @@ -154,12 +186,32 @@ environment variables, `AWS_SESSION_TOKEN`, refresh behavior, sidecar or controller refresh options, and prohibited patterns such as long-lived root access keys. +## Implementation Review - 2026-05-18 + +Implemented as architecture and decision artifacts: + +- `docs/object-storage-sts-credential-vending.md` defines the target + architecture, actors, trust boundaries, token flow, flex-auth + vocabulary, IAM Profile requirements, backend assessment, OpenBao + role, request/response prototype, audit event, failure modes, and + consumer guidance. +- `docs/adr/ADR-0008-object-storage-sts-credential-vending.md` records + the decision to use a provider-neutral NetKingdom vending boundary with + provider-native temporary credential mechanisms where possible. + +The implementation deliberately stops before building a live vending +service. Service implementation belongs in a follow-up workplan once +artifact-store has session-token/refresh support and the Railiance +OpenBao bootstrap/unseal/break-glass work is ready. + ## Acceptance Criteria - NetKingdom has a canonical, provider-neutral pattern for object-storage STS credential vending. - flex-auth is the policy decision point for bucket/prefix/action/TTL authorization. +- OpenBao is treated as runtime secret/lease infrastructure where useful, + not as the canonical authorization policy engine. - key-cape and Keycloak are treated as IAM Profile implementations, not object-storage policy engines. - ops-warden and ops-bridge remain SSH/tunnel-specific but their diff --git a/workplans/NK-WP-0008-it-security-architecture-patterns-infospace.md b/workplans/NK-WP-0008-it-security-architecture-patterns-infospace.md index 3e1ee1f..3cdf277 100644 --- a/workplans/NK-WP-0008-it-security-architecture-patterns-infospace.md +++ b/workplans/NK-WP-0008-it-security-architecture-patterns-infospace.md @@ -4,16 +4,18 @@ type: workplan title: IT Security Architecture Patterns Infospace domain: netkingdom repo: net-kingdom -status: proposed +status: done owner: codex topic_slug: netkingdom planning_priority: high planning_order: 8 created: 2026-05-17 -updated: 2026-05-17 +updated: 2026-05-19 depends_on: - NK-WP-0006 state_hub_workstream_id: "053c6d96-9396-40c9-a2e5-c36531e7810d" +execution_repo: infospace-bench +infospace_path: infospaces/patterns-of-it-securita-architecture --- # NK-WP-0008 - IT Security Architecture Patterns Infospace @@ -49,13 +51,40 @@ and secrets management. Several patterns repeat across repos: An infospace makes these repeatable architectural patterns explicit, searchable, comparable, and teachable. +## Seeded Infospace + +Bernd seeded the working corpus in `infospace-bench`: + +```text +/home/worsch/infospace-bench/infospaces/patterns-of-it-securita-architecture/ +``` + +The current seed is: + +```text +genesis/InitialExploration.md +``` + +That file already sketches the two intended collections: a security +capability catalog and a security architecture pattern catalog, with +readiness levels and mappings to standards such as NIST CSF, CIS, OWASP, +SLSA, and OpenSSF. + +NK-WP-0008 will therefore use `infospace-bench` as the canonical working +space instead of creating a separate `docs/security-patterns/` tree in +`net-kingdom`. `net-kingdom` remains the owner of the NetKingdom security +architecture mapping; `infospace-bench` owns the concrete infospace +artifact lifecycle, manifests, evaluation reports, and exports. + ## Scope In scope: -- define a pattern document template -- create an infospace directory under `docs/security-patterns/` -- capture initial patterns and their NetKingdom mapping +- promote the seeded directory into a valid `infospace-bench` infospace + with `infospace.yaml`, `artifacts/index.yaml`, and the standard layout +- import `genesis/InitialExploration.md` as the seed source artifact +- define capability and pattern document templates inside the infospace +- capture initial capability, pattern, readiness, and mapping artifacts - record source references and current product/tool options - connect patterns to implementation repos and workplans - distinguish canonical patterns from experiments and anti-patterns @@ -63,74 +92,195 @@ In scope: Out of scope: - implementing every pattern +- building new lower-level `infospace-bench` engine features - replacing ADRs - duplicating vendor documentation +- using `net-kingdom/docs/security-patterns/` as the primary corpus - writing full tutorials; tutorials are handled by NK-WP-0009 ## Tasks +### T01 - Promote The Seed Into A Valid Infospace + ```task id: NK-WP-0008-T1 -status: todo +status: done priority: high state_hub_task_id: "d1b7213c-3315-49d2-90c9-efdf2bea3563" ``` -Define the security-pattern infospace structure and template: -problem, forces, applicability, threat model, canonical NetKingdom -mapping, implementation variants, operational checks, audit hooks, -anti-patterns, and references. +Promote +`/home/worsch/infospace-bench/infospaces/patterns-of-it-securita-architecture` +from a genesis note into a valid `infospace-bench` project: create +`infospace.yaml`, `artifacts/index.yaml`, the required artifact/output +directories, and a manifest entry for `genesis/InitialExploration.md`. + +Define the initial artifact taxonomy for: + +- capabilities +- patterns +- readiness levels +- mappings +- source references +- generated reports + +### T02 - Extract The Initial Capability And Pattern Catalogs ```task id: NK-WP-0008-T2 -status: todo +status: done priority: high state_hub_task_id: "59966187-27f1-4b9c-9dfc-e59d11ff115c" ``` -Create initial pattern entries for: STS credential vending, workload -identity, secret zero avoidance, dynamic secrets, short-lived SSH -certificates, delegated authorization, break-glass access, and -policy-as-code. +Extract the first structured artifacts from the seeded exploration: + +- a security capability catalog +- a security architecture pattern catalog +- readiness levels RL0-RL4 +- a production readiness baseline + +Initial patterns must include STS credential vending, workload identity, +secret zero avoidance, dynamic secrets, short-lived SSH certificates, +delegated authorization, break-glass access, tenant isolation, central +audit ledger, policy-as-code admission, and supply-chain provenance. + +Each pattern artifact should use a repeatable template: problem, context, +forces, solution, implementation sketch, failure modes, related +capabilities, maturity, verification, and references. + +### T03 - Map Patterns To NetKingdom And Ecosystem Ownership ```task id: NK-WP-0008-T3 -status: todo +status: done priority: medium state_hub_task_id: "927c08a5-1a7e-4634-a514-0f562e286708" ``` -Map each pattern to NetKingdom/Railiance repos and components: -net-kingdom, key-cape, flex-auth, ops-warden, ops-bridge, -railiance-platform, artifact-store, Keycloak, Authelia, LLDAP, -privacyIDEA, OpenBao, Ceph RGW, and MinIO-compatible stores. +Map each capability and pattern to the owning repos, components, and +workplans: + +- `net-kingdom`, NK-WP-0006, NK-WP-0007, NK-WP-0008, NK-WP-0009 +- `key-cape`, Keycloak, Authelia, LLDAP, privacyIDEA +- `flex-auth`, Topaz, CARING descriptors, decision envelopes +- `railiance-platform`, OpenBao, Ceph RGW, MinIO-compatible stores +- `ops-warden`, `ops-bridge`, short-lived SSH and agent access +- `artifact-store`, object storage, STS consumers, artifact integrity +- relevant standards and references from the seed document + +This mapping should distinguish platform ownership, product/application +ownership, tenant responsibility, and external provider responsibility. + +### T04 - Build The Index, Maturity Matrix, And Report ```task id: NK-WP-0008-T4 -status: todo +status: done priority: medium state_hub_task_id: "884626ea-243e-4806-9267-77ef643158b7" ``` -Create an index page with pattern status, maturity, owning repo, -implementation links, and open decisions. +Create the first infospace index/report with capability status, pattern +maturity, owning repo, implementation links, open decisions, and gaps. + +The report should make it easy to answer: + +- which patterns are canonical versus experimental +- which patterns already have NetKingdom/Railiance implementation anchors +- which patterns need ADRs, workplans, or tutorials +- which patterns feed NK-WP-0009 tutorials + +### T05 - Define Admission And Review Criteria ```task id: NK-WP-0008-T5 -status: todo +status: done priority: medium state_hub_task_id: "d3b29f3d-0da5-43b5-a93a-d95fb8a0ceef" ``` Add review criteria for admitting new patterns: vendor neutrality, threat-model clarity, open-source/commercial implementation options, -operability, auditability, and failure-mode behavior. +operability, auditability, failure-mode behavior, readiness-level fit, +evidence quality, and ownership clarity. + +Define when a pattern is allowed to graduate from: + +```text +seed -> draft -> reviewed -> canonical -> deprecated +``` + +The criteria should be expressed as an infospace evaluation/checklist so +future pattern additions can be reviewed consistently. + +## Implementation Review - 2026-05-19 + +NK-WP-0008 has been implemented in `infospace-bench` at: + +```text +/home/worsch/infospace-bench/infospaces/patterns-of-it-securita-architecture/ +``` + +Created artifacts: + +- `infospace.yaml` +- `artifacts/index.yaml` +- `artifacts/entities/security-capability-catalog.md` +- `artifacts/entities/security-architecture-pattern-catalog.md` +- `artifacts/entities/security-readiness-levels.md` +- `artifacts/relations/netkingdom-ownership-map.md` +- `artifacts/generated/security-pattern-index.md` +- `artifacts/generated/pattern-admission-review.md` +- `reports/initial-security-pattern-report.md` + +Population pass: + +- copied the NetKingdom object-storage STS credential-vending document, + platform identity/security architecture document, ADR-0008, and + Railiance OpenBao platform secrets service note into + `artifacts/sources/` +- added `artifacts/entities/capability-object-storage-access.md` +- added `artifacts/entities/pattern-sts-credential-vending.md` +- added `artifacts/relations/sts-credential-vending-relationship-map.md` +- added `artifacts/generated/sts-credential-vending-extraction.md` +- registered the new STS/OpenBao evidence, capability, pattern, relation + map, extraction, index, and report edges in `artifacts/index.yaml` +- normalized relationship direction so the infospace graph remains + connected and acyclic + +Catalogue population pass: + +- added first-class pattern artifacts for the initial NetKingdom pattern + set beyond STS credential vending: workload identity, secret zero + avoidance, dynamic secrets, short-lived SSH certificates, delegated + authorization, break-glass access, tenant isolation, central audit + ledger, policy-as-code admission, supply-chain provenance, network + default deny, object-level authorization check, human/agent identity + split, and tenant context propagation +- added `artifacts/generated/research-pattern-normalization.md` to map + the broader `genesis/InitialExploration.md` research tables into the + first-class pattern set and future promotion candidates +- registered the pattern artifacts in `artifacts/index.yaml` with source + seed, catalog inclusion, ownership map, admission review, readiness, + and index summary relationships + +Verification: + +- `.venv/bin/python -m infospace_bench inspect infospaces/patterns-of-it-securita-architecture` +- `.venv/bin/python -m infospace_bench validate infospaces/patterns-of-it-securita-architecture` +- `.venv/bin/python -m infospace_bench metrics infospaces/patterns-of-it-securita-architecture` passed viability in snapshot `502d0933` with 16 artifacts, 100% coverage, one connected component, and zero consistency cycles +- `.venv/bin/python -m infospace_bench graph infospaces/patterns-of-it-securita-architecture --format mermaid` +- `.venv/bin/python -m pytest` passed with 179 passed, 2 skipped ## Acceptance Criteria -- `docs/security-patterns/` exists with a repeatable pattern template. +- `infospace-bench/infospaces/patterns-of-it-securita-architecture/` + is a valid infospace-bench project with a manifest and standard layout. - Initial high-value security patterns are documented in a consistent shape. - Each pattern names the canonical NetKingdom mapping and the repos that own implementation. - The infospace distinguishes patterns, tutorials, ADRs, and vendor docs. +- State Hub task titles and descriptions reflect the concrete + infospace-bench workflow instead of generic placeholder task names.