Add security architecture pattern infospace

This commit is contained in:
2026-05-19 07:12:07 +02:00
parent 3ca891de4a
commit 5bb4b40b86
81 changed files with 6836 additions and 0 deletions

View File

@@ -0,0 +1,115 @@
# Capability: Object Storage Access
Status: draft
Readiness target: RL3 production
Primary owners: NetKingdom, Railiance platform, artifact-store
## Intent
Provide safe object-storage access for platform and tenant workloads
without giving applications long-lived root credentials or making each
application own storage authorization policy.
## Scope
Included:
- identity-backed access requests;
- bucket, prefix, action, tenant, and TTL scoping;
- temporary credentials with session token and expiration;
- audit correlation across identity, authorization, OpenBao, backend,
and workload events;
- transitional static credential bridge where STS is not ready.
Excluded:
- object-storage backend deployment;
- product-specific artifact package semantics;
- replacing flex-auth with provider-specific bucket policy;
- exposing parent object-store credentials to tenant workloads.
## Threats Addressed
- leaked long-lived object-store access keys;
- application repos becoming policy owners for storage access;
- cross-tenant bucket or prefix access;
- workloads using platform-root object-store credentials;
- unaudited access to generated artifacts, backups, reports, and
evidence packages;
- failure to revoke or expire credentials after task completion.
## Required Controls
- IAM Profile token validation for human, service, and agent callers.
- flex-auth decision envelope for protected system, tenant, resource,
action set, TTL, assurance, obligations, and deny reason.
- Provider-native temporary credentials where the backend supports them.
- OpenBao custody for parent credentials, broker configuration, delivery
secrets, and audit records where used.
- Consumer support for `AWS_SESSION_TOKEN` and expiration-aware refresh.
- Durable audit sink and correlation id.
## Implementation Options
| Option | Use when | Notes |
| --- | --- | --- |
| AWS STS `AssumeRoleWithWebIdentity` | AWS-native object storage | Strong native fit; use IAM OIDC provider and role trust policies |
| Ceph RGW STS | self-hosted Ceph object storage | Use when RGW IAM/STS maturity fits deployment risk |
| MinIO/AIStor STS | lightweight or self-hosted S3-compatible storage | Good fit if consumers support session tokens |
| Cloudflare R2 temporary credentials | Cloudflare object storage | Requires backend-specific broker protecting parent credentials |
| Transitional static bridge | before STS support is ready | Store scoped static credentials in OpenBao; rotate and retire quickly |
## Platform Responsibility
- define issuer, audience, tenant, and assurance requirements;
- define flex-auth resource/action vocabulary;
- operate or approve the credential-vending service;
- protect backend parent credentials through OpenBao;
- provide audit retention and break-glass procedure.
## Product Responsibility
- use temporary credentials rather than root/static credentials;
- refresh credentials before expiration;
- include correlation ids in storage operations where possible;
- handle deny and expiration cleanly;
- avoid embedding policy decisions in application code.
## Tenant Responsibility
- request access only to registered tenant resources;
- manage tenant-scoped groups or memberships where delegated;
- review tenant-visible audit events where available.
## Readiness Criteria
| Level | Criteria |
| --- | --- |
| RL1 | static scoped credentials are not committed; object-store root credentials are not used |
| RL2 | tenant/bucket/prefix mapping exists; OpenBao or equivalent custody protects credentials |
| RL3 | temporary credentials, session token support, flex-auth decisions, audit correlation, and refresh behavior are verified |
| RL4 | tenant-visible audit, dual control for platform-scoped access, restore/revocation drills, and standards mapping are complete |
## Evidence
- `docs/object-storage-sts-credential-vending.md`
- `ADR-0008 - Object Storage STS Credential Vending Boundary`
- artifact-store support for `AWS_SESSION_TOKEN`
- OpenBao audit record for broker/parent credential access
- flex-auth decision record with stable reason codes
- backend credential expiration and revocation proof
## Related Patterns
- `pattern-sts-credential-vending.md`
- secret zero avoidance
- delegated authorization
- workload identity
- central audit ledger
## Related Standards
- NIST CSF Protect and Detect functions.
- OWASP API Security broken object-level authorization risk.
- SLSA and artifact integrity patterns where object storage holds release
artifacts or provenance.

View File

@@ -0,0 +1,44 @@
# Pattern: API Gateway as Security Boundary
Status: seed
Readiness target: RL3 production
Primary owners: Railiance platform, product repos
Genesis family: Application/API security
## Problem
Public APIs need consistent edge protections before traffic reaches
product services.
## Context
Use this pattern for public HTTP APIs, tenant-facing APIs, admin APIs,
ingress paths, rate limiting, schema checks, authentication, and edge
logging.
## Forces
- Gateways can centralize common controls.
- Applications still need local authorization and validation.
- Edge policies must not hide tenant or object-level checks.
- Admin APIs require stricter exposure rules than public product APIs.
## Solution
Place a managed gateway at API ingress to enforce authentication
prechecks, TLS, rate limits, request size, schema constraints, logging,
and routing before forwarding to application enforcement points.
## Verification
- Unauthenticated or malformed requests are rejected at the edge.
- Rate limits and abuse controls are active.
- Admin surfaces use separate routes and stronger controls.
- Application-level object authorization still runs behind the gateway.
## Related Patterns
- Object-Level Authorization Check.
- Schema-First API Security.
- Network Default Deny.
- Central Audit Ledger.

View File

@@ -0,0 +1,46 @@
# Pattern: Backend-for-Frontend
Status: seed
Readiness target: RL2 private beta
Primary owners: product repos, NetKingdom
Genesis family: Application/API security
## Problem
Different UI clients often need different API shapes, but exposing broad
backend APIs directly to browsers or mobile clients increases token and
data exposure.
## Context
Use this pattern for web frontends, mobile clients, admin consoles,
tenant portals, and agent-facing UI surfaces.
## Forces
- UI clients need tailored data and workflow APIs.
- Backend services may expose fields or operations that clients should
not see.
- Tokens and sessions need client-appropriate handling.
- Authorization decisions still need tenant and object context.
## Solution
Create a client-specific backend layer that mediates session handling,
data shaping, authorization calls, and downstream API access for one
frontend class.
## Verification
- The BFF exposes only client-appropriate operations and fields.
- Downstream calls include trusted user, tenant, and authorization
context.
- Sensitive backend tokens are not exposed to clients.
- Session and CSRF controls match the client type.
## Related Patterns
- API Gateway as Security Boundary.
- Object-Level Authorization Check.
- Tenant Context Propagation.
- Schema-First API Security.

View File

@@ -0,0 +1,77 @@
# Pattern: Break-Glass Access
Status: reviewed
Readiness target: RL3 production
Primary owners: NetKingdom, Railiance platform
## Problem
Operators need a recovery path when normal identity, policy, cluster, or
secret services fail, but emergency access can easily become an
unbounded platform-root bypass.
## Context
Use this pattern for OpenBao recovery, cluster recovery, privileged
account recovery, incident containment, and platform restore workflows.
## Forces
- Emergency access must work during partial outages.
- It must be limited, auditable, and rarely used.
- Tenant administrators must not receive platform-root powers.
- Post-event review must turn emergency use into durable fixes.
## Solution
Define a small emergency path with explicit custody, MFA or quorum where
possible, narrow scope, recorded use, and mandatory post-event review.
Keep it separate from ordinary administration.
## Implementation Sketch
1. Identify emergency scenarios and required minimum authority.
2. Store emergency material separately with named custodians.
3. Require ceremony, reason, and timestamp for use.
4. Alert on activation where systems are available.
5. Rotate or reseal affected credentials after use.
6. Run post-event review and close follow-up tasks.
## Failure Modes
| Failure | Mitigation |
| --- | --- |
| Break-glass becomes routine admin | require review and track frequency |
| Emergency access is too broad | define scenario-specific bundles |
| Recovery material is stale | run drills and rotation checks |
| Tenant admins gain platform-root access | hard-separate tenant and platform authority |
## Related Capabilities
- Incident response and recovery.
- Privileged access management.
- Secrets, keys, and credentials.
- Security governance and production readiness.
## Maturity
Reviewed. The concept is anchored in NetKingdom/OpenBao planning, but
drills and custody evidence are required before canonical graduation.
## Verification
- Emergency path is documented and tested.
- Activation produces an event record and follow-up review.
- Credentials are rotated or revalidated after use.
- Tenant and platform emergency powers are separated.
## Research Basis
Seeded by break-glass access, incident response process, backup restore,
and secret-zero avoidance requirements.
## References
- Initial exploration: Identity and access patterns.
- Initial exploration: Incident response and recovery.
- Railiance OpenBao platform secrets service.

View File

@@ -0,0 +1,44 @@
# Pattern: Cell-based Architecture
Status: seed
Readiness target: RL3 production
Primary owners: Railiance platform, product repos
Genesis family: Tenant isolation
## Problem
Large multi-tenant systems need blast-radius control when a single
shared runtime or data plane would make incidents too broad.
## Context
Use this pattern when tenants can be grouped into cells, shards, or
regional service instances with independent capacity, deployment, and
failure boundaries.
## Forces
- Cells reduce blast radius and scaling contention.
- Routing and tenant placement become platform responsibilities.
- Identity, policy, and audit must work across cells.
- Cross-cell operations need strict control and observability.
## Solution
Assign tenants to isolated cells that contain runtime, data, or service
subsystems. Keep global control-plane operations minimal and require
tenant-to-cell mapping in deployment, routing, policy, and audit.
## Verification
- Tenant-to-cell placement is explicit and auditable.
- Failure in one cell does not grant access to another cell.
- Deployments can be rolled out cell by cell.
- Cross-cell administrative actions are explicitly authorized.
## Related Patterns
- Tenant Isolation.
- Tenant Context Propagation.
- Central Audit Ledger.
- Kill Switch / Tenant Freeze.

View File

@@ -0,0 +1,78 @@
# Pattern: Central Audit Ledger
Status: seed
Readiness target: RL3 production
Primary owners: NetKingdom, Railiance platform, State Hub
## Problem
Security-relevant events lose accountability when identity, policy,
OpenBao, Kubernetes, deployment, and workload logs remain disconnected.
## Context
Use this pattern when platform actions, tenant actions, agent actions,
policy decisions, secret access, deployments, and data access must be
correlated for operations, incident response, and customer trust.
## Forces
- Logs are high volume, but audit events must be durable and searchable.
- Tenants may need partial visibility without seeing platform secrets.
- Agents and humans need distinct attribution.
- Correlation ids must cross system boundaries.
## Solution
Define a central security event taxonomy and durable audit ledger for
security-sensitive actions. Every protected system emits events with
actor, tenant, resource, action, decision, correlation id, and source.
## Implementation Sketch
1. Define security event classes and required fields.
2. Emit events from key-cape, flex-auth, Topaz, OpenBao, Kubernetes,
artifact-store, ops-bridge, and workloads.
3. Preserve correlation ids across request, decision, secret, and data
paths.
4. Protect ledger retention, access, and integrity.
5. Add tenant-visible projections where appropriate.
## Failure Modes
| Failure | Mitigation |
| --- | --- |
| Logs exist but cannot answer who did what | require actor/resource/action fields |
| Tenant-visible logs expose platform internals | define projection and redaction rules |
| Agent events hide behind human account | require explicit agent identity |
| Audit sink outage loses privileged events | fail closed for privileged paths or buffer under policy |
## Related Capabilities
- Observability, detection, and audit.
- Incident response and recovery.
- Authorization and access control.
- Agent access control.
## Maturity
Seed. The need is clear, but storage, retention, projection, and State
Hub integration decisions remain open.
## Verification
- Critical systems emit events with required fields.
- A single correlation id links identity, policy, secret, and workload
events.
- Ledger access is protected and audited.
- Tenant-visible views contain only tenant-appropriate records.
## Research Basis
Seeded by security logging, central log collection, audit trail, tenant
visible audit logs, and security event taxonomy.
## References
- Initial exploration: Observability, detection, and audit.
- Initial exploration: Detection and response patterns.

View File

@@ -0,0 +1,47 @@
# Pattern: Central Identity Provider
Status: seed
Readiness target: RL3 production
Primary owners: NetKingdom, key-cape, Keycloak
Genesis family: Identity and access
## Problem
Services drift into local user stores and inconsistent login behavior
when there is no shared identity source.
## Context
Use this pattern for platform services, admin tools, product
applications, and operational interfaces that need interactive user
login or identity claims.
## Forces
- Product teams need a simple integration point.
- Platform operators need lifecycle, MFA, and audit consistency.
- Lightweight local identity and expanded Keycloak deployments need to
share a stable profile contract.
- Tenant users and platform operators must remain distinguishable.
## Solution
Route interactive authentication through a central IdP and expose a
stable NetKingdom IAM Profile to consumers. The implementation may be
lightweight key-cape mode or expanded Keycloak mode, but applications
consume the same issuer, audience, subject, tenant, role, and assurance
shape.
## Verification
- Applications reject tokens from unknown issuers or audiences.
- Privileged flows require MFA or equivalent assurance evidence.
- User disablement removes access across integrated services.
- Audit events identify user, tenant, issuer, and client.
## Related Patterns
- Identity Broker.
- Tenant Membership Boundary.
- Human/Agent Identity Split.
- Delegated Authorization.

View File

@@ -0,0 +1,43 @@
# Pattern: Certificate Automation
Status: seed
Readiness target: RL3 production
Primary owners: Railiance platform, NetKingdom
Genesis family: Secrets and cryptography
## Problem
Manual certificate issuance and renewal create outages, weak defaults,
and stale trust anchors.
## Context
Use this pattern for ingress TLS, internal service TLS, mTLS, workload
certificates, admin endpoints, and platform APIs.
## Forces
- Certificates need automated issuance and renewal.
- Trust roots must be owned and rotated.
- Internal and external certificate policies differ.
- Expiry must be observable before outage.
## Solution
Automate certificate issuance, renewal, monitoring, and rotation through
approved issuers and scoped identities. Treat certificate authority
material as platform-root secret material.
## Verification
- Certificates renew before expiry.
- Issuers and trust roots are documented and protected.
- Expiry monitoring alerts before service impact.
- mTLS or internal TLS policies are tested where required.
## Related Patterns
- Workload Identity.
- Secret Zero Avoidance.
- Secure Cluster Baseline.
- Network Default Deny.

View File

@@ -0,0 +1,45 @@
# Pattern: Cluster-per-Tenant
Status: seed
Readiness target: RL4 regulated production
Primary owners: Railiance platform
Genesis family: Tenant isolation
## Problem
Some tenants require stronger runtime and control-plane separation than
namespace isolation can provide.
## Context
Use this pattern for regulated customers, high-trust deployments,
dedicated environments, high-risk workloads, or tenants with contractual
isolation requirements.
## Forces
- Separate clusters increase isolation and blast-radius control.
- More clusters increase operational complexity and cost.
- Shared platform services still need consistent identity, policy,
secrets, and audit contracts.
- Tenant lifecycle and upgrades must remain manageable.
## Solution
Allocate a dedicated Kubernetes cluster or equivalent control boundary
per tenant while preserving shared NetKingdom identity, authorization,
secret, deployment, and audit contracts.
## Verification
- Tenant workloads cannot share Kubernetes control-plane authority.
- Cluster credentials, secrets, and audit sinks are tenant scoped.
- Shared platform integrations preserve tenant identity and ownership.
- Restore and upgrade procedures are tested per tenant cluster class.
## Related Patterns
- Tenant Isolation.
- Shared Control Plane, Isolated Data Plane.
- Central Audit Ledger.
- Secure Cluster Baseline.

View File

@@ -0,0 +1,78 @@
# Pattern: Delegated Authorization
Status: reviewed
Readiness target: RL3 production
Primary owners: flex-auth, NetKingdom
## Problem
Identity providers and application code should not become the scattered
home for every tenant, resource, and object-level authorization rule.
## Context
Use this pattern for protected systems that need consistent decisions
for tenant-scoped resources, privileged operations, object storage,
agent access, and application APIs.
## Forces
- Applications need local enforcement, but policy needs central shape.
- Tenant, resource, action, assurance, and context must travel together.
- Some decisions can be delegated to PDP runtimes such as Topaz.
- Deny reasons and obligations need to be auditable.
## Solution
Use flex-auth as the canonical authorization boundary. Callers submit a
standard decision request; flex-auth evaluates directly or delegates to
Topaz; applications enforce the returned allow/deny, obligations, and
audit metadata at the boundary.
## Implementation Sketch
1. Register protected systems and resource/action vocabulary.
2. Define the decision envelope and CARING descriptors.
3. Add policy packages with tenant/platform separation.
4. Delegate to Topaz where ReBAC or policy runtime support is useful.
5. Return stable allow/deny, reason, obligation, and audit fields.
6. Require applications to enforce decisions before resource access.
## Failure Modes
| Failure | Mitigation |
| --- | --- |
| App ignores deny obligations | add conformance tests at enforcement points |
| Policies mix platform and tenant authority | separate policy packages and review paths |
| Decision context omits tenant | fail closed |
| PDP outage becomes implicit allow | fail closed except documented emergency flows |
## Related Capabilities
- Authorization and access control.
- Tenant isolation.
- Application and API security.
- Observability, detection, and audit.
## Maturity
Reviewed. This is a core NetKingdom boundary and should become
canonical once flex-auth conformance fixtures are stable.
## Verification
- Decision envelopes include actor, tenant, resource, action, context,
obligations, reason, and audit id.
- Enforcement points deny when flex-auth denies or is unavailable.
- Topaz delegation is visible in decision records.
- Tenant and platform policy packages are separated.
## Research Basis
Seeded by the policy decision point/enforcement point pattern, tenant
scoped authorization, API authorization, and CARING modeling notes.
## References
- NetKingdom platform identity/security architecture.
- Initial exploration: Authorization and access control.

View File

@@ -0,0 +1,42 @@
# Pattern: Dependency Update Bot
Status: seed
Readiness target: RL2 private beta
Primary owners: product repos
Genesis family: Supply chain
## Problem
Dependency updates become stale, risky, and manual when there is no
repeatable intake and test path.
## Context
Use this pattern for application dependencies, container base images,
GitHub Actions, Helm charts, Terraform providers, and platform tools.
## Forces
- Automated updates reduce known-vulnerability exposure.
- Update noise can overwhelm reviewers.
- Security updates need prioritization.
- Tests must catch compatibility breakage.
## Solution
Use automated dependency update pull requests with grouping rules,
security prioritization, test gates, review ownership, and release notes.
## Verification
- Dependency inventory is covered by update automation.
- Security updates are surfaced with priority.
- Update PRs run relevant tests.
- Deferred updates have owner and reason.
## Related Patterns
- Protected Main Branch.
- SBOM-per-Release.
- Quarantined Build Runner.
- Supply-Chain Provenance.

View File

@@ -0,0 +1,76 @@
# Pattern: Dynamic Secrets
Status: draft
Readiness target: RL3 production
Primary owners: Railiance platform, OpenBao
## Problem
Static service credentials accumulate, drift from ownership, and remain
useful after compromise.
## Context
Use this pattern for databases, object stores, message brokers, internal
APIs, and operator workflows where credentials can be issued with a
lease and revoked after use.
## Forces
- Consumers need credentials on demand.
- Backends vary in their ability to mint short-lived credentials.
- Lease and revocation behavior must be observable.
- Application teams need stable integration contracts even when backend
credential mechanisms differ.
## Solution
Use OpenBao or a credential broker to issue scoped credentials with TTL,
lease metadata, renewal rules, and revocation. Keep parent credentials
inside the platform secret authority.
## Implementation Sketch
1. Define a protected system and role for each dynamic credential type.
2. Authenticate the caller with workload or human identity.
3. Authorize requested scope and TTL through policy.
4. Generate backend-native credentials or brokered session material.
5. Record lease id, caller, tenant, backend, and expiry.
6. Revoke credentials on expiry, deployment teardown, or incident.
## Failure Modes
| Failure | Mitigation |
| --- | --- |
| Backend does not support dynamic users | use brokered credentials or shorter static bridge with explicit exception |
| Lease renewal hides stale consumers | cap max TTL and require owner metadata |
| Parent credential exposed to apps | keep parent material only in OpenBao or broker config |
| Revocation is untested | include revocation drills in readiness gates |
## Related Capabilities
- Secrets, keys, and credentials.
- Authorization and access control.
- Observability, detection, and audit.
## Maturity
Draft. The OpenBao direction is established, but each backend needs a
verified lease and revocation story.
## Verification
- Issued credentials have owner, scope, TTL, and lease metadata.
- Revocation invalidates access at the backend.
- Expired credentials are rejected.
- Audit records link issuance and revocation to actor and tenant.
## Research Basis
Seeded by central secrets management, workload secret injection, secret
rotation, short-lived credentials, and OpenBao runtime authority.
## References
- Initial exploration: Secrets, keys, and credentials.
- Railiance OpenBao platform secrets service.

View File

@@ -0,0 +1,44 @@
# Pattern: External Secrets Operator
Status: seed
Readiness target: RL3 production
Primary owners: Railiance platform, OpenBao
Genesis family: Secrets and cryptography
## Problem
Kubernetes applications need secrets without making Kubernetes itself
the long-term source of truth for secret material.
## Context
Use this pattern when workloads consume secrets from OpenBao or another
external secret manager through Kubernetes-native references.
## Forces
- Workloads often expect Kubernetes Secrets.
- Secret source of truth should remain in OpenBao.
- Sync creates copies that need scope, ownership, and rotation.
- Tenant and namespace boundaries must be respected.
## Solution
Use an external secrets controller to reconcile secret references from
OpenBao into scoped Kubernetes Secrets, with explicit ownership, refresh
intervals, RBAC, namespace boundaries, and audit.
## Verification
- Kubernetes Secrets are derived from OpenBao references, not committed
plaintext.
- Sync permissions are namespace and tenant scoped.
- Rotation in OpenBao reaches consumers within the expected interval.
- Sync failures are visible and fail safe.
## Related Patterns
- Workload Identity.
- Dynamic Secrets.
- Secret Zero Avoidance.
- Namespace-per-Tenant.

View File

@@ -0,0 +1,44 @@
# Pattern: GitOps with Guardrails
Status: seed
Readiness target: RL3 production
Primary owners: Railiance platform, product repos
Genesis family: Kubernetes and platform
## Problem
GitOps can make operations reproducible while still deploying unsafe
state if review, policy, secrets, and provenance controls are weak.
## Context
Use this pattern for platform and product deployment repositories,
environment promotion, configuration changes, and operational rollbacks.
## Forces
- Desired state should be reviewable and auditable.
- Secrets must not be exposed in Git.
- Policy checks need to run before reconciliation.
- Emergency changes need traceability.
## Solution
Use Git as the reviewed desired-state source while enforcing branch
protection, policy-as-code checks, encrypted secret references, signed
artifact admission, and clear rollback procedures.
## Verification
- Production changes enter through reviewed commits or documented
emergency paths.
- Reconciliation rejects policy failures.
- Secret plaintext is absent from Git.
- Rollbacks preserve audit and policy evidence.
## Related Patterns
- Protected Main Branch.
- Policy-as-Code Admission Control.
- Sealed Secret / Encrypted Git Secret.
- Signed Image Admission.

View File

@@ -0,0 +1,75 @@
# Pattern: Human/Agent Identity Split
Status: draft
Readiness target: RL3 production
Primary owners: NetKingdom, ops-bridge, product repos
## Problem
Agents acting as invisible extensions of human users make access scope,
accountability, rate limits, and incident response ambiguous.
## Context
Use this pattern for AI agents, automation workers, repository agents,
ops agents, scheduled tasks, and delegated user workflows.
## Forces
- Agents need to act on behalf of people or systems.
- Human approval does not mean unlimited agent authority.
- Audit must distinguish sponsor, agent, tool, and target action.
- Agents may need tighter scopes and shorter TTLs than humans.
## Solution
Give agents explicit identities with their own scopes, limits,
credentials, and audit records. Link agent activity to a human or system
sponsor without collapsing them into the same principal.
## Implementation Sketch
1. Define agent identity type in IAM Profile or equivalent registry.
2. Bind agent to sponsor, purpose, tenant, allowed tools, and TTL.
3. Issue scoped credentials or certificates for agent activity.
4. Require flex-auth to evaluate agent context separately.
5. Emit audit events with both sponsor and agent ids.
6. Support revocation by agent, sponsor, tenant, and task.
## Failure Modes
| Failure | Mitigation |
| --- | --- |
| Agent uses human token directly | require separate agent credentials |
| Audit only records sponsor | include agent id and tool/action metadata |
| Agent keeps broad long-lived access | enforce TTL and purpose-bound scopes |
| Tenant cannot revoke delegated agent | support tenant-scoped revocation controls |
## Related Capabilities
- Agent access control.
- Identity and user management.
- Authorization and access control.
- Observability, detection, and audit.
## Maturity
Draft. The need is explicit in the platform direction; detailed IAM
Profile claim shape and ops integration are still open.
## Verification
- Agent events are distinguishable from human events.
- Revoking the agent does not require disabling the sponsor.
- flex-auth decisions include agent context.
- Agent credentials have explicit scope and TTL.
## Research Basis
Seeded by agent access control, human/agent identity split,
time-boxed privilege elevation, and auditability requirements.
## References
- Initial exploration: Authorization and access control.
- Initial exploration: Identity and access patterns.

View File

@@ -0,0 +1,43 @@
# Pattern: Idempotent Command API
Status: seed
Readiness target: RL2 private beta
Primary owners: product repos
Genesis family: Application/API security
## Problem
Retries, duplicate submissions, and partial failures can create unsafe
state changes when command APIs are not idempotent.
## Context
Use this pattern for payment-like operations, provisioning, tenant
configuration, file processing, job submission, and external callbacks.
## Forces
- Networks and clients retry requests.
- Commands need audit and correlation.
- Duplicate execution can create data or authorization errors.
- Some commands must be scoped to actor and tenant.
## Solution
Require command identifiers and replay-safe semantics for state-changing
operations. Bind idempotency keys to actor, tenant, command type, and
resource scope.
## Verification
- Replaying a command id returns the prior result or safe status.
- Idempotency keys cannot be reused across tenants or actors.
- Command audit records include correlation id and outcome.
- Partial failures are recoverable without duplicate effects.
## Related Patterns
- Tenant Context Propagation.
- Central Audit Ledger.
- Schema-First API Security.
- Incident Runbook Library.

View File

@@ -0,0 +1,47 @@
# Pattern: Identity Broker
Status: seed
Readiness target: RL3 production
Primary owners: NetKingdom, key-cape, Keycloak
Genesis family: Identity and access
## Problem
External, customer, or ecosystem identities cannot be trusted directly
by every product service without duplicating federation and mapping
logic.
## Context
Use this pattern when tenants bring their own IdP, when operators need
multiple upstream identity sources, or when the platform must normalize
federated identity into the IAM Profile.
## Forces
- Upstream IdPs have different claim shapes and assurance semantics.
- Tenant membership is not the same as global user identity.
- Product applications need stable claims.
- Federation errors can create cross-tenant or privilege confusion.
## Solution
Broker external identity through a controlled IAM layer that validates
upstream issuer trust, maps claims into the NetKingdom IAM Profile, and
records federation source, tenant membership, assurance, and lifecycle
state.
## Verification
- Each upstream IdP has explicit trust metadata and claim mappings.
- Tenant membership is resolved after federation, not assumed from raw
upstream claims.
- Assurance and MFA evidence are normalized for privileged flows.
- Federation failures fail closed with auditable reason codes.
## Related Patterns
- Central Identity Provider.
- Tenant Membership Boundary.
- Role Composition.
- Object-Level Authorization Check.

View File

@@ -0,0 +1,44 @@
# Pattern: Incident Runbook Library
Status: seed
Readiness target: RL3 production
Primary owners: NetKingdom, Railiance platform, product repos
Genesis family: Detection and response
## Problem
Incident response becomes slow and inconsistent when teams rely on
memory or ad hoc decisions during security events.
## Context
Use this pattern for credential compromise, tenant isolation incidents,
malicious uploads, policy bypass, OpenBao recovery, build compromise,
and platform outage scenarios.
## Forces
- Incidents need quick containment.
- Actions must be safe, authorized, and reversible where possible.
- Evidence preservation matters.
- Tenant communication and post-incident learning need structure.
## Solution
Maintain a reviewed library of incident runbooks with triggers,
severity, roles, containment steps, evidence handling, tenant
communication, recovery, and post-incident follow-up.
## Verification
- High-value scenarios have runbooks with owners.
- Runbooks are tested through drills or tabletop exercises.
- Containment actions link to audit and decision records.
- Post-incident reviews create tracked follow-up work.
## Related Patterns
- Break-glass Access.
- Kill Switch / Tenant Freeze.
- Token Revocation Sweep.
- Central Audit Ledger.

View File

@@ -0,0 +1,43 @@
# Pattern: Key-per-Tenant
Status: seed
Readiness target: RL4 regulated production
Primary owners: NetKingdom, Railiance platform, product repos
Genesis family: Secrets and cryptography
## Problem
Shared encryption keys make tenant data separation and incident
containment weaker than the tenant model may require.
## Context
Use this pattern for sensitive tenant data, regulated tenants, object
storage, databases, backups, and export/deletion workflows.
## Forces
- Per-tenant keys strengthen isolation and revocation.
- Key lifecycle is operationally complex.
- Applications need safe key selection without tenant spoofing.
- Backups and derived data need the same key boundary.
## Solution
Assign tenant-specific encryption keys or key hierarchy roots where risk
requires it. Bind key use to trusted tenant context, policy, audit, and
rotation procedures.
## Verification
- Tenant data is encrypted with the correct tenant key or key hierarchy.
- Cross-tenant key use is denied.
- Rotation and revocation are tested.
- Backups and exports preserve tenant key boundaries.
## Related Patterns
- Tenant Data Partitioning.
- Tenant Context Propagation.
- OpenBao runtime secret authority.
- Central Audit Ledger.

View File

@@ -0,0 +1,43 @@
# Pattern: Kill Switch / Tenant Freeze
Status: seed
Readiness target: RL3 production
Primary owners: NetKingdom, product repos, Railiance platform
Genesis family: Detection and response
## Problem
Operators need a fast containment mechanism when a tenant, app, token,
or integration appears compromised or abusive.
## Context
Use this pattern for tenant compromise, runaway automation, abusive API
usage, malicious uploads, credential exposure, or legal/security holds.
## Forces
- Containment must be fast.
- Freezing a tenant can be high impact.
- Actions need scope, reason, approval, and audit.
- Recovery must be predictable and reversible where possible.
## Solution
Provide controlled kill-switch or tenant-freeze states that can block
login, API access, background jobs, credential vending, uploads, or data
mutation according to scoped policy.
## Verification
- Freeze actions take effect across API, jobs, credentials, and uploads.
- Scope and reason are recorded.
- Tenant-visible communication rules are defined.
- Unfreeze requires authorized review and audit.
## Related Patterns
- Tenant Isolation.
- Token Revocation Sweep.
- Incident Runbook Library.
- Central Audit Ledger.

View File

@@ -0,0 +1,44 @@
# Pattern: Namespace-per-Tenant
Status: seed
Readiness target: RL2 private beta
Primary owners: Railiance platform, product repos
Genesis family: Tenant isolation
## Problem
Shared Kubernetes clusters need tenant boundaries without the cost and
operational overhead of one cluster per tenant.
## Context
Use this pattern for medium-strength tenant isolation where workloads can
share a cluster but need separate Kubernetes namespaces, resource
quotas, network policies, and access controls.
## Forces
- Shared clusters reduce platform cost.
- Namespaces are not a hard security boundary by themselves.
- Tenant workloads need quotas, labels, policies, and ownership.
- Platform controllers can accidentally gain cross-tenant reach.
## Solution
Assign each tenant one or more namespaces with mandatory labels,
resource quotas, network default deny, RBAC boundaries, admission
policies, and tenant-aware audit events.
## Verification
- Tenant service accounts cannot access other tenant namespaces.
- Network policies block cross-namespace traffic unless allowed.
- Quotas and pod security policies apply to every tenant namespace.
- Audit records include namespace and tenant id.
## Related Patterns
- Tenant Isolation.
- Network Default Deny.
- Pod Security Baseline/Restricted.
- Tenant Context Propagation.

View File

@@ -0,0 +1,76 @@
# Pattern: Network Default Deny
Status: seed
Readiness target: RL3 production
Primary owners: Railiance platform, product repos
## Problem
Flat internal networks allow accidental exposure and lateral movement
when one workload, namespace, or tenant is compromised.
## Context
Use this pattern for Kubernetes namespaces, tenant workloads, platform
services, ingress paths, egress control, admin surfaces, and service to
service communication.
## Forces
- Services need explicit communication paths.
- Product teams need a manageable way to declare network intent.
- Some platform services must be reachable across tenants or namespaces.
- Debugging becomes harder when default connectivity disappears.
## Solution
Deny network traffic by default and allow only explicit, reviewed paths
between workloads, namespaces, platform services, ingress, and egress
destinations.
## Implementation Sketch
1. Apply namespace-level default deny policies.
2. Define service-specific ingress and egress policies.
3. Separate tenant, platform, admin, and observability networks.
4. Route public traffic through managed ingress.
5. Log or sample denied flows where practical.
6. Provide policy templates for product teams.
## Failure Modes
| Failure | Mitigation |
| --- | --- |
| Broad allow rules recreate flat network | review wildcard selectors and CIDRs |
| DNS or observability breaks silently | maintain platform allow templates |
| Admin tools exposed like public apps | separate admin surfaces and access paths |
| Teams bypass policy through host networking | enforce pod security and admission rules |
## Related Capabilities
- Network and edge security.
- Tenant isolation.
- Platform and Kubernetes hardening.
- Observability, detection, and audit.
## Maturity
Seed. The pattern is a production baseline candidate; implementation
needs Railiance network policy conventions.
## Verification
- New namespaces start with deny-all ingress and egress.
- Required service paths have explicit policies.
- Cross-tenant connectivity is denied by default.
- Admin surfaces are not reachable through public workload paths.
## Research Basis
Seeded by network segmentation, egress control, service-to-service
trust, network default deny, and Kubernetes hardening requirements.
## References
- Initial exploration: Network and edge security.
- Initial exploration: Kubernetes and platform patterns.

View File

@@ -0,0 +1,76 @@
# Pattern: Object-Level Authorization Check
Status: draft
Readiness target: RL3 production
Primary owners: flex-auth, product repos, NetKingdom
## Problem
APIs often authenticate callers correctly while still allowing access to
objects, records, files, or tenant resources outside the caller's scope.
## Context
Use this pattern for product APIs, admin APIs, object storage brokers,
artifact-store, tenant data, background jobs, and any endpoint that
accepts resource identifiers.
## Forces
- Object ownership and scope are application-specific.
- Authorization must happen before data is returned or mutated.
- Tenant context must be trusted, not copied from user input.
- Bulk, search, and background operations need the same checks.
## Solution
Require every object access path to ask an authorization boundary with
trusted actor, tenant, resource, action, and context before reading,
writing, deleting, exporting, or sharing an object.
## Implementation Sketch
1. Define resource types and action vocabulary.
2. Derive actor and tenant from trusted identity/session evidence.
3. Resolve object ownership or scope before access.
4. Ask flex-auth or local policy adapter for a decision.
5. Enforce allow/deny before data access.
6. Log object-level decisions with correlation ids.
## Failure Modes
| Failure | Mitigation |
| --- | --- |
| Endpoint checks role but not object ownership | add object-level conformance tests |
| Search/list endpoints bypass item checks | enforce tenant/resource filters in query layer |
| Background jobs run with global authority | carry tenant and actor context in job envelopes |
| Deny reason leaks object existence | use stable, non-revealing deny responses |
## Related Capabilities
- Application and API security.
- Authorization and access control.
- Tenant isolation.
- Data protection and privacy.
## Maturity
Draft. The pattern maps directly to flex-auth but requires product-level
adoption and tests.
## Verification
- Cross-tenant object access tests fail.
- List/search endpoints cannot reveal out-of-scope objects.
- Background jobs preserve authorization context.
- Deny paths are audited and do not leak sensitive existence details.
## Research Basis
Seeded by API authorization, object-level authorization, OWASP API
security framing, and tenant-scoped authorization.
## References
- Initial exploration: Application and API security.
- Initial exploration: Application/API patterns.

View File

@@ -0,0 +1,44 @@
# Pattern: Pod Security Baseline/Restricted
Status: seed
Readiness target: RL3 production
Primary owners: Railiance platform, product repos
Genesis family: Kubernetes and platform
## Problem
Workloads can gain host, network, filesystem, or privilege escalation
capabilities that are unnecessary and dangerous in production.
## Context
Use this pattern for Kubernetes workloads, tenant namespaces, platform
services, controllers, and admission policies.
## Forces
- Some platform components need elevated privileges.
- Most product workloads should run with restricted settings.
- Exceptions must not become broad namespace bypasses.
- Developers need clear guidance for safe pod specs.
## Solution
Apply Kubernetes Pod Security baseline or restricted profiles with
admission enforcement, explicit exceptions, and tests for privileged
features such as host networking, hostPath, root users, and privilege
escalation.
## Verification
- Non-exempt workloads run as non-root and cannot escalate privileges.
- hostPath, host networking, and privileged mode are rejected by default.
- Exceptions are scoped, owned, and time bounded.
- Policy tests cover representative manifests.
## Related Patterns
- Secure Cluster Baseline.
- Policy-as-Code Admission Control.
- Namespace-per-Tenant.
- Network Default Deny.

View File

@@ -0,0 +1,42 @@
# Pattern: Policy-as-Code Admission Control
Status: seed
Readiness target: RL3 production
Primary owners: Railiance platform, NetKingdom
Genesis family: Kubernetes and platform
## Problem
Unsafe Kubernetes manifests can reach runtime when deployment safety
depends only on convention or manual review.
## Context
Use this pattern for CI checks, GitOps flows, admission webhooks,
namespace guardrails, image trust, pod security, and exception handling.
## Forces
- Product teams need self-service deployment.
- Platform teams need enforceable guardrails.
- Policies must be versioned, reviewable, and tested.
- Emergency exceptions need expiry and audit.
## Solution
Encode deployment rules as policy packages evaluated before workloads
run. Reject or quarantine manifests that violate baseline controls.
## Verification
- Unsafe manifests fail in CI and at admission.
- Policy packages have tests and review history.
- Exceptions carry owner, reason, risk, and expiry.
- Admission decisions are logged.
## Related Patterns
- Policy-as-Code Admission.
- Secure Cluster Baseline.
- Pod Security Baseline/Restricted.
- Signed Image Admission.

View File

@@ -0,0 +1,76 @@
# Pattern: Policy-as-Code Admission
Status: seed
Readiness target: RL3 production
Primary owners: Railiance platform, NetKingdom
## Problem
Unsafe workloads can enter production when deployment checks depend on
manual review, inconsistent conventions, or late runtime detection.
## Context
Use this pattern for Kubernetes manifests, Helm releases, GitOps
deployments, image admission, network policy, pod security, and tenant
guardrails.
## Forces
- Product teams need self-service deployments.
- Platform teams need enforceable production baselines.
- Policies must be reviewable and testable.
- Emergency exceptions need explicit expiry and audit.
## Solution
Represent deployment and platform safety rules as policy packages that
run before workloads are admitted. Reject or quarantine unsafe manifests
before runtime.
## Implementation Sketch
1. Define baseline policies for pod security, RBAC, image trust,
namespace ownership, labels, resources, and network intent.
2. Run policy checks in CI and admission.
3. Version policy packages through Git review.
4. Support exceptions with owner, reason, expiry, and risk acceptance.
5. Emit policy decision events to the audit ledger.
## Failure Modes
| Failure | Mitigation |
| --- | --- |
| Policies only run in CI | enforce at admission too |
| Exceptions never expire | require expiry and review |
| Policy language is opaque to teams | publish examples and test fixtures |
| Admission outage blocks recovery | document break-glass admission process |
## Related Capabilities
- Platform and Kubernetes hardening.
- Security governance and production readiness.
- Software supply chain security.
- Observability, detection, and audit.
## Maturity
Seed. The pattern is a baseline candidate, but tool choice and policy
package lifecycle need implementation work.
## Verification
- Unsafe manifests are rejected before runtime.
- Policy packages have tests and change review.
- Exceptions are time bounded and visible.
- Admission decisions are logged.
## Research Basis
Seeded by policy-as-code admission control, pod security
baseline/restricted, signed image admission, and GitOps with guardrails.
## References
- Initial exploration: Kubernetes and platform patterns.
- Initial exploration: Platform and Kubernetes hardening.

View File

@@ -0,0 +1,46 @@
# Pattern: Policy Decision Point / Policy Enforcement Point
Status: reviewed
Readiness target: RL3 production
Primary owners: flex-auth, NetKingdom
Genesis family: Identity and access
## Problem
Authorization becomes inconsistent when policy decisions live inside
many applications without a shared decision contract.
## Context
Use this pattern when product services, platform APIs, object-storage
brokers, admin tools, and agents need consistent allow or deny decisions
for protected resources.
## Forces
- Applications must enforce decisions at local boundaries.
- Policy needs central shape, testability, and audit.
- Tenant, resource, action, assurance, and context must be included.
- PDP outages must not become implicit allow.
## Solution
Separate policy decision from policy enforcement. flex-auth acts as the
canonical PDP boundary and may delegate to Topaz; applications and
gateways act as PEPs that enforce allow, deny, obligations, and reason
codes.
## Verification
- Decision requests include actor, tenant, resource, action, assurance,
and context.
- PEPs deny on explicit deny, malformed decision, or PDP outage.
- Decisions produce stable reason codes and audit correlation ids.
- Policy packages are tested before production use.
## Related Patterns
- Delegated Authorization.
- Role Composition.
- Object-Level Authorization Check.
- Policy-as-Code Admission Control.

View File

@@ -0,0 +1,44 @@
# Pattern: Protected Main Branch
Status: seed
Readiness target: RL2 private beta
Primary owners: product repos, NetKingdom
Genesis family: Supply chain
## Problem
Production code can be changed without review, tests, or traceable
approval when the main branch is not protected.
## Context
Use this pattern for repositories that produce production services,
platform manifests, policy packages, documentation standards, or release
artifacts.
## Forces
- Teams need fast iteration.
- Production branches need review and checks.
- Emergency fixes need traceable override.
- Branch rules should match artifact criticality.
## Solution
Protect main and release branches with review, required checks, signed
or verified changes where useful, and explicit emergency override
procedures.
## Verification
- Direct pushes to protected branches are blocked.
- Required tests and reviews pass before merge.
- Emergency overrides are logged and reviewed.
- Release artifacts link back to protected branch commits.
## Related Patterns
- GitOps with Guardrails.
- SLSA Build Provenance.
- SBOM-per-Release.
- Supply-Chain Provenance.

View File

@@ -0,0 +1,43 @@
# Pattern: Quarantined Build Runner
Status: seed
Readiness target: RL3 production
Primary owners: Railiance platform, product repos
Genesis family: Supply chain
## Problem
Build jobs often process untrusted code while also having access to
secrets, registries, signing keys, or deployment credentials.
## Context
Use this pattern for CI/CD runners, release builders, image builders,
dependency test jobs, and signing workflows.
## Forces
- Builds need network and artifact access.
- Pull requests may contain untrusted code.
- Signing and deployment credentials are high impact.
- Runners need cleanup and isolation between jobs.
## Solution
Run build jobs in isolated, least-privilege environments with limited
secrets, scoped network access, clean workspaces, and separate trusted
release/signing stages.
## Verification
- Untrusted jobs cannot access release or deployment secrets.
- Runner workspaces are isolated and cleaned.
- Network and registry access are scoped.
- Signing happens only in trusted release contexts.
## Related Patterns
- SLSA Build Provenance.
- Signed Container Images.
- Dependency Update Bot.
- Protected Main Branch.

View File

@@ -0,0 +1,43 @@
# Pattern: Role Composition
Status: seed
Readiness target: RL3 production
Primary owners: NetKingdom, flex-auth
Genesis family: Identity and access
## Problem
Hardcoded roles become too broad, inconsistent, and difficult to review
as products, tenants, agents, and operational tasks grow.
## Context
Use this pattern for platform roles, tenant roles, product roles,
operator privileges, and agent scopes.
## Forces
- Roles need stable names for people and policy.
- Permissions are resource and action specific.
- Tenant roles differ from platform roles.
- Agents need narrower scopes than human sponsors.
## Solution
Compose roles from named capabilities, resource scopes, actions,
constraints, and obligations. Keep role vocabulary in a reviewable model
that can be evaluated by flex-auth or a delegated PDP.
## Verification
- Each role maps to explicit capabilities and action sets.
- Privileged roles are time bounded or separately approved.
- Tenant and platform roles cannot be confused.
- Access reviews can explain why an actor has an action.
## Related Patterns
- Policy Decision Point / Policy Enforcement Point.
- Time-boxed Privilege Elevation.
- Human/Agent Identity Split.
- Object-Level Authorization Check.

View File

@@ -0,0 +1,46 @@
# Pattern: Runtime Threat Detection
Status: seed
Readiness target: RL3 production
Primary owners: Railiance platform, NetKingdom
Genesis family: Kubernetes and platform
## Problem
Admission controls and build checks do not detect every compromise that
appears after workloads are running.
## Context
Use this pattern for Kubernetes runtime events, process and network
signals, container behavior, privileged action monitoring, and incident
response triggers.
## Forces
- Runtime signals can be noisy.
- Detection must distinguish platform, tenant, human, service, and
agent activity.
- Alerts need enough context for response.
- Detection coverage should feed audit and incident workflows.
## Solution
Collect runtime process, network, Kubernetes, and workload signals,
classify them using a security event taxonomy, and route actionable
alerts into incident response and audit workflows.
## Verification
- Runtime detections include actor, workload, namespace, tenant, and
severity where available.
- Known suspicious events trigger alerts or findings.
- False positives are tuned without disabling critical coverage.
- Detection events link to incident runbooks.
## Related Patterns
- Security Event Taxonomy.
- Central Audit Ledger.
- Incident Runbook Library.
- Kill Switch / Tenant Freeze.

View File

@@ -0,0 +1,43 @@
# Pattern: SBOM-per-Release
Status: seed
Readiness target: RL3 production
Primary owners: artifact-store, product repos
Genesis family: Supply chain
## Problem
Teams cannot assess exposure, license risk, or incident impact if
release components are not recorded.
## Context
Use this pattern for containers, packages, product releases, platform
images, and deployable artifacts.
## Forces
- SBOM generation should be automated.
- SBOMs need to stay attached to release artifacts.
- Consumers need stable formats and storage.
- Vulnerability triage needs component evidence.
## Solution
Generate a software bill of materials for each release artifact and
store it with artifact metadata, provenance, signature, and deployment
records.
## Verification
- Each production release has an SBOM.
- SBOMs are stored with immutable artifact identity.
- Vulnerability scans can reference SBOM components.
- Release promotion checks require SBOM presence.
## Related Patterns
- Supply-Chain Provenance.
- SLSA Build Provenance.
- Signed Container Images.
- Protected Main Branch.

View File

@@ -0,0 +1,43 @@
# Pattern: Schema-First API Security
Status: seed
Readiness target: RL3 production
Primary owners: product repos, NetKingdom
Genesis family: Application/API security
## Problem
APIs become difficult to validate, test, and protect when the request
and response contract is implicit.
## Context
Use this pattern for OpenAPI, async APIs, event schemas, public APIs,
tenant APIs, and internal service contracts.
## Forces
- Schemas can drive validation and tests.
- Schemas alone do not prove authorization.
- Backward compatibility must be managed.
- Sensitive fields need explicit treatment.
## Solution
Define API schemas before or alongside implementation and use them to
drive validation, compatibility checks, security tests, documentation,
and gateway/application enforcement.
## Verification
- Requests and responses are validated against versioned schemas.
- Sensitive fields are marked and tested.
- Breaking changes are detected before release.
- Authorization tests cover resources described by the schema.
## Related Patterns
- API Gateway as Security Boundary.
- Object-Level Authorization Check.
- Backend-for-Frontend.
- Secure File Upload Pipeline.

View File

@@ -0,0 +1,44 @@
# Pattern: Sealed Secret / Encrypted Git Secret
Status: seed
Readiness target: RL2 private beta
Primary owners: NetKingdom, Railiance platform
Genesis family: Secrets and cryptography
## Problem
GitOps workflows need reproducible secret references, but plaintext
secrets in Git are unacceptable.
## Context
Use this pattern for bootstrap secrets, local development secrets,
environment configuration, and recovery material that must be carried in
versioned operational bundles.
## Forces
- Git gives review and history.
- Secret plaintext must not be stored in Git.
- Encryption recipients and rotation need ownership.
- Bootstrap secrets should transition to runtime secret authority.
## Solution
Store only encrypted secret payloads or sealed secret manifests in Git,
with documented recipients, rotation, and bootstrap scope. Move runtime
secret issuance to OpenBao where possible.
## Verification
- Plaintext secret scans pass.
- Encryption recipients and owners are documented.
- Decryption is limited to approved bootstrap or recovery contexts.
- Runtime workloads do not depend on bootstrap root material.
## Related Patterns
- Secret Zero Avoidance.
- GitOps with Guardrails.
- External Secrets Operator.
- Break-glass Access.

View File

@@ -0,0 +1,81 @@
# Pattern: Secret Zero Avoidance
Status: reviewed
Readiness target: RL3 production
Primary owners: NetKingdom, Railiance platform
## Problem
A secret manager cannot safely improve the platform if its bootstrap
material becomes a larger unmanaged secret than the credentials it is
meant to protect.
## Context
Use this pattern when introducing OpenBao, encrypted Git secrets,
emergency bundles, age/SOPS bootstrap material, or recovery ceremonies.
## Forces
- The platform needs an initial trust anchor.
- Operators need recoverability, but ordinary workloads must never use
platform-root material.
- GitOps needs reproducible configuration, but secret plaintext does not
belong in repositories.
- Emergency access must exist without becoming normal operating access.
## Solution
Separate bootstrap trust from runtime secret authority. Use encrypted
bootstrap material only for installation and recovery, then transfer
ordinary secret issuance, rotation, audit, and lease tracking to
OpenBao.
## Implementation Sketch
1. Store bootstrap and recovery material in encrypted form with explicit
custody.
2. Require a documented ceremony for unseal, restore, and emergency
recovery.
3. Move runtime workloads to OpenBao-issued or brokered secrets.
4. Keep root/recovery material out of normal deployment paths.
5. Record every break-glass or bootstrap use as a review event.
## Failure Modes
| Failure | Mitigation |
| --- | --- |
| Bootstrap root token reused by workloads | prohibit root material in runtime manifests |
| Encrypted secret repository becomes undocumented | maintain owner, recipient, and rotation inventory |
| Recovery ceremony untested | run restore and unseal drills |
| Emergency bundle becomes daily admin path | require post-event review and replacement |
## Related Capabilities
- Secrets, keys, and credentials.
- Security governance and production readiness.
- Incident response and recovery.
- Observability, detection, and audit.
## Maturity
Reviewed. NetKingdom credential and OpenBao architecture already anchor
the pattern; operational drills are still needed before canonical status.
## Verification
- Runtime workloads do not reference bootstrap root material.
- OpenBao audit is enabled before production use.
- Recovery and unseal steps are documented and tested.
- Emergency access events produce reviewable records.
## Research Basis
Seeded by encrypted Git secret, central secrets management, break-glass,
and incident recovery requirements in the initial exploration.
## References
- NetKingdom credential-management standard.
- NetKingdom recursive platform identity/security architecture.
- Railiance OpenBao platform secrets service.

View File

@@ -0,0 +1,44 @@
# Pattern: Secure Cluster Baseline
Status: seed
Readiness target: RL3 production
Primary owners: Railiance platform
Genesis family: Kubernetes and platform
## Problem
Production Kubernetes clusters inherit unsafe defaults unless baseline
hardening is explicit, versioned, and verified.
## Context
Use this pattern for every cluster class that hosts platform services,
tenant workloads, identity services, secret managers, or production
applications.
## Forces
- Kubernetes exposes many powerful APIs by default.
- Platform add-ons need privileged access but must be bounded.
- Baseline controls must survive upgrades.
- Product teams need predictable guardrails.
## Solution
Define a secure cluster baseline covering API server settings, RBAC,
node hardening, pod security, admission, network policy, secret
handling, audit, backups, and upgrade posture.
## Verification
- Cluster baseline checks run before production admission.
- Privileged Kubernetes APIs are limited and reviewed.
- Audit logging, backup, and restore paths are enabled.
- Upgrade tests verify baseline controls remain active.
## Related Patterns
- Pod Security Baseline/Restricted.
- Policy-as-Code Admission Control.
- Network Default Deny.
- Runtime Threat Detection.

View File

@@ -0,0 +1,43 @@
# Pattern: Secure File Upload Pipeline
Status: seed
Readiness target: RL3 production
Primary owners: product repos, artifact-store, NetKingdom
Genesis family: Application/API security
## Problem
User-supplied files can carry malware, parser attacks, data leakage, and
unsafe object-storage exposure.
## Context
Use this pattern for tenant file uploads, artifact ingestion, document
processing, media upload, and user-controlled object storage paths.
## Forces
- Users need convenient uploads.
- Uploaded files should not be trusted until processed.
- Scanning, classification, and transformation may be asynchronous.
- Object access must remain tenant and authorization scoped.
## Solution
Route uploads through a controlled pipeline: accept to quarantine,
record metadata, scan/classify, transform if needed, promote to trusted
storage, and serve through authorized access paths.
## Verification
- Raw uploads land in quarantine or untrusted storage.
- Scanning and classification results are recorded before promotion.
- Access to uploaded objects uses tenant and object-level authorization.
- Malicious or unsupported files fail safely.
## Related Patterns
- Object-Level Authorization Check.
- STS Credential Vending.
- Tenant Data Partitioning.
- Central Audit Ledger.

View File

@@ -0,0 +1,43 @@
# Pattern: Security Event Taxonomy
Status: seed
Readiness target: RL3 production
Primary owners: NetKingdom, Railiance platform
Genesis family: Detection and response
## Problem
Logs become noisy and hard to correlate when security-relevant events
do not share names, fields, severities, and actor/resource semantics.
## Context
Use this pattern for identity, authorization, OpenBao, Kubernetes,
artifact-store, ops access, workload events, and incident response.
## Forces
- Different systems emit different event shapes.
- Tenant-visible audit requires careful projection.
- Agents and humans need separate attribution.
- Detection and response need severity and category consistency.
## Solution
Define a shared taxonomy for security events with event class, actor,
tenant, resource, action, outcome, severity, source, correlation id, and
visibility rules.
## Verification
- Critical systems emit events matching required fields.
- Event classes map to detection and response workflows.
- Tenant-visible fields are explicitly marked.
- Correlation ids link identity, policy, secret, and workload events.
## Related Patterns
- Central Audit Ledger.
- Tenant Audit Log View.
- Runtime Threat Detection.
- Incident Runbook Library.

View File

@@ -0,0 +1,45 @@
# Pattern: Shared Control Plane, Isolated Data Plane
Status: seed
Readiness target: RL3 production
Primary owners: NetKingdom, Railiance platform
Genesis family: Tenant isolation
## Problem
Platforms often need centralized management while tenant workloads and
data require stronger separation than the control plane itself.
## Context
Use this pattern for SaaS management planes, tenant runtime clusters,
dedicated object storage, per-tenant databases, or cell-based data
planes.
## Forces
- Shared control planes reduce management overhead.
- Tenant data planes need stronger isolation and blast-radius control.
- Control-plane actions are high impact and must be tenant scoped.
- Audit must explain who affected which tenant data plane.
## Solution
Keep management APIs and policy orchestration in a shared control plane,
but isolate tenant runtime and data paths. Every control action carries
tenant, target plane, actor, policy, and audit context.
## Verification
- Tenant users cannot mutate global control-plane state unless
delegated.
- Data-plane credentials and network paths are tenant scoped.
- Control-plane actions produce tenant and target-plane audit events.
- A compromised tenant data plane cannot directly control another.
## Related Patterns
- Tenant Isolation.
- Cluster-per-Tenant.
- Cell-based Architecture.
- Tenant Data Partitioning.

View File

@@ -0,0 +1,43 @@
# Pattern: Short-lived Credentials
Status: reviewed
Readiness target: RL3 production
Primary owners: NetKingdom, Railiance platform, flex-auth
Genesis family: Secrets and cryptography
## Problem
Static credentials remain useful after compromise and are difficult to
inventory, rotate, and scope.
## Context
Use this pattern for object storage, SSH, API tokens, database access,
workload secrets, and operator elevation.
## Forces
- Consumers need stable integration contracts.
- Backends differ in session or lease support.
- Refresh and expiration behavior must be tested.
- Audit must link issuance to actor, tenant, resource, and purpose.
## Solution
Prefer credentials with explicit TTL, scope, lease metadata, and
revocation path. Normalize issuance through identity, authorization,
and brokered secret authority.
## Verification
- Credentials include scope, expiry, and owner metadata.
- Consumers refresh before expiration.
- Expired credentials are rejected by backends.
- Issuance and revocation are auditable.
## Related Patterns
- STS Credential Vending.
- Dynamic Secrets.
- Short-Lived SSH Certificates.
- Time-boxed Privilege Elevation.

View File

@@ -0,0 +1,76 @@
# Pattern: Short-Lived SSH Certificates
Status: draft
Readiness target: RL3 production
Primary owners: ops-warden, ops-bridge, NetKingdom
## Problem
Long-lived SSH keys make operator and agent access hard to revoke,
audit, and scope.
## Context
Use this pattern for administrative shell access, tunnel access,
automation access, and agent access to infrastructure where SSH remains
necessary.
## Forces
- Operators need reliable emergency and maintenance access.
- Access must be time-boxed, attributable, and least privilege.
- Agents need identities separate from their human sponsors.
- SSH remains useful but should not bypass platform authorization.
## Solution
Issue short-lived SSH certificates from a controlled authority after
identity and policy checks. Consumers use certificates through
ops-bridge or equivalent access paths that record actor, purpose,
target, TTL, and correlation ids.
## Implementation Sketch
1. Authenticate the human, automation, or agent identity.
2. Authorize target, role, command class, and TTL.
3. Have ops-warden issue an SSH certificate with principal and expiry.
4. Route access through ops-bridge where practical for audit capture.
5. Revoke or let certificates expire quickly after use.
## Failure Modes
| Failure | Mitigation |
| --- | --- |
| Static SSH key fallback persists | inventory and remove unmanaged keys |
| Certificates have broad principals | bind principals to role and target class |
| Agent access borrows human identity | issue explicit agent certificates |
| Audit path bypassed | restrict direct network/admin paths |
## Related Capabilities
- Privileged access management.
- Agent access control.
- Incident response and recovery.
- Observability, detection, and audit.
## Maturity
Draft. Pattern ownership is clear, but implementation details live in
the ops repos and need verification fixtures.
## Verification
- Certificates expire quickly and cannot be renewed silently.
- Certificate principal, target, actor, and reason are logged.
- Static key exceptions are inventoried and reviewed.
- Agent identities are distinguishable from human identities.
## Research Basis
Seeded by privileged access management, time-boxed privilege elevation,
human/agent identity split, and break-glass access patterns.
## References
- Initial exploration: Authorization and access control.
- NetKingdom ownership map: ops-warden and ops-bridge.

View File

@@ -0,0 +1,43 @@
# Pattern: Signed Container Images
Status: seed
Readiness target: RL3 production
Primary owners: artifact-store, Railiance platform, product repos
Genesis family: Supply chain
## Problem
Container images can be replaced, confused, or deployed from untrusted
origins unless image identity is cryptographically verifiable.
## Context
Use this pattern for production container images, platform controllers,
tenant workloads, and deployment admission.
## Forces
- Tags are mutable unless controlled.
- Signatures need trusted keys or identities.
- Admission must verify the image that actually runs.
- Rollbacks need signed artifacts too.
## Solution
Sign container image digests after trusted builds and require production
deployments to reference images whose signatures and provenance satisfy
policy.
## Verification
- Images are signed by trusted build or release identities.
- Admission rejects unsigned or untrusted images.
- Deployments pin or verify immutable digests.
- Signature rotation and key compromise processes exist.
## Related Patterns
- Signed Image Admission.
- SLSA Build Provenance.
- SBOM-per-Release.
- Quarantined Build Runner.

View File

@@ -0,0 +1,43 @@
# Pattern: Signed Image Admission
Status: seed
Readiness target: RL3 production
Primary owners: Railiance platform, artifact-store, product repos
Genesis family: Kubernetes and platform
## Problem
Clusters cannot trust that an image came from reviewed source or an
approved build unless admission verifies signature and provenance.
## Context
Use this pattern for Kubernetes admission, container registries,
artifact-store, release promotion, and supply-chain evidence.
## Forces
- Developers need fast image builds.
- Production needs evidence of review and provenance.
- Not every image source is equally trusted.
- Emergency rollback must still use trusted artifacts.
## Solution
Require production workloads to use signed images from approved
registries. Admission verifies signature, digest, provenance, and policy
before the workload runs.
## Verification
- Unsigned images are rejected in production namespaces.
- Admission pins by digest or verifies immutable references.
- Signatures and provenance link to reviewed source and build identity.
- Emergency images follow the same trust policy.
## Related Patterns
- Supply-Chain Provenance.
- Signed Container Images.
- SLSA Build Provenance.
- Policy-as-Code Admission Control.

View File

@@ -0,0 +1,43 @@
# Pattern: SLSA Build Provenance
Status: seed
Readiness target: RL3 production
Primary owners: artifact-store, product repos, Railiance platform
Genesis family: Supply chain
## Problem
Production artifacts are hard to trust if the platform cannot prove
which source, builder, dependencies, and process produced them.
## Context
Use this pattern for release pipelines, container builds, package
publishing, artifact-store metadata, and deployment admission.
## Forces
- Provenance must be generated by a trustworthy build process.
- Developers need usable build workflows.
- Provenance must be verifiable after release.
- Admission and incident response need artifact lineage.
## Solution
Emit SLSA-style provenance for production artifacts, linking artifact
digest to source repository, commit, builder identity, workflow, inputs,
and build parameters.
## Verification
- Production artifacts carry verifiable provenance.
- Provenance links to protected source and trusted builder identity.
- Admission or release promotion checks provenance policy.
- Incident triage can trace an artifact back to source and build.
## Related Patterns
- Supply-Chain Provenance.
- Protected Main Branch.
- Quarantined Build Runner.
- Signed Container Images.

View File

@@ -0,0 +1,117 @@
# Pattern: STS Credential Vending
Status: reviewed
Readiness target: RL3 production
Primary owners: NetKingdom, flex-auth, Railiance platform
## Problem
Applications need object-storage access, but long-lived access keys in
application pods, repositories, or tenant namespaces create a durable
compromise path. Provider-specific bucket policy alone is not enough for
NetKingdom because access decisions must include identity, tenant,
resource, action, TTL, assurance, and audit context.
## Context
Use this pattern when a workload, service, agent, or human needs
temporary access to S3-compatible object storage. The first target
consumer is `artifact-store`, but the pattern is intended for any
NetKingdom-enabled platform or tenant workload.
The pattern applies across AWS S3, Ceph RGW, MinIO/AIStor, Cloudflare R2,
and OpenBao-assisted broker paths.
## Forces
- Provider-native STS gives strong backend expiration semantics, but
providers differ in API shape and OIDC support.
- Applications need a simple credential contract, but security decisions
require rich context.
- OpenBao can protect parent credentials and audit broker operations, but
must not become the authorization policy engine.
- Tenant administrators need self-service within tenant scope, but must
not receive platform-root object-store or OpenBao authority.
- Temporary credentials reduce blast radius, but consumers must refresh
safely and support session tokens.
## Solution
Introduce a credential-vending service that accepts a NetKingdom IAM
Profile token or workload identity, asks flex-auth for an authorization
decision, and exchanges approved requests for provider-native temporary
credentials or an OpenBao-assisted broker path.
The consumer receives normalized credentials:
```text
access key id
secret access key
session token
expiration
scope metadata
decision/audit correlation id
```
## Implementation Sketch
1. Caller authenticates through key-cape or Keycloak.
2. Credential-vending service validates issuer, audience, subject,
tenant, expiration, and assurance evidence.
3. Service builds a flex-auth request with protected-system id, bucket,
prefix/object, actions, TTL, actor, tenant, and context.
4. flex-auth evaluates policy, delegated to Topaz where appropriate.
5. Deny returns a stable reason code and audit id.
6. Allow invokes backend exchange:
- AWS STS `AssumeRoleWithWebIdentity`;
- Ceph RGW STS;
- MinIO/AIStor STS;
- Cloudflare R2 temporary credential API;
- OpenBao-assisted broker path for protected parent material.
7. Service records identity, policy, backend, lease, and audit metadata.
8. Consumer refreshes before expiration.
## Failure Modes
| Failure | Mitigation |
| --- | --- |
| Consumer lacks `AWS_SESSION_TOKEN` support | keep static bridge only as transitional; add session token refs |
| flex-auth unavailable | fail closed except documented emergency platform workflows |
| OpenBao unavailable | fail if parent material or broker config is required |
| Backend STS unavailable | return retryable backend error; do not fall back to root credentials |
| Tenant mismatch | deny with stable reason code |
| TTL too long | reduce to policy maximum or deny |
| Audit sink unavailable | deny privileged/platform-scoped requests; buffer only if policy permits |
## Related Capabilities
- Object storage access.
- Authorization and access control.
- Secrets, keys, and credentials.
- Tenant isolation.
- Observability, detection, and audit.
## Maturity
Reviewed. The pattern has architecture and ADR coverage in NetKingdom.
It should not be marked canonical until artifact-store temporary session
token support and at least one backend exchange path are verified.
## Verification
- IAM Profile validation rejects wrong issuer or audience.
- flex-auth decision includes tenant, protected system, bucket/prefix,
action set, TTL, obligations, and deny reason.
- Backend returns access key id, secret access key, session token, and
expiration.
- OpenBao audit record exists when parent material or broker config is
accessed.
- Consumer refreshes before expiration.
- Deny paths emit stable reason codes.
## References
- NetKingdom `docs/object-storage-sts-credential-vending.md`.
- NetKingdom `ADR-0008 - Object Storage STS Credential Vending Boundary`.
- Railiance Platform `docs/openbao.md`.
- Artifact-store `ARTIFACT-STORE-WP-0007`.

View File

@@ -0,0 +1,78 @@
# Pattern: Supply-Chain Provenance
Status: seed
Readiness target: RL3 production
Primary owners: Railiance platform, artifact-store, product repos
## Problem
Production artifacts become hard to trust when source, dependencies,
build runners, images, signatures, SBOMs, and deployment admission are
not connected.
## Context
Use this pattern for container images, packages, release artifacts,
SBOMs, dependency updates, GitHub/GitLab workflows, artifact-store, and
Kubernetes admission.
## Forces
- Teams need fast dependency updates and builds.
- Production needs evidence that artifacts came from reviewed source.
- Build systems need secrets, but secret exposure in CI is high impact.
- Admission should verify artifacts without blocking all development.
## Solution
Require production artifacts to carry review, dependency, build,
signature, and provenance evidence. Admission and release workflows use
that evidence to decide what can run or be promoted.
## Implementation Sketch
1. Protect main branches and release tags.
2. Generate SBOMs per release.
3. Sign container images and release artifacts.
4. Emit SLSA-style build provenance from trusted runners.
5. Keep build runners isolated and least privilege.
6. Verify signatures and provenance before production admission.
## Failure Modes
| Failure | Mitigation |
| --- | --- |
| SBOM generated but not stored with releases | store SBOMs in artifact-store or release records |
| Signatures exist but admission ignores them | enforce signed image admission |
| CI runner has broad production secrets | quarantine runners and restrict secret access |
| Dependency bot floods unreviewed changes | require tests and review gates |
## Related Capabilities
- Software supply chain security.
- Platform and Kubernetes hardening.
- Security governance and production readiness.
- Observability, detection, and audit.
## Maturity
Seed. The pattern has strong external standards, but NetKingdom still
needs concrete artifact-store and admission integration.
## Verification
- Releases include SBOM, signature, and provenance.
- Admission rejects unsigned or untrusted production artifacts.
- Build runner access to secrets is minimized.
- Dependency updates are tested and reviewed.
## Research Basis
Seeded by protected main branch, dependency update bot, SBOM-per-release,
SLSA build provenance, signed container images, and quarantined build
runner patterns.
## References
- Initial exploration: Software supply chain security.
- Initial exploration: Supply-chain patterns.

View File

@@ -0,0 +1,44 @@
# Pattern: Tenant Audit Log View
Status: seed
Readiness target: RL4 regulated production
Primary owners: product repos, NetKingdom, Railiance platform
Genesis family: Detection and response
## Problem
Enterprise tenants need visibility into relevant security events without
being exposed to platform-only internals or other tenants' data.
## Context
Use this pattern for tenant admin portals, compliance exports, data
access events, privileged tenant actions, and platform actions that
affect a tenant.
## Forces
- Tenants need trust and accountability.
- Platform logs may contain sensitive internal details.
- Event redaction must preserve evidence value.
- Tenant exports require retention and integrity rules.
## Solution
Project central audit events into tenant-scoped views with explicit
visibility rules, redaction, retention, export controls, and access
authorization.
## Verification
- Tenants can see their own relevant events only.
- Platform-only fields are redacted or omitted.
- Tenant audit access is itself audited.
- Exports preserve integrity and retention rules.
## Related Patterns
- Central Audit Ledger.
- Security Event Taxonomy.
- Tenant Context Propagation.
- Object-Level Authorization Check.

View File

@@ -0,0 +1,77 @@
# Pattern: Tenant Context Propagation
Status: draft
Readiness target: RL3 production
Primary owners: NetKingdom, product repos, Railiance platform
## Problem
Tenant isolation breaks when request handlers, background jobs, events,
storage access, or audit records lose the tenant context that justified
the action.
## Context
Use this pattern for APIs, workers, queues, schedulers, event streams,
data stores, object storage, audit events, and admin workflows.
## Forces
- Tenant context must be explicit and authenticated.
- Some actions are platform-scoped and must not masquerade as tenant
actions.
- Background jobs often outlive the original request.
- Logs and policy decisions need the same tenant correlation.
## Solution
Carry tenant context as a required field in trusted request, job, event,
authorization, storage, and audit envelopes. Derive it from identity or
controlled platform context, not from arbitrary user input.
## Implementation Sketch
1. Define tenant id and platform-scope semantics.
2. Add tenant context to API, job, and event envelopes.
3. Require flex-auth decisions to include tenant scope.
4. Persist tenant context through queues and background workers.
5. Include tenant id in audit and detection events.
6. Reject ambiguous or conflicting tenant context.
## Failure Modes
| Failure | Mitigation |
| --- | --- |
| Tenant id is passed as untrusted parameter | derive from trusted identity/session claims |
| Worker jobs omit tenant id | require typed job envelopes and tests |
| Platform job uses tenant scope accidentally | distinguish platform-scope explicitly |
| Audit records miss tenant | make tenant field required for tenant actions |
## Related Capabilities
- Tenant isolation.
- Authorization and access control.
- Application and API security.
- Observability, detection, and audit.
## Maturity
Draft. This is foundational for multi-tenant correctness and should be
promoted with product conformance tests.
## Verification
- Request, job, event, policy, and audit envelopes include tenant scope.
- Ambiguous tenant context fails closed.
- Platform-scope operations are explicitly marked and reviewed.
- Cross-tenant propagation tests fail for APIs and workers.
## Research Basis
Seeded by tenant context propagation, tenant identity boundary,
tenant-scoped authorization, and tenant data partitioning.
## References
- Initial exploration: Tenant isolation patterns.
- Initial exploration: Application and API security.

View File

@@ -0,0 +1,43 @@
# Pattern: Tenant Data Partitioning
Status: seed
Readiness target: RL3 production
Primary owners: product repos, NetKingdom
Genesis family: Tenant isolation
## Problem
Tenant data can leak when storage, query, object, or cache access does
not enforce tenant partition boundaries.
## Context
Use this pattern for shared databases, object storage, search indexes,
caches, queues, analytics stores, and backup/export paths.
## Forces
- Shared data stores improve efficiency.
- Every data access must include trusted tenant scope.
- Query bugs can bypass application-level checks.
- Backups and exports need the same tenant boundary as live systems.
## Solution
Partition data by tenant at the storage or query layer and require
tenant-scoped authorization before access. Use schema, prefix, row-level
security, bucket/prefix policy, or dedicated stores according to risk.
## Verification
- Cross-tenant query and object access tests fail.
- Background jobs and exports carry tenant context.
- Backups, search, and caches preserve tenant partitioning.
- Audit records identify tenant, object, and actor.
## Related Patterns
- Tenant Context Propagation.
- Object-Level Authorization Check.
- Key-per-Tenant.
- STS Credential Vending.

View File

@@ -0,0 +1,81 @@
# Pattern: Tenant Isolation
Status: draft
Readiness target: RL3 production
Primary owners: NetKingdom, Railiance platform, product repos
## Problem
Multi-tenant systems fail dangerously when tenant identity, runtime,
data, control-plane authority, or background jobs can cross boundaries
implicitly.
## Context
Use this pattern for SaaS products, shared clusters, shared databases,
object storage, platform services, admin tools, and asynchronous jobs.
## Forces
- Shared infrastructure improves efficiency.
- Tenants need strong data and authorization boundaries.
- Isolation may be implemented at namespace, cluster, cell, database,
key, policy, or API layers.
- Product teams need a clear contract for carrying tenant context.
## Solution
Make tenant context an explicit security boundary across identity,
authorization, runtime, data, audit, and operations. Choose isolation
strength per risk: namespace, cluster, cell, data partition, or isolated
data plane.
## Implementation Sketch
1. Define tenant id format and trust source.
2. Require tenant context in request and job envelopes.
3. Enforce tenant scope in flex-auth decisions.
4. Partition data and object storage by tenant.
5. Apply runtime and network boundaries for tenant workloads.
6. Record tenant id in audit and detection events.
## Failure Modes
| Failure | Mitigation |
| --- | --- |
| Tenant id accepted from untrusted input | derive from trusted identity/session claims |
| Background jobs lose tenant context | require job envelope tenant binding |
| Shared database queries miss tenant filter | add query guards and tests |
| Control plane can mutate tenant resources globally | add guardrails and review flows |
## Related Capabilities
- Tenant isolation.
- Authorization and access control.
- Data protection and privacy.
- Observability, detection, and audit.
## Maturity
Draft. The capability is central and well described; individual products
need concrete verification patterns.
## Verification
- Cross-tenant access tests fail for APIs, jobs, storage, and admin
paths.
- Tenant id is present in identity, authorization, and audit records.
- Data access includes tenant partition enforcement.
- Control-plane operations are tenant scoped unless explicitly platform
scoped.
## Research Basis
Seeded by tenant identity boundary, namespace-per-tenant,
cluster-per-tenant, cell-based architecture, isolated data plane,
tenant context propagation, and tenant data partitioning.
## References
- Initial exploration: Tenant isolation capability group.
- Initial exploration: Tenant isolation patterns.

View File

@@ -0,0 +1,45 @@
# Pattern: Tenant Membership Boundary
Status: seed
Readiness target: RL3 production
Primary owners: NetKingdom, product repos
Genesis family: Identity and access
## Problem
Multi-tenant systems become unsafe when global user identity is treated
as proof of membership, authority, or role inside a tenant.
## Context
Use this pattern for invitations, groups, tenant admin roles, product
accounts, background jobs, audit events, and tenant-scoped policy.
## Forces
- One human can belong to multiple tenants.
- Tenant role assignment may be delegated to tenant administrators.
- Platform roles must not leak into tenant roles.
- Offboarding must revoke tenant membership without necessarily deleting
global identity.
## Solution
Represent tenant membership as an explicit relationship separate from
global identity. Every tenant-scoped decision uses trusted identity plus
membership, role, tenant, and resource context.
## Verification
- Removing tenant membership blocks tenant access while preserving the
global account where appropriate.
- Cross-tenant membership tests fail closed.
- Tenant admins can manage only delegated tenant-scoped membership.
- Audit records include global identity and tenant membership context.
## Related Patterns
- Tenant Context Propagation.
- Role Composition.
- Delegated Authorization.
- Tenant Isolation.

View File

@@ -0,0 +1,44 @@
# Pattern: Time-boxed Privilege Elevation
Status: seed
Readiness target: RL3 production
Primary owners: NetKingdom, ops-warden, flex-auth
Genesis family: Identity and access
## Problem
Permanent privileged access increases blast radius and makes it hard to
distinguish ordinary access from exceptional authority.
## Context
Use this pattern for operator access, tenant admin elevation, emergency
maintenance, agent tasks, production data access, and SSH certificate
issuance.
## Forces
- Operators need enough authority to fix incidents.
- Privilege should expire automatically.
- Elevation should include reason, scope, approval, and audit.
- Break-glass must remain separate from ordinary elevation.
## Solution
Grant privileged roles or credentials only for a bounded purpose, scope,
and TTL. The elevation request records actor, tenant, resource, reason,
approval, assurance, and expiration.
## Verification
- Elevated access expires without manual cleanup.
- The platform records who elevated, why, for what, and until when.
- Expired elevation cannot be reused by agents or background sessions.
- Emergency break-glass paths are distinguishable from normal elevation.
## Related Patterns
- Short-Lived SSH Certificates.
- Short-lived Credentials.
- Break-glass Access.
- Central Audit Ledger.

View File

@@ -0,0 +1,45 @@
# Pattern: Token Revocation Sweep
Status: seed
Readiness target: RL3 production
Primary owners: NetKingdom, key-cape, flex-auth
Genesis family: Detection and response
## Problem
Credential compromise requires quickly invalidating related tokens,
sessions, keys, leases, and grants across multiple systems.
## Context
Use this pattern for user compromise, agent compromise, tenant incident,
OpenBao lease revocation, object-storage session exposure, and SSH
certificate containment.
## Forces
- Tokens may be issued by different systems.
- Some credentials expire naturally but still need immediate revocation.
- Revocation must target scope without disabling unrelated tenants.
- Audit and evidence must survive the sweep.
## Solution
Define revocation sweep procedures that identify affected actor, tenant,
credential class, session, lease, key, and token families, then revoke or
expire them through owning systems.
## Verification
- Sweep inputs can target actor, tenant, session, token class, and time
window.
- Revoked credentials fail at enforcement points.
- Sweep actions are logged with reason and operator.
- Follow-up rotation and user communication are tracked.
## Related Patterns
- Short-lived Credentials.
- Dynamic Secrets.
- STS Credential Vending.
- Incident Runbook Library.

View File

@@ -0,0 +1,87 @@
# Pattern: Workload Identity
Status: draft
Readiness target: RL3 production
Primary owners: Railiance platform, NetKingdom
## Problem
Workloads need to authenticate to platform services without inheriting
human credentials, static shared secrets, or tenant-ambiguous service
accounts.
## Context
Use this pattern for Kubernetes workloads, platform services, agents,
and background jobs that need access to OpenBao, object storage,
databases, queues, or internal APIs.
## Forces
- Workloads need stable identity, but credentials should be short lived.
- Kubernetes service accounts are useful local identity evidence, but
consumers need a NetKingdom-level identity contract.
- Tenant context must be explicit for multi-tenant workloads.
- OpenBao can issue or broker secrets, but should trust verified
workload identity rather than static bootstrap credentials.
## Solution
Bind runtime workload identity to the NetKingdom IAM Profile. Validate
the Kubernetes service account, namespace, audience, issuer, tenant, and
deployment context, then exchange that evidence for scoped credentials,
tokens, or policy decisions.
## Implementation Sketch
1. Define workload identity subjects and tenant scope in IAM Profile
claims.
2. Use Kubernetes projected service account tokens or equivalent runtime
attestation.
3. Map service account, namespace, and deployment labels to protected
systems and tenant scope.
4. Let OpenBao Kubernetes auth or a credential broker validate runtime
evidence.
5. Issue scoped, short-lived credentials with audit correlation.
6. Deny requests when workload identity and tenant context disagree.
## Failure Modes
| Failure | Mitigation |
| --- | --- |
| Shared namespace account reused across services | require workload-specific service accounts |
| Tenant missing from identity evidence | fail closed and require explicit tenant binding |
| Long-lived mounted credentials | use short TTLs and rotation |
| OpenBao trusts weak Kubernetes metadata | validate issuer, audience, namespace, service account, and bound claims |
## Related Capabilities
- Identity and user management.
- Secrets, keys, and credentials.
- Tenant isolation.
- Observability, detection, and audit.
## Maturity
Draft. The pattern is well aligned with OpenBao and IAM Profile goals,
but it needs a concrete Railiance implementation path and verification
fixture before graduation.
## Verification
- Workload tokens have expected issuer, audience, subject, and tenant.
- OpenBao or broker policy rejects wrong namespace/service account
combinations.
- Credentials are short lived and auditable.
- Tenant mismatch tests fail closed.
## Research Basis
Seeded by the initial catalogue entries for service identities, workload
secret injection, tenant context propagation, and external secrets.
## References
- Initial exploration: Identity and user management.
- Initial exploration: Secrets and cryptography patterns.
- Railiance OpenBao platform secrets service.

View File

@@ -0,0 +1,126 @@
# Security Architecture Pattern Catalog
Status: completed genesis pattern catalog for NK-WP-0010
Owner: NetKingdom architecture, maintained in infospace-bench
## Purpose
This catalog collects reusable security architecture patterns for
NetKingdom-enabled infrastructures. Patterns describe recurring
implementation shapes, tradeoffs, failure modes, and verification
signals.
Patterns are not tutorials, ADRs, or vendor docs. A tutorial shows how to
do a concrete implementation. An ADR records a decision. A vendor doc
describes a product. A pattern captures a reusable architecture idea and
how NetKingdom maps it into its platform.
## Pattern Template
```text
Problem
Context
Forces
Solution
Implementation sketch
Failure modes
Related capabilities
Maturity
Verification
References
```
## Initial Pattern Set
| Pattern | Capability group | Maturity | Canonical NetKingdom mapping |
| --- | --- | --- | --- |
| STS credential vending | Secrets, authorization, data access | reviewed | IAM Profile + flex-auth + backend STS, OpenBao broker/audit where useful |
| Workload identity | Identity and secrets | draft | Kubernetes service account identity, IAM Profile mapping, OpenBao Kubernetes auth |
| Secret zero avoidance | Secrets and bootstrap | reviewed | SOPS/age bootstrap, emergency bundle, OpenBao runtime handoff |
| Dynamic secrets | Secrets and credentials | draft | OpenBao dynamic credentials with leases and revocation |
| Short-lived SSH certificates | Privileged access | draft | ops-warden issues certificates, ops-bridge consumes and audits |
| Delegated authorization | Authorization | reviewed | flex-auth as canonical boundary, Topaz as first delegated PDP |
| Break-glass access | Recovery and incident response | reviewed | emergency bundle, limited principals, audit and post-event review |
| Tenant isolation | Tenant boundary | draft | tenant ids, tenant-scoped resources, control-plane guardrails |
| Central audit ledger | Detection and audit | seed | identity, flex-auth, Topaz, OpenBao, Kubernetes, workload correlation |
| Policy-as-code admission | Kubernetes hardening | seed | deployment gates and reviewable policy packages |
| Supply-chain provenance | Supply chain | seed | SBOM, signed images, SLSA-style provenance |
| Network default deny | Network security | seed | Kubernetes NetworkPolicy and explicit service communication |
| Object-level authorization check | Application and API security | draft | every resource access includes tenant/resource/action decision |
| Human/agent identity split | Agent access control | draft | agents have explicit identities, scopes, and audit trails |
| Tenant context propagation | Tenant isolation | draft | every request and background job carries tenant context |
## First-Class Pattern Artifacts
The genesis catalogue now has one first-class artifact per exact pattern
name. The authoritative completion matrix is
`artifacts/generated/research-pattern-normalization.md`.
| Family | Exact pattern artifacts |
| --- | --- |
| Identity and access | Central Identity Provider; Identity Broker; Tenant Membership Boundary; Role Composition; Policy Decision Point / Policy Enforcement Point; Time-boxed Privilege Elevation; Break-glass Access; Human/Agent Identity Split |
| Tenant isolation | Namespace-per-Tenant; Cluster-per-Tenant; Cell-based Architecture; Shared Control Plane, Isolated Data Plane; Tenant Context Propagation; Tenant Data Partitioning |
| Kubernetes and platform | Secure Cluster Baseline; Policy-as-Code Admission Control; Pod Security Baseline/Restricted; Network Default Deny; Signed Image Admission; GitOps with Guardrails; Runtime Threat Detection |
| Secrets and cryptography | External Secrets Operator; Sealed Secret / Encrypted Git Secret; Short-lived Credentials; Key-per-Tenant; Certificate Automation |
| Application/API security | API Gateway as Security Boundary; Backend-for-Frontend; Object-Level Authorization Check; Schema-First API Security; Idempotent Command API; Secure File Upload Pipeline |
| Supply chain | Protected Main Branch; Dependency Update Bot; SBOM-per-Release; SLSA Build Provenance; Signed Container Images; Quarantined Build Runner |
| Detection and response | Security Event Taxonomy; Central Audit Ledger; Tenant Audit Log View; Incident Runbook Library; Kill Switch / Tenant Freeze; Token Revocation Sweep |
The NetKingdom umbrella artifacts created during NK-WP-0008 remain in
the infospace where they describe platform-specific compositions, such
as STS credential vending, workload identity, dynamic secrets, delegated
authorization, tenant isolation, policy-as-code admission, and
supply-chain provenance.
## Pattern Notes
### STS Credential Vending
Problem: applications need object-storage access without holding
long-lived root credentials.
Solution: use IAM Profile tokens to identify the actor, flex-auth to
authorize bucket/prefix/action/TTL, provider-native STS or temporary
credential APIs to mint credentials, and OpenBao for parent material,
lease, broker configuration, and audit where needed.
Verification: credentials include session token and expiration; deny
paths produce stable reason codes; consumers refresh before expiration.
### Secret Zero Avoidance
Problem: runtime secret managers need initial trust without creating a
worse unmanaged secret.
Solution: use SOPS/age and emergency bundles for bootstrap and recovery,
then hand runtime workload secret authority to OpenBao once initialized,
audited, backed up, and governed.
Verification: OpenBao root and recovery material are treated as
platform-root break-glass material; workloads do not consume bootstrap
root material.
### Delegated Authorization
Problem: identity providers and application code should not become the
canonical home for every resource-specific authorization decision.
Solution: flex-auth owns the canonical request/decision envelope,
resource/action vocabulary, CARING descriptors, audit/explain records,
and backend adapter boundary. Topaz is the first delegated PDP runtime.
Verification: policy packages distinguish `tenant:platform` from tenant
packages; decision envelopes include tenant, protected-system, resource,
action, assurance, obligations, deny reasons, and audit correlation.
### Break-Glass Access
Problem: operators need recovery access when normal identity, policy, or
cluster services are unavailable.
Solution: define a minimal emergency path with scoped credentials,
separate storage, event logging where possible, and mandatory post-event
review.
Verification: break-glass is tested in drills and never grants ordinary
tenant administrators platform-root authority.

View File

@@ -0,0 +1,91 @@
# Security Capability Catalog
Status: initial catalog extracted from the genesis exploration
Owner: NetKingdom architecture, maintained in infospace-bench
## Purpose
This catalog names the security outcomes a NetKingdom-enabled platform
must provide before production use. Capabilities describe what must
exist; patterns describe how the capability may be implemented.
The catalog is intentionally platform-oriented. It separates platform
responsibility from product/application responsibility and tenant
responsibility so security does not become scattered repo-local lore.
## Capability Template
Each capability should eventually use this shape:
```text
Intent
Scope
Threats addressed
Required controls
Implementation options
Platform responsibility
Product responsibility
Tenant responsibility
Readiness criteria
Evidence
Related patterns
Related standards
```
## Capability Groups
| Group | Intent | Initial readiness focus |
| --- | --- | --- |
| Security governance and production readiness | Make security decisions, risks, exceptions, and promotion gates explicit | ADRs, risk register, threat models, readiness gates |
| Identity and user management | Establish trusted human, service, workload, and agent identities | IAM Profile, key-cape, Keycloak, MFA, lifecycle management |
| Authorization and access control | Decide what actors may do to scoped resources | flex-auth, CARING descriptors, Topaz, tenant-aware decisions |
| Tenant isolation | Keep tenant identity, runtime, data, and control-plane boundaries explicit | tenant context propagation, data partitioning, control-plane guardrails |
| Secrets, keys, and credentials | Prevent scattered static credentials and unsafe bootstrap paths | SOPS/age bootstrap, OpenBao runtime authority, rotation, leases |
| Network and edge security | Control public entry points and lateral movement | ingress, TLS, default-deny network policy, egress control |
| Platform and Kubernetes hardening | Reduce default platform attack surface | RBAC, pod security, admission control, image provenance |
| Application and API security | Make applications safe consumers of platform security services | OIDC integration, object-level authorization, API schemas |
| Data protection and privacy | Protect sensitive and tenant data over its lifecycle | classification, encryption, retention, deletion, auditability |
| Software supply chain security | Protect source, build, dependency, and artifact integrity | SBOM, signed images, provenance, dependency review |
| Observability, detection, and audit | Make security-relevant activity visible and reviewable | central audit, identity logs, policy logs, OpenBao audit, tenant audit |
| Incident response and recovery | Contain incidents and recover platform and tenant service safely | runbooks, break-glass, restore drills, post-incident review |
## Production Readiness Baseline v0.1
The first NetKingdom production readiness baseline contains these
capabilities:
1. Central identity provider.
2. MFA for privileged access.
3. Tenant identity and isolation model.
4. Kubernetes secure baseline.
5. Secrets management and OpenBao runtime handoff.
6. Network default-deny and ingress control.
7. API authentication and object-level authorization.
8. Policy-as-code admission control.
9. Container and dependency vulnerability management.
10. Central security logging and audit trail.
11. Backup and restore verification.
12. Incident response runbooks.
## Standards Mapping Seed
| Standard or framework | Use in this infospace |
| --- | --- |
| NIST CSF 2.0 | Governance-level capability grouping: Govern, Identify, Protect, Detect, Respond, Recover |
| CIS Controls v8 | Practical control coverage and data protection mapping |
| OWASP ASVS | Verifiable application security requirements |
| OWASP API Security | API authorization and object-level access risk framing |
| SLSA | Build provenance and supply-chain integrity |
| OpenSSF Scorecard | Open-source dependency and project-risk signals |
| CNCF Cloud Native Security | Kubernetes and cloud-native platform security framing |
| NSA/CISA Kubernetes Hardening | Kubernetes hardening checklist and threat focus |
## NetKingdom-Specific Notes
- IAM Profile is the canonical identity contract.
- flex-auth is the canonical authorization decision boundary.
- OpenBao is runtime secret authority, not identity provider or policy
decision point.
- Railiance owns deployment layers and platform services.
- `infospace-bench` owns this catalog as a concrete infospace artifact,
not as the canonical deployment source.

View File

@@ -0,0 +1,51 @@
# Security Readiness Levels
Status: initial readiness model extracted from the genesis exploration
## Purpose
Readiness levels make the catalog operational. They define how strong a
capability or pattern must be before it is trusted for a given deployment
stage.
## Levels
| Level | Name | Suitable for | Minimum expectations |
| --- | --- | --- | --- |
| RL0 | Experimental | local prototypes | no production data, no external users, no real secrets in code |
| RL1 | Internal alpha | internal use | central login preferred, basic access control, secrets not committed, known risks documented |
| RL2 | Private beta | selected external users | tenant model defined, isolation tested, backups configured, security logging centralized |
| RL3 | Production | paid customers | MFA for privileged users, least privilege, secrets rotation, auditable admin actions, restore tested |
| RL4 | Regulated production | high-trust or regulated customers | formal risk management, customer audit logs, artifact provenance, stronger data residency and deletion controls |
## Pattern Maturity
Pattern maturity is related to readiness, but not identical:
```text
seed -> draft -> reviewed -> canonical -> deprecated
```
- `seed`: captured from exploration, source notes, or external reading.
- `draft`: written in the pattern template with initial mapping.
- `reviewed`: checked for threat model, ownership, and verification.
- `canonical`: accepted as the recommended NetKingdom pattern.
- `deprecated`: retained for history but no longer recommended.
## Evidence By Level
| Evidence | RL1 | RL2 | RL3 | RL4 |
| --- | --- | --- | --- | --- |
| owner named | required | required | required | required |
| threat notes | useful | required | required | required |
| verification checklist | useful | required | required | required |
| operational runbook | optional | useful | required | required |
| audit hooks | optional | useful | required | required |
| restore or failure drill | optional | useful | required | required |
| standards mapping | optional | useful | useful | required |
## NetKingdom Default
Patterns that touch identity, authorization, secrets, tenant isolation,
or privileged access should not be marked canonical below RL3 unless
their production limitations are explicitly documented.

View File

@@ -0,0 +1,120 @@
# Pattern Admission And Review Criteria
Status: review checklist refreshed for NK-WP-0010
## Purpose
This checklist controls how new patterns enter and graduate inside the
security architecture pattern infospace.
NK-WP-0010 admitted every exact pattern named in the genesis catalogue as
`seed` or stronger. The next reviews should focus on evidence quality and
maturity promotion rather than admission.
## Lifecycle
```text
seed -> draft -> reviewed -> canonical -> deprecated
```
## Admission Criteria
### Seed
A pattern can enter as `seed` when:
- it describes a recurring security architecture problem;
- it has a source, observation, workplan, incident, or external reference;
- it is clearly not just a one-off implementation note.
### Draft
A pattern can move to `draft` when it has:
- problem;
- context;
- forces and tradeoffs;
- solution sketch;
- known failure modes;
- related capabilities;
- initial NetKingdom or ecosystem mapping.
### Reviewed
A pattern can move to `reviewed` when it has:
- threat-model clarity;
- vendor-neutral framing;
- at least one open-source or self-hosted implementation option when
possible;
- commercial/provider options where relevant;
- operability notes;
- audit hooks;
- failure-mode behavior;
- readiness-level fit;
- owning repo or component named;
- evidence needed for verification.
### Canonical
A pattern can move to `canonical` when:
- NetKingdom architecture accepts it as the recommended pattern;
- implementation anchors exist or are intentionally scheduled;
- one or more workplans, ADRs, tutorials, or runbooks point to it;
- the pattern has clear prohibited alternatives or anti-patterns;
- verification evidence has been captured at the intended readiness
level.
### Deprecated
A pattern moves to `deprecated` when:
- it is replaced by a stronger pattern;
- implementation experience shows the pattern is unsafe or too costly;
- platform direction changes;
- vendor or technology assumptions no longer hold.
Deprecated patterns remain visible with their reason and replacement.
## Review Checklist
| Criterion | Question |
| --- | --- |
| Vendor neutrality | Can the pattern be understood without committing to a single product? |
| Threat model | Does it name the realistic failures or attacks it reduces? |
| Ownership | Are platform, product, tenant, and provider responsibilities clear? |
| Operability | Can an operator deploy, monitor, rotate, and recover it? |
| Auditability | Are security-relevant events and correlation ids defined? |
| Failure behavior | Does it fail closed or document controlled exceptions? |
| Readiness fit | Is RL0-RL4 applicability explicit? |
| Evidence | What proves implementation is correct? |
| Anti-patterns | What common unsafe shortcuts are prohibited? |
| Tutorial handoff | Does NK-WP-0009 need a tutorial for it? |
## Current Canonical Candidates
- STS credential vending.
- Secret zero avoidance.
- Delegated authorization.
- Break-glass access.
- Short-lived credentials.
- Policy Decision Point / Policy Enforcement Point.
These are candidates, not automatically canonical. Each still needs the
checklist evidence before the infospace marks it canonical.
## NK-WP-0010 Review Backlog
Use `artifacts/generated/research-pattern-normalization.md` as the
backlog for maturity promotion. Strong first review candidates are:
- Central Identity Provider and Identity Broker, because they shape
key-cape/Keycloak integration.
- Tenant Membership Boundary and Tenant Context Propagation, because
they protect multi-tenant correctness.
- Policy-as-Code Admission Control, Pod Security Baseline/Restricted,
and Signed Image Admission, because they form the platform deployment
gate.
- Security Event Taxonomy and Tenant Audit Log View, because they define
what can become tenant-visible evidence.

View File

@@ -0,0 +1,91 @@
# Research Pattern Normalization
Status: complete coverage map for NK-WP-0010
## Purpose
The genesis exploration contains a broad security architecture pattern
catalogue. NK-WP-0010 promotes every exact pattern name from that
catalogue into a first-class infospace artifact while preserving the
earlier NetKingdom-specific umbrella patterns created during NK-WP-0008.
## Completion Rule
- Every exact pattern name in `genesis/InitialExploration.md` has a
discoverable `artifacts/entities/pattern-*.md` artifact.
- Umbrella NetKingdom patterns remain when they describe a canonical
platform shape that spans multiple exact genesis patterns.
- The generated index and ownership map link both exact and umbrella
artifacts, but the exact genesis list is the completion baseline for
this workplan.
## Completion Matrix
| Family | Exact genesis pattern | Artifact | Current status |
| --- | --- | --- | --- |
| Identity and access | Central Identity Provider | `artifacts/entities/pattern-central-identity-provider.md` | seed |
| Identity and access | Identity Broker | `artifacts/entities/pattern-identity-broker.md` | seed |
| Identity and access | Tenant Membership Boundary | `artifacts/entities/pattern-tenant-membership-boundary.md` | seed |
| Identity and access | Role Composition | `artifacts/entities/pattern-role-composition.md` | seed |
| Identity and access | Policy Decision Point / Policy Enforcement Point | `artifacts/entities/pattern-policy-decision-point-policy-enforcement-point.md` | reviewed |
| Identity and access | Time-boxed Privilege Elevation | `artifacts/entities/pattern-time-boxed-privilege-elevation.md` | seed |
| Identity and access | Break-glass Access | `artifacts/entities/pattern-break-glass-access.md` | reviewed |
| Identity and access | Human/Agent Identity Split | `artifacts/entities/pattern-human-agent-identity-split.md` | draft |
| Tenant isolation | Namespace-per-Tenant | `artifacts/entities/pattern-namespace-per-tenant.md` | seed |
| Tenant isolation | Cluster-per-Tenant | `artifacts/entities/pattern-cluster-per-tenant.md` | seed |
| Tenant isolation | Cell-based Architecture | `artifacts/entities/pattern-cell-based-architecture.md` | seed |
| Tenant isolation | Shared Control Plane, Isolated Data Plane | `artifacts/entities/pattern-shared-control-plane-isolated-data-plane.md` | seed |
| Tenant isolation | Tenant Context Propagation | `artifacts/entities/pattern-tenant-context-propagation.md` | draft |
| Tenant isolation | Tenant Data Partitioning | `artifacts/entities/pattern-tenant-data-partitioning.md` | seed |
| Kubernetes and platform | Secure Cluster Baseline | `artifacts/entities/pattern-secure-cluster-baseline.md` | seed |
| Kubernetes and platform | Policy-as-Code Admission Control | `artifacts/entities/pattern-policy-as-code-admission-control.md` | seed |
| Kubernetes and platform | Pod Security Baseline/Restricted | `artifacts/entities/pattern-pod-security-baseline-restricted.md` | seed |
| Kubernetes and platform | Network Default Deny | `artifacts/entities/pattern-network-default-deny.md` | seed |
| Kubernetes and platform | Signed Image Admission | `artifacts/entities/pattern-signed-image-admission.md` | seed |
| Kubernetes and platform | GitOps with Guardrails | `artifacts/entities/pattern-gitops-with-guardrails.md` | seed |
| Kubernetes and platform | Runtime Threat Detection | `artifacts/entities/pattern-runtime-threat-detection.md` | seed |
| Secrets and cryptography | External Secrets Operator | `artifacts/entities/pattern-external-secrets-operator.md` | seed |
| Secrets and cryptography | Sealed Secret / Encrypted Git Secret | `artifacts/entities/pattern-sealed-secret-encrypted-git-secret.md` | seed |
| Secrets and cryptography | Short-lived Credentials | `artifacts/entities/pattern-short-lived-credentials.md` | reviewed |
| Secrets and cryptography | Key-per-Tenant | `artifacts/entities/pattern-key-per-tenant.md` | seed |
| Secrets and cryptography | Certificate Automation | `artifacts/entities/pattern-certificate-automation.md` | seed |
| Application/API security | API Gateway as Security Boundary | `artifacts/entities/pattern-api-gateway-as-security-boundary.md` | seed |
| Application/API security | Backend-for-Frontend | `artifacts/entities/pattern-backend-for-frontend.md` | seed |
| Application/API security | Object-Level Authorization Check | `artifacts/entities/pattern-object-level-authorization-check.md` | draft |
| Application/API security | Schema-First API Security | `artifacts/entities/pattern-schema-first-api-security.md` | seed |
| Application/API security | Idempotent Command API | `artifacts/entities/pattern-idempotent-command-api.md` | seed |
| Application/API security | Secure File Upload Pipeline | `artifacts/entities/pattern-secure-file-upload-pipeline.md` | seed |
| Supply chain | Protected Main Branch | `artifacts/entities/pattern-protected-main-branch.md` | seed |
| Supply chain | Dependency Update Bot | `artifacts/entities/pattern-dependency-update-bot.md` | seed |
| Supply chain | SBOM-per-Release | `artifacts/entities/pattern-sbom-per-release.md` | seed |
| Supply chain | SLSA Build Provenance | `artifacts/entities/pattern-slsa-build-provenance.md` | seed |
| Supply chain | Signed Container Images | `artifacts/entities/pattern-signed-container-images.md` | seed |
| Supply chain | Quarantined Build Runner | `artifacts/entities/pattern-quarantined-build-runner.md` | seed |
| Detection and response | Security Event Taxonomy | `artifacts/entities/pattern-security-event-taxonomy.md` | seed |
| Detection and response | Central Audit Ledger | `artifacts/entities/pattern-central-audit-ledger.md` | seed |
| Detection and response | Tenant Audit Log View | `artifacts/entities/pattern-tenant-audit-log-view.md` | seed |
| Detection and response | Incident Runbook Library | `artifacts/entities/pattern-incident-runbook-library.md` | seed |
| Detection and response | Kill Switch / Tenant Freeze | `artifacts/entities/pattern-kill-switch-tenant-freeze.md` | seed |
| Detection and response | Token Revocation Sweep | `artifacts/entities/pattern-token-revocation-sweep.md` | seed |
## NetKingdom Umbrella Patterns
These artifacts remain first-class because they capture NetKingdom
platform-specific architecture that spans multiple exact seed patterns:
| Umbrella pattern | Artifact | Covers |
| --- | --- | --- |
| STS credential vending | `artifacts/entities/pattern-sts-credential-vending.md` | short-lived object-storage credentials, delegated authorization, OpenBao broker/audit support |
| Workload identity | `artifacts/entities/pattern-workload-identity.md` | service identities, workload secret injection, tenant context |
| Secret zero avoidance | `artifacts/entities/pattern-secret-zero-avoidance.md` | encrypted Git secrets, bootstrap, break-glass, OpenBao handoff |
| Dynamic secrets | `artifacts/entities/pattern-dynamic-secrets.md` | short-lived credentials, leases, rotation, revocation |
| Short-lived SSH certificates | `artifacts/entities/pattern-short-lived-ssh-certificates.md` | time-boxed privilege, agent/admin access, SSH audit |
| Delegated authorization | `artifacts/entities/pattern-delegated-authorization.md` | PDP/PEP, flex-auth, Topaz, decision envelopes |
| Tenant isolation | `artifacts/entities/pattern-tenant-isolation.md` | namespace, cluster, cell, data, and control-plane isolation |
| Policy-as-code admission | `artifacts/entities/pattern-policy-as-code-admission.md` | admission control, pod security, image trust, GitOps guardrails |
| Supply-chain provenance | `artifacts/entities/pattern-supply-chain-provenance.md` | SBOMs, SLSA, signed images, protected branches, trusted runners |
## Completion Result
No exact genesis pattern remains unaccounted. Future work should improve
maturity and evidence quality, not create missing seed placeholders.

View File

@@ -0,0 +1,72 @@
# Security Pattern Index And Maturity Matrix
Status: generated index refreshed for NK-WP-0010
## Capability Index
| Capability group | Initial status | Primary owner | Gaps |
| --- | --- | --- | --- |
| Security governance and readiness | draft | NetKingdom, State Hub | risk register and readiness gates need formalization |
| Identity and user management | partial | NetKingdom, key-cape, Keycloak | lifecycle and federation evidence incomplete |
| Authorization and access control | partial | flex-auth, NetKingdom | policy package lifecycle and Topaz runtime checks need implementation |
| Tenant isolation | draft | NetKingdom, Railiance, product repos | isolation patterns now exist; product conformance tests remain |
| Secrets, keys, and credentials | partial | NetKingdom, Railiance platform, OpenBao | OpenBao drills and certificate lifecycle evidence pending |
| Network and edge security | seed | Railiance, product repos | default-deny and gateway patterns need concrete manifests |
| Platform and Kubernetes hardening | seed | Railiance | baseline, pod-security, and admission patterns need implementation evidence |
| Application and API security | seed | product repos, NetKingdom | object-level, schema, BFF, command, and upload patterns need product adoption |
| Data protection and privacy | seed | product repos, platform | tenant data partitioning and key-per-tenant need storage-specific decisions |
| Software supply chain security | seed | Railiance, artifact-store, product repos | SBOM/provenance/signature pipeline needs implementation anchors |
| Observability, detection, and audit | draft | Railiance, NetKingdom, State Hub | event taxonomy and tenant audit projection need storage decision |
| Incident response and recovery | draft | Railiance, NetKingdom | runbooks, tenant freeze, and revocation sweep need drills |
## Genesis Pattern Coverage
| Family | Exact patterns | Artifact coverage | Maturity spread |
| --- | ---: | --- | --- |
| Identity and access | 8 | complete | seed to reviewed |
| Tenant isolation | 6 | complete | seed to draft |
| Kubernetes and platform | 7 | complete | seed |
| Secrets and cryptography | 5 | complete | seed to reviewed |
| Application/API security | 6 | complete | seed to draft |
| Supply chain | 6 | complete | seed |
| Detection and response | 6 | complete | seed |
The authoritative per-pattern completion matrix is
`artifacts/generated/research-pattern-normalization.md`.
## NetKingdom Umbrella Pattern Index
| Pattern | Maturity | Owner | Implementation links |
| --- | --- | --- | --- |
| STS credential vending | reviewed | NetKingdom, flex-auth, Railiance platform | NK-WP-0007, ADR-0008, artifact-store follow-up |
| Workload identity | draft | Railiance platform, NetKingdom | IAM Profile, OpenBao Kubernetes auth |
| Secret zero avoidance | reviewed | NetKingdom, Railiance platform | NK-WP-0004, NK-WP-0005, Railiance OpenBao |
| Dynamic secrets | draft | OpenBao, Railiance platform | OpenBao leases and revocation |
| Short-lived SSH certificates | draft | ops-warden, ops-bridge, NetKingdom | SSH certificate issuance and audit |
| Delegated authorization | reviewed | flex-auth, NetKingdom | flex-auth, Topaz, CARING descriptors |
| Tenant isolation | draft | NetKingdom, Railiance platform, product repos | namespace, cluster, cell, data, and control-plane isolation |
| Policy-as-code admission | seed | Railiance platform, NetKingdom | admission policy, pod security, image trust |
| Supply-chain provenance | seed | Railiance platform, artifact-store, product repos | SBOM, signatures, SLSA provenance |
## Tutorial Handoff To NK-WP-0009
High-value tutorial candidates after NK-WP-0010 completion:
1. Vend temporary S3 credentials from a NetKingdom identity token.
2. Deploy OpenBao as canonical Railiance platform secrets manager.
3. Use short-lived SSH credentials for admins, agents, and automations.
4. Add a protected system to flex-auth using PDP/PEP boundaries.
5. Apply secret-zero avoidance from bootstrap to runtime OpenBao.
6. Build tenant audit visibility from the central audit ledger.
7. Add policy-as-code admission with pod-security and signed-image gates.
8. Produce SBOM, signature, and SLSA-style provenance for a release.
## Open Decisions And Gaps
- Decide durable audit ledger storage and tenant-visible audit boundary.
- Decide which seed patterns should graduate to reviewed before tutorial
writing starts.
- Decide how machine-readable capability and pattern status should be
represented.
- Decide which pattern families need product-specific conformance tests
before being marked canonical.

View File

@@ -0,0 +1,49 @@
# STS Credential Vending Extraction
Status: generated extraction from NetKingdom and Railiance source docs
## Extracted Claims
1. Object-storage credential vending must be identity-backed and
policy-approved before backend exchange.
2. flex-auth is the canonical authorization decision point for bucket,
prefix, action, TTL, tenant, and assurance decisions.
3. Provider-native temporary credentials are preferred when mature.
4. OpenBao may protect parent credentials, broker configuration, leases,
and audit records, but must not decide object-storage authorization.
5. Consumers must support session tokens and expiration-aware refresh.
6. Long-lived static credentials are transitional only.
7. Tenant administrators must not receive platform-root object-store or
OpenBao authority.
## Extracted Anti-Patterns
- object-store root credentials in application pods;
- access-key/secret-key-only production consumers with no session token;
- application repos as canonical bucket policy owners;
- OpenBao used as a substitute for flex-auth decisions;
- local-identity tokens accepted by production object-storage backends;
- fallback to root credentials when STS or flex-auth is unavailable.
## Extracted Evidence Needs
- IAM Profile issuer/audience validation.
- flex-auth decision record with stable reason codes.
- backend credential response with session token and expiration.
- OpenBao audit event where parent material or broker config is used.
- artifact-store or consumer refresh test.
- denial tests for wrong tenant, unregistered prefix, and excessive TTL.
## Candidate Tutorial
Title: Vend temporary S3 credentials from a NetKingdom identity token.
Tutorial path:
1. issue or obtain an IAM Profile token;
2. register protected system, bucket, and prefix in flex-auth;
3. request credentials from the vending service;
4. observe backend temporary credential response;
5. configure `artifact-store` or an SDK with session token;
6. verify refresh and audit correlation;
7. exercise deny cases.

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,78 @@
# NetKingdom Security Pattern Ownership Map
Status: ownership mapping refreshed for NK-WP-0010
## Purpose
This map connects capabilities and patterns to the repos, components,
workplans, and responsibilities that own implementation or governance.
## Responsibility Split
| Area | Platform responsibility | Product/application responsibility | Tenant responsibility |
| --- | --- | --- | --- |
| Identity | NetKingdom IAM Profile, key-cape/Keycloak integration | consume OIDC/IAM Profile correctly | manage tenant users and groups where delegated |
| Authorization | flex-auth model, CARING descriptors, Topaz boundary | enforce decisions locally | request/administer tenant-scoped access only |
| Secrets | SOPS/age bootstrap, OpenBao runtime authority | consume scoped secrets and rotate safely | manage tenant-scoped secrets where allowed |
| Object storage | STS vending policy and backend broker boundary | support temporary credentials and refresh | request credentials for registered resources |
| SSH/admin access | ops-warden/ops-bridge short-lived access paths | avoid direct unmanaged admin paths | no platform-root access |
| Audit | central audit taxonomy and durable sinks | emit workload events and correlation ids | review tenant-visible events where offered |
| Deployment | Railiance stack layers and readiness checks | app manifests and release posture | tenant configuration within guardrails |
## Component Mapping
| Component or repo | Role in this infospace | Related patterns |
| --- | --- | --- |
| `net-kingdom` | canonical security architecture, IAM Profile, workplans, ADRs | IAM Profile, recursive platform identity, STS vending, secret-zero avoidance |
| `key-cape` | lightweight IAM Profile implementation | central identity provider, lightweight identity, MFA integration |
| Keycloak | expanded IAM implementation and federation | central identity provider, identity broker |
| Authelia | lightweight SSO backing component | central identity provider, SSO boundary |
| LLDAP | lightweight directory backing component | user lifecycle, group mapping |
| privacyIDEA | MFA and token lifecycle | privileged MFA, break-glass controls |
| `flex-auth` | authorization control plane | delegated authorization, policy-as-code, decision envelopes |
| Topaz | delegated PDP runtime | policy decision runtime |
| OpenBao | runtime secret authority | dynamic secrets, secret zero avoidance, STS broker/audit support |
| `railiance-platform` | platform service deployment | OpenBao, object storage, databases, platform secret delivery |
| Ceph RGW | candidate object-storage backend | STS credential vending |
| MinIO-compatible stores | candidate object-storage backend | STS credential vending |
| `artifact-store` | S3 consumer and artifact integrity owner | STS credential consumer, supply-chain evidence |
| `ops-warden` | short-lived SSH certificate issuer | privileged access, short-lived credentials |
| `ops-bridge` | SSH/tunnel access consumer and audit path | human/agent access, auditability |
| State Hub | cross-repo workstream and decision read model | security readiness tracking, progress evidence |
## First-Class Pattern Ownership
| Pattern family | Primary ownership | Notes |
| --- | --- | --- |
| Identity and access | NetKingdom, key-cape, Keycloak, flex-auth | IdP, broker, membership, role, PDP/PEP, elevation, break-glass, human/agent split |
| Tenant isolation | NetKingdom, Railiance platform, product repos | namespace, cluster, cell, shared-control/isolated-data, tenant context, data partitioning |
| Kubernetes and platform | Railiance platform, NetKingdom, product repos | secure baseline, admission, pod security, network, image, GitOps, runtime detection |
| Secrets and cryptography | NetKingdom, Railiance platform, OpenBao, product repos | external secrets, encrypted Git secrets, short-lived credentials, tenant keys, certificates |
| Application/API security | product repos, NetKingdom, flex-auth, artifact-store | gateway, BFF, object auth, schema-first APIs, idempotent commands, file uploads |
| Supply chain | Railiance platform, artifact-store, product repos | protected branches, dependency updates, SBOMs, SLSA provenance, signatures, build runners |
| Detection and response | NetKingdom, Railiance platform, State Hub, product repos | event taxonomy, central audit, tenant audit, runbooks, freeze, revocation |
| NetKingdom umbrella patterns | NetKingdom plus implementing repos | STS vending, workload identity, secret-zero avoidance, dynamic secrets, SSH certificates, delegated authorization, tenant isolation, policy-as-code admission, supply-chain provenance |
The exact per-pattern artifact coverage is maintained in
`artifacts/generated/research-pattern-normalization.md`.
## Workplan Mapping
| Workplan | Relationship |
| --- | --- |
| NK-WP-0006 | platform, tenant, bootstrap, identity, authorization, and OpenBao architecture baseline |
| NK-WP-0007 | object-storage STS credential-vending pattern baseline |
| NK-WP-0008 | this infospace and pattern catalog |
| NK-WP-0010 | complete first-class artifacts for every exact genesis pattern |
| NK-WP-0009 | tutorials derived from canonical patterns |
| RAIL-PL-WP-0002 | OpenBao deployment, unseal, break-glass, audit, backup, and workload integration |
| ARTIFACT-STORE-WP-0007 | S3 compatibility and temporary credential consumer behavior |
## Open Mapping Questions
- Which patterns should become NetKingdom internal standards versus
Railiance operational runbooks?
- Which patterns require external customer-facing documentation?
- Which audit events are platform-only and which become tenant-visible?
- Which patterns need machine-readable status in State Hub or a future
capability registry?

View File

@@ -0,0 +1,58 @@
# STS Credential Vending Relationship Map
Status: draft
## Purpose
This relation artifact spells out how the STS credential-vending pattern
connects NetKingdom, Railiance, flex-auth, OpenBao, artifact-store, and
future tutorials.
## Relationship Summary
```text
IAM Profile
identifies caller
-> flex-auth
authorizes tenant/resource/action/TTL
-> credential-vending service
normalizes backend-specific temporary credential exchange
-> OpenBao
protects parent material and audit/lease evidence where used
-> object-storage backend
issues temporary credentials
-> artifact-store or other consumer
refreshes and uses scoped credentials
```
## Handoffs
| From | To | Handoff | Evidence |
| --- | --- | --- | --- |
| NetKingdom IAM Profile | credential-vending service | issuer, audience, subject, tenant, assurance claims | token validation test |
| credential-vending service | flex-auth | protected system, bucket, prefix, action set, TTL, context | decision envelope |
| flex-auth | Topaz | delegated PDP evaluation where used | policy load and decision log |
| credential-vending service | OpenBao | parent credential access, broker config, lease/audit metadata | OpenBao audit event |
| credential-vending service | backend STS/API | approved temporary credential request | backend response with expiration |
| credential-vending service | artifact-store | normalized temporary credentials | `AWS_SESSION_TOKEN` and expiration-aware refresh |
| artifact-store | audit sink | workload storage event with correlation id | workload audit record |
| pattern catalog | NK-WP-0009 | tutorial candidate | tutorial backlog item |
## Required Edges In The Infospace Graph
- STS source docs seed the STS pattern.
- STS pattern implements object-storage access capability.
- STS pattern uses delegated authorization.
- STS pattern uses OpenBao runtime secret authority.
- STS pattern requires artifact-store session-token support.
- STS pattern feeds NK-WP-0009 tutorial work.
## Open Questions
- Which backend is the first live exchange target: MinIO/AIStor, Ceph
RGW, AWS, or Cloudflare R2?
- Does the first implementation use a standalone service, controller, or
CLI plus `credential_process`?
- Where does durable cross-system audit correlation land?
- Which claim shape should become the IAM Profile minimum for workload
object-storage access?

View File

@@ -0,0 +1,82 @@
# ADR-0008 - Object Storage STS Credential Vending Boundary
**Status:** Accepted
**Date:** 2026-05-18
**Deciders:** Bernd Worsch, Codex
## Context
NetKingdom needs a canonical pattern for issuing short-lived
object-storage credentials to platform and tenant workloads. The first
known consumer is `artifact-store`, but the pattern must work for future
S3-compatible consumers without making each application repo own identity,
authorization, root object-store credentials, or backend-specific STS
differences.
The backend landscape is not uniform. AWS S3, Ceph RGW, and MinIO/AIStor
can use web-identity STS-style flows. Cloudflare R2 exposes temporary
credentials through a provider API or local signing with parent access
material. OpenBao is now part of the Railiance platform stack as runtime
secret authority, but it is not an identity provider or authorization
policy engine.
## Decision
NetKingdom will define a provider-neutral credential-vending interface
backed by provider-native temporary credential mechanisms where possible.
The trust path is:
1. IAM Profile token proves the actor or workload.
2. flex-auth decides whether the actor may receive credentials for the
requested protected system, tenant, bucket, prefix, action set, TTL,
and assurance level.
3. The credential-vending service exchanges the approved request with
the backend-specific temporary credential mechanism.
4. OpenBao stores parent credentials, broker configuration, lease
metadata, and audit evidence where useful, but it does not replace
flex-auth authorization.
5. Consumers receive normalized temporary credentials containing access
key id, secret access key, session token, and expiration.
## Consequences
- `artifact-store` needs temporary credential support, especially
`AWS_SESSION_TOKEN` and refresh behavior, before it can fully consume
the production vending pattern.
- Backend-specific differences are isolated in the vending service, not
leaked into application policy.
- OpenBao remains runtime secret infrastructure and audit support; it
does not become the object-storage policy source.
- Provider-native STS is preferred when available because it gives the
storage backend direct lease/expiration semantics.
- Cloudflare R2 requires a broker path that protects parent access
material, most likely through OpenBao custody.
## Alternatives Considered
### Give Applications Long-Lived Access Keys
This is simple but leaves applications holding durable credentials and
pushes policy into ad hoc bucket configuration. It is acceptable only as
a transitional bridge with scoped credentials and explicit rotation.
### Put Object-Storage Policy In Keycloak Or key-cape
Identity providers can assert who the actor is and coarse groups or
roles, but they should not become the canonical source of bucket,
prefix, action, TTL, and explanation semantics.
### Use OpenBao As The Credential Vending Policy Engine
OpenBao is valuable for secret custody, broker configuration, leases,
and audit records. Making it the policy decision point would duplicate
flex-auth, blur the platform/tenant boundary, and make authorization
semantics backend-specific.
### Require One Backend Everywhere
A single backend would simplify implementation but does not match the
platform direction. Railiance and NetKingdom need a stable security
interface across AWS, self-hosted S3-compatible stores, and Cloudflare
R2-like APIs.

View File

@@ -0,0 +1,482 @@
# Object Storage STS Credential Vending
Status: architecture baseline for NK-WP-0007
Date: 2026-05-18
## Purpose
This document defines the NetKingdom pattern for vending short-lived
object-storage credentials from verified identity and policy decisions.
It is provider-neutral at the NetKingdom boundary and provider-aware at
the backend exchange boundary.
The goal is to let consumers such as `artifact-store` use S3-compatible
temporary credentials without owning identity, authorization, secret
custody, or object-storage root credentials.
## Ownership Boundary
| Capability | Owner |
| --- | --- |
| IAM Profile, issuer and claim requirements | NetKingdom |
| Resource/action vocabulary and policy decision envelope | flex-auth, governed by NetKingdom architecture |
| Delegated PDP runtime | Topaz first, behind flex-auth |
| Runtime secret custody, broker configuration, audit, leases | OpenBao, deployed by Railiance platform |
| Object-storage backend configuration | Railiance platform |
| Artifact package behavior and S3 client refresh behavior | artifact-store |
| Application deployment | Railiance apps or the owning application repo |
OpenBao may store parent credentials, broker configuration, or issued
credential metadata where appropriate. It does not replace flex-auth as
the authorization decision point and must not become the object-storage
policy model.
## Core Flow
```text
Human, service, or agent principal
|
v
NetKingdom IAM Profile token
key-cape lightweight mode or Keycloak expanded mode
|
v
credential-vending service
verifies issuer, audience, subject, assurance, tenant
|
v
flex-auth decision
tenant, protected-system, bucket, prefix, actions, TTL, obligations
|
v
backend exchange
AWS STS, Ceph RGW STS, MinIO/AIStor STS, Cloudflare R2 temp API,
or OpenBao-assisted broker path
|
v
temporary S3 credentials
access key id, secret access key, session token, expiration
|
v
consumer
artifact-store, SDK, CLI, sidecar, controller, or batch job
```
## Trust Boundaries
### Platform Control Plane
`tenant:platform` administers the credential-vending service, approved
issuer list, flex-auth policy import pipeline, OpenBao mounts/auth
methods, backend parent credentials, audit retention, and emergency
recovery.
### Tenant Plane
`tenant:coulomb` and later tenants may request scoped credentials for
registered tenant resources. Tenant administrators must not receive
OpenBao root tokens, object-storage root credentials, global backend STS
configuration, or platform policy import authority.
### Backend Boundary
The credential-vending service is the only component that exchanges an
approved decision for provider-native credentials. Consumers receive only
short-lived credentials scoped to the approved bucket, prefix, actions,
and TTL.
## Token And Decision Flow
1. The caller authenticates through a NetKingdom IAM Profile
implementation.
2. The caller sends a request to the credential-vending service with a
bearer token or a workload identity binding.
3. The service validates issuer, audience, signature, expiration,
subject, tenant claim, and assurance evidence.
4. The service builds a flex-auth request with the protected-system id,
resource, action set, requested TTL, tenant, actor, and context.
5. flex-auth evaluates policy through its standalone evaluator or a
delegated PDP such as Topaz.
6. If denied, the service returns a deny envelope with a stable reason
code and audit correlation id.
7. If allowed, the service exchanges the approved request with the
backend or OpenBao-assisted broker path.
8. The service returns normalized temporary credentials and records
identity, policy, backend, lease, and audit metadata.
## Resource Model
Every object-storage resource belongs to a protected system and tenant.
Suggested identifiers:
```text
protected_system:object-storage:artifact-store-prod
tenant:platform
tenant:coulomb
bucket:artifact-store-prod
prefix:tenant/coulomb/packages/
object:tenant/coulomb/packages/<digest>
```
The protected-system id names the storage integration boundary, not just
the backend product. For example, a MinIO tenant and an AWS bucket used
by the same application should still be distinct protected systems if
their trust, audit, or policy lifecycle differs.
## flex-auth Vocabulary
| Resource | Example | Notes |
| --- | --- | --- |
| protected system | `object-storage:artifact-store-prod` | Required in every decision |
| bucket | `bucket:artifact-store-prod` | Coarse storage boundary |
| prefix | `prefix:tenant/coulomb/packages/` | Preferred grant boundary for workloads |
| object | `object:tenant/coulomb/packages/a.tar.zst` | Use for exceptional single-object decisions |
Canonical action names:
| Action | Meaning |
| --- | --- |
| `s3:GetObject` | Read object data |
| `s3:PutObject` | Create or replace object data |
| `s3:DeleteObject` | Delete object data |
| `s3:ListBucket` | List bucket or prefix contents |
| `s3:GetObjectAttributes` | Read metadata, checksums, or object attributes |
| `s3:AbortMultipartUpload` | Abort multipart state |
| `s3:CreateMultipartUpload` | Start multipart upload |
| `s3:UploadPart` | Upload multipart chunk |
| `s3:CompleteMultipartUpload` | Complete multipart upload |
Required decision inputs:
- subject id, subject type, issuer, audience, and tenant;
- protected-system id;
- bucket and prefix or object;
- requested action set;
- requested TTL;
- assurance level and MFA evidence where privileged or destructive
actions are requested;
- workload identity evidence for service or agent callers;
- request purpose and audit correlation id when available.
Required decision outputs:
- allow or deny;
- maximum TTL;
- permitted actions;
- permitted bucket and prefix/object scope;
- obligations such as read-only, checksum-required, write-once, or
audit-detail-required;
- deny reason code;
- explanation/audit correlation id;
- backend exchange hint where policy deliberately restricts backend use.
TTL policy:
- default interactive TTL: 15 minutes;
- default workload TTL: 30 minutes;
- maximum normal TTL: 1 hour;
- longer TTLs require explicit policy and should not exceed backend
limits;
- destructive or platform-scoped credentials should use shorter TTLs and
MFA or dual-control obligations.
## IAM Profile Requirements
Accepted issuers:
- key-cape lightweight mode for local, sandbox, and small deployments;
- Keycloak expanded mode for production and enterprise federation;
- local-identity only for development or bootstrap contexts explicitly
marked non-production.
Required token properties:
- `iss` matches an approved NetKingdom issuer;
- `aud` targets the credential-vending service or an approved backend
exchange audience;
- `sub` is stable for the principal;
- `exp`, `nbf`, and `iat` are present and within skew tolerance;
- `tenant` or equivalent tenant mapping is present for tenant-scoped
requests;
- service accounts and agents are distinguishable from humans;
- assurance/MFA claims are present when policy needs them;
- groups or roles are mapped through IAM Profile semantics, not
provider-specific bucket policy.
Local-dev restrictions:
- local issuers must only be accepted by explicitly configured dev
vending instances;
- local issuer tokens must not be trusted by production backends;
- credentials minted from local issuers must be restricted to local or
sandbox object stores.
Emergency principals:
- break-glass use is platform-control-plane access, not tenant access;
- emergency credentials must be short-lived where possible;
- every emergency vending event requires a post-event review record.
## Backend Assessment
| Backend | Temporary credential path | NetKingdom stance |
| --- | --- | --- |
| AWS S3 | AWS STS `AssumeRoleWithWebIdentity` returns access key id, secret access key, session token, and expiration | Best fit for AWS-native deployments. Use IAM OIDC provider and role trust policies, with flex-auth deciding before exchange. |
| Ceph RGW | RGW implements a subset of STS, including `AssumeRoleWithWebIdentity` for OIDC-backed temporary credentials | Good fit for self-hosted S3-compatible storage when RGW IAM/STS maturity is acceptable for the deployment. |
| MinIO/AIStor | MinIO STS supports `AssumeRoleWithWebIdentity` with OIDC JWTs and AWS-like response semantics | Strong fit for lightweight/self-hosted deployments if session-token support is wired through consumers. |
| Cloudflare R2 | R2 temporary credentials are created through the R2 Temporary Credentials API or local signing with parent access material | Use a backend-specific broker. Store parent material in OpenBao; do not expose parent credentials to workloads. |
| OpenBao | Can store parent credentials, broker dynamic material, record leases, and audit secret access | Runtime secret infrastructure and audit point, not the canonical object-storage authorization engine. |
Decision summary: prefer provider-native temporary credentials when the
backend has a mature STS or temporary-credentials API. Keep the
NetKingdom interface stable and normalize backend differences in the
credential-vending service.
## OpenBao Role
OpenBao participates in credential vending only after flex-auth approval.
Allowed OpenBao responsibilities:
- store backend parent credentials for Cloudflare R2 or other APIs that
need privileged signing material;
- store broker configuration and backend endpoint metadata;
- issue or lease dynamic credentials where a supported backend plugin or
controlled broker path exists;
- provide audit records for parent credential access and broker
operations;
- deliver credential-vending service configuration through Kubernetes
auth, CSI, or External Secrets Operator.
Prohibited OpenBao responsibilities:
- deciding whether a tenant may access a bucket or prefix;
- storing tenant policy as the canonical object-storage authorization
model;
- exposing platform mounts, root tokens, unseal/recovery material, or
parent credentials to tenants;
- bypassing flex-auth because a backend secret path is readable.
## Interface Prototype
HTTP request:
```http
POST /v1/object-storage/credentials
Authorization: Bearer <iam-profile-token>
Content-Type: application/json
```
```json
{
"protected_system_id": "object-storage:artifact-store-prod",
"tenant_id": "tenant:coulomb",
"bucket": "artifact-store-prod",
"prefix": "tenant/coulomb/packages/",
"actions": ["s3:GetObject", "s3:PutObject", "s3:ListBucket"],
"ttl_seconds": 1800,
"purpose": "artifact-store package upload",
"correlation_id": "01JYNETKINGDOMSTS000000000001"
}
```
Normalized response:
```json
{
"credentials": {
"access_key_id": "AKIA...",
"secret_access_key": "redacted-by-client-logging",
"session_token": "token...",
"expiration": "2026-05-18T16:45:00Z"
},
"scope": {
"protected_system_id": "object-storage:artifact-store-prod",
"tenant_id": "tenant:coulomb",
"bucket": "artifact-store-prod",
"prefix": "tenant/coulomb/packages/",
"actions": ["s3:GetObject", "s3:PutObject", "s3:ListBucket"]
},
"lease": {
"ttl_seconds": 1800,
"renewable": false,
"backend": "minio-assume-role-with-web-identity",
"openbao_lease_id": null
},
"decision": {
"decision_id": "dec_01JYNETKINGDOMSTS000000000001",
"policy_package": "object-storage-artifact-store-prod@2026-05-18",
"obligations": ["checksum-required"],
"audit_correlation_id": "01JYNETKINGDOMSTS000000000001"
}
}
```
Deny response:
```json
{
"error": "credential_denied",
"reason_code": "prefix_not_registered_for_tenant",
"decision_id": "dec_01JYNETKINGDOMSTS000000000002",
"audit_correlation_id": "01JYNETKINGDOMSTS000000000002"
}
```
`credential_process` output for SDK consumers:
```json
{
"Version": 1,
"AccessKeyId": "AKIA...",
"SecretAccessKey": "...",
"SessionToken": "...",
"Expiration": "2026-05-18T16:45:00Z"
}
```
CLI shape:
```bash
netkingdom-object-creds vend \
--protected-system object-storage:artifact-store-prod \
--tenant tenant:coulomb \
--bucket artifact-store-prod \
--prefix tenant/coulomb/packages/ \
--action s3:GetObject \
--action s3:PutObject \
--ttl 1800 \
--credential-process
```
## Audit Event
Each successful or denied request should emit one canonical audit event:
```json
{
"event_type": "object_storage_credential_vending",
"outcome": "allowed",
"actor": {
"subject": "service:artifact-store",
"issuer": "https://kc.coulomb.social",
"tenant": "tenant:coulomb",
"assurance": "workload"
},
"request": {
"protected_system_id": "object-storage:artifact-store-prod",
"bucket": "artifact-store-prod",
"prefix": "tenant/coulomb/packages/",
"actions": ["s3:GetObject", "s3:PutObject"],
"ttl_seconds": 1800
},
"decision": {
"decision_id": "dec_01JYNETKINGDOMSTS000000000001",
"policy_package": "object-storage-artifact-store-prod@2026-05-18"
},
"backend": {
"type": "minio-assume-role-with-web-identity",
"credential_expiration": "2026-05-18T16:45:00Z",
"openbao_lease_id": null
}
}
```
OpenBao audit events should be correlated when OpenBao parent material,
broker config, dynamic secret engines, or delivery paths are used.
## Consumer Guidance
### artifact-store
`artifact-store` should consume temporary credentials without owning the
vending authority.
Required consumer support:
- `AWS_ACCESS_KEY_ID`;
- `AWS_SECRET_ACCESS_KEY`;
- `AWS_SESSION_TOKEN`;
- credential expiration awareness;
- refresh before expiration, preferably with jitter;
- env, file, sidecar, controller, or `credential_process` delivery.
The existing static bridge can remain transitional:
```bash
export ARTIFACTSTORE_S3_ACCESS_KEY_REF=file:/run/secrets/artifactstore/s3-access-key
export ARTIFACTSTORE_S3_SECRET_KEY_REF=file:/run/secrets/artifactstore/s3-secret-key
```
Temporary credentials require either a session-token ref or a refresh
pattern that updates all three credential values atomically:
```bash
export ARTIFACTSTORE_S3_ACCESS_KEY_REF=file:/run/secrets/artifactstore/aws-access-key-id
export ARTIFACTSTORE_S3_SECRET_KEY_REF=file:/run/secrets/artifactstore/aws-secret-access-key
export ARTIFACTSTORE_S3_SESSION_TOKEN_REF=file:/run/secrets/artifactstore/aws-session-token
export ARTIFACTSTORE_S3_CREDENTIAL_EXPIRATION_REF=file:/run/secrets/artifactstore/expiration
```
Recommended deployment patterns:
- CLI or SDK `credential_process` for developer and batch use;
- sidecar refresh process for pods that cannot call the vending API
directly;
- controller plus mounted files when platform operators need centralized
refresh and audit;
- direct vending API call only when the workload can protect its IAM
token and handle refresh safely.
### Other S3 Consumers
Consumers must support the session token. Access-key/secret-key-only
clients are limited to transitional static credentials and should not be
used for production tenant workloads.
Prohibited patterns:
- object-store root credentials in application pods;
- long-lived tenant access keys for normal workload traffic;
- bucket policy managed by application repos as the source of truth;
- storing parent R2/API credentials in tenant namespaces;
- ignoring credential expiration and retrying indefinitely with expired
credentials;
- accepting local-identity tokens in production.
## Failure Modes
| Failure | Expected behavior |
| --- | --- |
| IAM token invalid or wrong audience | Deny before policy evaluation; emit audit event |
| Tenant missing or mismatched | Deny with `tenant_scope_missing` or `tenant_mismatch` |
| Prefix not registered | Deny with `prefix_not_registered_for_tenant` |
| TTL too long | Reduce to policy maximum or deny, depending on policy |
| flex-auth or Topaz unavailable | Fail closed except for explicitly documented emergency platform workflows |
| Backend STS unavailable | Do not mint credentials; return retryable backend error |
| OpenBao unavailable | Fail if parent material or broker config requires OpenBao; otherwise continue only for backend paths that do not depend on it |
| Audit sink unavailable | Deny privileged/platform-scoped requests; allow low-risk tenant requests only if policy permits buffered audit |
| Consumer refresh fails | Stop writes before expiration; retry vending with backoff; never fall back to root credentials |
## Readiness Checks
- IAM Profile token validation test passes for key-cape or Keycloak.
- flex-auth has policy packages for platform and tenant scopes.
- Topaz policy load and health are verified where delegated PDP is used.
- Backend-specific STS or temporary credential path returns credentials
with session token and expiration.
- OpenBao parent credential access, lease metadata, and audit correlation
work where OpenBao is in the path.
- artifact-store or the consumer can refresh all credential fields before
expiration.
- Deny paths produce stable reason codes and audit records.
- Break-glass operation is documented and post-event review is required.
## References
- [AWS STS AssumeRoleWithWebIdentity](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRoleWithWebIdentity.html)
- [Ceph RGW STS](https://docs.ceph.com/en/latest/radosgw/STS/)
- [MinIO AssumeRoleWithWebIdentity](https://min.io/docs/minio/linux/developers/security-token-service/AssumeRoleWithWebIdentity.html)
- [Cloudflare R2 Temporary Credentials API](https://developers.cloudflare.com/api/resources/r2/subresources/temporary_credentials/)
- [Cloudflare R2 temporary credential example](https://developers.cloudflare.com/r2/examples/authenticate-r2-temp-credentials/)

View File

@@ -0,0 +1,402 @@
# Platform Identity and Security Architecture
Status: implemented architecture baseline for NetKingdom/Railiance/Coulomb
Date: 2026-05-18
## Purpose
This document captures the production-oriented identity, authorization,
MFA, credential, and bootstrap architecture for the platform we are
building. It deliberately treats Coulomb as the first internal tenant and
reference workload, not as the platform itself.
The architecture must be recursive: the same platform that protects
future tenants also protects the services and repositories used to build
and operate the platform. That recursion is useful, but it is also where
many security designs accidentally collapse into self-administering root
power. This document exists to prevent that.
## Core Model
```text
Bootstrap plane
establishes initial trust before normal platform services exist
Platform control plane
operates identity, MFA, secrets, policy, audit, and authorization
Tenant planes
run Coulomb and future customer/project/domain workloads
```
Coulomb is the first internal tenant. It is also the reference tenant that
helps validate the platform. It must not become the platform root of
trust merely because it is first.
## Planes
### Bootstrap Plane
The bootstrap plane exists before the full platform is alive. It owns the
minimal authority needed to create and recover the control plane.
Responsibilities:
- host provisioning and hardening
- root age/SOPS material and emergency bundles
- initial cluster access
- initial identity service deployment
- initial secret injection
- break-glass recovery
- transition to managed runtime authority
Owned primarily by `railiance-infra`, `railiance-cluster`, and the
credential bootstrap work in `net-kingdom`.
### Platform Control Plane
The platform control plane owns shared security services.
Responsibilities:
- NetKingdom IAM Profile
- lightweight identity mode through key-cape
- expanded identity mode through Keycloak
- MFA/token lifecycle through privacyIDEA where applicable
- canonical authorization through flex-auth
- delegated authorization runtime through Topaz first, with other PDPs as
adapters
- runtime secret authority through OpenBao
- audit and explanation records
- platform service secrets, dynamic credentials, leases, and rotation
Owned conceptually by `net-kingdom`; deployed through the Railiance stack.
### Tenant Plane
Tenant planes are where workloads live. Coulomb is tenant zero/reference
tenant; later tenants may be projects, customers, domains, sandboxes, or
isolated deployments.
Responsibilities:
- protected services and repositories
- tenant-owned resources
- tenant-specific groups, policies, and service accounts
- local enforcement of authorization decisions
- workload audit events and diagnostics
Tenant administrators may manage their tenant resources. They must not be
able to alter platform root trust, global identity configuration,
platform break-glass material, or the policy pipeline that governs the
platform itself.
## Component Responsibilities
| Component | Primary role | Must not become |
| --- | --- | --- |
| `net-kingdom` | canonical security architecture, IAM Profile, SSO/MFA, credential bootstrap decisions | a deployment repo for every stack layer |
| `key-cape` | lightweight IAM implementation of the NetKingdom IAM Profile | a general-purpose IAM platform or authorization engine |
| Keycloak | expanded-mode IAM and optional Keycloak Authorization Services adapter | the canonical model for all platform authorization |
| privacyIDEA | MFA/token authority, especially in lightweight/key-cape mode | a policy decision point for application resources |
| OpenBao | runtime platform secrets service, dynamic credential broker, lease/revocation point, and audit source for secret access | the bootstrap root of trust or an application-specific configuration store |
| `flex-auth` | authorization control plane, CARING descriptors, policy packages, decision envelopes, audit/explain | an identity provider or backend-specific wrapper |
| Topaz | first delegated authorization runtime/PDP for flex-auth | the platform control plane or identity provider |
| Railiance repos | converged infrastructure, cluster, platform services, enablement, and app deployment | the source of security policy semantics |
## Identity Path
```text
Human/service/agent principal
|
v
NetKingdom IAM Profile
|
+-- lightweight mode: key-cape
| Authelia + LLDAP + privacyIDEA
|
+-- expanded mode: Keycloak
Keycloak + LDAP/Entra federation + MFA integration
```
Applications depend on the IAM Profile, not on the concrete provider.
key-cape is the lightweight profile implementation. Keycloak is the
expanded-mode profile implementation. privacyIDEA provides MFA/token
capabilities where the deployment mode uses it.
Identity answers: who is this actor, how was the actor authenticated,
what coarse claims are asserted, and what assurance evidence exists?
Identity does not answer final resource-specific authorization.
## Authorization Path
```text
Identity claims from IAM Profile
|
v
flex-auth
resource registry
policy packages
CARING descriptors
decision/audit/explain envelope
|
+-- standalone evaluator
+-- Topaz delegated PDP
+-- optional Keycloak AuthZ adapter
+-- future OpenFGA/SpiceDB/OPA/Cedar adapters
|
v
Protected service enforcement
```
Authorization answers: may this actor perform this action on this
resource in this context, and what explanation/audit/CARING metadata
supports that answer?
Protected services enforce decisions locally. flex-auth is the canonical
policy and decision boundary; delegated PDPs are runtime implementations
behind it.
## Secret And Credential Path
```text
Bootstrap SOPS/age material
|
v
OpenBao platform secrets service
KV v2 platform configuration
dynamic database credentials
Kubernetes auth / workload identity
future object-storage credential brokering
audit devices and lease/revocation records
|
+-- direct OpenBao clients
+-- External Secrets Operator / synced Kubernetes Secrets
+-- CSI-mounted secrets where appropriate
|
v
Platform and tenant workloads
```
SOPS/age remains the bootstrap and Git-at-rest protection mechanism. It
can create the initial cluster secrets and emergency recovery bundles, but
it should not become the long-lived runtime authority for every workload
secret.
OpenBao is the runtime platform secrets service once the control plane is
alive. It owns secret leases, revocation, audit, dynamic credentials, and
workload-facing secret delivery patterns. Workloads should receive scoped
secrets or short-lived credentials, not platform-root material. Tenant
administrators may manage tenant-scoped secrets through approved policy
paths; they must not gain access to OpenBao root tokens, unseal keys,
platform mounts, or global secret engine configuration.
OpenBao does not replace identity or authorization. NetKingdom IAM
identifies actors and workloads; flex-auth decides whether a credential
or secret request is allowed; OpenBao stores, issues, audits, and revokes
the resulting secret material.
## Recursive Trust Rule
Normal tenant administration must never be sufficient to alter the
platform root of trust.
This applies even when the tenant is Coulomb. Coulomb can be a tenant and
a reference workload, but platform-root actions require platform control
plane authority and appropriate bootstrap/break-glass safeguards.
Examples of platform-root actions:
- changing IAM Profile semantics
- rotating root bootstrap keys
- changing break-glass access
- changing global MFA requirements
- activating authorization policy that governs platform administration
- changing flex-auth/Topaz policy import pipelines
- changing OpenBao root tokens, unseal policy, platform mounts, or global
auth methods
- changing audit retention or tamper-evidence settings
## Tenant Model
Every protected resource should belong to a tenant or to the platform
control plane.
Suggested identifiers:
```text
tenant:platform # platform control plane resources
tenant:coulomb # first internal/reference tenant
tenant:sandbox:<name> # sandbox tenants
tenant:customer:<name> # future customer tenants
```
Tenant membership and platform membership are distinct. A subject may be
an administrator in `tenant:coulomb` without being a platform operator.
CARING descriptors should explicitly identify scope and tenant when the
access is tenant-scoped. Platform-scoped descriptors should be rare,
audited, and usually condition-bound.
## Bootstrap To Runtime Transition
Production setup should move through explicit trust states:
1. **Bare host trust** - provisioned and verified by Railiance infra.
2. **Cluster trust** - Kubernetes runtime exists and is verified.
3. **Bootstrap secret trust** - age/SOPS and emergency bundles are
established.
4. **Bootstrap identity trust** - local/bootstrap identity can operate
enough to install full identity services.
5. **Runtime secret trust** - OpenBao is deployed, initialized, unsealed,
audited, backed up, and ready to issue scoped secrets.
6. **Runtime identity trust** - key-cape or Keycloak becomes the normal
IAM Profile issuer.
7. **Runtime authorization trust** - flex-auth and Topaz are initialized
with platform and tenant policies.
8. **Tenant onboarding trust** - Coulomb and later tenants register
resources and receive scoped authority.
Each transition needs a verification check and a rollback/recovery path.
## Production Topology
For an initial production-capable Coulomb deployment:
```text
railiance-infra
host baseline, SSH, age keys, emergency material
railiance-cluster
Kubernetes, ingress, cert-manager, network policy
railiance-platform
OpenBao, PostgreSQL, object storage, platform service secret delivery
key-cape or Keycloak
privacyIDEA where used
flex-auth
Topaz
railiance-apps
Coulomb services as tenant:coulomb workloads
```
`net-kingdom` owns the architecture and standards. Railiance owns the
converged deployment layers. Component repos own their implementation
contracts.
## Orchestration Implication
A future orchestration repo may be justified, but only after the state
machine is clear. It should not own resources directly. It should own
safe sequencing across repos.
Possible responsibilities:
- verify Railiance preconditions
- initialize credential bootstrap
- deploy or validate identity services
- deploy or validate flex-auth and Topaz
- run IAM Profile conformance checks
- run authorization conformance checks
- produce a platform security readiness report
This orchestration layer should build on Railiance capabilities rather
than bypassing the Railiance stack boundaries.
ADR-0007 records the current decision: keep orchestration in Railiance
playbooks for now, with NetKingdom defining the trust-state model,
readiness checks, OpenBao boundaries, and security semantics.
## flex-auth And Topaz Implications
flex-auth work must preserve the recursive boundary between platform
control-plane resources and tenant resources.
Required implications:
- CARING descriptors must include scope and tenant metadata for
tenant-scoped access, and must mark rare platform-scoped access
explicitly.
- Policy packages must distinguish `tenant:platform` policy from
tenant-local packages such as `tenant:coulomb`.
- Decision envelopes must carry subject, issuer, audience, tenant,
protected-system id, resource, action, requested TTL where relevant,
assurance evidence, obligations, deny reasons, and audit correlation
ids.
- Topaz is a delegated PDP runtime behind flex-auth. It must not become
the canonical policy model, identity provider, or platform control
plane.
- Audit and explain records must be durable enough to reconstruct why a
platform-root, secret, credential, or tenant-administration decision was
allowed or denied.
- Platform-root guardrails must deny tenant administrators the ability to
alter IAM Profile semantics, OpenBao platform mounts/auth methods,
flex-auth policy import pipelines, Topaz runtime configuration, or
platform audit retention.
OpenBao secret access and dynamic credential requests follow the same
authorization rule: identity proves the actor or workload, flex-auth
decides whether the request is permitted, and OpenBao stores, issues,
leases, audits, and revokes the secret material.
## Coulomb Tenant Onboarding Path
The first Coulomb tenant onboarding path should be repeatable before it
becomes automated:
1. Register `tenant:coulomb` as a tenant distinct from
`tenant:platform`.
2. Map Coulomb human, service, and agent principals to IAM Profile claims
with issuer, audience, subject, group, tenant, and assurance evidence.
3. Register Coulomb protected systems and resources in flex-auth with
stable protected-system ids.
4. Import tenant-scoped policy packages and CARING descriptors for
Coulomb resources.
5. Initialize the delegated PDP runtime, starting with Topaz, using only
the policy packages approved for the tenant and platform boundary.
6. Provision Coulomb workload secret paths, Kubernetes auth roles, or
delivery mechanisms in OpenBao without granting access to platform
mounts, unseal/recovery material, or global auth configuration.
7. Run audit readiness checks before admitting production traffic:
identity issuance, flex-auth decision envelope, Topaz health,
OpenBao audit event, workload enforcement event, and correlation id.
The onboarding path is complete when a Coulomb workload can authenticate,
receive a scoped authorization decision, obtain only the allowed secret or
short-lived credential, enforce the decision locally, and produce an
auditable record without receiving platform-root authority.
## Production Readiness Checks
Before the security platform is production-ready, each trust state needs
an explicit check:
| Area | Readiness check |
| --- | --- |
| MFA and identity | key-cape or Keycloak issues IAM Profile-compatible tokens; privacyIDEA or the selected MFA provider enforces required assurance for privileged actions |
| Bootstrap and recovery | age/SOPS material, emergency bundle, and break-glass credentials are present, tested, and separated from tenant administration |
| OpenBao runtime secrets | OpenBao is initialized, unsealed or auto-unsealed by the approved mechanism, backed up, audited, and using scoped auth methods and mounts |
| Secret rotation | service, database, OpenBao-issued, and break-glass rotation paths have documented blast radius and verification steps |
| flex-auth policy state | platform and tenant policy packages are versioned, reviewable, imported, and explainable |
| Topaz runtime | delegated PDP health, data freshness, policy load status, and fail-closed behavior are verified |
| Tenant onboarding | `tenant:coulomb` resources, claims, policies, OpenBao paths, and audit correlation are registered and tested |
| Audit sink | identity, flex-auth, Topaz, OpenBao, Kubernetes, and workload audit records land in durable storage with restore/drill coverage |
| Break-glass | emergency access works when normal identity is unavailable and produces a post-event review record |
## Open Questions
- Where is the durable audit log stored for platform-root decisions?
- Where are OpenBao audit logs durably shipped, and how are they included
in tamper-evidence and restore drills?
- Which actions require dual control or human confirmation?
- How is break-glass use recorded when normal identity is unavailable?
- Which workloads consume OpenBao directly, via External Secrets Operator,
or via CSI-mounted secrets?
- Which tenant metadata is required before a service can register
resources with flex-auth?
- When does the platform switch from key-cape lightweight mode to
Keycloak expanded mode?
- Does Topaz run centrally for the platform, per tenant, or per service
for the first production deployment?

View File

@@ -0,0 +1,207 @@
# OpenBao - Platform Secrets Service
**Chart:** `openbao/openbao`
**Chart version:** `0.28.2`
**App version:** `v2.5.3`
**Namespace:** `openbao`
**Managed by:** `railiance-platform` (S3)
**Workplan:** `RAIL-PL-WP-0002`
**Initial target:** Railiance01 (`92.205.62.239`)
---
## Architecture
```
S5 workloads / operators
-> openbao.openbao.svc.cluster.local:8200
-> openbao-0
-> integrated Raft storage on local-path PVC
-> audit storage PVC mounted at /openbao/audit
```
- OpenBao is the canonical Railiance S3 secrets service.
- SOPS/age remains the Git-at-rest bootstrap mechanism.
- The first Railiance01 deployment is single-replica Raft, not true HA.
- Public ingress is disabled. Operators use `kubectl exec` or port-forwarding.
- TLS is disabled inside the pod listener for this internal-only bootstrap. Add
cert-manager-backed internal TLS before exposing OpenBao beyond cluster-local
traffic.
## Deployment
The official OpenBao project recommends the Helm chart for Kubernetes
deployments and warns to run Helm with `--dry-run` before install or upgrade.
From a host with kubeconfig access:
```bash
make openbao-dry-run
make openbao-deploy
make openbao-status
```
On Railiance01 directly:
```bash
cd ~/railiance-platform
sudo env KUBECONFIG=/etc/rancher/k3s/k3s.yaml make openbao-dry-run
sudo env KUBECONFIG=/etc/rancher/k3s/k3s.yaml make openbao-deploy
sudo env KUBECONFIG=/etc/rancher/k3s/k3s.yaml make openbao-status
```
If the repo is not present on Railiance01 yet, copy only the non-secret values
file and run Helm directly:
```bash
scp helm/openbao-values.yaml tegwick@92.205.62.239:/tmp/openbao-values.yaml
ssh tegwick@92.205.62.239 \
'sudo env KUBECONFIG=/etc/rancher/k3s/k3s.yaml helm upgrade --install openbao openbao/openbao \
--version 0.28.2 \
--namespace openbao \
--create-namespace \
-f /tmp/openbao-values.yaml \
--dry-run'
```
Repeat without `--dry-run` to deploy.
## Verification
```bash
kubectl get pods,svc,pvc -n openbao -o wide
kubectl exec -n openbao openbao-0 -- bao status
```
Expected immediately after install:
- `openbao-0` is Running.
- `openbao`, `openbao-active`, `openbao-internal`, and `openbao-ui` services
exist as cluster-internal services.
- data and audit PVCs are Bound.
- `bao status` reports `Initialized: false` and `Sealed: true`.
That state is intentional until the bootstrap ceremony is completed.
## Bootstrap Ceremony
Do not initialize OpenBao in a casual shell session. Initialization emits the
unseal keys and initial root token. Treat this as a break-glass event.
Recommended ceremony:
1. Confirm the Railiance01 backup posture first.
2. Prepare three human escrow recipients for unseal shares.
3. Run initialization once:
```bash
kubectl exec -n openbao openbao-0 -- \
bao operator init -key-shares=3 -key-threshold=2
```
4. Give each unseal share to its escrow owner through an out-of-band channel.
5. Unseal with two shares:
```bash
kubectl exec -n openbao openbao-0 -- bao operator unseal
```
6. Log in with the initial root token only long enough to create durable admin
auth, enable audit, and prepare policies.
7. Revoke or tightly escrow the initial root token.
## Initial Configuration After Unseal
Enable file audit:
```bash
kubectl exec -n openbao openbao-0 -- \
bao audit enable file file_path=/openbao/audit/openbao-audit.log
```
Enable the first KV v2 mount:
```bash
kubectl exec -n openbao openbao-0 -- \
bao secrets enable -path=platform kv-v2
```
Kubernetes auth, database dynamic credentials, PKI, CSI, and External Secrets
integration are follow-up tasks in `RAIL-PL-WP-0002`. Do not migrate live
application secrets until those policies and restore drills are documented.
## Artifact-Store Object Storage Handoff
`artifact-store` is the consumer-facing artifact preservation service for
generated outputs, evidence packages, reports, logs, snapshots, exports, and
release artifacts. It already has an S3-compatible backend with `env:NAME` and
`file:/mounted/path` credential references, plus an
`artifactstore storage verify --backend s3` smoke path.
Railiance should avoid building a parallel object-storage client or credential
vending flow in OpenBao. The ownership split is:
- `railiance-platform` / OpenBao owns bootstrap secret custody, policy, audit,
break-glass access, and workload secret delivery.
- `artifact-store` owns artifact package manifests, the S3 backend, storage
verification, and whether temporary credentials require backend refresh
support or a sidecar/controller.
- `net-kingdom` owns the identity issuer and role-claim model if object storage
adopts STS with `AssumeRoleWithWebIdentity`.
Initial static-credential bridge, before STS is proven:
1. Create a scoped object-store access key limited to the artifact-store bucket
and prefix. Do not use object-store root credentials.
2. Store the key pair in OpenBao under a platform-owned path such as
`platform/object-storage/artifact-store`.
3. Deliver the values to the artifact-store pod through CSI or External Secrets
as mounted files.
4. Configure artifact-store with file references:
```bash
export ARTIFACTSTORE_S3_ACCESS_KEY_REF=file:/run/secrets/artifactstore/s3-access-key
export ARTIFACTSTORE_S3_SECRET_KEY_REF=file:/run/secrets/artifactstore/s3-secret-key
```
5. Verify from artifact-store:
```bash
artifactstore storage verify --backend s3
```
STS credential vending remains linked to
`ARTIFACT-STORE-WP-0007 - MinIO Compatibility, MaxIO Fork Assessment, And STS
Credential Vending`. If that workstream chooses MinIO-compatible
`AssumeRoleWithWebIdentity`, OpenBao should not become the identity provider by
default. Use the NetKingdom OIDC issuer for workload/user identity, map object
storage roles and policies there, and keep OpenBao responsible for bootstrap,
break-glass, audit, and delivery of any controller configuration.
Current artifact-store configuration exposes access key and secret key refs,
but no session-token ref. `ARTIFACT-STORE-WP-0007-T004` must either add
temporary-session-token support to the S3 backend or choose a sidecar/secret
controller pattern that keeps refreshed credentials available through the
existing env/file reference contract.
## Upgrade And Rollback
1. Read the OpenBao chart release notes.
2. Update `OPENBAO_CHART_VERSION` in `Makefile`.
3. Run `make openbao-dry-run`.
4. Confirm current backup and audit log posture.
5. Run `make openbao-deploy`.
6. Run `make openbao-status`.
For rollback, run `helm rollback openbao <REVISION> -n openbao` on Railiance01
and re-check `bao status`.
## Scaling To Three Nodes
When Railiance02 and Railiance03 join:
1. Move storage from `local-path` to distributed storage.
2. Set `server.affinity` back to anti-affinity.
3. Set `server.ha.replicas: 3`.
4. Re-enable a PodDisruptionBudget.
5. Run an unseal, failover, backup, and restore drill before migrating secrets.

View File

@@ -0,0 +1,44 @@
slug: patterns-of-it-securita-architecture
name: Patterns Of IT Security Architecture
topic:
name: Patterns Of IT Security Architecture
domain: NetKingdom Security Architecture
sources: artifacts/sources
disciplines:
- name: Security Capability Catalog
path: artifacts/entities/security-capability-catalog.md
- name: Security Architecture Pattern Catalog
path: artifacts/entities/security-architecture-pattern-catalog.md
- name: NetKingdom Ownership Mapping
path: artifacts/relations/netkingdom-ownership-map.md
schemas: {}
workflows:
- id: initial-catalog-inspection
description: Inspect the first security capability and pattern catalog extracted from the genesis exploration.
inputs:
source:
kind: source
stages: []
expected_evaluations:
- metrics
- review
- id: pattern-admission-review
description: Review whether a proposed security pattern can move from seed to draft, reviewed, canonical, or deprecated.
inputs:
pattern:
kind: entity
stages: []
expected_evaluations:
- checklist
- evidence
viability:
redundancy_ratio:
max: 0.2
coverage_ratio:
min: 0.8
coherence_components:
max: 1
consistency_cycles:
max: 0
granularity_entropy:
min: 1

View File

@@ -0,0 +1,184 @@
- snapshot_id: f2a7ad2b
created_at: '2026-05-19T01:28:48.740586+00:00'
schema_name: default
artifact_count: 16
artifact_evaluations: []
collection_metrics:
- name: coherence_components
value: 1.0
concern: C3
- name: consistency_cycles
value: 26.0
concern: C4
- name: coverage_ratio
value: 1.0
concern: C2
- name: granularity_entropy
value: 2.1266144718101816
concern: C5
- name: redundancy_ratio
value: 0.0
concern: C1
metadata:
source: collection-checks
- snapshot_id: 05cb3170
created_at: '2026-05-19T01:34:03.065840+00:00'
schema_name: default
artifact_count: 16
artifact_evaluations: []
collection_metrics:
- name: coherence_components
value: 1.0
concern: C3
- name: consistency_cycles
value: 2.0
concern: C4
- name: coverage_ratio
value: 1.0
concern: C2
- name: granularity_entropy
value: 2.1266144718101816
concern: C5
- name: redundancy_ratio
value: 0.0
concern: C1
metadata:
source: collection-checks
- snapshot_id: 502d0933
created_at: '2026-05-19T01:34:54.938621+00:00'
schema_name: default
artifact_count: 16
artifact_evaluations: []
collection_metrics:
- name: coherence_components
value: 1.0
concern: C3
- name: consistency_cycles
value: 0.0
concern: C4
- name: coverage_ratio
value: 1.0
concern: C2
- name: granularity_entropy
value: 2.1266144718101816
concern: C5
- name: redundancy_ratio
value: 0.0
concern: C1
metadata:
source: collection-checks
- snapshot_id: 92eb2b29
created_at: '2026-05-19T02:01:30.942012+00:00'
schema_name: default
artifact_count: 31
artifact_evaluations: []
collection_metrics:
- name: coherence_components
value: 1.0
concern: C3
- name: consistency_cycles
value: 0.0
concern: C4
- name: coverage_ratio
value: 1.0
concern: C2
- name: granularity_entropy
value: 1.653542657810587
concern: C5
- name: redundancy_ratio
value: 0.0
concern: C1
metadata:
source: collection-checks
- snapshot_id: d3e79408
created_at: '2026-05-19T02:50:31.580046+00:00'
schema_name: default
artifact_count: 69
artifact_evaluations: []
collection_metrics:
- name: coherence_components
value: 1.0
concern: C3
- name: consistency_cycles
value: 1.0
concern: C4
- name: coverage_ratio
value: 1.0
concern: C2
- name: granularity_entropy
value: 0.9768669019690808
concern: C5
- name: redundancy_ratio
value: 0.0
concern: C1
metadata:
source: collection-checks
- snapshot_id: 252bcf09
created_at: '2026-05-19T02:58:52.655553+00:00'
schema_name: default
artifact_count: 69
artifact_evaluations: []
collection_metrics:
- name: coherence_components
value: 1.0
concern: C3
- name: consistency_cycles
value: 1.0
concern: C4
- name: coverage_ratio
value: 1.0
concern: C2
- name: granularity_entropy
value: 1.2796922166870623
concern: C5
- name: redundancy_ratio
value: 0.0
concern: C1
metadata:
source: collection-checks
- snapshot_id: 38101ddf
created_at: '2026-05-19T02:59:40.038605+00:00'
schema_name: default
artifact_count: 69
artifact_evaluations: []
collection_metrics:
- name: coherence_components
value: 1.0
concern: C3
- name: consistency_cycles
value: 0.0
concern: C4
- name: coverage_ratio
value: 1.0
concern: C2
- name: granularity_entropy
value: 1.2796922166870623
concern: C5
- name: redundancy_ratio
value: 0.0
concern: C1
metadata:
source: collection-checks
- snapshot_id: 7bf35f3b
created_at: '2026-05-19T03:01:45.733321+00:00'
schema_name: default
artifact_count: 69
artifact_evaluations: []
collection_metrics:
- name: coherence_components
value: 1.0
concern: C3
- name: consistency_cycles
value: 0.0
concern: C4
- name: coverage_ratio
value: 1.0
concern: C2
- name: granularity_entropy
value: 1.2796922166870623
concern: C5
- name: redundancy_ratio
value: 0.0
concern: C1
metadata:
source: collection-checks

View File

@@ -0,0 +1,5 @@
coherence_components: 1.0
consistency_cycles: 0.0
coverage_ratio: 1.0
granularity_entropy: 1.279692
redundancy_ratio: 0.0

View File

@@ -0,0 +1,23 @@
snapshot_id: 05cb3170
created_at: '2026-05-19T01:34:03.065840+00:00'
schema_name: default
artifact_count: 16
artifact_evaluations: []
collection_metrics:
- name: coherence_components
value: 1.0
concern: C3
- name: consistency_cycles
value: 2.0
concern: C4
- name: coverage_ratio
value: 1.0
concern: C2
- name: granularity_entropy
value: 2.1266144718101816
concern: C5
- name: redundancy_ratio
value: 0.0
concern: C1
metadata:
source: collection-checks

View File

@@ -0,0 +1,23 @@
snapshot_id: 252bcf09
created_at: '2026-05-19T02:58:52.655553+00:00'
schema_name: default
artifact_count: 69
artifact_evaluations: []
collection_metrics:
- name: coherence_components
value: 1.0
concern: C3
- name: consistency_cycles
value: 1.0
concern: C4
- name: coverage_ratio
value: 1.0
concern: C2
- name: granularity_entropy
value: 1.2796922166870623
concern: C5
- name: redundancy_ratio
value: 0.0
concern: C1
metadata:
source: collection-checks

View File

@@ -0,0 +1,23 @@
snapshot_id: 38101ddf
created_at: '2026-05-19T02:59:40.038605+00:00'
schema_name: default
artifact_count: 69
artifact_evaluations: []
collection_metrics:
- name: coherence_components
value: 1.0
concern: C3
- name: consistency_cycles
value: 0.0
concern: C4
- name: coverage_ratio
value: 1.0
concern: C2
- name: granularity_entropy
value: 1.2796922166870623
concern: C5
- name: redundancy_ratio
value: 0.0
concern: C1
metadata:
source: collection-checks

View File

@@ -0,0 +1,23 @@
snapshot_id: 502d0933
created_at: '2026-05-19T01:34:54.938621+00:00'
schema_name: default
artifact_count: 16
artifact_evaluations: []
collection_metrics:
- name: coherence_components
value: 1.0
concern: C3
- name: consistency_cycles
value: 0.0
concern: C4
- name: coverage_ratio
value: 1.0
concern: C2
- name: granularity_entropy
value: 2.1266144718101816
concern: C5
- name: redundancy_ratio
value: 0.0
concern: C1
metadata:
source: collection-checks

View File

@@ -0,0 +1,23 @@
snapshot_id: 7bf35f3b
created_at: '2026-05-19T03:01:45.733321+00:00'
schema_name: default
artifact_count: 69
artifact_evaluations: []
collection_metrics:
- name: coherence_components
value: 1.0
concern: C3
- name: consistency_cycles
value: 0.0
concern: C4
- name: coverage_ratio
value: 1.0
concern: C2
- name: granularity_entropy
value: 1.2796922166870623
concern: C5
- name: redundancy_ratio
value: 0.0
concern: C1
metadata:
source: collection-checks

View File

@@ -0,0 +1,23 @@
snapshot_id: 92eb2b29
created_at: '2026-05-19T02:01:30.942012+00:00'
schema_name: default
artifact_count: 31
artifact_evaluations: []
collection_metrics:
- name: coherence_components
value: 1.0
concern: C3
- name: consistency_cycles
value: 0.0
concern: C4
- name: coverage_ratio
value: 1.0
concern: C2
- name: granularity_entropy
value: 1.653542657810587
concern: C5
- name: redundancy_ratio
value: 0.0
concern: C1
metadata:
source: collection-checks

View File

@@ -0,0 +1,23 @@
snapshot_id: d3e79408
created_at: '2026-05-19T02:50:31.580046+00:00'
schema_name: default
artifact_count: 69
artifact_evaluations: []
collection_metrics:
- name: coherence_components
value: 1.0
concern: C3
- name: consistency_cycles
value: 1.0
concern: C4
- name: coverage_ratio
value: 1.0
concern: C2
- name: granularity_entropy
value: 0.9768669019690808
concern: C5
- name: redundancy_ratio
value: 0.0
concern: C1
metadata:
source: collection-checks

View File

@@ -0,0 +1,23 @@
snapshot_id: f2a7ad2b
created_at: '2026-05-19T01:28:48.740586+00:00'
schema_name: default
artifact_count: 16
artifact_evaluations: []
collection_metrics:
- name: coherence_components
value: 1.0
concern: C3
- name: consistency_cycles
value: 26.0
concern: C4
- name: coverage_ratio
value: 1.0
concern: C2
- name: granularity_entropy
value: 2.1266144718101816
concern: C5
- name: redundancy_ratio
value: 0.0
concern: C1
metadata:
source: collection-checks

View File

@@ -0,0 +1,32 @@
passed: true
results:
redundancy_ratio:
metric: redundancy_ratio
value: 0.0
threshold:
max: 0.2
passed: true
coverage_ratio:
metric: coverage_ratio
value: 1.0
threshold:
min: 0.8
passed: true
coherence_components:
metric: coherence_components
value: 1.0
threshold:
max: 1.0
passed: true
consistency_cycles:
metric: consistency_cycles
value: 0.0
threshold:
max: 0.0
passed: true
granularity_entropy:
metric: granularity_entropy
value: 1.2796922166870623
threshold:
min: 1.0
passed: true

View File

@@ -0,0 +1,62 @@
# Initial Security Pattern Infospace Report
Date: 2026-05-19
Workplans: NK-WP-0008, NK-WP-0010
## Summary
The seeded security architecture exploration has been promoted into a
valid `infospace-bench` infospace with manifest, metadata, artifact
directories, catalogs, ownership map, maturity index, review criteria,
and first-class artifacts for every exact pattern named in the genesis
security architecture pattern catalogue.
The infospace is now ready for pattern review, maturity promotion, and
NK-WP-0009 tutorial production.
## Created Artifacts
| Artifact | Purpose |
| --- | --- |
| `infospace.yaml` | Declares identity, disciplines, workflows, and viability thresholds |
| `artifacts/index.yaml` | Manifest for source, entity, relation, generated, and report artifacts |
| `artifacts/entities/security-capability-catalog.md` | Initial capability catalog |
| `artifacts/entities/security-architecture-pattern-catalog.md` | Initial pattern catalog |
| `artifacts/entities/security-readiness-levels.md` | RL0-RL4 and pattern maturity model |
| `artifacts/relations/netkingdom-ownership-map.md` | Repo/component/workplan responsibility mapping |
| `artifacts/generated/security-pattern-index.md` | Capability status, pattern maturity, and tutorial handoff index |
| `artifacts/generated/pattern-admission-review.md` | Admission and graduation checklist |
| `artifacts/generated/research-pattern-normalization.md` | Completion map from every genesis seed pattern to its first-class artifact |
| `artifacts/entities/pattern-*.md` | One artifact per exact genesis pattern plus NetKingdom umbrella patterns |
## Coverage Against NK-WP-0008
| Task | Status | Evidence |
| --- | --- | --- |
| T01 Promote seed | done | `infospace.yaml`, `artifacts/index.yaml`, standard directories |
| T02 Extract catalogs | done | capability catalog, pattern catalog, readiness levels |
| T03 Map ownership | done | NetKingdom ownership map |
| T04 Build index/report | done | security pattern index and this report |
| T05 Review criteria | done | pattern admission and review criteria |
## Coverage Against NK-WP-0010
| Task | Status | Evidence |
| --- | --- | --- |
| T01 Reconcile inventory | done | `artifacts/generated/research-pattern-normalization.md` |
| T02 Identity and access | done | eight exact identity/access pattern artifacts |
| T03 Tenant isolation | done | six exact tenant-isolation pattern artifacts |
| T04 Kubernetes and platform | done | seven exact Kubernetes/platform pattern artifacts |
| T05 Secrets and cryptography | done | five exact secrets/cryptography pattern artifacts |
| T06 Application/API security | done | six exact application/API pattern artifacts |
| T07 Supply chain | done | six exact supply-chain pattern artifacts |
| T08 Detection and response | done | six exact detection/response pattern artifacts |
| T09 Relationships and reports | done | refreshed manifest, catalog, ownership map, index, normalization, and report |
| T10 Verification and tutorial handoff | done | validation passed; metrics snapshot `7bf35f3b` passed viability with 69 artifacts, one connected component, and zero cycles; graph export succeeded; pytest passed with 181 passed and 2 skipped |
## Important Next Work
- Promote seed patterns to draft or reviewed after evidence is attached.
- Feed canonical patterns into NK-WP-0009 tutorials.
- Decide whether capability and pattern status should become structured
YAML for dashboard or State Hub consumption.