Add OpenBao runtime secret authority; complete NK-WP-0006/0007/0008

Refine the recursive platform security architecture to make OpenBao the
canonical runtime secret authority, with SOPS/age, K8s Secrets, and the
emergency bundle reframed as bootstrap/delivery/break-glass mechanisms.

- credential-management standard v0.2: add OpenBao runtime authority
  section, rotation rules, and prohibited patterns (OpenBao-as-PDP,
  tenant platform-root)
- platform-identity-security-architecture: mark implemented; add
  flex-auth/Topaz implications, Coulomb onboarding path, and a
  production-readiness checklist
- NK-WP-0004/0005: document bootstrap-to-OpenBao handoff boundary
- NK-WP-0006/0007: status -> done with implementation reviews; add
  recursive platform/tenant split and OpenBao broker/audit role for
  object-storage STS vending
- NK-WP-0008: status -> done; repoint corpus to infospace-bench
- new ADR-0007 (orchestration boundary), ADR-0008 (STS vending
  boundary), and the object-storage STS credential-vending architecture

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-20 22:51:20 +02:00
parent b49631acef
commit 7b211acd57
10 changed files with 1150 additions and 69 deletions

View File

@@ -0,0 +1,87 @@
# ADR-0007 - Security Orchestration Boundary
**Status:** Accepted
**Date:** 2026-05-18
**Deciders:** Bernd Worsch, Codex
## Context
The recursive platform security architecture needs careful sequencing:
host trust, cluster trust, bootstrap secrets, runtime secret authority,
runtime identity, runtime authorization, tenant onboarding, and readiness
verification.
That sequencing crosses NetKingdom and Railiance ownership boundaries.
NetKingdom owns the canonical security architecture, IAM Profile,
credential/bootstrap standards, and authorization semantics. Railiance
owns deployment layering for infrastructure, clusters, platform services,
and applications. OpenBao adds an important runtime-secret authority to
the platform control plane, but it does not change those ownership
boundaries.
Creating a dedicated orchestration repo too early would risk encoding
temporary bootstrap order and accidental stack assumptions as a permanent
interface. Leaving every sequence implicit would also be risky: platform
root actions, OpenBao initialization, policy import, and tenant onboarding
must be auditable and repeatable.
## Decision
Security orchestration will stay in Railiance playbooks for now.
NetKingdom will define the trust-state model, readiness checks, policy
semantics, OpenBao boundaries, and tenant/control-plane rules. Railiance
playbooks will own the concrete deployment sequencing across
`railiance-infra`, `railiance-cluster`, `railiance-platform`, and
`railiance-apps`.
A dedicated orchestration repo is deferred until the sequencing surface is
stable enough to justify its own product boundary. If created later, it
must coordinate safe sequencing and readiness reporting; it must not own
security policy semantics or bypass Railiance stack ownership.
## Consequences
- NK-WP-0006 is implemented as architecture, standards, ADRs, and
workplan constraints rather than a new repo.
- OpenBao bootstrap, unseal/recovery, audit, backup, and workload-secret
delivery belong in Railiance platform playbooks, governed by
NetKingdom standards.
- Cross-repo readiness should be reported as checks against explicit
trust states, not as a hidden imperative script.
- A future orchestration repo needs a new ADR before creation.
## Future Repo Trigger
Revisit a dedicated orchestration repo only if at least two of these are
true:
- multiple Railiance deployments need the same security sequencing
interface;
- readiness reporting becomes a reusable artifact consumed by operators,
agents, or CI;
- rollback and recovery workflows need a cross-repo state machine that no
single Railiance layer can own cleanly;
- tenant onboarding becomes a repeatable workflow spanning identity,
flex-auth, Topaz, OpenBao, object storage, and application repos.
## Alternatives Considered
### Create A Dedicated Orchestration Repo Now
This would give sequencing a visible home, but it would probably encode
unstable details before OpenBao runtime operations, flex-auth/Topaz
policy import, and tenant onboarding have enough implementation feedback.
### Put Orchestration In NetKingdom
NetKingdom owns the security model, but it should not become the
deployment repo for every stack layer. This would blur architecture
ownership with platform deployment ownership.
### Leave Sequencing Entirely Informal
This avoids premature structure but leaves bootstrap and runtime trust
transitions too dependent on operator memory. The accepted approach keeps
the sequence explicit while leaving concrete deployment in the Railiance
stack.

View File

@@ -0,0 +1,82 @@
# ADR-0008 - Object Storage STS Credential Vending Boundary
**Status:** Accepted
**Date:** 2026-05-18
**Deciders:** Bernd Worsch, Codex
## Context
NetKingdom needs a canonical pattern for issuing short-lived
object-storage credentials to platform and tenant workloads. The first
known consumer is `artifact-store`, but the pattern must work for future
S3-compatible consumers without making each application repo own identity,
authorization, root object-store credentials, or backend-specific STS
differences.
The backend landscape is not uniform. AWS S3, Ceph RGW, and MinIO/AIStor
can use web-identity STS-style flows. Cloudflare R2 exposes temporary
credentials through a provider API or local signing with parent access
material. OpenBao is now part of the Railiance platform stack as runtime
secret authority, but it is not an identity provider or authorization
policy engine.
## Decision
NetKingdom will define a provider-neutral credential-vending interface
backed by provider-native temporary credential mechanisms where possible.
The trust path is:
1. IAM Profile token proves the actor or workload.
2. flex-auth decides whether the actor may receive credentials for the
requested protected system, tenant, bucket, prefix, action set, TTL,
and assurance level.
3. The credential-vending service exchanges the approved request with
the backend-specific temporary credential mechanism.
4. OpenBao stores parent credentials, broker configuration, lease
metadata, and audit evidence where useful, but it does not replace
flex-auth authorization.
5. Consumers receive normalized temporary credentials containing access
key id, secret access key, session token, and expiration.
## Consequences
- `artifact-store` needs temporary credential support, especially
`AWS_SESSION_TOKEN` and refresh behavior, before it can fully consume
the production vending pattern.
- Backend-specific differences are isolated in the vending service, not
leaked into application policy.
- OpenBao remains runtime secret infrastructure and audit support; it
does not become the object-storage policy source.
- Provider-native STS is preferred when available because it gives the
storage backend direct lease/expiration semantics.
- Cloudflare R2 requires a broker path that protects parent access
material, most likely through OpenBao custody.
## Alternatives Considered
### Give Applications Long-Lived Access Keys
This is simple but leaves applications holding durable credentials and
pushes policy into ad hoc bucket configuration. It is acceptable only as
a transitional bridge with scoped credentials and explicit rotation.
### Put Object-Storage Policy In Keycloak Or key-cape
Identity providers can assert who the actor is and coarse groups or
roles, but they should not become the canonical source of bucket,
prefix, action, TTL, and explanation semantics.
### Use OpenBao As The Credential Vending Policy Engine
OpenBao is valuable for secret custody, broker configuration, leases,
and audit records. Making it the policy decision point would duplicate
flex-auth, blur the platform/tenant boundary, and make authorization
semantics backend-specific.
### Require One Backend Everywhere
A single backend would simplify implementation but does not match the
platform direction. Railiance and NetKingdom need a stable security
interface across AWS, self-hosted S3-compatible stores, and Cloudflare
R2-like APIs.

View File

@@ -0,0 +1,482 @@
# Object Storage STS Credential Vending
Status: architecture baseline for NK-WP-0007
Date: 2026-05-18
## Purpose
This document defines the NetKingdom pattern for vending short-lived
object-storage credentials from verified identity and policy decisions.
It is provider-neutral at the NetKingdom boundary and provider-aware at
the backend exchange boundary.
The goal is to let consumers such as `artifact-store` use S3-compatible
temporary credentials without owning identity, authorization, secret
custody, or object-storage root credentials.
## Ownership Boundary
| Capability | Owner |
| --- | --- |
| IAM Profile, issuer and claim requirements | NetKingdom |
| Resource/action vocabulary and policy decision envelope | flex-auth, governed by NetKingdom architecture |
| Delegated PDP runtime | Topaz first, behind flex-auth |
| Runtime secret custody, broker configuration, audit, leases | OpenBao, deployed by Railiance platform |
| Object-storage backend configuration | Railiance platform |
| Artifact package behavior and S3 client refresh behavior | artifact-store |
| Application deployment | Railiance apps or the owning application repo |
OpenBao may store parent credentials, broker configuration, or issued
credential metadata where appropriate. It does not replace flex-auth as
the authorization decision point and must not become the object-storage
policy model.
## Core Flow
```text
Human, service, or agent principal
|
v
NetKingdom IAM Profile token
key-cape lightweight mode or Keycloak expanded mode
|
v
credential-vending service
verifies issuer, audience, subject, assurance, tenant
|
v
flex-auth decision
tenant, protected-system, bucket, prefix, actions, TTL, obligations
|
v
backend exchange
AWS STS, Ceph RGW STS, MinIO/AIStor STS, Cloudflare R2 temp API,
or OpenBao-assisted broker path
|
v
temporary S3 credentials
access key id, secret access key, session token, expiration
|
v
consumer
artifact-store, SDK, CLI, sidecar, controller, or batch job
```
## Trust Boundaries
### Platform Control Plane
`tenant:platform` administers the credential-vending service, approved
issuer list, flex-auth policy import pipeline, OpenBao mounts/auth
methods, backend parent credentials, audit retention, and emergency
recovery.
### Tenant Plane
`tenant:coulomb` and later tenants may request scoped credentials for
registered tenant resources. Tenant administrators must not receive
OpenBao root tokens, object-storage root credentials, global backend STS
configuration, or platform policy import authority.
### Backend Boundary
The credential-vending service is the only component that exchanges an
approved decision for provider-native credentials. Consumers receive only
short-lived credentials scoped to the approved bucket, prefix, actions,
and TTL.
## Token And Decision Flow
1. The caller authenticates through a NetKingdom IAM Profile
implementation.
2. The caller sends a request to the credential-vending service with a
bearer token or a workload identity binding.
3. The service validates issuer, audience, signature, expiration,
subject, tenant claim, and assurance evidence.
4. The service builds a flex-auth request with the protected-system id,
resource, action set, requested TTL, tenant, actor, and context.
5. flex-auth evaluates policy through its standalone evaluator or a
delegated PDP such as Topaz.
6. If denied, the service returns a deny envelope with a stable reason
code and audit correlation id.
7. If allowed, the service exchanges the approved request with the
backend or OpenBao-assisted broker path.
8. The service returns normalized temporary credentials and records
identity, policy, backend, lease, and audit metadata.
## Resource Model
Every object-storage resource belongs to a protected system and tenant.
Suggested identifiers:
```text
protected_system:object-storage:artifact-store-prod
tenant:platform
tenant:coulomb
bucket:artifact-store-prod
prefix:tenant/coulomb/packages/
object:tenant/coulomb/packages/<digest>
```
The protected-system id names the storage integration boundary, not just
the backend product. For example, a MinIO tenant and an AWS bucket used
by the same application should still be distinct protected systems if
their trust, audit, or policy lifecycle differs.
## flex-auth Vocabulary
| Resource | Example | Notes |
| --- | --- | --- |
| protected system | `object-storage:artifact-store-prod` | Required in every decision |
| bucket | `bucket:artifact-store-prod` | Coarse storage boundary |
| prefix | `prefix:tenant/coulomb/packages/` | Preferred grant boundary for workloads |
| object | `object:tenant/coulomb/packages/a.tar.zst` | Use for exceptional single-object decisions |
Canonical action names:
| Action | Meaning |
| --- | --- |
| `s3:GetObject` | Read object data |
| `s3:PutObject` | Create or replace object data |
| `s3:DeleteObject` | Delete object data |
| `s3:ListBucket` | List bucket or prefix contents |
| `s3:GetObjectAttributes` | Read metadata, checksums, or object attributes |
| `s3:AbortMultipartUpload` | Abort multipart state |
| `s3:CreateMultipartUpload` | Start multipart upload |
| `s3:UploadPart` | Upload multipart chunk |
| `s3:CompleteMultipartUpload` | Complete multipart upload |
Required decision inputs:
- subject id, subject type, issuer, audience, and tenant;
- protected-system id;
- bucket and prefix or object;
- requested action set;
- requested TTL;
- assurance level and MFA evidence where privileged or destructive
actions are requested;
- workload identity evidence for service or agent callers;
- request purpose and audit correlation id when available.
Required decision outputs:
- allow or deny;
- maximum TTL;
- permitted actions;
- permitted bucket and prefix/object scope;
- obligations such as read-only, checksum-required, write-once, or
audit-detail-required;
- deny reason code;
- explanation/audit correlation id;
- backend exchange hint where policy deliberately restricts backend use.
TTL policy:
- default interactive TTL: 15 minutes;
- default workload TTL: 30 minutes;
- maximum normal TTL: 1 hour;
- longer TTLs require explicit policy and should not exceed backend
limits;
- destructive or platform-scoped credentials should use shorter TTLs and
MFA or dual-control obligations.
## IAM Profile Requirements
Accepted issuers:
- key-cape lightweight mode for local, sandbox, and small deployments;
- Keycloak expanded mode for production and enterprise federation;
- local-identity only for development or bootstrap contexts explicitly
marked non-production.
Required token properties:
- `iss` matches an approved NetKingdom issuer;
- `aud` targets the credential-vending service or an approved backend
exchange audience;
- `sub` is stable for the principal;
- `exp`, `nbf`, and `iat` are present and within skew tolerance;
- `tenant` or equivalent tenant mapping is present for tenant-scoped
requests;
- service accounts and agents are distinguishable from humans;
- assurance/MFA claims are present when policy needs them;
- groups or roles are mapped through IAM Profile semantics, not
provider-specific bucket policy.
Local-dev restrictions:
- local issuers must only be accepted by explicitly configured dev
vending instances;
- local issuer tokens must not be trusted by production backends;
- credentials minted from local issuers must be restricted to local or
sandbox object stores.
Emergency principals:
- break-glass use is platform-control-plane access, not tenant access;
- emergency credentials must be short-lived where possible;
- every emergency vending event requires a post-event review record.
## Backend Assessment
| Backend | Temporary credential path | NetKingdom stance |
| --- | --- | --- |
| AWS S3 | AWS STS `AssumeRoleWithWebIdentity` returns access key id, secret access key, session token, and expiration | Best fit for AWS-native deployments. Use IAM OIDC provider and role trust policies, with flex-auth deciding before exchange. |
| Ceph RGW | RGW implements a subset of STS, including `AssumeRoleWithWebIdentity` for OIDC-backed temporary credentials | Good fit for self-hosted S3-compatible storage when RGW IAM/STS maturity is acceptable for the deployment. |
| MinIO/AIStor | MinIO STS supports `AssumeRoleWithWebIdentity` with OIDC JWTs and AWS-like response semantics | Strong fit for lightweight/self-hosted deployments if session-token support is wired through consumers. |
| Cloudflare R2 | R2 temporary credentials are created through the R2 Temporary Credentials API or local signing with parent access material | Use a backend-specific broker. Store parent material in OpenBao; do not expose parent credentials to workloads. |
| OpenBao | Can store parent credentials, broker dynamic material, record leases, and audit secret access | Runtime secret infrastructure and audit point, not the canonical object-storage authorization engine. |
Decision summary: prefer provider-native temporary credentials when the
backend has a mature STS or temporary-credentials API. Keep the
NetKingdom interface stable and normalize backend differences in the
credential-vending service.
## OpenBao Role
OpenBao participates in credential vending only after flex-auth approval.
Allowed OpenBao responsibilities:
- store backend parent credentials for Cloudflare R2 or other APIs that
need privileged signing material;
- store broker configuration and backend endpoint metadata;
- issue or lease dynamic credentials where a supported backend plugin or
controlled broker path exists;
- provide audit records for parent credential access and broker
operations;
- deliver credential-vending service configuration through Kubernetes
auth, CSI, or External Secrets Operator.
Prohibited OpenBao responsibilities:
- deciding whether a tenant may access a bucket or prefix;
- storing tenant policy as the canonical object-storage authorization
model;
- exposing platform mounts, root tokens, unseal/recovery material, or
parent credentials to tenants;
- bypassing flex-auth because a backend secret path is readable.
## Interface Prototype
HTTP request:
```http
POST /v1/object-storage/credentials
Authorization: Bearer <iam-profile-token>
Content-Type: application/json
```
```json
{
"protected_system_id": "object-storage:artifact-store-prod",
"tenant_id": "tenant:coulomb",
"bucket": "artifact-store-prod",
"prefix": "tenant/coulomb/packages/",
"actions": ["s3:GetObject", "s3:PutObject", "s3:ListBucket"],
"ttl_seconds": 1800,
"purpose": "artifact-store package upload",
"correlation_id": "01JYNETKINGDOMSTS000000000001"
}
```
Normalized response:
```json
{
"credentials": {
"access_key_id": "AKIA...",
"secret_access_key": "redacted-by-client-logging",
"session_token": "token...",
"expiration": "2026-05-18T16:45:00Z"
},
"scope": {
"protected_system_id": "object-storage:artifact-store-prod",
"tenant_id": "tenant:coulomb",
"bucket": "artifact-store-prod",
"prefix": "tenant/coulomb/packages/",
"actions": ["s3:GetObject", "s3:PutObject", "s3:ListBucket"]
},
"lease": {
"ttl_seconds": 1800,
"renewable": false,
"backend": "minio-assume-role-with-web-identity",
"openbao_lease_id": null
},
"decision": {
"decision_id": "dec_01JYNETKINGDOMSTS000000000001",
"policy_package": "object-storage-artifact-store-prod@2026-05-18",
"obligations": ["checksum-required"],
"audit_correlation_id": "01JYNETKINGDOMSTS000000000001"
}
}
```
Deny response:
```json
{
"error": "credential_denied",
"reason_code": "prefix_not_registered_for_tenant",
"decision_id": "dec_01JYNETKINGDOMSTS000000000002",
"audit_correlation_id": "01JYNETKINGDOMSTS000000000002"
}
```
`credential_process` output for SDK consumers:
```json
{
"Version": 1,
"AccessKeyId": "AKIA...",
"SecretAccessKey": "...",
"SessionToken": "...",
"Expiration": "2026-05-18T16:45:00Z"
}
```
CLI shape:
```bash
netkingdom-object-creds vend \
--protected-system object-storage:artifact-store-prod \
--tenant tenant:coulomb \
--bucket artifact-store-prod \
--prefix tenant/coulomb/packages/ \
--action s3:GetObject \
--action s3:PutObject \
--ttl 1800 \
--credential-process
```
## Audit Event
Each successful or denied request should emit one canonical audit event:
```json
{
"event_type": "object_storage_credential_vending",
"outcome": "allowed",
"actor": {
"subject": "service:artifact-store",
"issuer": "https://kc.coulomb.social",
"tenant": "tenant:coulomb",
"assurance": "workload"
},
"request": {
"protected_system_id": "object-storage:artifact-store-prod",
"bucket": "artifact-store-prod",
"prefix": "tenant/coulomb/packages/",
"actions": ["s3:GetObject", "s3:PutObject"],
"ttl_seconds": 1800
},
"decision": {
"decision_id": "dec_01JYNETKINGDOMSTS000000000001",
"policy_package": "object-storage-artifact-store-prod@2026-05-18"
},
"backend": {
"type": "minio-assume-role-with-web-identity",
"credential_expiration": "2026-05-18T16:45:00Z",
"openbao_lease_id": null
}
}
```
OpenBao audit events should be correlated when OpenBao parent material,
broker config, dynamic secret engines, or delivery paths are used.
## Consumer Guidance
### artifact-store
`artifact-store` should consume temporary credentials without owning the
vending authority.
Required consumer support:
- `AWS_ACCESS_KEY_ID`;
- `AWS_SECRET_ACCESS_KEY`;
- `AWS_SESSION_TOKEN`;
- credential expiration awareness;
- refresh before expiration, preferably with jitter;
- env, file, sidecar, controller, or `credential_process` delivery.
The existing static bridge can remain transitional:
```bash
export ARTIFACTSTORE_S3_ACCESS_KEY_REF=file:/run/secrets/artifactstore/s3-access-key
export ARTIFACTSTORE_S3_SECRET_KEY_REF=file:/run/secrets/artifactstore/s3-secret-key
```
Temporary credentials require either a session-token ref or a refresh
pattern that updates all three credential values atomically:
```bash
export ARTIFACTSTORE_S3_ACCESS_KEY_REF=file:/run/secrets/artifactstore/aws-access-key-id
export ARTIFACTSTORE_S3_SECRET_KEY_REF=file:/run/secrets/artifactstore/aws-secret-access-key
export ARTIFACTSTORE_S3_SESSION_TOKEN_REF=file:/run/secrets/artifactstore/aws-session-token
export ARTIFACTSTORE_S3_CREDENTIAL_EXPIRATION_REF=file:/run/secrets/artifactstore/expiration
```
Recommended deployment patterns:
- CLI or SDK `credential_process` for developer and batch use;
- sidecar refresh process for pods that cannot call the vending API
directly;
- controller plus mounted files when platform operators need centralized
refresh and audit;
- direct vending API call only when the workload can protect its IAM
token and handle refresh safely.
### Other S3 Consumers
Consumers must support the session token. Access-key/secret-key-only
clients are limited to transitional static credentials and should not be
used for production tenant workloads.
Prohibited patterns:
- object-store root credentials in application pods;
- long-lived tenant access keys for normal workload traffic;
- bucket policy managed by application repos as the source of truth;
- storing parent R2/API credentials in tenant namespaces;
- ignoring credential expiration and retrying indefinitely with expired
credentials;
- accepting local-identity tokens in production.
## Failure Modes
| Failure | Expected behavior |
| --- | --- |
| IAM token invalid or wrong audience | Deny before policy evaluation; emit audit event |
| Tenant missing or mismatched | Deny with `tenant_scope_missing` or `tenant_mismatch` |
| Prefix not registered | Deny with `prefix_not_registered_for_tenant` |
| TTL too long | Reduce to policy maximum or deny, depending on policy |
| flex-auth or Topaz unavailable | Fail closed except for explicitly documented emergency platform workflows |
| Backend STS unavailable | Do not mint credentials; return retryable backend error |
| OpenBao unavailable | Fail if parent material or broker config requires OpenBao; otherwise continue only for backend paths that do not depend on it |
| Audit sink unavailable | Deny privileged/platform-scoped requests; allow low-risk tenant requests only if policy permits buffered audit |
| Consumer refresh fails | Stop writes before expiration; retry vending with backoff; never fall back to root credentials |
## Readiness Checks
- IAM Profile token validation test passes for key-cape or Keycloak.
- flex-auth has policy packages for platform and tenant scopes.
- Topaz policy load and health are verified where delegated PDP is used.
- Backend-specific STS or temporary credential path returns credentials
with session token and expiration.
- OpenBao parent credential access, lease metadata, and audit correlation
work where OpenBao is in the path.
- artifact-store or the consumer can refresh all credential fields before
expiration.
- Deny paths produce stable reason codes and audit records.
- Break-glass operation is documented and post-event review is required.
## References
- [AWS STS AssumeRoleWithWebIdentity](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRoleWithWebIdentity.html)
- [Ceph RGW STS](https://docs.ceph.com/en/latest/radosgw/STS/)
- [MinIO AssumeRoleWithWebIdentity](https://min.io/docs/minio/linux/developers/security-token-service/AssumeRoleWithWebIdentity.html)
- [Cloudflare R2 Temporary Credentials API](https://developers.cloudflare.com/api/resources/r2/subresources/temporary_credentials/)
- [Cloudflare R2 temporary credential example](https://developers.cloudflare.com/r2/examples/authenticate-r2-temp-credentials/)

View File

@@ -1,7 +1,7 @@
# Platform Identity and Security Architecture
Status: draft architecture baseline for NetKingdom/Railiance/Coulomb
Date: 2026-05-17
Status: implemented architecture baseline for NetKingdom/Railiance/Coulomb
Date: 2026-05-18
## Purpose
@@ -305,6 +305,86 @@ Possible responsibilities:
This orchestration layer should build on Railiance capabilities rather
than bypassing the Railiance stack boundaries.
ADR-0007 records the current decision: keep orchestration in Railiance
playbooks for now, with NetKingdom defining the trust-state model,
readiness checks, OpenBao boundaries, and security semantics.
## flex-auth And Topaz Implications
flex-auth work must preserve the recursive boundary between platform
control-plane resources and tenant resources.
Required implications:
- CARING descriptors must include scope and tenant metadata for
tenant-scoped access, and must mark rare platform-scoped access
explicitly.
- Policy packages must distinguish `tenant:platform` policy from
tenant-local packages such as `tenant:coulomb`.
- Decision envelopes must carry subject, issuer, audience, tenant,
protected-system id, resource, action, requested TTL where relevant,
assurance evidence, obligations, deny reasons, and audit correlation
ids.
- Topaz is a delegated PDP runtime behind flex-auth. It must not become
the canonical policy model, identity provider, or platform control
plane.
- Audit and explain records must be durable enough to reconstruct why a
platform-root, secret, credential, or tenant-administration decision was
allowed or denied.
- Platform-root guardrails must deny tenant administrators the ability to
alter IAM Profile semantics, OpenBao platform mounts/auth methods,
flex-auth policy import pipelines, Topaz runtime configuration, or
platform audit retention.
OpenBao secret access and dynamic credential requests follow the same
authorization rule: identity proves the actor or workload, flex-auth
decides whether the request is permitted, and OpenBao stores, issues,
leases, audits, and revokes the secret material.
## Coulomb Tenant Onboarding Path
The first Coulomb tenant onboarding path should be repeatable before it
becomes automated:
1. Register `tenant:coulomb` as a tenant distinct from
`tenant:platform`.
2. Map Coulomb human, service, and agent principals to IAM Profile claims
with issuer, audience, subject, group, tenant, and assurance evidence.
3. Register Coulomb protected systems and resources in flex-auth with
stable protected-system ids.
4. Import tenant-scoped policy packages and CARING descriptors for
Coulomb resources.
5. Initialize the delegated PDP runtime, starting with Topaz, using only
the policy packages approved for the tenant and platform boundary.
6. Provision Coulomb workload secret paths, Kubernetes auth roles, or
delivery mechanisms in OpenBao without granting access to platform
mounts, unseal/recovery material, or global auth configuration.
7. Run audit readiness checks before admitting production traffic:
identity issuance, flex-auth decision envelope, Topaz health,
OpenBao audit event, workload enforcement event, and correlation id.
The onboarding path is complete when a Coulomb workload can authenticate,
receive a scoped authorization decision, obtain only the allowed secret or
short-lived credential, enforce the decision locally, and produce an
auditable record without receiving platform-root authority.
## Production Readiness Checks
Before the security platform is production-ready, each trust state needs
an explicit check:
| Area | Readiness check |
| --- | --- |
| MFA and identity | key-cape or Keycloak issues IAM Profile-compatible tokens; privacyIDEA or the selected MFA provider enforces required assurance for privileged actions |
| Bootstrap and recovery | age/SOPS material, emergency bundle, and break-glass credentials are present, tested, and separated from tenant administration |
| OpenBao runtime secrets | OpenBao is initialized, unsealed or auto-unsealed by the approved mechanism, backed up, audited, and using scoped auth methods and mounts |
| Secret rotation | service, database, OpenBao-issued, and break-glass rotation paths have documented blast radius and verification steps |
| flex-auth policy state | platform and tenant policy packages are versioned, reviewable, imported, and explainable |
| Topaz runtime | delegated PDP health, data freshness, policy load status, and fail-closed behavior are verified |
| Tenant onboarding | `tenant:coulomb` resources, claims, policies, OpenBao paths, and audit correlation are registered and tested |
| Audit sink | identity, flex-auth, Topaz, OpenBao, Kubernetes, and workload audit records land in durable storage with restore/drill coverage |
| Break-glass | emergency access works when normal identity is unavailable and produces a post-event review record |
## Open Questions
- Where is the durable audit log stored for platform-root decisions?