generated from coulomb/repo-seed
Add OpenBao runtime secret authority; complete NK-WP-0006/0007/0008
Refine the recursive platform security architecture to make OpenBao the canonical runtime secret authority, with SOPS/age, K8s Secrets, and the emergency bundle reframed as bootstrap/delivery/break-glass mechanisms. - credential-management standard v0.2: add OpenBao runtime authority section, rotation rules, and prohibited patterns (OpenBao-as-PDP, tenant platform-root) - platform-identity-security-architecture: mark implemented; add flex-auth/Topaz implications, Coulomb onboarding path, and a production-readiness checklist - NK-WP-0004/0005: document bootstrap-to-OpenBao handoff boundary - NK-WP-0006/0007: status -> done with implementation reviews; add recursive platform/tenant split and OpenBao broker/audit role for object-storage STS vending - NK-WP-0008: status -> done; repoint corpus to infospace-bench - new ADR-0007 (orchestration boundary), ADR-0008 (STS vending boundary), and the object-storage STS credential-vending architecture Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
87
docs/adr/ADR-0007-security-orchestration-boundary.md
Normal file
87
docs/adr/ADR-0007-security-orchestration-boundary.md
Normal file
@@ -0,0 +1,87 @@
|
||||
# ADR-0007 - Security Orchestration Boundary
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2026-05-18
|
||||
**Deciders:** Bernd Worsch, Codex
|
||||
|
||||
## Context
|
||||
|
||||
The recursive platform security architecture needs careful sequencing:
|
||||
host trust, cluster trust, bootstrap secrets, runtime secret authority,
|
||||
runtime identity, runtime authorization, tenant onboarding, and readiness
|
||||
verification.
|
||||
|
||||
That sequencing crosses NetKingdom and Railiance ownership boundaries.
|
||||
NetKingdom owns the canonical security architecture, IAM Profile,
|
||||
credential/bootstrap standards, and authorization semantics. Railiance
|
||||
owns deployment layering for infrastructure, clusters, platform services,
|
||||
and applications. OpenBao adds an important runtime-secret authority to
|
||||
the platform control plane, but it does not change those ownership
|
||||
boundaries.
|
||||
|
||||
Creating a dedicated orchestration repo too early would risk encoding
|
||||
temporary bootstrap order and accidental stack assumptions as a permanent
|
||||
interface. Leaving every sequence implicit would also be risky: platform
|
||||
root actions, OpenBao initialization, policy import, and tenant onboarding
|
||||
must be auditable and repeatable.
|
||||
|
||||
## Decision
|
||||
|
||||
Security orchestration will stay in Railiance playbooks for now.
|
||||
|
||||
NetKingdom will define the trust-state model, readiness checks, policy
|
||||
semantics, OpenBao boundaries, and tenant/control-plane rules. Railiance
|
||||
playbooks will own the concrete deployment sequencing across
|
||||
`railiance-infra`, `railiance-cluster`, `railiance-platform`, and
|
||||
`railiance-apps`.
|
||||
|
||||
A dedicated orchestration repo is deferred until the sequencing surface is
|
||||
stable enough to justify its own product boundary. If created later, it
|
||||
must coordinate safe sequencing and readiness reporting; it must not own
|
||||
security policy semantics or bypass Railiance stack ownership.
|
||||
|
||||
## Consequences
|
||||
|
||||
- NK-WP-0006 is implemented as architecture, standards, ADRs, and
|
||||
workplan constraints rather than a new repo.
|
||||
- OpenBao bootstrap, unseal/recovery, audit, backup, and workload-secret
|
||||
delivery belong in Railiance platform playbooks, governed by
|
||||
NetKingdom standards.
|
||||
- Cross-repo readiness should be reported as checks against explicit
|
||||
trust states, not as a hidden imperative script.
|
||||
- A future orchestration repo needs a new ADR before creation.
|
||||
|
||||
## Future Repo Trigger
|
||||
|
||||
Revisit a dedicated orchestration repo only if at least two of these are
|
||||
true:
|
||||
|
||||
- multiple Railiance deployments need the same security sequencing
|
||||
interface;
|
||||
- readiness reporting becomes a reusable artifact consumed by operators,
|
||||
agents, or CI;
|
||||
- rollback and recovery workflows need a cross-repo state machine that no
|
||||
single Railiance layer can own cleanly;
|
||||
- tenant onboarding becomes a repeatable workflow spanning identity,
|
||||
flex-auth, Topaz, OpenBao, object storage, and application repos.
|
||||
|
||||
## Alternatives Considered
|
||||
|
||||
### Create A Dedicated Orchestration Repo Now
|
||||
|
||||
This would give sequencing a visible home, but it would probably encode
|
||||
unstable details before OpenBao runtime operations, flex-auth/Topaz
|
||||
policy import, and tenant onboarding have enough implementation feedback.
|
||||
|
||||
### Put Orchestration In NetKingdom
|
||||
|
||||
NetKingdom owns the security model, but it should not become the
|
||||
deployment repo for every stack layer. This would blur architecture
|
||||
ownership with platform deployment ownership.
|
||||
|
||||
### Leave Sequencing Entirely Informal
|
||||
|
||||
This avoids premature structure but leaves bootstrap and runtime trust
|
||||
transitions too dependent on operator memory. The accepted approach keeps
|
||||
the sequence explicit while leaving concrete deployment in the Railiance
|
||||
stack.
|
||||
82
docs/adr/ADR-0008-object-storage-sts-credential-vending.md
Normal file
82
docs/adr/ADR-0008-object-storage-sts-credential-vending.md
Normal file
@@ -0,0 +1,82 @@
|
||||
# ADR-0008 - Object Storage STS Credential Vending Boundary
|
||||
|
||||
**Status:** Accepted
|
||||
**Date:** 2026-05-18
|
||||
**Deciders:** Bernd Worsch, Codex
|
||||
|
||||
## Context
|
||||
|
||||
NetKingdom needs a canonical pattern for issuing short-lived
|
||||
object-storage credentials to platform and tenant workloads. The first
|
||||
known consumer is `artifact-store`, but the pattern must work for future
|
||||
S3-compatible consumers without making each application repo own identity,
|
||||
authorization, root object-store credentials, or backend-specific STS
|
||||
differences.
|
||||
|
||||
The backend landscape is not uniform. AWS S3, Ceph RGW, and MinIO/AIStor
|
||||
can use web-identity STS-style flows. Cloudflare R2 exposes temporary
|
||||
credentials through a provider API or local signing with parent access
|
||||
material. OpenBao is now part of the Railiance platform stack as runtime
|
||||
secret authority, but it is not an identity provider or authorization
|
||||
policy engine.
|
||||
|
||||
## Decision
|
||||
|
||||
NetKingdom will define a provider-neutral credential-vending interface
|
||||
backed by provider-native temporary credential mechanisms where possible.
|
||||
|
||||
The trust path is:
|
||||
|
||||
1. IAM Profile token proves the actor or workload.
|
||||
2. flex-auth decides whether the actor may receive credentials for the
|
||||
requested protected system, tenant, bucket, prefix, action set, TTL,
|
||||
and assurance level.
|
||||
3. The credential-vending service exchanges the approved request with
|
||||
the backend-specific temporary credential mechanism.
|
||||
4. OpenBao stores parent credentials, broker configuration, lease
|
||||
metadata, and audit evidence where useful, but it does not replace
|
||||
flex-auth authorization.
|
||||
5. Consumers receive normalized temporary credentials containing access
|
||||
key id, secret access key, session token, and expiration.
|
||||
|
||||
## Consequences
|
||||
|
||||
- `artifact-store` needs temporary credential support, especially
|
||||
`AWS_SESSION_TOKEN` and refresh behavior, before it can fully consume
|
||||
the production vending pattern.
|
||||
- Backend-specific differences are isolated in the vending service, not
|
||||
leaked into application policy.
|
||||
- OpenBao remains runtime secret infrastructure and audit support; it
|
||||
does not become the object-storage policy source.
|
||||
- Provider-native STS is preferred when available because it gives the
|
||||
storage backend direct lease/expiration semantics.
|
||||
- Cloudflare R2 requires a broker path that protects parent access
|
||||
material, most likely through OpenBao custody.
|
||||
|
||||
## Alternatives Considered
|
||||
|
||||
### Give Applications Long-Lived Access Keys
|
||||
|
||||
This is simple but leaves applications holding durable credentials and
|
||||
pushes policy into ad hoc bucket configuration. It is acceptable only as
|
||||
a transitional bridge with scoped credentials and explicit rotation.
|
||||
|
||||
### Put Object-Storage Policy In Keycloak Or key-cape
|
||||
|
||||
Identity providers can assert who the actor is and coarse groups or
|
||||
roles, but they should not become the canonical source of bucket,
|
||||
prefix, action, TTL, and explanation semantics.
|
||||
|
||||
### Use OpenBao As The Credential Vending Policy Engine
|
||||
|
||||
OpenBao is valuable for secret custody, broker configuration, leases,
|
||||
and audit records. Making it the policy decision point would duplicate
|
||||
flex-auth, blur the platform/tenant boundary, and make authorization
|
||||
semantics backend-specific.
|
||||
|
||||
### Require One Backend Everywhere
|
||||
|
||||
A single backend would simplify implementation but does not match the
|
||||
platform direction. Railiance and NetKingdom need a stable security
|
||||
interface across AWS, self-hosted S3-compatible stores, and Cloudflare
|
||||
R2-like APIs.
|
||||
482
docs/object-storage-sts-credential-vending.md
Normal file
482
docs/object-storage-sts-credential-vending.md
Normal file
@@ -0,0 +1,482 @@
|
||||
# Object Storage STS Credential Vending
|
||||
|
||||
Status: architecture baseline for NK-WP-0007
|
||||
Date: 2026-05-18
|
||||
|
||||
## Purpose
|
||||
|
||||
This document defines the NetKingdom pattern for vending short-lived
|
||||
object-storage credentials from verified identity and policy decisions.
|
||||
It is provider-neutral at the NetKingdom boundary and provider-aware at
|
||||
the backend exchange boundary.
|
||||
|
||||
The goal is to let consumers such as `artifact-store` use S3-compatible
|
||||
temporary credentials without owning identity, authorization, secret
|
||||
custody, or object-storage root credentials.
|
||||
|
||||
## Ownership Boundary
|
||||
|
||||
| Capability | Owner |
|
||||
| --- | --- |
|
||||
| IAM Profile, issuer and claim requirements | NetKingdom |
|
||||
| Resource/action vocabulary and policy decision envelope | flex-auth, governed by NetKingdom architecture |
|
||||
| Delegated PDP runtime | Topaz first, behind flex-auth |
|
||||
| Runtime secret custody, broker configuration, audit, leases | OpenBao, deployed by Railiance platform |
|
||||
| Object-storage backend configuration | Railiance platform |
|
||||
| Artifact package behavior and S3 client refresh behavior | artifact-store |
|
||||
| Application deployment | Railiance apps or the owning application repo |
|
||||
|
||||
OpenBao may store parent credentials, broker configuration, or issued
|
||||
credential metadata where appropriate. It does not replace flex-auth as
|
||||
the authorization decision point and must not become the object-storage
|
||||
policy model.
|
||||
|
||||
## Core Flow
|
||||
|
||||
```text
|
||||
Human, service, or agent principal
|
||||
|
|
||||
v
|
||||
NetKingdom IAM Profile token
|
||||
key-cape lightweight mode or Keycloak expanded mode
|
||||
|
|
||||
v
|
||||
credential-vending service
|
||||
verifies issuer, audience, subject, assurance, tenant
|
||||
|
|
||||
v
|
||||
flex-auth decision
|
||||
tenant, protected-system, bucket, prefix, actions, TTL, obligations
|
||||
|
|
||||
v
|
||||
backend exchange
|
||||
AWS STS, Ceph RGW STS, MinIO/AIStor STS, Cloudflare R2 temp API,
|
||||
or OpenBao-assisted broker path
|
||||
|
|
||||
v
|
||||
temporary S3 credentials
|
||||
access key id, secret access key, session token, expiration
|
||||
|
|
||||
v
|
||||
consumer
|
||||
artifact-store, SDK, CLI, sidecar, controller, or batch job
|
||||
```
|
||||
|
||||
## Trust Boundaries
|
||||
|
||||
### Platform Control Plane
|
||||
|
||||
`tenant:platform` administers the credential-vending service, approved
|
||||
issuer list, flex-auth policy import pipeline, OpenBao mounts/auth
|
||||
methods, backend parent credentials, audit retention, and emergency
|
||||
recovery.
|
||||
|
||||
### Tenant Plane
|
||||
|
||||
`tenant:coulomb` and later tenants may request scoped credentials for
|
||||
registered tenant resources. Tenant administrators must not receive
|
||||
OpenBao root tokens, object-storage root credentials, global backend STS
|
||||
configuration, or platform policy import authority.
|
||||
|
||||
### Backend Boundary
|
||||
|
||||
The credential-vending service is the only component that exchanges an
|
||||
approved decision for provider-native credentials. Consumers receive only
|
||||
short-lived credentials scoped to the approved bucket, prefix, actions,
|
||||
and TTL.
|
||||
|
||||
## Token And Decision Flow
|
||||
|
||||
1. The caller authenticates through a NetKingdom IAM Profile
|
||||
implementation.
|
||||
2. The caller sends a request to the credential-vending service with a
|
||||
bearer token or a workload identity binding.
|
||||
3. The service validates issuer, audience, signature, expiration,
|
||||
subject, tenant claim, and assurance evidence.
|
||||
4. The service builds a flex-auth request with the protected-system id,
|
||||
resource, action set, requested TTL, tenant, actor, and context.
|
||||
5. flex-auth evaluates policy through its standalone evaluator or a
|
||||
delegated PDP such as Topaz.
|
||||
6. If denied, the service returns a deny envelope with a stable reason
|
||||
code and audit correlation id.
|
||||
7. If allowed, the service exchanges the approved request with the
|
||||
backend or OpenBao-assisted broker path.
|
||||
8. The service returns normalized temporary credentials and records
|
||||
identity, policy, backend, lease, and audit metadata.
|
||||
|
||||
## Resource Model
|
||||
|
||||
Every object-storage resource belongs to a protected system and tenant.
|
||||
|
||||
Suggested identifiers:
|
||||
|
||||
```text
|
||||
protected_system:object-storage:artifact-store-prod
|
||||
tenant:platform
|
||||
tenant:coulomb
|
||||
|
||||
bucket:artifact-store-prod
|
||||
prefix:tenant/coulomb/packages/
|
||||
object:tenant/coulomb/packages/<digest>
|
||||
```
|
||||
|
||||
The protected-system id names the storage integration boundary, not just
|
||||
the backend product. For example, a MinIO tenant and an AWS bucket used
|
||||
by the same application should still be distinct protected systems if
|
||||
their trust, audit, or policy lifecycle differs.
|
||||
|
||||
## flex-auth Vocabulary
|
||||
|
||||
| Resource | Example | Notes |
|
||||
| --- | --- | --- |
|
||||
| protected system | `object-storage:artifact-store-prod` | Required in every decision |
|
||||
| bucket | `bucket:artifact-store-prod` | Coarse storage boundary |
|
||||
| prefix | `prefix:tenant/coulomb/packages/` | Preferred grant boundary for workloads |
|
||||
| object | `object:tenant/coulomb/packages/a.tar.zst` | Use for exceptional single-object decisions |
|
||||
|
||||
Canonical action names:
|
||||
|
||||
| Action | Meaning |
|
||||
| --- | --- |
|
||||
| `s3:GetObject` | Read object data |
|
||||
| `s3:PutObject` | Create or replace object data |
|
||||
| `s3:DeleteObject` | Delete object data |
|
||||
| `s3:ListBucket` | List bucket or prefix contents |
|
||||
| `s3:GetObjectAttributes` | Read metadata, checksums, or object attributes |
|
||||
| `s3:AbortMultipartUpload` | Abort multipart state |
|
||||
| `s3:CreateMultipartUpload` | Start multipart upload |
|
||||
| `s3:UploadPart` | Upload multipart chunk |
|
||||
| `s3:CompleteMultipartUpload` | Complete multipart upload |
|
||||
|
||||
Required decision inputs:
|
||||
|
||||
- subject id, subject type, issuer, audience, and tenant;
|
||||
- protected-system id;
|
||||
- bucket and prefix or object;
|
||||
- requested action set;
|
||||
- requested TTL;
|
||||
- assurance level and MFA evidence where privileged or destructive
|
||||
actions are requested;
|
||||
- workload identity evidence for service or agent callers;
|
||||
- request purpose and audit correlation id when available.
|
||||
|
||||
Required decision outputs:
|
||||
|
||||
- allow or deny;
|
||||
- maximum TTL;
|
||||
- permitted actions;
|
||||
- permitted bucket and prefix/object scope;
|
||||
- obligations such as read-only, checksum-required, write-once, or
|
||||
audit-detail-required;
|
||||
- deny reason code;
|
||||
- explanation/audit correlation id;
|
||||
- backend exchange hint where policy deliberately restricts backend use.
|
||||
|
||||
TTL policy:
|
||||
|
||||
- default interactive TTL: 15 minutes;
|
||||
- default workload TTL: 30 minutes;
|
||||
- maximum normal TTL: 1 hour;
|
||||
- longer TTLs require explicit policy and should not exceed backend
|
||||
limits;
|
||||
- destructive or platform-scoped credentials should use shorter TTLs and
|
||||
MFA or dual-control obligations.
|
||||
|
||||
## IAM Profile Requirements
|
||||
|
||||
Accepted issuers:
|
||||
|
||||
- key-cape lightweight mode for local, sandbox, and small deployments;
|
||||
- Keycloak expanded mode for production and enterprise federation;
|
||||
- local-identity only for development or bootstrap contexts explicitly
|
||||
marked non-production.
|
||||
|
||||
Required token properties:
|
||||
|
||||
- `iss` matches an approved NetKingdom issuer;
|
||||
- `aud` targets the credential-vending service or an approved backend
|
||||
exchange audience;
|
||||
- `sub` is stable for the principal;
|
||||
- `exp`, `nbf`, and `iat` are present and within skew tolerance;
|
||||
- `tenant` or equivalent tenant mapping is present for tenant-scoped
|
||||
requests;
|
||||
- service accounts and agents are distinguishable from humans;
|
||||
- assurance/MFA claims are present when policy needs them;
|
||||
- groups or roles are mapped through IAM Profile semantics, not
|
||||
provider-specific bucket policy.
|
||||
|
||||
Local-dev restrictions:
|
||||
|
||||
- local issuers must only be accepted by explicitly configured dev
|
||||
vending instances;
|
||||
- local issuer tokens must not be trusted by production backends;
|
||||
- credentials minted from local issuers must be restricted to local or
|
||||
sandbox object stores.
|
||||
|
||||
Emergency principals:
|
||||
|
||||
- break-glass use is platform-control-plane access, not tenant access;
|
||||
- emergency credentials must be short-lived where possible;
|
||||
- every emergency vending event requires a post-event review record.
|
||||
|
||||
## Backend Assessment
|
||||
|
||||
| Backend | Temporary credential path | NetKingdom stance |
|
||||
| --- | --- | --- |
|
||||
| AWS S3 | AWS STS `AssumeRoleWithWebIdentity` returns access key id, secret access key, session token, and expiration | Best fit for AWS-native deployments. Use IAM OIDC provider and role trust policies, with flex-auth deciding before exchange. |
|
||||
| Ceph RGW | RGW implements a subset of STS, including `AssumeRoleWithWebIdentity` for OIDC-backed temporary credentials | Good fit for self-hosted S3-compatible storage when RGW IAM/STS maturity is acceptable for the deployment. |
|
||||
| MinIO/AIStor | MinIO STS supports `AssumeRoleWithWebIdentity` with OIDC JWTs and AWS-like response semantics | Strong fit for lightweight/self-hosted deployments if session-token support is wired through consumers. |
|
||||
| Cloudflare R2 | R2 temporary credentials are created through the R2 Temporary Credentials API or local signing with parent access material | Use a backend-specific broker. Store parent material in OpenBao; do not expose parent credentials to workloads. |
|
||||
| OpenBao | Can store parent credentials, broker dynamic material, record leases, and audit secret access | Runtime secret infrastructure and audit point, not the canonical object-storage authorization engine. |
|
||||
|
||||
Decision summary: prefer provider-native temporary credentials when the
|
||||
backend has a mature STS or temporary-credentials API. Keep the
|
||||
NetKingdom interface stable and normalize backend differences in the
|
||||
credential-vending service.
|
||||
|
||||
## OpenBao Role
|
||||
|
||||
OpenBao participates in credential vending only after flex-auth approval.
|
||||
Allowed OpenBao responsibilities:
|
||||
|
||||
- store backend parent credentials for Cloudflare R2 or other APIs that
|
||||
need privileged signing material;
|
||||
- store broker configuration and backend endpoint metadata;
|
||||
- issue or lease dynamic credentials where a supported backend plugin or
|
||||
controlled broker path exists;
|
||||
- provide audit records for parent credential access and broker
|
||||
operations;
|
||||
- deliver credential-vending service configuration through Kubernetes
|
||||
auth, CSI, or External Secrets Operator.
|
||||
|
||||
Prohibited OpenBao responsibilities:
|
||||
|
||||
- deciding whether a tenant may access a bucket or prefix;
|
||||
- storing tenant policy as the canonical object-storage authorization
|
||||
model;
|
||||
- exposing platform mounts, root tokens, unseal/recovery material, or
|
||||
parent credentials to tenants;
|
||||
- bypassing flex-auth because a backend secret path is readable.
|
||||
|
||||
## Interface Prototype
|
||||
|
||||
HTTP request:
|
||||
|
||||
```http
|
||||
POST /v1/object-storage/credentials
|
||||
Authorization: Bearer <iam-profile-token>
|
||||
Content-Type: application/json
|
||||
```
|
||||
|
||||
```json
|
||||
{
|
||||
"protected_system_id": "object-storage:artifact-store-prod",
|
||||
"tenant_id": "tenant:coulomb",
|
||||
"bucket": "artifact-store-prod",
|
||||
"prefix": "tenant/coulomb/packages/",
|
||||
"actions": ["s3:GetObject", "s3:PutObject", "s3:ListBucket"],
|
||||
"ttl_seconds": 1800,
|
||||
"purpose": "artifact-store package upload",
|
||||
"correlation_id": "01JYNETKINGDOMSTS000000000001"
|
||||
}
|
||||
```
|
||||
|
||||
Normalized response:
|
||||
|
||||
```json
|
||||
{
|
||||
"credentials": {
|
||||
"access_key_id": "AKIA...",
|
||||
"secret_access_key": "redacted-by-client-logging",
|
||||
"session_token": "token...",
|
||||
"expiration": "2026-05-18T16:45:00Z"
|
||||
},
|
||||
"scope": {
|
||||
"protected_system_id": "object-storage:artifact-store-prod",
|
||||
"tenant_id": "tenant:coulomb",
|
||||
"bucket": "artifact-store-prod",
|
||||
"prefix": "tenant/coulomb/packages/",
|
||||
"actions": ["s3:GetObject", "s3:PutObject", "s3:ListBucket"]
|
||||
},
|
||||
"lease": {
|
||||
"ttl_seconds": 1800,
|
||||
"renewable": false,
|
||||
"backend": "minio-assume-role-with-web-identity",
|
||||
"openbao_lease_id": null
|
||||
},
|
||||
"decision": {
|
||||
"decision_id": "dec_01JYNETKINGDOMSTS000000000001",
|
||||
"policy_package": "object-storage-artifact-store-prod@2026-05-18",
|
||||
"obligations": ["checksum-required"],
|
||||
"audit_correlation_id": "01JYNETKINGDOMSTS000000000001"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Deny response:
|
||||
|
||||
```json
|
||||
{
|
||||
"error": "credential_denied",
|
||||
"reason_code": "prefix_not_registered_for_tenant",
|
||||
"decision_id": "dec_01JYNETKINGDOMSTS000000000002",
|
||||
"audit_correlation_id": "01JYNETKINGDOMSTS000000000002"
|
||||
}
|
||||
```
|
||||
|
||||
`credential_process` output for SDK consumers:
|
||||
|
||||
```json
|
||||
{
|
||||
"Version": 1,
|
||||
"AccessKeyId": "AKIA...",
|
||||
"SecretAccessKey": "...",
|
||||
"SessionToken": "...",
|
||||
"Expiration": "2026-05-18T16:45:00Z"
|
||||
}
|
||||
```
|
||||
|
||||
CLI shape:
|
||||
|
||||
```bash
|
||||
netkingdom-object-creds vend \
|
||||
--protected-system object-storage:artifact-store-prod \
|
||||
--tenant tenant:coulomb \
|
||||
--bucket artifact-store-prod \
|
||||
--prefix tenant/coulomb/packages/ \
|
||||
--action s3:GetObject \
|
||||
--action s3:PutObject \
|
||||
--ttl 1800 \
|
||||
--credential-process
|
||||
```
|
||||
|
||||
## Audit Event
|
||||
|
||||
Each successful or denied request should emit one canonical audit event:
|
||||
|
||||
```json
|
||||
{
|
||||
"event_type": "object_storage_credential_vending",
|
||||
"outcome": "allowed",
|
||||
"actor": {
|
||||
"subject": "service:artifact-store",
|
||||
"issuer": "https://kc.coulomb.social",
|
||||
"tenant": "tenant:coulomb",
|
||||
"assurance": "workload"
|
||||
},
|
||||
"request": {
|
||||
"protected_system_id": "object-storage:artifact-store-prod",
|
||||
"bucket": "artifact-store-prod",
|
||||
"prefix": "tenant/coulomb/packages/",
|
||||
"actions": ["s3:GetObject", "s3:PutObject"],
|
||||
"ttl_seconds": 1800
|
||||
},
|
||||
"decision": {
|
||||
"decision_id": "dec_01JYNETKINGDOMSTS000000000001",
|
||||
"policy_package": "object-storage-artifact-store-prod@2026-05-18"
|
||||
},
|
||||
"backend": {
|
||||
"type": "minio-assume-role-with-web-identity",
|
||||
"credential_expiration": "2026-05-18T16:45:00Z",
|
||||
"openbao_lease_id": null
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
OpenBao audit events should be correlated when OpenBao parent material,
|
||||
broker config, dynamic secret engines, or delivery paths are used.
|
||||
|
||||
## Consumer Guidance
|
||||
|
||||
### artifact-store
|
||||
|
||||
`artifact-store` should consume temporary credentials without owning the
|
||||
vending authority.
|
||||
|
||||
Required consumer support:
|
||||
|
||||
- `AWS_ACCESS_KEY_ID`;
|
||||
- `AWS_SECRET_ACCESS_KEY`;
|
||||
- `AWS_SESSION_TOKEN`;
|
||||
- credential expiration awareness;
|
||||
- refresh before expiration, preferably with jitter;
|
||||
- env, file, sidecar, controller, or `credential_process` delivery.
|
||||
|
||||
The existing static bridge can remain transitional:
|
||||
|
||||
```bash
|
||||
export ARTIFACTSTORE_S3_ACCESS_KEY_REF=file:/run/secrets/artifactstore/s3-access-key
|
||||
export ARTIFACTSTORE_S3_SECRET_KEY_REF=file:/run/secrets/artifactstore/s3-secret-key
|
||||
```
|
||||
|
||||
Temporary credentials require either a session-token ref or a refresh
|
||||
pattern that updates all three credential values atomically:
|
||||
|
||||
```bash
|
||||
export ARTIFACTSTORE_S3_ACCESS_KEY_REF=file:/run/secrets/artifactstore/aws-access-key-id
|
||||
export ARTIFACTSTORE_S3_SECRET_KEY_REF=file:/run/secrets/artifactstore/aws-secret-access-key
|
||||
export ARTIFACTSTORE_S3_SESSION_TOKEN_REF=file:/run/secrets/artifactstore/aws-session-token
|
||||
export ARTIFACTSTORE_S3_CREDENTIAL_EXPIRATION_REF=file:/run/secrets/artifactstore/expiration
|
||||
```
|
||||
|
||||
Recommended deployment patterns:
|
||||
|
||||
- CLI or SDK `credential_process` for developer and batch use;
|
||||
- sidecar refresh process for pods that cannot call the vending API
|
||||
directly;
|
||||
- controller plus mounted files when platform operators need centralized
|
||||
refresh and audit;
|
||||
- direct vending API call only when the workload can protect its IAM
|
||||
token and handle refresh safely.
|
||||
|
||||
### Other S3 Consumers
|
||||
|
||||
Consumers must support the session token. Access-key/secret-key-only
|
||||
clients are limited to transitional static credentials and should not be
|
||||
used for production tenant workloads.
|
||||
|
||||
Prohibited patterns:
|
||||
|
||||
- object-store root credentials in application pods;
|
||||
- long-lived tenant access keys for normal workload traffic;
|
||||
- bucket policy managed by application repos as the source of truth;
|
||||
- storing parent R2/API credentials in tenant namespaces;
|
||||
- ignoring credential expiration and retrying indefinitely with expired
|
||||
credentials;
|
||||
- accepting local-identity tokens in production.
|
||||
|
||||
## Failure Modes
|
||||
|
||||
| Failure | Expected behavior |
|
||||
| --- | --- |
|
||||
| IAM token invalid or wrong audience | Deny before policy evaluation; emit audit event |
|
||||
| Tenant missing or mismatched | Deny with `tenant_scope_missing` or `tenant_mismatch` |
|
||||
| Prefix not registered | Deny with `prefix_not_registered_for_tenant` |
|
||||
| TTL too long | Reduce to policy maximum or deny, depending on policy |
|
||||
| flex-auth or Topaz unavailable | Fail closed except for explicitly documented emergency platform workflows |
|
||||
| Backend STS unavailable | Do not mint credentials; return retryable backend error |
|
||||
| OpenBao unavailable | Fail if parent material or broker config requires OpenBao; otherwise continue only for backend paths that do not depend on it |
|
||||
| Audit sink unavailable | Deny privileged/platform-scoped requests; allow low-risk tenant requests only if policy permits buffered audit |
|
||||
| Consumer refresh fails | Stop writes before expiration; retry vending with backoff; never fall back to root credentials |
|
||||
|
||||
## Readiness Checks
|
||||
|
||||
- IAM Profile token validation test passes for key-cape or Keycloak.
|
||||
- flex-auth has policy packages for platform and tenant scopes.
|
||||
- Topaz policy load and health are verified where delegated PDP is used.
|
||||
- Backend-specific STS or temporary credential path returns credentials
|
||||
with session token and expiration.
|
||||
- OpenBao parent credential access, lease metadata, and audit correlation
|
||||
work where OpenBao is in the path.
|
||||
- artifact-store or the consumer can refresh all credential fields before
|
||||
expiration.
|
||||
- Deny paths produce stable reason codes and audit records.
|
||||
- Break-glass operation is documented and post-event review is required.
|
||||
|
||||
## References
|
||||
|
||||
- [AWS STS AssumeRoleWithWebIdentity](https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRoleWithWebIdentity.html)
|
||||
- [Ceph RGW STS](https://docs.ceph.com/en/latest/radosgw/STS/)
|
||||
- [MinIO AssumeRoleWithWebIdentity](https://min.io/docs/minio/linux/developers/security-token-service/AssumeRoleWithWebIdentity.html)
|
||||
- [Cloudflare R2 Temporary Credentials API](https://developers.cloudflare.com/api/resources/r2/subresources/temporary_credentials/)
|
||||
- [Cloudflare R2 temporary credential example](https://developers.cloudflare.com/r2/examples/authenticate-r2-temp-credentials/)
|
||||
@@ -1,7 +1,7 @@
|
||||
# Platform Identity and Security Architecture
|
||||
|
||||
Status: draft architecture baseline for NetKingdom/Railiance/Coulomb
|
||||
Date: 2026-05-17
|
||||
Status: implemented architecture baseline for NetKingdom/Railiance/Coulomb
|
||||
Date: 2026-05-18
|
||||
|
||||
## Purpose
|
||||
|
||||
@@ -305,6 +305,86 @@ Possible responsibilities:
|
||||
This orchestration layer should build on Railiance capabilities rather
|
||||
than bypassing the Railiance stack boundaries.
|
||||
|
||||
ADR-0007 records the current decision: keep orchestration in Railiance
|
||||
playbooks for now, with NetKingdom defining the trust-state model,
|
||||
readiness checks, OpenBao boundaries, and security semantics.
|
||||
|
||||
## flex-auth And Topaz Implications
|
||||
|
||||
flex-auth work must preserve the recursive boundary between platform
|
||||
control-plane resources and tenant resources.
|
||||
|
||||
Required implications:
|
||||
|
||||
- CARING descriptors must include scope and tenant metadata for
|
||||
tenant-scoped access, and must mark rare platform-scoped access
|
||||
explicitly.
|
||||
- Policy packages must distinguish `tenant:platform` policy from
|
||||
tenant-local packages such as `tenant:coulomb`.
|
||||
- Decision envelopes must carry subject, issuer, audience, tenant,
|
||||
protected-system id, resource, action, requested TTL where relevant,
|
||||
assurance evidence, obligations, deny reasons, and audit correlation
|
||||
ids.
|
||||
- Topaz is a delegated PDP runtime behind flex-auth. It must not become
|
||||
the canonical policy model, identity provider, or platform control
|
||||
plane.
|
||||
- Audit and explain records must be durable enough to reconstruct why a
|
||||
platform-root, secret, credential, or tenant-administration decision was
|
||||
allowed or denied.
|
||||
- Platform-root guardrails must deny tenant administrators the ability to
|
||||
alter IAM Profile semantics, OpenBao platform mounts/auth methods,
|
||||
flex-auth policy import pipelines, Topaz runtime configuration, or
|
||||
platform audit retention.
|
||||
|
||||
OpenBao secret access and dynamic credential requests follow the same
|
||||
authorization rule: identity proves the actor or workload, flex-auth
|
||||
decides whether the request is permitted, and OpenBao stores, issues,
|
||||
leases, audits, and revokes the secret material.
|
||||
|
||||
## Coulomb Tenant Onboarding Path
|
||||
|
||||
The first Coulomb tenant onboarding path should be repeatable before it
|
||||
becomes automated:
|
||||
|
||||
1. Register `tenant:coulomb` as a tenant distinct from
|
||||
`tenant:platform`.
|
||||
2. Map Coulomb human, service, and agent principals to IAM Profile claims
|
||||
with issuer, audience, subject, group, tenant, and assurance evidence.
|
||||
3. Register Coulomb protected systems and resources in flex-auth with
|
||||
stable protected-system ids.
|
||||
4. Import tenant-scoped policy packages and CARING descriptors for
|
||||
Coulomb resources.
|
||||
5. Initialize the delegated PDP runtime, starting with Topaz, using only
|
||||
the policy packages approved for the tenant and platform boundary.
|
||||
6. Provision Coulomb workload secret paths, Kubernetes auth roles, or
|
||||
delivery mechanisms in OpenBao without granting access to platform
|
||||
mounts, unseal/recovery material, or global auth configuration.
|
||||
7. Run audit readiness checks before admitting production traffic:
|
||||
identity issuance, flex-auth decision envelope, Topaz health,
|
||||
OpenBao audit event, workload enforcement event, and correlation id.
|
||||
|
||||
The onboarding path is complete when a Coulomb workload can authenticate,
|
||||
receive a scoped authorization decision, obtain only the allowed secret or
|
||||
short-lived credential, enforce the decision locally, and produce an
|
||||
auditable record without receiving platform-root authority.
|
||||
|
||||
## Production Readiness Checks
|
||||
|
||||
Before the security platform is production-ready, each trust state needs
|
||||
an explicit check:
|
||||
|
||||
| Area | Readiness check |
|
||||
| --- | --- |
|
||||
| MFA and identity | key-cape or Keycloak issues IAM Profile-compatible tokens; privacyIDEA or the selected MFA provider enforces required assurance for privileged actions |
|
||||
| Bootstrap and recovery | age/SOPS material, emergency bundle, and break-glass credentials are present, tested, and separated from tenant administration |
|
||||
| OpenBao runtime secrets | OpenBao is initialized, unsealed or auto-unsealed by the approved mechanism, backed up, audited, and using scoped auth methods and mounts |
|
||||
| Secret rotation | service, database, OpenBao-issued, and break-glass rotation paths have documented blast radius and verification steps |
|
||||
| flex-auth policy state | platform and tenant policy packages are versioned, reviewable, imported, and explainable |
|
||||
| Topaz runtime | delegated PDP health, data freshness, policy load status, and fail-closed behavior are verified |
|
||||
| Tenant onboarding | `tenant:coulomb` resources, claims, policies, OpenBao paths, and audit correlation are registered and tested |
|
||||
| Audit sink | identity, flex-auth, Topaz, OpenBao, Kubernetes, and workload audit records land in durable storage with restore/drill coverage |
|
||||
| Break-glass | emergency access works when normal identity is unavailable and produces a post-event review record |
|
||||
|
||||
## Open Questions
|
||||
|
||||
- Where is the durable audit log stored for platform-root decisions?
|
||||
|
||||
Reference in New Issue
Block a user