Add credential-change delegated applier flow

This commit is contained in:
2026-07-01 20:07:26 +02:00
parent c626bfcf15
commit a95236d2e5
21 changed files with 2705 additions and 119 deletions

View File

@@ -4,18 +4,19 @@ type: workplan
title: "Workload KV Access Lanes for ops-warden Fetch"
domain: financials
repo: railiance-platform
status: active
status: finished
owner: codex
topic_slug: railiance
planning_priority: high
planning_order: 6
created: "2026-06-27"
updated: "2026-06-28"
updated: "2026-06-29"
depends_on_workplans:
- RAIL-PL-WP-0002
- RAILIANCE-WP-0004
related_state_hub_messages:
- "551031d1-335e-4db8-9535-820fea52d0a3"
- "f76d3a9e-a98f-4081-885d-b79d94312699"
state_hub_workstream_id: "96c8a93d-7a5a-4fa9-8f7b-865119551da3"
---
@@ -268,7 +269,7 @@ groups bound-claim mismatch. `platform-root` was restored to the
```task
id: RAILIANCE-WP-0006-T06
status: progress
status: done
priority: high
state_hub_task_id: "8e84ec19-01db-4baf-a532-de87e51d4994"
```
@@ -301,6 +302,16 @@ catalog. Keep activation pending until caller verification and catalog update.
to confirm that its dedicated `whynot-design-npm-publish` catalog selector
resolves through the caller-scoped lane.
**2026-06-29:** ops-warden confirmed in State Hub message
`f76d3a9e-a98f-4081-885d-b79d94312699` that catalog selector
`whynot-design-npm-publish` is `status: active`, `resolvable: true`, and wired
to the owner-confirmed lane:
`platform/workloads/coulomb/whynot-design/npm-publish`, field
`NPM_AUTH_TOKEN`, OIDC role `whynot-design-workload-kv-read`, and policy
`workload-kv-read-whynot-design-npm-publish`. ops-warden also confirmed it
notified whynot-design with `warden access whynot-design-npm-publish --exec -- npm publish`,
and that the sibling lanes remain draft for separate planning.
## T07 - Decide whether to batch sibling workload-KV requests
```task
@@ -329,6 +340,13 @@ serviced first. The later ops-warden batch follow-up is now represented as
proposed CCRs in `RAILIANCE-WP-0007`, still unapproved and unresolvable until
human review and verification.
**2026-06-29:** Reviewed the sibling lane suggestions against `INTENT.md`.
Created follow-up workplans `RAILIANCE-WP-0009` for the issue-core runtime
ingestion credential lane and `RAILIANCE-WP-0010` for the llm-connect
OpenRouter provider key lane. Both plans keep this repo's scope limited to
shared platform secret custody, least-privilege OpenBao/External Secrets
delivery, verification, and ops-warden front-door handoff.
## Exit Criteria
- The whynot-design npm publish token has a concrete OpenBao KV path, field,

View File

@@ -4,13 +4,13 @@ type: workplan
title: "Credential Change Proposal Review Workflow"
domain: financials
repo: railiance-platform
status: active
status: finished
owner: codex
topic_slug: railiance
planning_priority: high
planning_order: 7
created: "2026-06-27"
updated: "2026-06-28"
updated: "2026-06-30"
depends_on_workplans:
- RAIL-PL-WP-0002
- RAILIANCE-WP-0005
@@ -156,7 +156,7 @@ lives in `tests/test_credential_change.py`.
```task
id: RAILIANCE-WP-0007-T04
status: progress
status: done
priority: high
state_hub_task_id: "1b2e7752-815c-46f8-a2e2-212e8d04da80"
```
@@ -184,11 +184,17 @@ from reviewed plan to the interactive live applier.
generated role payloads after live OpenBao rejected an OIDC role without
callbacks. Unit coverage now checks the generated whynot-design role payload.
**2026-06-30:** Added source-artifact diff rendering to `plan` and delegated
`applier-dry-run` output. The generated plan now reports whether the checked-in
policy artifact matches the CCR-generated HCL and shows a unified diff when it
does not. Approved-only `apply-plan`/`operator-commands` remain gated by CCR
status and confirmed auth binding.
## T05 - Add chat/CLI approval commands
```task
id: RAILIANCE-WP-0007-T05
status: progress
status: done
priority: high
state_hub_task_id: "e6d4d2d1-1881-4db7-92f8-05e3fdb846ae"
```
@@ -215,11 +221,16 @@ State Hub decision-event emission and tighter chat integration. Created a
State Hub decision for `CCR-2026-0001` and added `sync-decision` so resolved
State Hub decisions can update the file-backed CCR status.
**2026-06-30:** Added optional `--record-state-hub` emission for approve, deny,
and needs-changes commands. Review comments are checked for known secret markers
before being written, and the State Hub progress event records only non-secret
CCR id/path/policy/field/auth-role metadata plus the reviewer comment.
## T06 - Build an interactive runbook for apply and verify
```task
id: RAILIANCE-WP-0007-T06
status: todo
status: done
priority: high
state_hub_task_id: "3c3fc38c-afa4-4367-b3e6-ba4b286ced30"
```
@@ -235,11 +246,23 @@ Acceptance:
- Positive and negative verification steps are guided.
- Non-secret evidence is recorded automatically.
**2026-06-30:** Added `scripts/credential-change.py runbook <CCR>` and Make
target `credential-change-runbook` to render the attended operator checklist,
final confirmation phrase, metadata apply guidance, secret custody instructions,
positive/negative verification steps, activation conditions, and evidence
commands. `runbook --execute-metadata` is opt-in, requires the exact `APPLY
<CCR-ID>` confirmation phrase, uses the local `bao` CLI with ambient approved
operator authority, writes only policy/auth metadata, and records a non-secret
`metadata_apply` evidence entry. Added `record-evidence` plus Make target
`credential-change-record-evidence` so operators can append apply, secret
provisioning, verification, and activation evidence to the CCR and optionally
State Hub without storing secret values.
## T07 - Pilot with whynot-design and ops-warden
```task
id: RAILIANCE-WP-0007-T07
status: progress
status: done
priority: high
state_hub_task_id: "07a7d8bf-5528-41c8-a791-d6ccd0466a33"
```
@@ -302,11 +325,15 @@ groups bound-claim mismatch, `platform-root` membership was restored afterward,
and `CCR-2026-0001` is now active/ready/resolvable. ops-warden catalog
confirmation remains the external closeout step.
**2026-06-30:** Closed the pilot task based on the active/ready/resolvable CCR
state and prior ops-warden catalog confirmation that the selector is active and
resolvable. The remaining lifecycle work is now tracked separately in T08.
## T08 - Add deactivation, rotation, and compromise flows
```task
id: RAILIANCE-WP-0007-T08
status: todo
status: done
priority: medium
state_hub_task_id: "23d6ef9d-8dbc-4468-b486-5ec8ada71130"
```
@@ -322,11 +349,22 @@ Acceptance:
- Deactivation disables the relevant access front door and auth/policy path.
- Compromise flow records blast-radius notes and required follow-up tasks.
**2026-06-30:** Added `lifecycle-plan`, `lifecycle-event`, and
`import-inventory` commands plus Make targets. Lifecycle plans render
deactivation, rotation, and compromise guidance, including access-front-door
state changes and OpenBao metadata disable commands for deactivation or
compromise. Lifecycle events update CCR status/front-door readiness, append
non-secret lifecycle evidence, and optionally post State Hub progress.
Compromise events accept non-secret blast-radius and follow-up references.
`import-inventory` can create a CCR-backed inventory file and matching read
policy artifact for an existing lane without asking for or storing secret
values.
## T09 - Add decision templates and guided review actions
```task
id: RAILIANCE-WP-0007-T09
status: todo
status: done
priority: high
state_hub_task_id: "c436fd8b-cd82-4600-81b0-87ec069d7ae6"
```
@@ -348,6 +386,12 @@ Acceptance:
- Future UI work can replace prefix parsing with structured decision outcomes
without changing the CCR audit trail.
**2026-06-30:** Added `scripts/credential-change.py decision-templates <CCR>`
and Make target `credential-change-decision-templates`. The generated templates
include accepted prefixes, CCR id, KV path, policy, auth-role path, and the
linked State Hub decision. Ambiguous State Hub rationale text now fails with the
valid templates in the error message.
## Exit Criteria
- A human can review and approve or deny a credential/security change without

View File

@@ -10,7 +10,7 @@ topic_slug: railiance
planning_priority: high
planning_order: 8
created: "2026-06-28"
updated: "2026-06-28"
updated: "2026-06-30"
depends_on_workplans:
- RAIL-PL-WP-0002
- RAILIANCE-WP-0005
@@ -114,7 +114,7 @@ and the local applier script.
```task
id: RAILIANCE-WP-0008-T01
status: todo
status: done
priority: high
state_hub_task_id: "d19fdfc5-addb-4813-8086-3aca2e948cea"
```
@@ -129,11 +129,20 @@ Acceptance:
- The proposal covers both workload KV read lanes and credential broker issuer
policies.
**2026-06-29:** Added `docs/openbao-approved-automation-delegation.md` and
`openbao/policies/credential-change-prod-applier.hcl`. The document defines
build/development, test/staging, and production boundaries, the allowed
production metadata mutation surface, denied secret/admin paths, and required
non-secret evidence. The production policy candidate allows only reviewed
metadata writes for workload KV read policies, credential-broker issuer
policies, approved auth-role prefixes, and self capability checks; it does not
grant secret value reads or writes.
## T02 - Implement a CCR-aware applier dry-run
```task
id: RAILIANCE-WP-0008-T02
status: todo
status: done
priority: high
state_hub_task_id: "2613f40d-fbd9-44f3-a864-85ec1d54e8f7"
```
@@ -149,11 +158,22 @@ Acceptance:
- Dry-run refuses attempts to create `root`, `platform-admin`, wildcard, or
unrelated policy names.
**2026-06-29:** Added `scripts/credential-change.py applier-dry-run <CCR>` and
Make target `credential-change-applier-dry-run`. The dry-run validates the CCR,
requires approved/applied/verified/active status, requires confirmed auth
bindings, verifies the OpenBao mount/path/policy/role stay inside the delegated
metadata surface, compares the policy artifact to the generated CCR policy body,
and renders only policy/auth-role mutations. It explicitly leaves secret value
writes, secret reads, and front-door activation out of scope. Unit tests cover
the active whynot-design CCR success path, unapproved CCR refusal, and rejection
of `platform-admin`/out-of-scope mount and path attempts. `make
credential-tests` passed with 28 tests.
## T03 - Add non-production applier role first
```task
id: RAILIANCE-WP-0008-T03
status: todo
status: progress
priority: medium
state_hub_task_id: "ff927a19-50fb-4351-8db1-c60a0cce0995"
```
@@ -167,11 +187,32 @@ Acceptance:
- Negative checks prove unrelated policy/auth/secret paths are denied.
- Evidence is recorded without secret values.
**2026-06-30:** Added the non-production metadata-only policy candidate
`openbao/policies/credential-change-nonprod-applier.hcl` and documented that
generated test-secret paths require separate CCR-backed approval. Live non-prod
identity creation and positive/negative OpenBao evidence remain to close this
task.
**2026-06-30:** Added the guarded `applier-apply` execution path that reuses the
CCR dry-run guardrails, requires exact `DELEGATED APPLY <CCR-ID>` confirmation,
uses the local `bao` CLI with ambient delegated applier authority, writes only
policy/auth-role metadata, and records non-secret `delegated_metadata_apply`
evidence. Non-production task closure still needs a live build/test applier
identity plus positive and negative capability evidence.
**2026-06-30:** Added `scripts/openbao-apply-credential-change-appliers.py` and
Make target `openbao-credential-change-appliers-dry-run` to install/dry-run the
non-production applier policy plus bounded `auth/token/roles/credential-change-
nonprod-applier` role. The token role allows only the matching applier policy,
disallows `root` and `platform-admin`, disables the default policy, and does not
issue tokens by itself. Live non-production apply and denial evidence remains
the closeout gate.
## T04 - Add production metadata applier with human approval gate
```task
id: RAILIANCE-WP-0008-T04
status: todo
status: progress
priority: high
state_hub_task_id: "414abd65-22d3-420f-994d-f7fdd1302db5"
```
@@ -185,6 +226,29 @@ Acceptance:
- Unapproved CCRs fail closed.
- Secret value provisioning is still not automated in production.
**2026-06-30:** Strengthened the production gate by adding source-artifact
checks to the CCR applier dry-run and documenting that unapproved CCRs fail
closed before OpenBao mutation rendering. The production policy candidate exists
and remains metadata-only; live delegated identity creation/application evidence
still needs an operator-held OpenBao step.
**2026-06-30:** Added `applier-apply` and Make targets
`credential-change-applier-apply-plan` / `credential-change-applier-apply`. The
command fails closed for unapproved CCRs, renders the dry-run payload before
mutation, requires exact confirmation, does not accept tokens in argv, leaves
secret values out of scope, and appends State Hub/file-backed non-secret apply
evidence when requested. Production closure still requires live execution using
the constrained applier identity rather than broad `platform-admin`.
**2026-06-30:** Added `scripts/openbao-apply-credential-change-appliers.py` and
Make targets `openbao-credential-change-appliers-dry-run` /
`openbao-configure-credential-change-appliers` to configure the production
`credential-change-prod-applier` policy and bounded token role. The role allows
only `credential-change-prod-applier`, disallows `root` and `platform-admin`,
uses service tokens, disables default policy attachment, and keeps token issuance
outside the setup script. Production closure still needs a live run and
capability evidence using this constrained identity.
## T05 - Close the whynot-design pilot
```task

View File

@@ -10,7 +10,7 @@ topic_slug: railiance
planning_priority: high
planning_order: 9
created: "2026-06-29"
updated: "2026-06-29"
updated: "2026-06-30"
depends_on_workplans:
- RAIL-PL-WP-0002
- RAILIANCE-WP-0004
@@ -80,10 +80,10 @@ The plan supports these `INTENT.md` principles:
| Read policy | `workload-kv-read-issue-core-runtime` |
| Policy file | `openbao/policies/workload-kv-read-issue-core-runtime.hcl` |
| Auth method | Kubernetes auth |
| Auth role | `issue-core-runtime-workload-kv-read` |
| Proposed service account | `issue-core` |
| Proposed namespace | `issue-core` |
| Delivery surface | External Secrets into the `issue-core` namespace |
| Auth role | `external-secrets-issue-core` |
| OpenBao auth service account | `external-secrets` |
| OpenBao auth namespace | `external-secrets` |
| Delivery surface | `ExternalSecret issue-core/issue-core-runtime` to Secret `issue-core-runtime` |
| ops-warden command | `warden access issue-core-ingestion-api-key --fetch ISSUE_CORE_API_KEY` |
The `GITEA_BACKEND_TOKEN` field remains an explicit review point. Remove it
@@ -95,7 +95,7 @@ from the CCR before approval if issue-core no longer needs it in this lane.
```task
id: RAILIANCE-WP-0009-T01
status: todo
status: done
priority: high
state_hub_task_id: "64d85288-38fb-4374-b889-fd0d136d3bdf"
```
@@ -114,11 +114,19 @@ Acceptance:
- The lane remains clearly platform-owned secret custody, not issue-core
application logic.
**2026-06-30:** Live cluster metadata confirms
`ExternalSecret issue-core/issue-core-runtime` is `Ready=True` with reason
`SecretSynced` and maps both `ISSUE_CORE_API_KEY` and `GITEA_BACKEND_TOKEN` from
`platform/workloads/issue-core/issue-core/issue-core-runtime`. Retain both
fields in `CCR-2026-0002` unless the issue-core owner later removes one through
review. The CCR remains `proposed`; this task records non-secret scope review,
not approval to apply.
## T02 - Confirm Kubernetes auth and External Secrets binding
```task
id: RAILIANCE-WP-0009-T02
status: todo
status: done
priority: high
state_hub_task_id: "7f4a8317-13f0-4be3-948c-a2e2f90447cf"
```
@@ -134,6 +142,17 @@ Acceptance:
- The External Secrets target and expected field names are documented.
- No direct human or agent read path is activated unless separately approved.
**2026-06-30:** Confirmed the current delivery path uses the platform External
Secrets operator, not a workload pod service account. The `issue-core`
Deployment uses the `default` service account, and no `issue-core` service
account exists. `ClusterSecretStore/openbao` authenticates to OpenBao as
`external-secrets/external-secrets` with role `external-secrets-issue-core` and
is limited to the `issue-core` namespace. Updated `CCR-2026-0002` to this
confirmed auth subject while keeping the exact
`workload-kv-read-issue-core-runtime` policy. `credential-change.py
applier-dry-run CCR-2026-0002` now blocks only because the CCR is still
`proposed`.
## T03 - Apply or confirm least-privilege OpenBao metadata
```task

View File

@@ -10,7 +10,7 @@ topic_slug: railiance
planning_priority: high
planning_order: 10
created: "2026-06-29"
updated: "2026-06-29"
updated: "2026-06-30"
depends_on_workplans:
- RAIL-PL-WP-0002
- RAILIANCE-WP-0004
@@ -84,10 +84,10 @@ The plan supports these `INTENT.md` principles:
| Read policy | `workload-kv-read-llm-connect-provider-secrets` |
| Policy file | `openbao/policies/workload-kv-read-llm-connect-provider-secrets.hcl` |
| Auth method | Kubernetes auth |
| Auth role | `llm-connect-provider-secrets-read` |
| Proposed service account | `llm-connect` |
| Proposed namespace | `activity-core` |
| Delivery surface | External Secrets to `llm-connect-provider-secrets` in `activity-core` |
| Auth role | `external-secrets-activity-core` |
| OpenBao auth service account | `external-secrets` |
| OpenBao auth namespace | `external-secrets` |
| Delivery surface | Future activity-core ExternalSecret to Secret `llm-connect-provider-secrets` |
| ops-warden command | `warden access llm-connect-openrouter-api-key --fetch OPENROUTER_API_KEY` |
## Tasks
@@ -96,7 +96,7 @@ The plan supports these `INTENT.md` principles:
```task
id: RAILIANCE-WP-0010-T01
status: todo
status: progress
priority: high
state_hub_task_id: "307b75a6-a3a8-473b-b171-7379d2848698"
```
@@ -116,11 +116,19 @@ Acceptance:
- The lane remains clearly platform-owned secret custody, not llm-connect model
routing or provider selection logic.
**2026-06-30:** Confirmed `activity-core` namespace exists and Kubernetes Secret
`activity-core/llm-connect-provider-secrets` exists, but no activity-core
`ExternalSecret` exists yet. Kept canonical CCR catalog id
`llm-connect-openrouter-api-key`; ops-warden previously mentioned
`openrouter-llm-connect`, so selector agreement remains open and this task stays
`progress`. OpenBao public seal status now reports `sealed=false`; the prior
sealed message is no longer the active blocker.
## T02 - Confirm Kubernetes auth and External Secrets binding
```task
id: RAILIANCE-WP-0010-T02
status: todo
status: done
priority: high
state_hub_task_id: "829192f5-4502-44e0-8020-656d74d5282a"
```
@@ -138,6 +146,19 @@ Acceptance:
`llm-connect-provider-secrets` or updated with the approved alternative.
- No direct human or agent read path is activated unless separately approved.
**2026-06-30:** Confirmed the proposed `llm-connect` service account does not
exist and the current `llm-connect` Deployment uses the namespace `default`
service account. Updated `CCR-2026-0003` to the approved platform ESO pattern:
OpenBao Kubernetes auth role `external-secrets-activity-core` bound to
`external-secrets/external-secrets`. Added
`argocd/platform-addons/openbao-secretstore/openbao-activity-core.clustersecretstore.yaml`,
limited to the `activity-core` namespace, and Make target
`openbao-configure-external-secrets-activity-core` for the matching OpenBao
role/policy apply. `kubectl kustomize argocd/platform-addons/openbao-secretstore`
renders both the existing issue-core store and the new activity-core store.
`credential-change.py applier-dry-run CCR-2026-0003` now blocks only because the
CCR is still `proposed`.
## T03 - Apply or confirm least-privilege OpenBao metadata
```task