4.6 KiB
Credential Lane Lifecycle Runbook
Status: active (RAILIANCE-WP-0009-T07 / RAILIANCE-WP-0010-T07) Date: 2026-07-02
Covers deactivation, rotation, and compromise response for the workload KV
lanes established by CCR-2026-0002 (issue-core) and CCR-2026-0003
(llm-connect). The canonical, always-current procedure is generated from
the CCR itself — this runbook adds only the lane-specific consumer facts the
generator cannot know.
scripts/credential-change.py lifecycle-plan <CCR-ID> --action {deactivate|rotate|compromise}
# then execute the rendered steps and record:
scripts/credential-change.py lifecycle-event <CCR-ID> --action <action> \
--actor <operator> --reason "<non-secret>" --detail "<non-secret>" --record-state-hub
All three actions share the same invariants: the front door goes
non-resolvable first, OpenBao metadata changes use approved operator or
delegated-applier authority (never platform-admin handoffs), audit
evidence is preserved (never delete the audit device or its entries), and no
secret value ever appears in Git, State Hub, chat, prompts, or shell history.
Lane: issue-core runtime ingestion (CCR-2026-0002)
| Item | Value |
|---|---|
| KV path | platform/workloads/issue-core/issue-core/issue-core-runtime |
| Fields | ISSUE_CORE_API_KEY, GITEA_BACKEND_TOKEN |
| Policy / auth role | workload-kv-read-issue-core-runtime / auth/kubernetes/role/external-secrets-issue-core |
| Primary consumer | ExternalSecret issue-core/issue-core-runtime (CoulombCore cluster, 1h refresh) |
| ops-warden catalog | issue-core-ingestion-api-key |
Consumer facts the generated plan does not cover:
- Deactivating the policy/role stops the ExternalSecret from refreshing,
but the materialized Kubernetes Secret persists with the last value —
a real deactivation or compromise response must also delete
secret/issue-core-runtimein theissue-corenamespace (ESO will not recreate it while the lane is down) and restart the issue-core Deployment. ISSUE_CORE_API_KEYhas a second consumer: railiance01'sactivity-core/actcore-runtime-secretholds an operator-injected copy (2026-07-02, ISSUE-WP-0003-T06). Rotation and compromise response MUST re-inject the new value there (stdin-only pipe from OpenBao) and restartdeploy/actcore-worker, or activity-core emission silently starts failing with 401s on the next run.GITEA_BACKEND_TOKENis a scoped Gitea token for service userissue-core-svc; rotating it means minting a new token in Gitea first, then updating OpenBao — order matters, or ingestion breaks between steps.
Lane: llm-connect OpenRouter provider key (CCR-2026-0003)
| Item | Value |
|---|---|
| KV path | platform/workloads/activity-core/llm-connect/llm-connect-provider-secrets |
| Field | OPENROUTER_API_KEY |
| Policy / auth role | workload-kv-read-llm-connect-provider-secrets / auth/kubernetes/role/external-secrets-activity-core |
| Primary consumer | ExternalSecret activity-core/llm-connect-provider-secrets (CoulombCore cluster, 1h refresh) |
| ops-warden catalog | openrouter-llm-connect |
Consumer facts the generated plan does not cover:
- llm-connect consumes the Secret via
envFrom, so a rotated value reaches the runtime only afterkubectl -n activity-core rollout restart deploy/llm-connect(CoulombCore). Wait for the ExternalSecret refresh (orforce-syncannotate) before restarting. - The railiance01 llm-connect instance is out of scope of this lane: it
uses a bootstrap-provisioned Secret from
activity-core/k8s/railiance/bootstrap-secrets.sh. Rotating the OpenRouter key upstream (at OpenRouter) invalidates both copies — a provider-side rotation therefore always requires the railiance01 manual update too, or the daily triage runs start failing with provider auth errors. - Compromise response for a provider key has an extra step the plan cannot render: revoke the key at OpenRouter itself (provider console) before or immediately after disabling the front door; OpenBao custody actions alone do not stop a leaked provider key from working.
Verification after rotate
Return the lane to active only with fresh positive + negative evidence,
same shape as activation (2026-07-02 precedent):
- positive: ExternalSecret
SecretSynced=Truewith a new refresh timestamp, consumer pod healthy after restart; - negative: a
default-policy token denied on the KV data path, matched in the file audit device by path and timestamp; - record via
lifecycle-event ... --record-state-huband notify ops-warden to flip the catalog entry back to active.