Files
railiance-cluster/docs/promote-rollback-onboarding.md
tegwick 87bd73b26b
Some checks failed
railiance-tests / smoke (push) Has been cancelled
Add Railiance promote rollback tooling
2026-06-27 17:01:11 +02:00

72 lines
2.5 KiB
Markdown

# Promote, Rollback, And Onboarding
This guide shows the representative Railiance lifecycle for an overlay repo.
Commands default to plan mode so the path is repeatable before cluster access or
operator approval exists.
## Stage 1
```bash
bin/railiance run /path/to/overlay --pretty
```
Stage 1 validates `railiance/app.toml`, local commands, and local checks. Save
the JSON result as non-secret evidence before Stage 2.
## Stage 2
```bash
bin/railiance deploy --stage 2 /path/to/overlay --plan --pretty
bin/railiance observe --stage 2 /path/to/overlay --plan --pretty
```
When Helm, kubectl, cluster access, and approval evidence are ready:
```bash
bin/railiance deploy --stage 2 /path/to/overlay --apply --approval-id <state-hub-id>
bin/railiance observe --stage 2 /path/to/overlay --live --pretty
```
For critical workloads, Stage 2 apply must not run until the operator has
approved canary exposure and rollback context is known.
## Stage 3
```bash
bin/railiance promote /path/to/overlay --plan --pretty
bin/railiance rollback /path/to/overlay --plan --pretty
```
Promotion plan mode emits a `railiance.stage3-promote-result.v1` JSON result
with stable release identity, chart and values paths, previous-stable target,
expected evidence, and approval requirements.
Rollback plan mode emits a `railiance.stage3-rollback-result.v1` JSON result
with rollback strategy, release identity, verification text, and apply-time
requirements.
When approval evidence and Helm access are ready:
```bash
bin/railiance promote /path/to/overlay --apply --approval-id <state-hub-id>
bin/railiance rollback /path/to/overlay --apply --approval-id <state-hub-id> --revision <helm-revision>
```
Stage 3 apply fails closed if the chart or values are missing, previous stable
is not recorded, Helm is unavailable, or approval evidence is missing. Rollback
apply fails closed if the rollback strategy is missing, Helm is unavailable,
approval evidence is missing, or a Helm revision is required but absent.
## Human Approval Points
Critical infrastructure workloads require explicit operator approval before:
- Stage 2 canary exposure;
- Stage 3 stable promotion;
- rollback apply, unless an incident runbook defines a narrower break-glass
process and records the evidence id.
Progress notes should include only non-secret result summaries: schema version,
status, release, namespace, approval id, check counts, and command byte counts.
Do not paste command logs, kubeconfigs, tokens, or private service output.