Add Railiance promote rollback tooling
Some checks failed
railiance-tests / smoke (push) Has been cancelled

This commit is contained in:
2026-06-27 17:01:11 +02:00
parent 6d862e68be
commit 87bd73b26b
9 changed files with 484 additions and 15 deletions

View File

@@ -78,6 +78,7 @@ From two bare Linux servers, a Git repo, and valid credentials, you can rebuild
- [Railiance overlay repo pattern](overlay-repo-pattern.md)
- [Canary Helm template](canary-helm-template.md)
- [Stage 2 deploy and observe](stage2-deploy-observe.md)
- [Promote, rollback, and onboarding](promote-rollback-onboarding.md)
- [Railiance run command](railiance-run-command.md)
## 👥 Contributing

View File

@@ -186,17 +186,17 @@ records only the route, target object, and pass/fail state.
## Command Semantics
Commands in `app.toml` are declarations for Railiance tooling. Stage 1 and
Stage 2 commands now have local CLI support; Stage 3 commands may still point
to existing scripts or runbook commands until T07 lands.
Commands in `app.toml` are declarations for Railiance tooling. Stage 1, Stage
2, and Stage 3 commands now have local CLI support; workload scripts may still
wrap them for service-specific checks.
Expected mapping:
- Stage 1 commands are consumed by `bin/railiance run <overlay-dir>`.
- Stage 2 commands are consumed by `bin/railiance deploy --stage 2 <overlay-dir>`
and `bin/railiance observe --stage 2 <overlay-dir>`.
- Stage 3 commands are consumed by future `bin/railiance promote <overlay-dir>`
and `bin/railiance rollback <overlay-dir>` commands.
- Stage 3 commands are consumed by `bin/railiance promote <overlay-dir>` and
`bin/railiance rollback <overlay-dir>`.
Tooling must emit machine-readable results with workload identity, candidate
revision, checks run, pass/fail status, non-secret evidence, rollback target,

View File

@@ -317,14 +317,14 @@ must not cut over to Stage 3.
## Minimum Command Contract
Future CLI tasks should make these lifecycle operations repeatable:
The Railiance CLI makes these lifecycle operations repeatable:
```text
bin/railiance run <overlay-dir> # Stage 1 local validation
bin/railiance deploy --stage 2 <overlay-dir> --plan # Stage 2 canary plan
bin/railiance observe --stage 2 <overlay-dir> --plan # Stage 2 evidence targets
bin/railiance promote <overlay-dir> # Stage 3 production promotion
bin/railiance rollback <overlay-dir> # rollback to previous stable
bin/railiance run <overlay-dir> # Stage 1 local validation
bin/railiance deploy --stage 2 <overlay-dir> --plan # Stage 2 canary plan
bin/railiance observe --stage 2 <overlay-dir> --plan # Stage 2 evidence targets
bin/railiance promote <overlay-dir> --plan # Stage 3 production promotion
bin/railiance rollback <overlay-dir> --plan # rollback to previous stable
```
The exact command names may change as implementation lands, but the behavior

View File

@@ -0,0 +1,71 @@
# Promote, Rollback, And Onboarding
This guide shows the representative Railiance lifecycle for an overlay repo.
Commands default to plan mode so the path is repeatable before cluster access or
operator approval exists.
## Stage 1
```bash
bin/railiance run /path/to/overlay --pretty
```
Stage 1 validates `railiance/app.toml`, local commands, and local checks. Save
the JSON result as non-secret evidence before Stage 2.
## Stage 2
```bash
bin/railiance deploy --stage 2 /path/to/overlay --plan --pretty
bin/railiance observe --stage 2 /path/to/overlay --plan --pretty
```
When Helm, kubectl, cluster access, and approval evidence are ready:
```bash
bin/railiance deploy --stage 2 /path/to/overlay --apply --approval-id <state-hub-id>
bin/railiance observe --stage 2 /path/to/overlay --live --pretty
```
For critical workloads, Stage 2 apply must not run until the operator has
approved canary exposure and rollback context is known.
## Stage 3
```bash
bin/railiance promote /path/to/overlay --plan --pretty
bin/railiance rollback /path/to/overlay --plan --pretty
```
Promotion plan mode emits a `railiance.stage3-promote-result.v1` JSON result
with stable release identity, chart and values paths, previous-stable target,
expected evidence, and approval requirements.
Rollback plan mode emits a `railiance.stage3-rollback-result.v1` JSON result
with rollback strategy, release identity, verification text, and apply-time
requirements.
When approval evidence and Helm access are ready:
```bash
bin/railiance promote /path/to/overlay --apply --approval-id <state-hub-id>
bin/railiance rollback /path/to/overlay --apply --approval-id <state-hub-id> --revision <helm-revision>
```
Stage 3 apply fails closed if the chart or values are missing, previous stable
is not recorded, Helm is unavailable, or approval evidence is missing. Rollback
apply fails closed if the rollback strategy is missing, Helm is unavailable,
approval evidence is missing, or a Helm revision is required but absent.
## Human Approval Points
Critical infrastructure workloads require explicit operator approval before:
- Stage 2 canary exposure;
- Stage 3 stable promotion;
- rollback apply, unless an incident runbook defines a narrower break-glass
process and records the evidence id.
Progress notes should include only non-secret result summaries: schema version,
status, release, namespace, approval id, check counts, and command byte counts.
Do not paste command logs, kubeconfigs, tokens, or private service output.