Files
railiance-cluster/docs/promote-rollback-onboarding.md
tegwick 87bd73b26b
Some checks failed
railiance-tests / smoke (push) Has been cancelled
Add Railiance promote rollback tooling
2026-06-27 17:01:11 +02:00

2.5 KiB

Promote, Rollback, And Onboarding

This guide shows the representative Railiance lifecycle for an overlay repo. Commands default to plan mode so the path is repeatable before cluster access or operator approval exists.

Stage 1

bin/railiance run /path/to/overlay --pretty

Stage 1 validates railiance/app.toml, local commands, and local checks. Save the JSON result as non-secret evidence before Stage 2.

Stage 2

bin/railiance deploy --stage 2 /path/to/overlay --plan --pretty
bin/railiance observe --stage 2 /path/to/overlay --plan --pretty

When Helm, kubectl, cluster access, and approval evidence are ready:

bin/railiance deploy --stage 2 /path/to/overlay --apply --approval-id <state-hub-id>
bin/railiance observe --stage 2 /path/to/overlay --live --pretty

For critical workloads, Stage 2 apply must not run until the operator has approved canary exposure and rollback context is known.

Stage 3

bin/railiance promote /path/to/overlay --plan --pretty
bin/railiance rollback /path/to/overlay --plan --pretty

Promotion plan mode emits a railiance.stage3-promote-result.v1 JSON result with stable release identity, chart and values paths, previous-stable target, expected evidence, and approval requirements.

Rollback plan mode emits a railiance.stage3-rollback-result.v1 JSON result with rollback strategy, release identity, verification text, and apply-time requirements.

When approval evidence and Helm access are ready:

bin/railiance promote /path/to/overlay --apply --approval-id <state-hub-id>
bin/railiance rollback /path/to/overlay --apply --approval-id <state-hub-id> --revision <helm-revision>

Stage 3 apply fails closed if the chart or values are missing, previous stable is not recorded, Helm is unavailable, or approval evidence is missing. Rollback apply fails closed if the rollback strategy is missing, Helm is unavailable, approval evidence is missing, or a Helm revision is required but absent.

Human Approval Points

Critical infrastructure workloads require explicit operator approval before:

  • Stage 2 canary exposure;
  • Stage 3 stable promotion;
  • rollback apply, unless an incident runbook defines a narrower break-glass process and records the evidence id.

Progress notes should include only non-secret result summaries: schema version, status, release, namespace, approval id, check counts, and command byte counts. Do not paste command logs, kubeconfigs, tokens, or private service output.