8.1 KiB
Railiance app.toml Contract
This document defines the repository-local railiance/app.toml contract used by
Railiance staged promotion tooling. The file tells Railiance how a workload
moves through Stage 1 local validation, Stage 2 production canary, and Stage 3
production promotion without relying on bespoke operator notes.
The contract is intentionally declarative. Commands, health checks, platform
dependencies, and secret references are described by stable names. Plaintext
secrets, bearer tokens, kubeconfigs, and private key material must never appear
in railiance/app.toml.
The machine-readable schema lives at schemas/railiance-app.schema.json. A
minimal example lives at examples/railiance/app.toml.
File Location
Participating workload repositories declare the contract at:
railiance/app.toml
Overlay repositories for third-party applications use the same path in the overlay repo, not in the upstream source repository.
Versioning
Every file must include:
schema_version = "railiance.app.v1"
Breaking contract changes require a new schema version. Tooling must fail closed
when it sees an unsupported schema_version.
Top-Level Sections
app
Identifies the workload and its ownership boundary.
Required fields:
id: stable lowercase id using letters, numbers, and hyphens.name: human-readable workload name.repo: owning source or overlay repository slug.owner: owning team, domain, or operator group.criticality: one oflow,medium,high, orcritical.description: short purpose statement.
Production-critical workloads include source forge, identity, State Hub,
Inter-Hub, databases, object stores, backup systems, ingress, and cluster-wide
policy controllers. For those workloads, criticality = "critical" requires
explicit human approval before Stage 2 traffic exposure and Stage 3 promotion.
source
Identifies the candidate under promotion.
Required fields:
revision: commit id, tag, or immutable source revision expression.artifact: artifact kind, normallyimage,helm-chart, orbundle.digest_policy: one ofrequired,preferred, ornot-applicable.
If an image is promoted, Stage 2 and Stage 3 tooling should prefer immutable image digests over mutable tags.
platform.dependencies
Declares platform services required before canary or production promotion.
Each dependency has:
name: stable service name.kind: dependency kind such aspostgres,redis,object-store,identity,state-hub,inter-hub,network, orother.required: boolean.stage: earliest stage that needs it, one ofstage1,stage2,stage3.evidence: non-secret evidence expected before promotion, such as a health endpoint result, Kubernetes Ready condition, or State Hub progress id.
secrets.references
Declares required secret references without secret values.
Each reference has:
name: workload-local secret name.route: approved credential route id, for exampleopenbao-api-key,key-cape-oidc-login, oractivity-core-issue-sink.target: non-secret target reference such as a Kubernetes Secret name, ExternalSecret name, OpenBao path, or environment variable name.stage: earliest stage that needs the secret.required: boolean.
Forbidden fields include plaintext values, tokens, passwords, kubeconfigs, or
private keys. Tooling must reject suspicious field names such as value,
token, password, secret, private_key, or kubeconfig inside secret
reference objects unless they are part of the approved non-secret target text.
observability
Defines how promotion tooling proves the workload is alive and observable.
Required fields:
health_endpoints: one or more HTTP health endpoint declarations.metrics: optional metrics endpoint or query references.logs: optional log selectors or query references.
Health endpoint declarations include name, url, stage, and expected
status code. URLs may be internal service URLs for Stage 2/3; they must not
embed credentials.
rollback
Defines how the workload returns to a previous stable state.
Required fields:
strategy: one ofhelm-revision,image-digest,traffic-shift,manual-runbook, ornone.command: command name or runbook path. This may be a placeholder before T07 implements automation, but it must tell the operator where rollback lives.verification: non-secret check to confirm rollback succeeded.
strategy = "none" is allowed only for Stage 1-only workloads and must not be
used for production-critical workloads.
Stage Sections
The contract has one table for each stage:
[stages.stage1]
[stages.stage2]
[stages.stage3]
Each stage includes:
enabled: boolean.namespace: target Kubernetes namespace, or a local namespace for Stage 1.release: release identity.commands: ordered command aliases or shell commands that tooling may run.checks: ordered check ids to evaluate.evidence: expected non-secret evidence outputs.requires_approval: boolean.
Stage 2 additionally includes canary_mode, one of weighted, header,
path, shadow, or isolated, plus observation_minutes and optional
traffic_percent when weighted routing is used.
Stage 3 additionally includes promotion_mode, one of traffic-shift,
release-replace, selector-switch, or workflow, plus previous_stable.
Check Definitions
Checks live under [[checks]] entries and are referenced by stage checks.
Required fields:
id: stable check id.type: one ofcommand,http,kubernetes,helm,metric,log, ormanual.stage: earliest stage that may run the check.description: human-readable purpose.required: boolean.
Type-specific fields:
command:runcommand string and optionaltimeout_seconds.http:url,expected_status, and optionaltimeout_seconds.kubernetes:namespace,resource, andcondition.helm:chart,values, andmodesuch astemplateorserver-dry-run.metric:query,window_minutes, andthreshold.log:selector,window_minutes, andforbidden_patterns.manual:evidence_requiredtext.
Checks must not print secrets. If a check needs secret-backed access, the result records only the route, target object, and pass/fail state.
Command Semantics
Commands in app.toml are declarations for Railiance tooling. Stage 1 and
Stage 2 commands now have local CLI support; Stage 3 commands may still point
to existing scripts or runbook commands until T07 lands.
Expected mapping:
- Stage 1 commands are consumed by
bin/railiance run <overlay-dir>. - Stage 2 commands are consumed by
bin/railiance deploy --stage 2 <overlay-dir>andbin/railiance observe --stage 2 <overlay-dir>. - Stage 3 commands are consumed by future
bin/railiance promote <overlay-dir>andbin/railiance rollback <overlay-dir>commands.
Tooling must emit machine-readable results with workload identity, candidate revision, checks run, pass/fail status, non-secret evidence, rollback target, and approval state.
Minimal Example
See examples/railiance/app.toml. It declares a critical internal service with:
- immutable image digest requirement;
- Stage 1 local validation;
- Stage 2 isolated canary;
- Stage 3 release replacement;
- OpenBao-routed secret references without values;
- HTTP, Helm, Kubernetes, and manual approval checks.
Adoption Rules
A workload can enter Stage 1 when app.toml passes schema validation and all
Stage 1 required checks are declared.
A workload can enter Stage 2 only when:
- Stage 1 passed for the same candidate artifact;
- Stage 2 namespace, release, canary mode, health checks, dependencies, and rollback target are declared;
- secret references use approved routes and contain no values;
- production-critical workloads have explicit approval.
A workload can enter Stage 3 only when:
- Stage 2 acceptance gates passed for the same candidate artifact;
previous_stableand rollback verification are recorded;- backup/restore posture is current for stateful workloads;
- production-critical workloads have explicit human approval.