10 KiB
Backup, Restore, And Secret Handoff
Last reviewed: 2026-06-05
Status: contract v1. This document defines ownership, evidence gates, and allowed references. It does not authorize a live backup job, restore drill, secret rotation, OpenBao policy change, or credential migration.
Purpose
Forge data is operationally important, but the mechanisms that make data
durable belong mostly below the forge layer. This contract states what
railiance-forge owns, what railiance-platform implements, and what evidence
S5 application releases can trust without taking custody of forge secrets.
Boundary Summary
railiance-forgeowns the inventory of forge data, artifact restore requirements, retention posture, operator runbooks, and non-secret evidence that downstream consumers cite.railiance-platformowns shared database, object-storage, backup/restore, OpenBao, policy, audit, and runtime secret-delivery mechanisms.railiance-clusterowns cluster-level recovery primitives such as etcd, kubeconfig, node/runtime recovery, and cluster add-ons.railiance-appsconsumes published artifacts and restore evidence in app runbooks; it does not own package blobs, registry credentials, runner tokens, or forge database backups.- Source repos own source code, build definitions, package metadata, and image build definitions.
Protected Asset Inventory
| Asset | Current anchor | Forge responsibility | Platform/lower-layer responsibility | Trust gate |
|---|---|---|---|---|
| Gitea application database | CNPG cluster databases/gitea-db, checked by make gitea-status |
State what must be restorable and what forge checks prove after restore | CNPG backup/restore implementation and database recovery mechanisms | Restored database must support login, repo list, package metadata, and registry metadata checks |
| Gitea shared storage | PVC default/gitea-shared-storage, mounted at /data; package blobs under /data/packages |
Track package/blob growth, retention posture, and restore requirements | Durable volume backup, object-storage export, or future storage replication | Restored storage must support Git clone, package download/install, and container pull checks |
| Source repositories and forge app state | Gitea database plus /data storage |
Define restore drill scope and consumer evidence | Database/PVC/object-storage restore tooling | Non-production restore drill proves a known repo can be cloned after restore |
| Container and Python package registry data | Package blobs under /data/packages; metadata in Gitea database |
Define retention, cleanup, package evidence, and consumer verification gates | Durable backup of blobs and metadata | Known image/package can be pulled or installed after restore |
| Runner registration and labels | Forge-owned runner substrate | Inventory labels, runner purpose, and replacement expectations | Secret delivery for runner tokens where OpenBao or platform policies apply | Replacement runner can run a sample job with the same semantic labels |
| SOPS-encrypted Gitea values | helm/gitea-values.sops.yaml |
Keep encrypted deploy input and sentinel check | SOPS/age bootstrap custody remains outside runtime secret delivery | make check-sops proves authorized decryption without storing plaintext |
| Runtime secrets | Kubernetes Secrets, OpenBao paths, operator custody paths | Reference names and required purposes only | OpenBao paths, policy, audit, break-glass, and workload secret delivery | Platform OpenBao restore/audit evidence exists before production-trust use |
| Artifact evidence | Forge docs, future artifact-store package, State Hub notes | Define required evidence fields and consumer references | Object-storage backend and credential delivery where evidence packages become durable artifacts | Evidence is retained without embedding secret material |
Known current caveat: the Gitea package data is on a 10 GiB local-path PVC.
On 2026-05-19 /data/packages was about 798.5 MiB, and no Kubernetes CronJob
backup resources were observed. That posture is acceptable for smoke and
development artifacts, but production-critical package reliance needs recorded
backup and restore evidence first.
Backup Ownership
Forge owns the question: "What forge data must be recoverable, and what does a successful recovery prove?"
Platform owns the mechanism for:
- CNPG database backup and restore;
- S3-compatible/object-storage backup targets;
- OpenBao runtime secret custody, audit, backup, and restore;
- workload secret delivery through External Secrets, CSI, or another approved platform mechanism;
- future object-storage credential vending and policy shape.
Cluster owns the mechanism for:
- etcd and kubeconfig backup;
- Kubernetes runtime recovery;
- cluster add-ons needed before platform services can recover.
Until platform backup coverage is explicitly available for a forge asset, forge docs must treat that asset as not production-trustworthy. Operators may still use it for smoke, development, and migration evidence if the risk is recorded.
Restore Drill Requirements
A forge restore drill should be non-production first and should record only non-secret evidence.
Minimum drills:
- Source forge restore:
- restore the Gitea database and shared storage into an isolated namespace or host;
- verify Gitea starts;
- verify a known repository can be listed and cloned;
- verify a known user/org/repo permission path still exists.
- Package/blob restore:
- restore package metadata and package blobs together;
- verify a known Python package version can be installed;
- verify a known container image tag or digest can be pulled;
- verify registry authentication behavior without recording the token.
- Runner substrate restore:
- replace a runner without reusing old registration tokens;
- verify semantic labels still match the published label contract;
- run a non-production sample workflow;
- record runner identity and label evidence, not runner tokens.
- Secret delivery restore:
- cite platform OpenBao restore evidence before relying on OpenBao-delivered forge credentials;
- verify a non-production secret reaches the intended workload path;
- verify no secret value appears in Git, State Hub notes, logs, screenshots, or drill artifacts.
Successful evidence should include:
- date and operator;
- source backup reference or encrypted snapshot reference;
- restored environment name;
- commands run, with secret values redacted before recording;
- post-restore checks and results;
- explicit
no_secret_material_recordedassertion; - rollback or cleanup note for the restored environment.
Secret Custody Boundaries
SOPS/age remains the Git-at-rest bootstrap mechanism for encrypted deploy
inputs such as helm/gitea-values.sops.yaml. This repo may keep encrypted SOPS
files and may provide make check-sops as a sentinel, but it must not commit or
log decrypted values.
OpenBao is the platform runtime secret service. The platform docs define paths such as:
platform/workloads/<namespace>/<service-account>/<secret-name>
platform/object-storage/<consumer>
platform/databases/<consumer>
platform/operators/<purpose>
Forge may request or reference OpenBao paths for forge workloads, package tokens, runner registration, object-storage credentials, and database access. Forge does not define OpenBao mounts, audit devices, root/unseal custody, break-glass policy, or global secret-delivery mechanisms.
Do not store in forge docs, State Hub notes, screenshots, logs, or workplans:
- decrypted SOPS values;
- OpenBao tokens, root tokens, unseal shares, or recovery codes;
- database passwords or connection strings with passwords;
- package tokens or tokenized package index URLs;
- runner registration tokens;
- object-storage access keys or secret keys;
- kubeconfigs or bearer tokens.
Allowed references:
- Kubernetes namespace and Secret names;
- SOPS file paths;
- OpenBao path names and policy names;
- credential purpose and scope;
- non-secret command names;
- redacted command examples;
- timestamps, backup ids, encrypted snapshot locations, and evidence file names that do not reveal secret material.
S5 Artifact Verification Without Registry Credentials
S5 application runbooks can trust forge artifacts only through evidence, not by owning forge credentials.
For a consuming app release, S5 may cite:
- source repo and commit SHA;
- package name and version;
- container image repository, tag, and digest when available;
- forge publish job id or evidence reference;
- package/blob restore drill reference when the artifact is production-critical;
- namespace-local pull Secret name if private registry access is required;
- app deployment dry-run and smoke-test result.
S5 should not store:
- package publish credentials;
- registry write tokens;
- package index URLs containing credentials;
- forge backup snapshots;
- OpenBao tokens or platform-root paths;
- package blob cleanup procedures as app-owned operations.
If an S5 release depends on a private package or image, the app runbook should name the consuming Kubernetes Secret or OpenBao-delivered workload path and cite forge/platform evidence that the artifact can be restored. The app repo should not copy the credential or the forge backup recipe.
Production-Trust Gates
Before treating forge packages, images, or source state as production-critical, the relevant asset must have:
- backup mechanism identified;
- restore drill completed in an isolated environment;
- consumer verification command recorded;
- secret custody path documented without live values;
- rollback or disable path documented;
- storage growth inspection procedure;
- owner for the next restore drill.
If one of these gates is missing, consumers may still use forge artifacts for smoke, development, or migration work, but production promotion should record a follow-up against the owning layer before relying on the artifact.
Follow-Ups
docs/observability-operating-evidence.mddefines the inspectable storage growth, restore-evidence, and runner-status signals for this contract.- WP-0006-T09 should model forge backup/restore and secret-delivery edges in Railiance Fabric.
RAILIANCE-WP-0005-T04should use this contract when documenting S5 app data restore readiness and app runbook evidence requirements.