Files
railiance-forge/docs/initial-operating-contracts.md

90 lines
4.2 KiB
Markdown

# Initial Forge Operating Contracts
Last reviewed: 2026-06-05
These contracts are the first explicit boundary for the Railiance forge layer.
They are intentionally operational enough to guide the next file moves, while
leaving live deploy and secret custody changes behind separate review gates.
## Status
- Contract maturity: draft v1.
- Live impact: none; this document does not authorize a deploy or cutover.
- Current forge runtime: Gitea at `gitea.coulomb.social`.
- Future migration target: Forgejo, under a separate cutover workplan.
## Artifact Lifecycle And Provenance
- Source repositories own build definitions, package metadata, and release
versioning.
- `railiance-forge` owns the registry endpoints, registry operating docs,
retention posture, and artifact evidence model.
- `railiance-apps` consumes already-published artifacts in S5 release values and
runbooks.
- Container images should publish immutable commit-SHA tags for release
evidence. Mutable tags such as `latest` are allowed only as convenience
pointers and must not be the sole production reference.
- Python packages should use versioned releases. Internal consumers should pin
compatible ranges in source repos and regenerate locks after the package is
published.
- Release evidence should capture source repo, commit SHA, artifact name,
version/tag, smoke result, and consuming deployment value change.
## Retention And Cleanup
- Smoke-test image tags can be pruned after a newer validated smoke tag exists
and no rollback or diagnosis still references them.
- Production image tags and Python package versions should be retained for at
least the active rollback window of every consuming deployment.
- Deleting a package or image is an operator action, not an automated default,
until package restore has been drilled.
- The current Gitea package data lives under `/data/packages` on the
`default/gitea-shared-storage` PVC. On 2026-05-19 it was about 798.5 MiB of a
10 GiB `local-path` volume.
- Growth inspection belongs in forge operations; durable backup implementation
belongs with the platform storage/database layer.
## Runner Substrate Ownership
- `railiance-forge` owns Gitea/Forgejo Actions runner deployment, runner labels,
runner placement, runner credentials, and runner health evidence.
- `railiance-enablement` owns reusable workflow templates, paved paths, and
developer-facing automation conventions.
- Application repos own app-specific workflows and build scripts.
- `railiance-apps` owns S5 release checks for app manifests and deployment
values, but not the runner substrate those checks execute on.
- Runner secret access must be explicit by label, repository, and workflow
purpose. Broad package or cluster credentials should not be shared across
unrelated jobs.
## Backup And Restore Handoff
- `railiance-forge` defines what must be restorable: Git repositories, package
blobs, registry metadata, runner configuration, and source-forge application
state.
- `railiance-platform` owns the reusable database, object-storage, backup, and
restore mechanisms used by forge workloads.
- Forge data should not become production-critical without a recorded restore
drill for the relevant storage path.
- S5 app releases may consume forge artifacts, but they should cite forge
evidence rather than owning package blob backup procedures themselves.
## Secret Custody
- This repo may reference secret names, SOPS file paths, OpenBao paths, and
operator procedures.
- This repo must not commit decrypted secret values, package tokens, runner
tokens, tokenized package index URLs, or generated credential material.
- Deploy-capable files that reference encrypted values move only after review of
the SOPS/OpenBao handoff and compatibility pointers.
## Observability And Evidence
- `make gitea-status` is the first read-only operator check in this repo.
- Forge health should cover web, Git SSH, container registry, Python package
registry, database, package storage, and runner status.
- Downstream app release evidence should cite forge artifact evidence rather
than repeating registry implementation details.
- Future monitoring should turn the manual status checks into durable signals
once the Railiance observability layer is ready.