feat: implement WP-0002 — Goss test suite, verify playbook, and ADR-002

- goss/baseline.yaml: assertions for all spec/server-baseline.yaml items
  (packages, services, SSH config, UFW rules, admin user, fail2ban, HISTCONTROL)
- goss/vars/baseline-vars.yaml: parameterised ports and paths
- ansible/roles/goss/: installs Goss binary (v0.4.9), deploys tests,
  runs assertions in TAP format, fetches report to reports/
- ansible/playbooks/verify.yaml: playbook wrapping the goss role
- Makefile: add 'make verify' target; update 'make status' with hint
- docs/adr/ADR-002: formal repo boundary — railiance-hosts vs railiance-bootstrap
- workplans/RAIL-HO-WP-0002: registered workstream 8fed53c2, T03–T06 done

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-09 12:38:48 +01:00
parent 2be5de2a3a
commit 8f5799553e
7 changed files with 242 additions and 5 deletions

View File

@@ -0,0 +1,73 @@
# ADR-002 — Repository Boundary: railiance-hosts vs railiance-bootstrap
**Status:** Accepted
**Date:** 2026-03-09
**Deciders:** Bernd Worsch
---
## Context
Two repositories exist in the Railiance domain that both touch server
configuration:
- **`railiance-hosts`** — manages the OS baseline, security hardening,
inventory, secrets, and test suite for every managed node.
- **`railiance-bootstrap`** — installs Kubernetes (k3s), Helm, GitOps
tooling, and platform services on top of an already-converged base node.
Prior to this ADR, `railiance-bootstrap` contained Ansible playbooks
(`harden.yml`, `bootstrap.yml`) that overlapped with OS-level tasks now
owned by `railiance-hosts`. This created a split responsibility that could
cause drift and conflicting configuration.
---
## Decision
### Ownership table
| Concern | Owner | Notes |
|---------|-------|-------|
| SSH hardening (PermitRootLogin, PasswordAuthentication) | `railiance-hosts` | Defined in `spec/server-baseline.yaml` |
| UFW firewall rules (including k3s/Flannel ports) | `railiance-hosts` | Spec section: `firewall.rules` |
| fail2ban installation and SSH jail | `railiance-hosts` | Spec section: `security.fail2ban_jails` |
| Required OS packages (ufw, fail2ban, git, curl, age, sops) | `railiance-hosts` | Spec section: `packages.installed` |
| Admin user + sudo config | `railiance-hosts` | Spec section: `users` |
| HISTCONTROL and shell security defaults | `railiance-hosts` | Spec section: `security` |
| SOPS/age key agent | `railiance-hosts` | `roles/sops_agent` |
| k3s installation | `railiance-bootstrap` | Consumes a converged base node |
| Helm + GitOps tooling | `railiance-bootstrap` | |
| Application-layer Kubernetes resources | `railiance-bootstrap` | |
### Rule
> **Any item present in `spec/server-baseline.yaml` MUST NOT be managed
> by `railiance-bootstrap`.**
`railiance-bootstrap` may add UFW rules for Kubernetes components (e.g.
NodePort ranges, cluster-internal ports) but must not remove or override
the base rules defined in this repo's spec.
### Superseded files in `railiance-bootstrap`
The following files in `railiance-bootstrap` are superseded by the roles
and spec in `railiance-hosts` and should not be used for new work:
- `ansible/harden.yml`
- `ansible/bootstrap.yml` (the OS-hardening portions)
An ecosystem todo (`[repo:railiance-bootstrap]`) should be filed to
formally retire these files or scope them down to k3s-only tasks.
---
## Consequences
- `railiance-hosts` converge step (`make converge`) must run and pass
before `railiance-bootstrap` deploys anything.
- Changes to the OS security baseline (new packages, firewall rules,
SSH settings) go into `spec/server-baseline.yaml` → update the Ansible
role → update `goss/baseline.yaml` — all in this repo.
- `make verify` provides a machine-readable assertion that the converge
step produced the expected state, suitable for CI gating.