docs: add verification guide, close WP-0002
- docs/verification.md: explains spec/server-baseline.yaml, goss/baseline.yaml, make verify workflow, assertion mapping table, and how to add new checks - docs/convergence.md: replace manual spot-check snippet with make verify reference - workplans/RAIL-HO-WP-0002: mark completed (all tasks done, workstream closed) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -26,19 +26,20 @@ This will:
|
|||||||
|
|
||||||
## Verifying
|
## Verifying
|
||||||
|
|
||||||
Once convergence completes, you can test:
|
After convergence, run the automated test suite to assert the node matches the
|
||||||
|
baseline spec:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
ssh admin@<server-ip>
|
make verify
|
||||||
|
```
|
||||||
|
|
||||||
# Check sudo access without password
|
This runs Goss assertions against all hosts and exits non-zero on failure.
|
||||||
sudo -n true && echo "✔ sudo OK"
|
TAP reports are written to `reports/`. See `docs/verification.md` for details.
|
||||||
|
|
||||||
# Firewall status
|
For a quick human-readable summary without assertions:
|
||||||
sudo ufw status
|
|
||||||
|
|
||||||
# Installed tools
|
```bash
|
||||||
htop --version
|
make status
|
||||||
```
|
```
|
||||||
|
|
||||||
## Notes
|
## Notes
|
||||||
|
|||||||
79
docs/verification.md
Normal file
79
docs/verification.md
Normal file
@@ -0,0 +1,79 @@
|
|||||||
|
# Server Verification
|
||||||
|
|
||||||
|
RailianceHosts ships a declarative baseline spec and a Goss test suite that
|
||||||
|
asserts every managed node matches it. This replaces manual spot-checks with
|
||||||
|
a reproducible, CI-friendly pass/fail verdict.
|
||||||
|
|
||||||
|
## The spec
|
||||||
|
|
||||||
|
`spec/server-baseline.yaml` is the single source of truth for the target state
|
||||||
|
of every managed node. It covers:
|
||||||
|
|
||||||
|
- **Firewall** — UFW active, default deny inbound, required ports allowed
|
||||||
|
(SSH 22/tcp, k3s API 6443/tcp, Flannel VXLAN 8472/udp)
|
||||||
|
- **SSH daemon** — root login disabled, password auth disabled, pubkey auth enabled
|
||||||
|
- **Services** — ufw, fail2ban, ssh.socket enabled and running
|
||||||
|
- **Packages** — ufw, fail2ban, git, curl, vim, htop (age and sops installed as binaries)
|
||||||
|
- **Users** — admin user with bash shell and passwordless sudo
|
||||||
|
- **Security** — fail2ban sshd jail active, HISTCONTROL=ignorespace in /etc/profile.d/
|
||||||
|
|
||||||
|
When you change the desired state of a node, update this file first. Then
|
||||||
|
update the Ansible role **and** the Goss tests to match.
|
||||||
|
|
||||||
|
## Running verification
|
||||||
|
|
||||||
|
```bash
|
||||||
|
make verify
|
||||||
|
```
|
||||||
|
|
||||||
|
This runs `ansible/playbooks/verify.yaml` against all hosts. The playbook:
|
||||||
|
|
||||||
|
1. Downloads the Goss binary (pinned version) to `/usr/local/bin/goss`
|
||||||
|
2. Copies `goss/baseline.yaml` to `/etc/goss/baseline.yaml` on each host
|
||||||
|
3. Runs `goss validate --format tap`
|
||||||
|
4. Fails the play (non-zero exit) if any assertion fails
|
||||||
|
5. Fetches the TAP report to `reports/goss-<host>-<timestamp>.tap`
|
||||||
|
6. Auto-commits the report to git
|
||||||
|
|
||||||
|
**All assertions passed** → exit 0
|
||||||
|
**One or more assertions FAILED** → exit non-zero, TAP report in `reports/`
|
||||||
|
|
||||||
|
## After convergence
|
||||||
|
|
||||||
|
The standard workflow after converging a new or updated node:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
make converge # bring the node to the desired state
|
||||||
|
make verify # assert it got there
|
||||||
|
```
|
||||||
|
|
||||||
|
Run `make status` for a quick human-readable summary; run `make verify` when
|
||||||
|
you need a structured, automatable check.
|
||||||
|
|
||||||
|
## Goss test file
|
||||||
|
|
||||||
|
`goss/baseline.yaml` contains one Goss assertion per spec item. The mapping is:
|
||||||
|
|
||||||
|
| spec section | Goss resource |
|
||||||
|
|---|---|
|
||||||
|
| `firewall` | `command: ufw status` stdout patterns |
|
||||||
|
| `ssh` | `file: /etc/ssh/sshd_config.d/10-hardening.conf` contains |
|
||||||
|
| `services` | `service:` blocks |
|
||||||
|
| `packages` | `package:` blocks |
|
||||||
|
| `users` | `user:` block + `command: grep NOPASSWD` |
|
||||||
|
| `security.histcontrol` | `command: grep -r HISTCONTROL /etc/profile.d/` |
|
||||||
|
| `security.fail2ban_jails` | `command: fail2ban-client status sshd` |
|
||||||
|
| `age`, `sops` (binary installs) | `command: test -x /usr/local/bin/{age,sops}` |
|
||||||
|
|
||||||
|
## Adding new assertions
|
||||||
|
|
||||||
|
1. Add the desired state to `spec/server-baseline.yaml`
|
||||||
|
2. Add the Ansible task to `ansible/roles/base/tasks/main.yml`
|
||||||
|
3. Add the Goss assertion to `goss/baseline.yaml`
|
||||||
|
4. Run `make converge && make verify` to confirm
|
||||||
|
|
||||||
|
## Reports
|
||||||
|
|
||||||
|
TAP reports are committed to `reports/` after each `make verify` run.
|
||||||
|
They are machine-readable and suitable for CI pipelines. A cleanup policy
|
||||||
|
for old reports is tracked as extension point EP `78ef4879`.
|
||||||
@@ -4,12 +4,13 @@ type: workplan
|
|||||||
title: "Server Specification and Automated Test Suite"
|
title: "Server Specification and Automated Test Suite"
|
||||||
domain: railiance
|
domain: railiance
|
||||||
repo: railiance-hosts
|
repo: railiance-hosts
|
||||||
status: active
|
status: completed
|
||||||
owner: railiance
|
owner: railiance
|
||||||
topic_slug: railiance
|
topic_slug: railiance
|
||||||
state_hub_workstream_id: "8fed53c2-4c39-4471-8bb9-61f58771fe0c"
|
state_hub_workstream_id: "8fed53c2-4c39-4471-8bb9-61f58771fe0c"
|
||||||
created: "2026-03-09"
|
created: "2026-03-09"
|
||||||
updated: "2026-03-09"
|
updated: "2026-03-09"
|
||||||
|
completed: "2026-03-09"
|
||||||
---
|
---
|
||||||
|
|
||||||
# Server Specification and Automated Test Suite
|
# Server Specification and Automated Test Suite
|
||||||
|
|||||||
Reference in New Issue
Block a user