Files
ops-warden/workplans/WARDEN-WP-0001-initial-implementation.md
2026-03-28 00:45:43 +00:00

127 lines
4.9 KiB
Markdown

---
id: WARDEN-WP-0001
type: workplan
title: "OpsWarden Initial Implementation"
domain: custodian
repo: ops-warden
status: draft
owner: Bernd
topic_slug: custodian
created: "2026-03-28"
updated: "2026-03-28"
---
# WARDEN-WP-0001 — OpsWarden Initial Implementation
**Scope:** Deliver a working `warden` CLI that implements the SSH CA and certificate
lifecycle defined in `wiki/AccessManagementDirective.md`. Scaffolding (models, config,
CA backends, inventory, scorecard, CLI) is already present in the repo; this workplan
tracks the remaining implementation, testing, and integration work.
**Out of scope:** Vault HA/cluster setup, Ansible playbooks for host principal deployment
(those live in `railiance-infra`), session recording, and SSO integration (trigger §6.2 of
the directive when scale requires it).
---
## Goal
After this workplan:
1. `warden sign agt-test --pubkey /tmp/test.pub` outputs a valid cert (local backend).
2. `warden status agt-test` shows correct identity, principals, and time-to-expiry.
3. `warden scorecard` returns 4/4 on a clean test inventory.
4. `warden sign` called from ops-bridge `cert_command` works end-to-end in an integration
test tunnel.
5. All tests pass (`uv run pytest`) and lints pass (`uv run ruff check .`).
---
## Reference Documents
| Document | Location |
|---|---|
| AccessManagementDirective | `wiki/AccessManagementDirective.md` |
| cert_command interface | `wiki/CertCommandInterface.md` |
| Config reference | `wiki/OpsWardenConfig.md` |
| ops-bridge alignment workplan | `../ops-bridge/workplans/BRIDGE-WP-0004-directive-alignment.md` |
---
## Architecture Summary
```
~/.config/warden/warden.yaml # backend, ca_key, inventory_path, state_dir
~/.config/warden/inventory.yaml # actor registry (name → type, principals, ttl_hours)
~/.local/state/warden/ # signed certs (*-cert.pub); keypairs (keys/)
```
Two swappable CA backends — both expose the same `sign(spec) -> CertRecord` interface:
- `LocalCA``ssh-keygen -s`; no Vault dependency; default for dev/lab
- `VaultCA` — Vault SSH engine via httpx
cert_command interface (consumed by ops-bridge):
```
warden sign <actor-name> --pubkey <path> # → cert text to stdout
```
---
## Tasks
### T1 — Repository registration
- [ ] Register repo with state-hub (`register_repo`); assign Repo ID; update
`.claude/rules/repo-identity.md`
- [ ] Create state-hub workstream for this workplan
### T2 — LocalCA integration test
- [ ] Generate a test CA key: `ssh-keygen -t ed25519 -f /tmp/test-ca -N ""`
- [ ] Run `warden sign` against a real pubkey with the test CA (requires `ssh-keygen` in PATH)
- [ ] Verify cert parses correctly with `ssh-keygen -L`
- [ ] Add to `tests/test_ca.py` as an integration test (skipped if `ssh-keygen` not in PATH)
### T3 — VaultCA integration test
- [ ] Set up a local Vault dev server (`vault server -dev`)
- [ ] Enable SSH secrets engine: `vault secrets enable ssh`
- [ ] Configure a signing role for `agt`
- [ ] Run `warden sign` with `backend: vault` config
- [ ] Add to `tests/test_vault.py` as an integration test (skipped if Vault not reachable)
### T4 — CLI end-to-end smoke tests
- [ ] `warden inventory add agt-test --type agt --principal agt-task-test`
- [ ] `warden inventory list` shows the actor
- [ ] `warden issue agt-test` (local backend) produces keypair + cert
- [ ] `warden status agt-test` shows valid cert
- [ ] `warden scorecard` returns 4/4
- [ ] `warden inventory remove agt-test` removes actor
### T5 — ops-bridge cert_command integration
- [ ] Add `agt-state-hub-bridge` to inventory (or use existing from ops-bridge config)
- [ ] Set `cert_command: "warden sign agt-state-hub-bridge --pubkey ~/.ssh/agt-state-hub-bridge_ed25519.pub"`
in a test `tunnels.yaml`
- [ ] Run `bridge up state-hub-coulombcore`; confirm cert is present in
`~/.local/state/bridge/` and `cert_identity` appears in the audit log
- [ ] Document result in a progress event
### T6 — CI/CD setup
- [ ] Add `.github/workflows/ci.yml` (or equivalent) running `uv run pytest` and
`uv run ruff check .` on push
- [ ] Tests must pass without Vault (VaultCA integration tests skipped via pytest marker)
### T7 — Documentation
- [ ] `wiki/OpsWardenConfig.md` — annotated `warden.yaml` reference (already stubbed)
- [ ] `wiki/CertCommandInterface.md` — contract for `cert_command` callers (already stubbed)
- [ ] Ensure `wiki/AccessManagementDirective.md` is in sync with `ops-bridge/wiki/`
---
## Acceptance Criteria
- [ ] `warden sign agt-test --pubkey /tmp/test.pub` → valid cert on stdout (local backend)
- [ ] `warden status agt-test` → identity, principals, time-to-expiry shown correctly
- [ ] `warden scorecard` → 4/4 on clean inventory
- [ ] `warden sign` works as `cert_command` in ops-bridge tunnel config
- [ ] All unit tests pass: `uv run pytest`
- [ ] All lints pass: `uv run ruff check .`
- [ ] No secrets (CA private key, certs) committed to repo