Both workplans had been registered as active workstreams but tasks were never ingested — the markdown checkbox format was invisible to the consistency checker, which requires task code blocks. Activated both workplans (draft→active) and added task blocks with state_hub_task_id for all 19 tasks (9 + 10). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
11 KiB
id, type, title, domain, repo, status, owner, topic_slug, created, updated, state_hub_workstream_id
| id | type | title | domain | repo | status | owner | topic_slug | created | updated | state_hub_workstream_id |
|---|---|---|---|---|---|---|---|---|---|---|
| WARDEN-WP-0001 | workplan | OpsWarden Initial Implementation | custodian | ops-warden | active | Bernd | custodian | 2026-03-28 | 2026-03-28 | c3118cc6-adfb-428c-a9c6-edd0ee152ae6 |
WARDEN-WP-0001 — OpsWarden Initial Implementation
Note: This workplan is authored in
ops-bridgebecauseops-wardendoes not yet exist. Move it toworkplans/WARDEN-WP-0001-initial-implementation.mdin the new repo as the first commit action.
Scope: Bootstrap the ops-warden repository and deliver a working warden CLI that
implements the SSH CA and certificate lifecycle defined in wiki/AccessManagementDirective.md.
Out of scope: Vault HA/cluster setup, Ansible playbooks for host principal deployment
(those live in railiance-infra), session recording, and SSO integration (trigger §6.2 of
the directive when scale requires it).
Goal
Create a new ops-warden repository that owns credential issuance only — the CA,
certificate signing, actor identity registry, and scorecard tooling. Its sole public surface
to sibling repos is a well-defined cert_command interface that any tool (principally
ops-bridge) can call to obtain a short-lived, CA-signed SSH certificate for a named actor.
Reference Documents
| Document | Location |
|---|---|
| AccessManagementDirective | ops-bridge/wiki/AccessManagementDirective.md |
| ops-bridge SCOPE.md | ops-bridge/SCOPE.md |
Architecture
ops-warden/
├── SCOPE.md
├── CLAUDE.md
├── pyproject.toml
├── src/warden/
│ ├── cli.py # Typer CLI: sign / issue / status / inventory / scorecard
│ ├── models.py # ActorType enum, CertSpec, CertRecord, PrincipalsInventory
│ ├── ca.py # LocalCA backend (file-based, for dev / non-Vault)
│ ├── vault.py # VaultCA backend (Vault SSH engine, for production)
│ ├── inventory.py # YAML principals inventory read/write
│ ├── scorecard.py # §5 compliance checks
│ └── config.py # ~/.config/warden/warden.yaml loader
├── tests/
└── wiki/ # (symlink or copy of AccessManagementDirective.md)
Backends are swappable. Config key backend: local | vault selects which CA
implementation is used. This means the tool is fully functional without Vault for local lab
use, and production-grade with Vault — the same CLI surface, the same cert_command
interface, the same principals inventory format.
cert_command interface contract:
warden sign <actor-name> --pubkey <path>
Writes the signed certificate to stdout (the cert text). Exits non-zero on failure.
ops-bridge calls this verbatim via cert_command in tunnels.yaml.
Stack
- Language: Python 3.11+
- CLI framework: Typer
- Dependencies: typer, pyyaml, httpx, cryptography (for cert parsing / TTL reading)
- Vault SDK:
hvac(optional; only required for vault backend) - Packaging:
uv tool install
Tasks
T1 — Repository bootstrap
id: WARDEN-WP-0001-T1
state_hub_task_id: 6d643e9d-5e97-4224-9d82-87267b5ba6bc
status: todo
priority: high
- Create
ops-wardenrepo; copy CLAUDE.md template fromops-bridge; addworkplans/WARDEN-WP-0001-initial-implementation.md(this file) - Write
SCOPE.md(see template in §SCOPE below) pyproject.toml:[project.scripts] warden = "warden.cli:app"- Register repo with state-hub (
register_repo) - Create state-hub workstream for this workplan
T2 — Models and config
id: WARDEN-WP-0001-T2
state_hub_task_id: c66fc65a-0b16-4ba2-9e70-a83d875572ec
status: todo
priority: high
models.py:ActorTypeenum (adm | agt | atm);CertSpec(actor_name, pubkey_path, ttl_hours, principals);CertRecord(identity, valid_before, cert_path, signed_at)config.py: load~/.config/warden/warden.yaml; required fields:backend,ca_key(local) orvault_addr+vault_role_map(vault); optional:inventory_path,state_dir- Validate actor name prefix matches
ActorType(adm-*,agt-*,atm-*)
T3 — LocalCA backend
id: WARDEN-WP-0001-T3
state_hub_task_id: a5a41e58-1c6d-42a9-9b11-2088f17c29b5
status: todo
priority: high
ca.py:LocalCA.sign(spec: CertSpec) -> CertRecord- Callsssh-keygen -s <ca_key> -I <identity> -n <principals> -V +<ttl>h <pubkey>- Parsesssh-keygen -L -f <cert>output to extractValid before,Key ID,Principals- ReturnsCertRecord; writes cert to~/.local/state/warden/<actor>.cert.pub- Default TTLs enforced per
ActorType: adm → 48 h, agt → 24 h, atm → 8 h (overridable per actor in inventory) LocalCA.generate_keypair(actor_name) -> (privkey_path, pubkey_path)— for agt/atm actors that do not bring their own key
T4 — VaultCA backend
id: WARDEN-WP-0001-T4
state_hub_task_id: b2067ee6-c9ce-423b-9d60-0d28069fb304
status: todo
priority: medium
vault.py:VaultCA.sign(spec: CertSpec) -> CertRecord-POST /v1/ssh/sign/<role>withpublic_key,valid_principals,ttl- Parse responsesigned_keyfield; write to state dir; extract metadata viassh-keygen -L- Role map in config:
vault_role_map: {adm: adm-role, agt: agt-role, atm: atm-role} - Graceful error message when Vault is unreachable (with
--backend localfallback hint)
T5 — Principals inventory
id: WARDEN-WP-0001-T5
state_hub_task_id: 6d13f8cd-1850-44c9-b769-b21250348319
status: todo
priority: high
inventory.py: load/saveinventory.yaml(format mirrors §4.1 of directive):yaml actors: agt-state-hub-bridge: type: agt principals: [agt-task-bridge] ttl_hours: 24 description: "ops-bridge tunnel actor" hosts: coulombcore: allowed_principals: agt: [agt-task-bridge] atm: [atm-backup-daily]warden inventory list— print tablewarden inventory add <actor-name> --type <adm|agt|atm> --principals <...>warden inventory remove <actor-name>
T6 — CLI commands
id: WARDEN-WP-0001-T6
state_hub_task_id: 656a4615-92bb-4b5d-9406-e86d24fa15d0
status: todo
priority: high
warden sign <actor-name> --pubkey <path>— sign existing pubkey; write cert to stdout (thecert_commandinterface for ops-bridge)warden issue <actor-name>— generate keypair + sign; output JSON withprivkey,cert,valid_before,identitywarden status [actor-name]— show cert validity, identity, principals, TTL remaining;--allflag to show all actors in state dirwarden scorecard— run §5 checks (see T7)warden inventory <subcommand>(list / add / remove)
T7 — Scorecard runner
id: WARDEN-WP-0001-T7
state_hub_task_id: 7818bcc5-f40e-4793-b117-d36f653ffeed
status: todo
priority: medium
scorecard.py: implement each §5 row as a named check function returningCheckResult(name, passed, detail)- Checks in scope for
ops-warden(local checks, not host-side): - All certs in state dir respect TTL policy for theirActorType- No actor in inventory lacks aprincipalsentry - Actor name prefix matches declared type - No cert expired by more than 5 min still present in state dir (stale cleanup) - Host-side checks (password auth disabled, root login disabled, etc.) are out of scope
— those live in the Ansible
ssh-access-audit.ymlplaybook inrailiance-infra warden scorecard --jsonfor machine-readable output
T8 — ops-ssh-wrapper script
id: WARDEN-WP-0001-T8
state_hub_task_id: e9c28152-5785-4995-83a5-439985ed3db9
status: todo
priority: medium
- Ship
scripts/ops-ssh-wrapper(the Python snippet from §4.1, hardened): - ReadsWARDEN_ACTORandSSH_PUBKEYenv vars - Callswarden sign $WARDEN_ACTOR --pubkey $SSH_PUBKEY- Loads cert viassh-add; execs the given command - Install as part of
uv tool installentry points
T9 — Tests
id: WARDEN-WP-0001-T9
state_hub_task_id: 950139ab-cc17-4f1d-9a17-d5744e402ddf
status: todo
priority: high
- Unit tests for
LocalCA(mockssh-keygensubprocess) - Unit tests for inventory YAML round-trip
- Unit tests for actor name prefix validation
- Integration test:
LocalCA.signon a real test keypair (requiresssh-keygenin PATH) - Scorecard unit tests (mock cert records)
T10 — Documentation
id: WARDEN-WP-0001-T10
state_hub_task_id: 271d6759-e359-41ce-80e4-76c574634a87
status: todo
priority: medium
SCOPE.md(see below)wiki/AccessManagementDirective.md— copy fromops-bridge/wiki/wiki/OpsWardenConfig.md— annotatedwarden.yamlreferencewiki/CertCommandInterface.md— contract forcert_commandcallers (ops-bridge etc.)
SCOPE.md Template
# SCOPE
## One-liner
SSH Certificate Authority and credential issuance for the ops fleet —
signs short-lived certs for adm/agt/atm actors; provides the cert_command
interface consumed by ops-bridge and other tooling.
## Core Idea
Implements AccessManagementDirective §§1–5. Owns the CA key, actor inventory,
signing logic, and scorecard. Does not own tunnel lifecycle, host provisioning,
or SSH key generation for humans.
## In Scope
- Local CA backend (ssh-keygen -s) for lab / non-Vault use
- Vault SSH engine backend for production
- Actor identity registry (inventory.yaml)
- cert_command CLI interface: `warden sign <actor> --pubkey <path>`
- TTL policy enforcement per ActorType (adm/agt/atm)
- Certificate status and stale-cert cleanup
- Scorecard checks (local / cert-side only)
- ops-ssh-wrapper script for agt/atm startup automation
## Out of Scope
- Host-side principal deployment (railiance-infra Ansible)
- SSH key generation for human admins (self-service: ssh-keygen)
- Vault cluster setup / HA
- Session recording, audit forwarding to SIEM (host-side)
- Tunnel lifecycle (ops-bridge)
- SSO / Teleport (trigger when §6.2 scale thresholds are hit)
## Relevant When
- Issuing or refreshing a cert for any adm/agt/atm actor
- Checking cert validity / scorecard compliance
- ops-bridge needs cert_command to be defined
- Adding a new actor to the principals inventory
## Not Relevant When
- Managing tunnel lifecycle (ops-bridge)
- Deploying SSH config to hosts (railiance-infra)
- All access is via static keys with no TTL (legacy mode)
## Current State
Status: planned (WARDEN-WP-0001 not yet started)
## Related Repositories
- ops-bridge — primary consumer of cert_command interface
- railiance-infra — owns host-side principal deployment
- the-custodian/state-hub — registers domain/workstreams
Acceptance Criteria
warden sign agt-test-actor --pubkey /tmp/test.puboutputs a valid cert (local backend)warden status agt-test-actorshows correct identity, principals, and time-to-expirywarden scorecardreturns 5/5 on a clean test inventorywarden signcalled from ops-bridgecert_commandin an integration test tunnel- All tests pass:
uv run pytest - All lints pass:
uv run ruff check .