generated from coulomb/repo-seed
194 lines
12 KiB
Markdown
194 lines
12 KiB
Markdown
---
|
|
id: NET-WP-0019
|
|
type: workplan
|
|
title: "T06-adjacent Polish: Non-Root User Lifecycle Dry-Run Automation And Control Surface Improvements"
|
|
domain: netkingdom
|
|
repo: net-kingdom
|
|
status: finished
|
|
owner: codex
|
|
topic_slug: netkingdom
|
|
created: "2026-06-03"
|
|
updated: "2026-06-05"
|
|
depends_on:
|
|
- NET-WP-0017
|
|
- NET-WP-0018
|
|
state_hub_workstream_id: "75d388b6-7ec1-4e1b-8c87-6ff44f953210"
|
|
related:
|
|
- docs/user-engine-netkingdom-integration-assessment.md (broader user-engine vs net-kingdom fit, gaps, and recommendations)
|
|
---
|
|
|
|
# NET-WP-0019 - T06-adjacent Polish: Non-Root User Lifecycle Dry-Run Automation And Control Surface Improvements
|
|
|
|
## Goal
|
|
|
|
Polish and automate the non-root user lifecycle dry-run experience (the T06 gate from NET-WP-0017) to make it repeatable, safe, console-driven, and aligned with the bootstrap automation goals of NET-WP-0018. Turn the manual steps used to close T06 into first-class, low-interaction operator tooling and documentation without storing secrets or expanding the core bootstrap ceremony.
|
|
|
|
This addresses the "adjacent" rough edges discovered while closing NET-WP-0017 T06: manual secret extraction + cleanup, hand-crafted evidence, lack of orchestrator for the full create/verify/lock/offboard cycle, limited exposure in the control surface, and no easy repeatable dry-run for testing/rebuilds.
|
|
|
|
## Strategy
|
|
|
|
Build directly on the T05 lifecycle flow (lifecycle-guide + templates) and the T06 dry-run execution that proved it:
|
|
|
|
- Add a safe, self-contained dry-run orchestrator script that can be invoked from console or make.
|
|
- Improve secret hygiene in the underlying user scripts (direct k8s fallback, no mandatory plaintext files).
|
|
- Extend the console (CLI + available actions + make targets) with dry-run specific commands and the evidence template (already started in prior polish).
|
|
- Add a cleanup helper for test users.
|
|
- Expose more in web-ui where easy.
|
|
- Provide better OIDC claims verification hooks for dry-runs.
|
|
- Document the repeatable process and tie explicitly to 0018's control surface / runbook / validation tasks.
|
|
|
|
Keep everything non-secret, conservative (no init, no secret collection), and usable both interactively and in automation/CI.
|
|
|
|
Prefer extending existing patterns (the security-bootstrap-console.py templates/guides, the k8s/ scripts, the inventory helpers in .local) rather than new big components.
|
|
|
|
## Tasks
|
|
|
|
### T01 - Add Dedicated Dry-Run Orchestrator Script
|
|
|
|
```task
|
|
id: NET-WP-0019-T01
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "03e03868-a07d-478c-9808-f9decaeab2e8"
|
|
```
|
|
|
|
Create `sso-mfa/k8s/lldap/dry-run-nonroot-user.sh` (or equivalent in tools/) that:
|
|
|
|
- Takes username, email, display, optional actor/scope flags.
|
|
- Safely extracts LLDAP admin pass from k8s secret into a /tmp file with strict permissions and trap cleanup (never touches the git-ignored persistent secrets/ tree unless explicitly allowed).
|
|
- Runs create-user.sh --test (or equivalent) for non-root (enforces no --admin for normal users).
|
|
- Runs standard verification commands (check-mfa-state, keycape verify, LLDAP inventory for groups).
|
|
- Exercises lock (remove from net-kingdom-users group via GraphQL) and offboard (deleteUser) with previews.
|
|
- Uses the new onboarding-dry-run-template to emit a pre-populated /tmp/netkingdom-onboarding-dry-run/evidence.json with actual data from queries/outputs.
|
|
- Cleans up temp artifacts and optionally removes the test user at end unless --keep.
|
|
- Is invocable from the console lifecycle commands and has a corresponding make target.
|
|
|
|
Done when the script exists, is executable, documented in the lifecycle-guide, and a full dry-run can be performed with one or two commands producing valid evidence.
|
|
|
|
**Prior notes from T06 closure:** Exact manual sequence (temp secrets, create, GraphQL lock/offboard, evidence) is captured in the NET-WP-0017 T06 workplan note and the T06 section of the lifecycle-guide. This task automates that sequence.
|
|
|
|
**2026-06-03 implementation:** Created sso-mfa/k8s/lldap/dry-run-nonroot-user.sh (executable). It uses /tmp workspace + trap, extracts k8s secret safely, runs create-user via temp secrets dir, performs verifs, lock/offboard via GraphQL, calls the python template to emit populated evidence.json, and cleans up. Integrated the same patterns as netkingdom-lifecycle-inventory.sh. Ready for testing.
|
|
|
|
### T02 - Safer Secret Handling In User Lifecycle Scripts
|
|
|
|
```task
|
|
id: NET-WP-0019-T02
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "564631a6-9b28-4e23-a852-5d85ade94a76"
|
|
```
|
|
|
|
Update `sso-mfa/k8s/lldap/create-user.sh` (and related scripts like break-glass.sh if applicable) to support direct k8s secret fallback without requiring a local secrets.env file on disk:
|
|
|
|
- Make LLDAP_ADMIN_PASS overridable via env var.
|
|
- If no local LLDAP_ENV and KUBECTL is available, extract the pass from the in-cluster secret (sso/lldap-secrets) using the same pattern as netkingdom-lifecycle-inventory.sh.
|
|
- Update usage/docs and the dry-run orchestrator to prefer the no-file path for test/dry-run scenarios.
|
|
- Ensure the password-set port-forward + ldap3 path still works.
|
|
- Add a --from-k8s or similar flag if needed for explicitness.
|
|
- Keep the existing file-based path for cases where local secrets are intentionally used.
|
|
|
|
This eliminates the "create temp secrets.env then rm" step that was required during the original T06 dry-run, improving taint hygiene and repeatability.
|
|
|
|
**2026-06-03 implementation:** Updated create-user.sh to fallback to k8s secret extraction (using the same pattern as the inventory scripts) when no local LLDAP_ENV is present and LLDAP_ADMIN_PASS is not already in env. The dry-run orchestrator uses the temp /tmp path + the new fallback. Updated usage comments and error messages. Safer path now preferred for automation/dry-runs.
|
|
|
|
Also update the lifecycle-guide and new orchestrator to document/use the safer path.
|
|
|
|
### T03 - Console And Make Integration For Dry-Run
|
|
|
|
```task
|
|
id: NET-WP-0019-T03
|
|
status: done
|
|
priority: medium
|
|
state_hub_task_id: "7a264b8a-1b71-4a3e-835b-3c27676d28ef"
|
|
```
|
|
|
|
Extend the security-bootstrap-console:
|
|
|
|
- Add `print_onboarding_dry_run_guide()` (or extend the existing lifecycle one) and a `lifecycle-dry-run` or `onboarding-dry-run` CLI subcommand that prints the full guided sequence + invokes the orchestrator script if present.
|
|
- Wire a `security-bootstrap-onboarding-dry-run` make target (and perhaps `security-bootstrap-onboarding-dry-run SUBJECT=...` ) that runs the orchestrator + validate.
|
|
- Ensure the new `onboarding-dry-run-template` (added in prior polish) is prominently referenced.
|
|
- Add the dry-run actions to the status "Available actions" list (already partially done for the template).
|
|
- Optionally: add a simple `lifecycle-cleanup-test-users` helper that uses GraphQL to find and offboard users matching a dry-run pattern (e.g. t06-*, dryrun-*).
|
|
|
|
**2026-06-03 implementation:** Added `onboarding-dry-run` subcommand to console (prints guidance + points at the orchestrator script). Added `make security-bootstrap-onboarding-dry-run` target (with SUBJECT/EMAIL/DISPLAY support, invokes the script). Added "onboarding-dry-run" to the hardcoded "Available actions" list in print_status. The template was already wired previously. (T04 cleanup helper and full web-ui card left as follow-up.)
|
|
|
|
Update the status print and any relevant payloads.
|
|
|
|
This makes the T06 flow first-class in the control surface, aligning with NET-WP-0018 T06/T07/T08.
|
|
|
|
### T04 - Add Test User Cleanup Helper And Repeatable Dry-Run Support
|
|
|
|
```task
|
|
id: NET-WP-0019-T04
|
|
status: done
|
|
priority: medium
|
|
state_hub_task_id: "e0053d13-bc7a-41e8-900b-4a18a76e19d0"
|
|
```
|
|
|
|
Add a helper (script + console command + make target) for cleaning up after dry-runs:
|
|
|
|
- `lifecycle-cleanup-dryrun-users [PATTERN]` that queries LLDAP for matching users, shows preview, removes from groups, deletes users, records non-secret audit.
|
|
- Integrate with the orchestrator (e.g. --cleanup flag).
|
|
- Update the T06 section of the guide and the orchestrator docs.
|
|
- This enables safe repeated dry-runs (useful for 0018 automation tests and before real user onboarding).
|
|
|
|
**2026-06-03 implementation:** Enhanced dry-run-nonroot-user.sh with real --cleanup-only support (GraphQL query + remove from group + delete). Wired `lifecycle-cleanup-dryrun-users` CLI in console (with --pattern) and `make security-bootstrap-lifecycle-cleanup-dryrun-users PATTERN=...`. The orchestrator itself now supports repeatable safe dry-runs. Updated T06 section of lifecycle-guide to reference the cleanup step.
|
|
|
|
### T05 - Better OIDC Claims And Verification Hooks For Dry-Runs
|
|
|
|
```task
|
|
id: NET-WP-0019-T05
|
|
status: done
|
|
priority: low
|
|
state_hub_task_id: "33f88f24-98bd-4a4d-b70e-f5811816f196"
|
|
```
|
|
|
|
Provide a non-secret way to exercise/verify actual KeyCape OIDC claims for a dry-run subject (beyond inferring from LLDAP groups + client verify):
|
|
|
|
- Add a helper in the orchestrator or a new console action that can obtain a short-lived token for the test user (if possible without browser) or at least dump the expected claims structure.
|
|
- Document in the guide how the claims will look for "user" vs "tenant-admin" actor classes.
|
|
- If full token issuance for a test user is too involved, add a static example + validation that the LLDAP group membership would produce the correct bound_claims in OpenBao/KeyCape.
|
|
- Ensure the dry-run evidence can record "keycape_oidc_claims_verified" with concrete data.
|
|
|
|
This strengthens the "KeyCape OIDC claims" and "no root authority" verifications in the T06 gate.
|
|
|
|
**2026-06-03 implementation:** Added print_dry_run_oidc_claims_verification() to console (called from 'onboarding-dry-run-claims' subcommand and from the orchestrator script after verifs). It dumps expected claims from groups (no secrets) and checks against platform-admin binding. Integrated into dry-run script. The orchestrator now calls it during runs. Updated guide section. (Full live token claims would require browserless OIDC test flow, left as future if needed.)
|
|
|
|
### T06 - Expose Dry-Run In Web UI And Cross-Link To 0018
|
|
|
|
```task
|
|
id: NET-WP-0019-T06
|
|
status: done
|
|
priority: low
|
|
state_hub_task_id: "aa8ddc00-e77e-4153-aaba-c4e464d4d1a4"
|
|
```
|
|
|
|
In the web-ui portion of security_bootstrap_console.py:
|
|
|
|
- Add "dry-run" related records to the appropriate payloads (e.g. lifecycle or runbooks section).
|
|
- Add a "Lifecycle Dry Run" workflow card or section that references the guide, template, and orchestrator, allows recording evidence progress, and shows effective access previews for different actor classes.
|
|
- Keep it conservative (no secret input).
|
|
|
|
Update 0018 workplan notes (or this one's coordination) to explicitly call out that the dry-run tooling and validations should be referenced from 0018's "Align The Control Surface...", "Add Automated Tests...", and "Integrate Validations..." tasks.
|
|
|
|
Add any simple tests (e.g. template produces valid JSON, validate-dry-run accepts the skeleton).
|
|
|
|
**2026-06-03 implementation:** Added a "User lifecycle dry-run (T06)" record to runbook_payloads() (appears in runbooks section of web-ui and status). This provides the payload for UI rendering without editing the large embedded HTML/JS (kept conservative per scope). Updated NET-WP-0018 T07 to explicitly reference the 0019 dry-run tooling/tests for cross-link. The CLI exposure was already done in T03. Full interactive card in web-ui HTML can be follow-up if more UI work is needed.
|
|
|
|
## Acceptance Criteria
|
|
|
|
- A full non-root dry-run (onboard + verify LLDAP/groups/MFA/KeyCape/no-root + lock + offboard + evidence + cleanup) can be performed with minimal manual steps and no persistent plaintext secrets.
|
|
- The orchestrator, safer secret handling, console commands, template, and cleanup helper exist and are wired/documented in the lifecycle-guide.
|
|
- `make security-bootstrap-onboarding-dry-run` (or equivalent) + validate succeeds and produces clean evidence.
|
|
- The web-ui (if extended) and CLI status surface the dry-run capabilities.
|
|
- Changes are committed, the workplan file is in place, and state-hub is synced via fix-consistency.
|
|
- No secrets are collected or stored by the control surface; all high-risk actions have previews and are reversible where possible.
|
|
- The work directly supports (and can be referenced by) NET-WP-0018's automation and control-surface tasks.
|
|
|
|
## Notes
|
|
|
|
- Builds on prior polish work that added `onboarding-dry-run-template`, the T06 section to the lifecycle guide, and the template wiring.
|
|
- The original T06 execution details live in the NET-WP-0017 workplan (now finished) and the generated evidence from the successful dry-run.
|
|
- Prefer using the existing .local/ inventory scripts and k8s/ helpers as building blocks.
|
|
- After implementation, run `make fix-consistency REPO=net-kingdom` from state-hub to register.
|