--- id: NET-WP-0019 type: workplan title: "T06-adjacent Polish: Non-Root User Lifecycle Dry-Run Automation And Control Surface Improvements" domain: netkingdom repo: net-kingdom status: finished owner: codex topic_slug: netkingdom created: "2026-06-03" updated: "2026-06-05" depends_on: - NET-WP-0017 - NET-WP-0018 state_hub_workstream_id: "75d388b6-7ec1-4e1b-8c87-6ff44f953210" related: - docs/user-engine-netkingdom-integration-assessment.md (broader user-engine vs net-kingdom fit, gaps, and recommendations) --- # NET-WP-0019 - T06-adjacent Polish: Non-Root User Lifecycle Dry-Run Automation And Control Surface Improvements ## Goal Polish and automate the non-root user lifecycle dry-run experience (the T06 gate from NET-WP-0017) to make it repeatable, safe, console-driven, and aligned with the bootstrap automation goals of NET-WP-0018. Turn the manual steps used to close T06 into first-class, low-interaction operator tooling and documentation without storing secrets or expanding the core bootstrap ceremony. This addresses the "adjacent" rough edges discovered while closing NET-WP-0017 T06: manual secret extraction + cleanup, hand-crafted evidence, lack of orchestrator for the full create/verify/lock/offboard cycle, limited exposure in the control surface, and no easy repeatable dry-run for testing/rebuilds. ## Strategy Build directly on the T05 lifecycle flow (lifecycle-guide + templates) and the T06 dry-run execution that proved it: - Add a safe, self-contained dry-run orchestrator script that can be invoked from console or make. - Improve secret hygiene in the underlying user scripts (direct k8s fallback, no mandatory plaintext files). - Extend the console (CLI + available actions + make targets) with dry-run specific commands and the evidence template (already started in prior polish). - Add a cleanup helper for test users. - Expose more in web-ui where easy. - Provide better OIDC claims verification hooks for dry-runs. - Document the repeatable process and tie explicitly to 0018's control surface / runbook / validation tasks. Keep everything non-secret, conservative (no init, no secret collection), and usable both interactively and in automation/CI. Prefer extending existing patterns (the security-bootstrap-console.py templates/guides, the k8s/ scripts, the inventory helpers in .local) rather than new big components. ## Tasks ### T01 - Add Dedicated Dry-Run Orchestrator Script ```task id: NET-WP-0019-T01 status: done priority: high state_hub_task_id: "03e03868-a07d-478c-9808-f9decaeab2e8" ``` Create `sso-mfa/k8s/lldap/dry-run-nonroot-user.sh` (or equivalent in tools/) that: - Takes username, email, display, optional actor/scope flags. - Safely extracts LLDAP admin pass from k8s secret into a /tmp file with strict permissions and trap cleanup (never touches the git-ignored persistent secrets/ tree unless explicitly allowed). - Runs create-user.sh --test (or equivalent) for non-root (enforces no --admin for normal users). - Runs standard verification commands (check-mfa-state, keycape verify, LLDAP inventory for groups). - Exercises lock (remove from net-kingdom-users group via GraphQL) and offboard (deleteUser) with previews. - Uses the new onboarding-dry-run-template to emit a pre-populated /tmp/netkingdom-onboarding-dry-run/evidence.json with actual data from queries/outputs. - Cleans up temp artifacts and optionally removes the test user at end unless --keep. - Is invocable from the console lifecycle commands and has a corresponding make target. Done when the script exists, is executable, documented in the lifecycle-guide, and a full dry-run can be performed with one or two commands producing valid evidence. **Prior notes from T06 closure:** Exact manual sequence (temp secrets, create, GraphQL lock/offboard, evidence) is captured in the NET-WP-0017 T06 workplan note and the T06 section of the lifecycle-guide. This task automates that sequence. **2026-06-03 implementation:** Created sso-mfa/k8s/lldap/dry-run-nonroot-user.sh (executable). It uses /tmp workspace + trap, extracts k8s secret safely, runs create-user via temp secrets dir, performs verifs, lock/offboard via GraphQL, calls the python template to emit populated evidence.json, and cleans up. Integrated the same patterns as netkingdom-lifecycle-inventory.sh. Ready for testing. ### T02 - Safer Secret Handling In User Lifecycle Scripts ```task id: NET-WP-0019-T02 status: done priority: high state_hub_task_id: "564631a6-9b28-4e23-a852-5d85ade94a76" ``` Update `sso-mfa/k8s/lldap/create-user.sh` (and related scripts like break-glass.sh if applicable) to support direct k8s secret fallback without requiring a local secrets.env file on disk: - Make LLDAP_ADMIN_PASS overridable via env var. - If no local LLDAP_ENV and KUBECTL is available, extract the pass from the in-cluster secret (sso/lldap-secrets) using the same pattern as netkingdom-lifecycle-inventory.sh. - Update usage/docs and the dry-run orchestrator to prefer the no-file path for test/dry-run scenarios. - Ensure the password-set port-forward + ldap3 path still works. - Add a --from-k8s or similar flag if needed for explicitness. - Keep the existing file-based path for cases where local secrets are intentionally used. This eliminates the "create temp secrets.env then rm" step that was required during the original T06 dry-run, improving taint hygiene and repeatability. **2026-06-03 implementation:** Updated create-user.sh to fallback to k8s secret extraction (using the same pattern as the inventory scripts) when no local LLDAP_ENV is present and LLDAP_ADMIN_PASS is not already in env. The dry-run orchestrator uses the temp /tmp path + the new fallback. Updated usage comments and error messages. Safer path now preferred for automation/dry-runs. Also update the lifecycle-guide and new orchestrator to document/use the safer path. ### T03 - Console And Make Integration For Dry-Run ```task id: NET-WP-0019-T03 status: done priority: medium state_hub_task_id: "7a264b8a-1b71-4a3e-835b-3c27676d28ef" ``` Extend the security-bootstrap-console: - Add `print_onboarding_dry_run_guide()` (or extend the existing lifecycle one) and a `lifecycle-dry-run` or `onboarding-dry-run` CLI subcommand that prints the full guided sequence + invokes the orchestrator script if present. - Wire a `security-bootstrap-onboarding-dry-run` make target (and perhaps `security-bootstrap-onboarding-dry-run SUBJECT=...` ) that runs the orchestrator + validate. - Ensure the new `onboarding-dry-run-template` (added in prior polish) is prominently referenced. - Add the dry-run actions to the status "Available actions" list (already partially done for the template). - Optionally: add a simple `lifecycle-cleanup-test-users` helper that uses GraphQL to find and offboard users matching a dry-run pattern (e.g. t06-*, dryrun-*). **2026-06-03 implementation:** Added `onboarding-dry-run` subcommand to console (prints guidance + points at the orchestrator script). Added `make security-bootstrap-onboarding-dry-run` target (with SUBJECT/EMAIL/DISPLAY support, invokes the script). Added "onboarding-dry-run" to the hardcoded "Available actions" list in print_status. The template was already wired previously. (T04 cleanup helper and full web-ui card left as follow-up.) Update the status print and any relevant payloads. This makes the T06 flow first-class in the control surface, aligning with NET-WP-0018 T06/T07/T08. ### T04 - Add Test User Cleanup Helper And Repeatable Dry-Run Support ```task id: NET-WP-0019-T04 status: done priority: medium state_hub_task_id: "e0053d13-bc7a-41e8-900b-4a18a76e19d0" ``` Add a helper (script + console command + make target) for cleaning up after dry-runs: - `lifecycle-cleanup-dryrun-users [PATTERN]` that queries LLDAP for matching users, shows preview, removes from groups, deletes users, records non-secret audit. - Integrate with the orchestrator (e.g. --cleanup flag). - Update the T06 section of the guide and the orchestrator docs. - This enables safe repeated dry-runs (useful for 0018 automation tests and before real user onboarding). **2026-06-03 implementation:** Enhanced dry-run-nonroot-user.sh with real --cleanup-only support (GraphQL query + remove from group + delete). Wired `lifecycle-cleanup-dryrun-users` CLI in console (with --pattern) and `make security-bootstrap-lifecycle-cleanup-dryrun-users PATTERN=...`. The orchestrator itself now supports repeatable safe dry-runs. Updated T06 section of lifecycle-guide to reference the cleanup step. ### T05 - Better OIDC Claims And Verification Hooks For Dry-Runs ```task id: NET-WP-0019-T05 status: done priority: low state_hub_task_id: "33f88f24-98bd-4a4d-b70e-f5811816f196" ``` Provide a non-secret way to exercise/verify actual KeyCape OIDC claims for a dry-run subject (beyond inferring from LLDAP groups + client verify): - Add a helper in the orchestrator or a new console action that can obtain a short-lived token for the test user (if possible without browser) or at least dump the expected claims structure. - Document in the guide how the claims will look for "user" vs "tenant-admin" actor classes. - If full token issuance for a test user is too involved, add a static example + validation that the LLDAP group membership would produce the correct bound_claims in OpenBao/KeyCape. - Ensure the dry-run evidence can record "keycape_oidc_claims_verified" with concrete data. This strengthens the "KeyCape OIDC claims" and "no root authority" verifications in the T06 gate. **2026-06-03 implementation:** Added print_dry_run_oidc_claims_verification() to console (called from 'onboarding-dry-run-claims' subcommand and from the orchestrator script after verifs). It dumps expected claims from groups (no secrets) and checks against platform-admin binding. Integrated into dry-run script. The orchestrator now calls it during runs. Updated guide section. (Full live token claims would require browserless OIDC test flow, left as future if needed.) ### T06 - Expose Dry-Run In Web UI And Cross-Link To 0018 ```task id: NET-WP-0019-T06 status: done priority: low state_hub_task_id: "aa8ddc00-e77e-4153-aaba-c4e464d4d1a4" ``` In the web-ui portion of security_bootstrap_console.py: - Add "dry-run" related records to the appropriate payloads (e.g. lifecycle or runbooks section). - Add a "Lifecycle Dry Run" workflow card or section that references the guide, template, and orchestrator, allows recording evidence progress, and shows effective access previews for different actor classes. - Keep it conservative (no secret input). Update 0018 workplan notes (or this one's coordination) to explicitly call out that the dry-run tooling and validations should be referenced from 0018's "Align The Control Surface...", "Add Automated Tests...", and "Integrate Validations..." tasks. Add any simple tests (e.g. template produces valid JSON, validate-dry-run accepts the skeleton). **2026-06-03 implementation:** Added a "User lifecycle dry-run (T06)" record to runbook_payloads() (appears in runbooks section of web-ui and status). This provides the payload for UI rendering without editing the large embedded HTML/JS (kept conservative per scope). Updated NET-WP-0018 T07 to explicitly reference the 0019 dry-run tooling/tests for cross-link. The CLI exposure was already done in T03. Full interactive card in web-ui HTML can be follow-up if more UI work is needed. ## Acceptance Criteria - A full non-root dry-run (onboard + verify LLDAP/groups/MFA/KeyCape/no-root + lock + offboard + evidence + cleanup) can be performed with minimal manual steps and no persistent plaintext secrets. - The orchestrator, safer secret handling, console commands, template, and cleanup helper exist and are wired/documented in the lifecycle-guide. - `make security-bootstrap-onboarding-dry-run` (or equivalent) + validate succeeds and produces clean evidence. - The web-ui (if extended) and CLI status surface the dry-run capabilities. - Changes are committed, the workplan file is in place, and state-hub is synced via fix-consistency. - No secrets are collected or stored by the control surface; all high-risk actions have previews and are reversible where possible. - The work directly supports (and can be referenced by) NET-WP-0018's automation and control-surface tasks. ## Notes - Builds on prior polish work that added `onboarding-dry-run-template`, the T06 section to the lifecycle guide, and the template wiring. - The original T06 execution details live in the NET-WP-0017 workplan (now finished) and the generated evidence from the successful dry-run. - Prefer using the existing .local/ inventory scripts and k8s/ helpers as building blocks. - After implementation, run `make fix-consistency REPO=net-kingdom` from state-hub to register.