Files
net-kingdom/workplans/NET-WP-0019-t06-adjacent-user-lifecycle-dry-run-polish.md

12 KiB

id, type, title, domain, repo, status, owner, topic_slug, created, updated, depends_on, state_hub_workstream_id, related
id type title domain repo status owner topic_slug created updated depends_on state_hub_workstream_id related
NET-WP-0019 workplan T06-adjacent Polish: Non-Root User Lifecycle Dry-Run Automation And Control Surface Improvements netkingdom net-kingdom ready codex netkingdom 2026-06-03 2026-06-03
NET-WP-0017
NET-WP-0018
75d388b6-7ec1-4e1b-8c87-6ff44f953210
docs/user-engine-netkingdom-integration-assessment.md (broader user-engine vs net-kingdom fit, gaps, and recommendations)

NET-WP-0019 - T06-adjacent Polish: Non-Root User Lifecycle Dry-Run Automation And Control Surface Improvements

Goal

Polish and automate the non-root user lifecycle dry-run experience (the T06 gate from NET-WP-0017) to make it repeatable, safe, console-driven, and aligned with the bootstrap automation goals of NET-WP-0018. Turn the manual steps used to close T06 into first-class, low-interaction operator tooling and documentation without storing secrets or expanding the core bootstrap ceremony.

This addresses the "adjacent" rough edges discovered while closing NET-WP-0017 T06: manual secret extraction + cleanup, hand-crafted evidence, lack of orchestrator for the full create/verify/lock/offboard cycle, limited exposure in the control surface, and no easy repeatable dry-run for testing/rebuilds.

Strategy

Build directly on the T05 lifecycle flow (lifecycle-guide + templates) and the T06 dry-run execution that proved it:

  • Add a safe, self-contained dry-run orchestrator script that can be invoked from console or make.
  • Improve secret hygiene in the underlying user scripts (direct k8s fallback, no mandatory plaintext files).
  • Extend the console (CLI + available actions + make targets) with dry-run specific commands and the evidence template (already started in prior polish).
  • Add a cleanup helper for test users.
  • Expose more in web-ui where easy.
  • Provide better OIDC claims verification hooks for dry-runs.
  • Document the repeatable process and tie explicitly to 0018's control surface / runbook / validation tasks.

Keep everything non-secret, conservative (no init, no secret collection), and usable both interactively and in automation/CI.

Prefer extending existing patterns (the security-bootstrap-console.py templates/guides, the k8s/ scripts, the inventory helpers in .local) rather than new big components.

Tasks

T01 - Add Dedicated Dry-Run Orchestrator Script

id: NET-WP-0019-T01
status: done
priority: high
state_hub_task_id: "03e03868-a07d-478c-9808-f9decaeab2e8"

Create sso-mfa/k8s/lldap/dry-run-nonroot-user.sh (or equivalent in tools/) that:

  • Takes username, email, display, optional actor/scope flags.
  • Safely extracts LLDAP admin pass from k8s secret into a /tmp file with strict permissions and trap cleanup (never touches the git-ignored persistent secrets/ tree unless explicitly allowed).
  • Runs create-user.sh --test (or equivalent) for non-root (enforces no --admin for normal users).
  • Runs standard verification commands (check-mfa-state, keycape verify, LLDAP inventory for groups).
  • Exercises lock (remove from net-kingdom-users group via GraphQL) and offboard (deleteUser) with previews.
  • Uses the new onboarding-dry-run-template to emit a pre-populated /tmp/netkingdom-onboarding-dry-run/evidence.json with actual data from queries/outputs.
  • Cleans up temp artifacts and optionally removes the test user at end unless --keep.
  • Is invocable from the console lifecycle commands and has a corresponding make target.

Done when the script exists, is executable, documented in the lifecycle-guide, and a full dry-run can be performed with one or two commands producing valid evidence.

Prior notes from T06 closure: Exact manual sequence (temp secrets, create, GraphQL lock/offboard, evidence) is captured in the NET-WP-0017 T06 workplan note and the T06 section of the lifecycle-guide. This task automates that sequence.

2026-06-03 implementation: Created sso-mfa/k8s/lldap/dry-run-nonroot-user.sh (executable). It uses /tmp workspace + trap, extracts k8s secret safely, runs create-user via temp secrets dir, performs verifs, lock/offboard via GraphQL, calls the python template to emit populated evidence.json, and cleans up. Integrated the same patterns as netkingdom-lifecycle-inventory.sh. Ready for testing.

T02 - Safer Secret Handling In User Lifecycle Scripts

id: NET-WP-0019-T02
status: done
priority: high
state_hub_task_id: "564631a6-9b28-4e23-a852-5d85ade94a76"

Update sso-mfa/k8s/lldap/create-user.sh (and related scripts like break-glass.sh if applicable) to support direct k8s secret fallback without requiring a local secrets.env file on disk:

  • Make LLDAP_ADMIN_PASS overridable via env var.
  • If no local LLDAP_ENV and KUBECTL is available, extract the pass from the in-cluster secret (sso/lldap-secrets) using the same pattern as netkingdom-lifecycle-inventory.sh.
  • Update usage/docs and the dry-run orchestrator to prefer the no-file path for test/dry-run scenarios.
  • Ensure the password-set port-forward + ldap3 path still works.
  • Add a --from-k8s or similar flag if needed for explicitness.
  • Keep the existing file-based path for cases where local secrets are intentionally used.

This eliminates the "create temp secrets.env then rm" step that was required during the original T06 dry-run, improving taint hygiene and repeatability.

2026-06-03 implementation: Updated create-user.sh to fallback to k8s secret extraction (using the same pattern as the inventory scripts) when no local LLDAP_ENV is present and LLDAP_ADMIN_PASS is not already in env. The dry-run orchestrator uses the temp /tmp path + the new fallback. Updated usage comments and error messages. Safer path now preferred for automation/dry-runs.

Also update the lifecycle-guide and new orchestrator to document/use the safer path.

T03 - Console And Make Integration For Dry-Run

id: NET-WP-0019-T03
status: done
priority: medium
state_hub_task_id: "7a264b8a-1b71-4a3e-835b-3c27676d28ef"

Extend the security-bootstrap-console:

  • Add print_onboarding_dry_run_guide() (or extend the existing lifecycle one) and a lifecycle-dry-run or onboarding-dry-run CLI subcommand that prints the full guided sequence + invokes the orchestrator script if present.
  • Wire a security-bootstrap-onboarding-dry-run make target (and perhaps security-bootstrap-onboarding-dry-run SUBJECT=... ) that runs the orchestrator + validate.
  • Ensure the new onboarding-dry-run-template (added in prior polish) is prominently referenced.
  • Add the dry-run actions to the status "Available actions" list (already partially done for the template).
  • Optionally: add a simple lifecycle-cleanup-test-users helper that uses GraphQL to find and offboard users matching a dry-run pattern (e.g. t06-, dryrun-).

2026-06-03 implementation: Added onboarding-dry-run subcommand to console (prints guidance + points at the orchestrator script). Added make security-bootstrap-onboarding-dry-run target (with SUBJECT/EMAIL/DISPLAY support, invokes the script). Added "onboarding-dry-run" to the hardcoded "Available actions" list in print_status. The template was already wired previously. (T04 cleanup helper and full web-ui card left as follow-up.)

Update the status print and any relevant payloads.

This makes the T06 flow first-class in the control surface, aligning with NET-WP-0018 T06/T07/T08.

T04 - Add Test User Cleanup Helper And Repeatable Dry-Run Support

id: NET-WP-0019-T04
status: done
priority: medium
state_hub_task_id: "e0053d13-bc7a-41e8-900b-4a18a76e19d0"

Add a helper (script + console command + make target) for cleaning up after dry-runs:

  • lifecycle-cleanup-dryrun-users [PATTERN] that queries LLDAP for matching users, shows preview, removes from groups, deletes users, records non-secret audit.
  • Integrate with the orchestrator (e.g. --cleanup flag).
  • Update the T06 section of the guide and the orchestrator docs.
  • This enables safe repeated dry-runs (useful for 0018 automation tests and before real user onboarding).

2026-06-03 implementation: Enhanced dry-run-nonroot-user.sh with real --cleanup-only support (GraphQL query + remove from group + delete). Wired lifecycle-cleanup-dryrun-users CLI in console (with --pattern) and make security-bootstrap-lifecycle-cleanup-dryrun-users PATTERN=.... The orchestrator itself now supports repeatable safe dry-runs. Updated T06 section of lifecycle-guide to reference the cleanup step.

T05 - Better OIDC Claims And Verification Hooks For Dry-Runs

id: NET-WP-0019-T05
status: done
priority: low
state_hub_task_id: "33f88f24-98bd-4a4d-b70e-f5811816f196"

Provide a non-secret way to exercise/verify actual KeyCape OIDC claims for a dry-run subject (beyond inferring from LLDAP groups + client verify):

  • Add a helper in the orchestrator or a new console action that can obtain a short-lived token for the test user (if possible without browser) or at least dump the expected claims structure.
  • Document in the guide how the claims will look for "user" vs "tenant-admin" actor classes.
  • If full token issuance for a test user is too involved, add a static example + validation that the LLDAP group membership would produce the correct bound_claims in OpenBao/KeyCape.
  • Ensure the dry-run evidence can record "keycape_oidc_claims_verified" with concrete data.

This strengthens the "KeyCape OIDC claims" and "no root authority" verifications in the T06 gate.

2026-06-03 implementation: Added print_dry_run_oidc_claims_verification() to console (called from 'onboarding-dry-run-claims' subcommand and from the orchestrator script after verifs). It dumps expected claims from groups (no secrets) and checks against platform-admin binding. Integrated into dry-run script. The orchestrator now calls it during runs. Updated guide section. (Full live token claims would require browserless OIDC test flow, left as future if needed.)

id: NET-WP-0019-T06
status: done
priority: low
state_hub_task_id: "aa8ddc00-e77e-4153-aaba-c4e464d4d1a4"

In the web-ui portion of security_bootstrap_console.py:

  • Add "dry-run" related records to the appropriate payloads (e.g. lifecycle or runbooks section).
  • Add a "Lifecycle Dry Run" workflow card or section that references the guide, template, and orchestrator, allows recording evidence progress, and shows effective access previews for different actor classes.
  • Keep it conservative (no secret input).

Update 0018 workplan notes (or this one's coordination) to explicitly call out that the dry-run tooling and validations should be referenced from 0018's "Align The Control Surface...", "Add Automated Tests...", and "Integrate Validations..." tasks.

Add any simple tests (e.g. template produces valid JSON, validate-dry-run accepts the skeleton).

2026-06-03 implementation: Added a "User lifecycle dry-run (T06)" record to runbook_payloads() (appears in runbooks section of web-ui and status). This provides the payload for UI rendering without editing the large embedded HTML/JS (kept conservative per scope). Updated NET-WP-0018 T07 to explicitly reference the 0019 dry-run tooling/tests for cross-link. The CLI exposure was already done in T03. Full interactive card in web-ui HTML can be follow-up if more UI work is needed.

Acceptance Criteria

  • A full non-root dry-run (onboard + verify LLDAP/groups/MFA/KeyCape/no-root + lock + offboard + evidence + cleanup) can be performed with minimal manual steps and no persistent plaintext secrets.
  • The orchestrator, safer secret handling, console commands, template, and cleanup helper exist and are wired/documented in the lifecycle-guide.
  • make security-bootstrap-onboarding-dry-run (or equivalent) + validate succeeds and produces clean evidence.
  • The web-ui (if extended) and CLI status surface the dry-run capabilities.
  • Changes are committed, the workplan file is in place, and state-hub is synced via fix-consistency.
  • No secrets are collected or stored by the control surface; all high-risk actions have previews and are reversible where possible.
  • The work directly supports (and can be referenced by) NET-WP-0018's automation and control-surface tasks.

Notes

  • Builds on prior polish work that added onboarding-dry-run-template, the T06 section to the lifecycle guide, and the template wiring.
  • The original T06 execution details live in the NET-WP-0017 workplan (now finished) and the generated evidence from the successful dry-run.
  • Prefer using the existing .local/ inventory scripts and k8s/ helpers as building blocks.
  • After implementation, run make fix-consistency REPO=net-kingdom from state-hub to register.