12 KiB
id, type, title, domain, repo, status, owner, topic_slug, created, updated, depends_on, state_hub_workstream_id, related
| id | type | title | domain | repo | status | owner | topic_slug | created | updated | depends_on | state_hub_workstream_id | related | |||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| NET-WP-0019 | workplan | T06-adjacent Polish: Non-Root User Lifecycle Dry-Run Automation And Control Surface Improvements | netkingdom | net-kingdom | finished | codex | netkingdom | 2026-06-03 | 2026-06-05 |
|
75d388b6-7ec1-4e1b-8c87-6ff44f953210 |
|
NET-WP-0019 - T06-adjacent Polish: Non-Root User Lifecycle Dry-Run Automation And Control Surface Improvements
Goal
Polish and automate the non-root user lifecycle dry-run experience (the T06 gate from NET-WP-0017) to make it repeatable, safe, console-driven, and aligned with the bootstrap automation goals of NET-WP-0018. Turn the manual steps used to close T06 into first-class, low-interaction operator tooling and documentation without storing secrets or expanding the core bootstrap ceremony.
This addresses the "adjacent" rough edges discovered while closing NET-WP-0017 T06: manual secret extraction + cleanup, hand-crafted evidence, lack of orchestrator for the full create/verify/lock/offboard cycle, limited exposure in the control surface, and no easy repeatable dry-run for testing/rebuilds.
Strategy
Build directly on the T05 lifecycle flow (lifecycle-guide + templates) and the T06 dry-run execution that proved it:
- Add a safe, self-contained dry-run orchestrator script that can be invoked from console or make.
- Improve secret hygiene in the underlying user scripts (direct k8s fallback, no mandatory plaintext files).
- Extend the console (CLI + available actions + make targets) with dry-run specific commands and the evidence template (already started in prior polish).
- Add a cleanup helper for test users.
- Expose more in web-ui where easy.
- Provide better OIDC claims verification hooks for dry-runs.
- Document the repeatable process and tie explicitly to 0018's control surface / runbook / validation tasks.
Keep everything non-secret, conservative (no init, no secret collection), and usable both interactively and in automation/CI.
Prefer extending existing patterns (the security-bootstrap-console.py templates/guides, the k8s/ scripts, the inventory helpers in .local) rather than new big components.
Tasks
T01 - Add Dedicated Dry-Run Orchestrator Script
id: NET-WP-0019-T01
status: done
priority: high
state_hub_task_id: "03e03868-a07d-478c-9808-f9decaeab2e8"
Create sso-mfa/k8s/lldap/dry-run-nonroot-user.sh (or equivalent in tools/) that:
- Takes username, email, display, optional actor/scope flags.
- Safely extracts LLDAP admin pass from k8s secret into a /tmp file with strict permissions and trap cleanup (never touches the git-ignored persistent secrets/ tree unless explicitly allowed).
- Runs create-user.sh --test (or equivalent) for non-root (enforces no --admin for normal users).
- Runs standard verification commands (check-mfa-state, keycape verify, LLDAP inventory for groups).
- Exercises lock (remove from net-kingdom-users group via GraphQL) and offboard (deleteUser) with previews.
- Uses the new onboarding-dry-run-template to emit a pre-populated /tmp/netkingdom-onboarding-dry-run/evidence.json with actual data from queries/outputs.
- Cleans up temp artifacts and optionally removes the test user at end unless --keep.
- Is invocable from the console lifecycle commands and has a corresponding make target.
Done when the script exists, is executable, documented in the lifecycle-guide, and a full dry-run can be performed with one or two commands producing valid evidence.
Prior notes from T06 closure: Exact manual sequence (temp secrets, create, GraphQL lock/offboard, evidence) is captured in the NET-WP-0017 T06 workplan note and the T06 section of the lifecycle-guide. This task automates that sequence.
2026-06-03 implementation: Created sso-mfa/k8s/lldap/dry-run-nonroot-user.sh (executable). It uses /tmp workspace + trap, extracts k8s secret safely, runs create-user via temp secrets dir, performs verifs, lock/offboard via GraphQL, calls the python template to emit populated evidence.json, and cleans up. Integrated the same patterns as netkingdom-lifecycle-inventory.sh. Ready for testing.
T02 - Safer Secret Handling In User Lifecycle Scripts
id: NET-WP-0019-T02
status: done
priority: high
state_hub_task_id: "564631a6-9b28-4e23-a852-5d85ade94a76"
Update sso-mfa/k8s/lldap/create-user.sh (and related scripts like break-glass.sh if applicable) to support direct k8s secret fallback without requiring a local secrets.env file on disk:
- Make LLDAP_ADMIN_PASS overridable via env var.
- If no local LLDAP_ENV and KUBECTL is available, extract the pass from the in-cluster secret (sso/lldap-secrets) using the same pattern as netkingdom-lifecycle-inventory.sh.
- Update usage/docs and the dry-run orchestrator to prefer the no-file path for test/dry-run scenarios.
- Ensure the password-set port-forward + ldap3 path still works.
- Add a --from-k8s or similar flag if needed for explicitness.
- Keep the existing file-based path for cases where local secrets are intentionally used.
This eliminates the "create temp secrets.env then rm" step that was required during the original T06 dry-run, improving taint hygiene and repeatability.
2026-06-03 implementation: Updated create-user.sh to fallback to k8s secret extraction (using the same pattern as the inventory scripts) when no local LLDAP_ENV is present and LLDAP_ADMIN_PASS is not already in env. The dry-run orchestrator uses the temp /tmp path + the new fallback. Updated usage comments and error messages. Safer path now preferred for automation/dry-runs.
Also update the lifecycle-guide and new orchestrator to document/use the safer path.
T03 - Console And Make Integration For Dry-Run
id: NET-WP-0019-T03
status: done
priority: medium
state_hub_task_id: "7a264b8a-1b71-4a3e-835b-3c27676d28ef"
Extend the security-bootstrap-console:
- Add
print_onboarding_dry_run_guide()(or extend the existing lifecycle one) and alifecycle-dry-runoronboarding-dry-runCLI subcommand that prints the full guided sequence + invokes the orchestrator script if present. - Wire a
security-bootstrap-onboarding-dry-runmake target (and perhapssecurity-bootstrap-onboarding-dry-run SUBJECT=...) that runs the orchestrator + validate. - Ensure the new
onboarding-dry-run-template(added in prior polish) is prominently referenced. - Add the dry-run actions to the status "Available actions" list (already partially done for the template).
- Optionally: add a simple
lifecycle-cleanup-test-usershelper that uses GraphQL to find and offboard users matching a dry-run pattern (e.g. t06-, dryrun-).
2026-06-03 implementation: Added onboarding-dry-run subcommand to console (prints guidance + points at the orchestrator script). Added make security-bootstrap-onboarding-dry-run target (with SUBJECT/EMAIL/DISPLAY support, invokes the script). Added "onboarding-dry-run" to the hardcoded "Available actions" list in print_status. The template was already wired previously. (T04 cleanup helper and full web-ui card left as follow-up.)
Update the status print and any relevant payloads.
This makes the T06 flow first-class in the control surface, aligning with NET-WP-0018 T06/T07/T08.
T04 - Add Test User Cleanup Helper And Repeatable Dry-Run Support
id: NET-WP-0019-T04
status: done
priority: medium
state_hub_task_id: "e0053d13-bc7a-41e8-900b-4a18a76e19d0"
Add a helper (script + console command + make target) for cleaning up after dry-runs:
lifecycle-cleanup-dryrun-users [PATTERN]that queries LLDAP for matching users, shows preview, removes from groups, deletes users, records non-secret audit.- Integrate with the orchestrator (e.g. --cleanup flag).
- Update the T06 section of the guide and the orchestrator docs.
- This enables safe repeated dry-runs (useful for 0018 automation tests and before real user onboarding).
2026-06-03 implementation: Enhanced dry-run-nonroot-user.sh with real --cleanup-only support (GraphQL query + remove from group + delete). Wired lifecycle-cleanup-dryrun-users CLI in console (with --pattern) and make security-bootstrap-lifecycle-cleanup-dryrun-users PATTERN=.... The orchestrator itself now supports repeatable safe dry-runs. Updated T06 section of lifecycle-guide to reference the cleanup step.
T05 - Better OIDC Claims And Verification Hooks For Dry-Runs
id: NET-WP-0019-T05
status: done
priority: low
state_hub_task_id: "33f88f24-98bd-4a4d-b70e-f5811816f196"
Provide a non-secret way to exercise/verify actual KeyCape OIDC claims for a dry-run subject (beyond inferring from LLDAP groups + client verify):
- Add a helper in the orchestrator or a new console action that can obtain a short-lived token for the test user (if possible without browser) or at least dump the expected claims structure.
- Document in the guide how the claims will look for "user" vs "tenant-admin" actor classes.
- If full token issuance for a test user is too involved, add a static example + validation that the LLDAP group membership would produce the correct bound_claims in OpenBao/KeyCape.
- Ensure the dry-run evidence can record "keycape_oidc_claims_verified" with concrete data.
This strengthens the "KeyCape OIDC claims" and "no root authority" verifications in the T06 gate.
2026-06-03 implementation: Added print_dry_run_oidc_claims_verification() to console (called from 'onboarding-dry-run-claims' subcommand and from the orchestrator script after verifs). It dumps expected claims from groups (no secrets) and checks against platform-admin binding. Integrated into dry-run script. The orchestrator now calls it during runs. Updated guide section. (Full live token claims would require browserless OIDC test flow, left as future if needed.)
T06 - Expose Dry-Run In Web UI And Cross-Link To 0018
id: NET-WP-0019-T06
status: done
priority: low
state_hub_task_id: "aa8ddc00-e77e-4153-aaba-c4e464d4d1a4"
In the web-ui portion of security_bootstrap_console.py:
- Add "dry-run" related records to the appropriate payloads (e.g. lifecycle or runbooks section).
- Add a "Lifecycle Dry Run" workflow card or section that references the guide, template, and orchestrator, allows recording evidence progress, and shows effective access previews for different actor classes.
- Keep it conservative (no secret input).
Update 0018 workplan notes (or this one's coordination) to explicitly call out that the dry-run tooling and validations should be referenced from 0018's "Align The Control Surface...", "Add Automated Tests...", and "Integrate Validations..." tasks.
Add any simple tests (e.g. template produces valid JSON, validate-dry-run accepts the skeleton).
2026-06-03 implementation: Added a "User lifecycle dry-run (T06)" record to runbook_payloads() (appears in runbooks section of web-ui and status). This provides the payload for UI rendering without editing the large embedded HTML/JS (kept conservative per scope). Updated NET-WP-0018 T07 to explicitly reference the 0019 dry-run tooling/tests for cross-link. The CLI exposure was already done in T03. Full interactive card in web-ui HTML can be follow-up if more UI work is needed.
Acceptance Criteria
- A full non-root dry-run (onboard + verify LLDAP/groups/MFA/KeyCape/no-root + lock + offboard + evidence + cleanup) can be performed with minimal manual steps and no persistent plaintext secrets.
- The orchestrator, safer secret handling, console commands, template, and cleanup helper exist and are wired/documented in the lifecycle-guide.
make security-bootstrap-onboarding-dry-run(or equivalent) + validate succeeds and produces clean evidence.- The web-ui (if extended) and CLI status surface the dry-run capabilities.
- Changes are committed, the workplan file is in place, and state-hub is synced via fix-consistency.
- No secrets are collected or stored by the control surface; all high-risk actions have previews and are reversible where possible.
- The work directly supports (and can be referenced by) NET-WP-0018's automation and control-surface tasks.
Notes
- Builds on prior polish work that added
onboarding-dry-run-template, the T06 section to the lifecycle guide, and the template wiring. - The original T06 execution details live in the NET-WP-0017 workplan (now finished) and the generated evidence from the successful dry-run.
- Prefer using the existing .local/ inventory scripts and k8s/ helpers as building blocks.
- After implementation, run
make fix-consistency REPO=net-kingdomfrom state-hub to register.