Files

tegwick 1721226427 docs: persist user-engine vs net-kingdom integration assessment (new doc + cross-references in SCOPE, boundary contract, guidance, responsibility map, 0018/0019 workplans). Also updated user-engine integration doc to reference it.

2026-06-03 10:33:31 +02:00

12 KiB

Raw Blame History

id, type, title, domain, repo, status, owner, topic_slug, created, updated, depends_on, state_hub_workstream_id, related

type

title

domain

repo

status

owner

topic_slug

created

updated

depends_on

state_hub_workstream_id

NET-WP-0019

workplan

T06-adjacent Polish: Non-Root User Lifecycle Dry-Run Automation And Control Surface Improvements

netkingdom

net-kingdom

ready

codex

netkingdom

2026-06-03

NET-WP-0017

NET-WP-0018

75d388b6-7ec1-4e1b-8c87-6ff44f953210

docs/user-engine-netkingdom-integration-assessment.md (broader user-engine vs net-kingdom fit, gaps, and recommendations)

NET-WP-0019 - T06-adjacent Polish: Non-Root User Lifecycle Dry-Run Automation And Control Surface Improvements

Goal

Polish and automate the non-root user lifecycle dry-run experience (the T06 gate from NET-WP-0017) to make it repeatable, safe, console-driven, and aligned with the bootstrap automation goals of NET-WP-0018. Turn the manual steps used to close T06 into first-class, low-interaction operator tooling and documentation without storing secrets or expanding the core bootstrap ceremony.

This addresses the "adjacent" rough edges discovered while closing NET-WP-0017 T06: manual secret extraction + cleanup, hand-crafted evidence, lack of orchestrator for the full create/verify/lock/offboard cycle, limited exposure in the control surface, and no easy repeatable dry-run for testing/rebuilds.

Strategy

Build directly on the T05 lifecycle flow (lifecycle-guide + templates) and the T06 dry-run execution that proved it:

Add a safe, self-contained dry-run orchestrator script that can be invoked from console or make.
Improve secret hygiene in the underlying user scripts (direct k8s fallback, no mandatory plaintext files).
Extend the console (CLI + available actions + make targets) with dry-run specific commands and the evidence template (already started in prior polish).
Add a cleanup helper for test users.
Expose more in web-ui where easy.
Provide better OIDC claims verification hooks for dry-runs.
Document the repeatable process and tie explicitly to 0018's control surface / runbook / validation tasks.

Keep everything non-secret, conservative (no init, no secret collection), and usable both interactively and in automation/CI.

Prefer extending existing patterns (the security-bootstrap-console.py templates/guides, the k8s/ scripts, the inventory helpers in .local) rather than new big components.

Tasks

T01 - Add Dedicated Dry-Run Orchestrator Script

id: NET-WP-0019-T01
status: done
priority: high
state_hub_task_id: "03e03868-a07d-478c-9808-f9decaeab2e8"

Create sso-mfa/k8s/lldap/dry-run-nonroot-user.sh (or equivalent in tools/) that:

Takes username, email, display, optional actor/scope flags.
Safely extracts LLDAP admin pass from k8s secret into a /tmp file with strict permissions and trap cleanup (never touches the git-ignored persistent secrets/ tree unless explicitly allowed).
Runs create-user.sh --test (or equivalent) for non-root (enforces no --admin for normal users).
Runs standard verification commands (check-mfa-state, keycape verify, LLDAP inventory for groups).
Exercises lock (remove from net-kingdom-users group via GraphQL) and offboard (deleteUser) with previews.
Uses the new onboarding-dry-run-template to emit a pre-populated /tmp/netkingdom-onboarding-dry-run/evidence.json with actual data from queries/outputs.
Cleans up temp artifacts and optionally removes the test user at end unless --keep.
Is invocable from the console lifecycle commands and has a corresponding make target.

Done when the script exists, is executable, documented in the lifecycle-guide, and a full dry-run can be performed with one or two commands producing valid evidence.

Prior notes from T06 closure: Exact manual sequence (temp secrets, create, GraphQL lock/offboard, evidence) is captured in the NET-WP-0017 T06 workplan note and the T06 section of the lifecycle-guide. This task automates that sequence.

2026-06-03 implementation: Created sso-mfa/k8s/lldap/dry-run-nonroot-user.sh (executable). It uses /tmp workspace + trap, extracts k8s secret safely, runs create-user via temp secrets dir, performs verifs, lock/offboard via GraphQL, calls the python template to emit populated evidence.json, and cleans up. Integrated the same patterns as netkingdom-lifecycle-inventory.sh. Ready for testing.

T02 - Safer Secret Handling In User Lifecycle Scripts

id: NET-WP-0019-T02
status: done
priority: high
state_hub_task_id: "564631a6-9b28-4e23-a852-5d85ade94a76"

Update sso-mfa/k8s/lldap/create-user.sh (and related scripts like break-glass.sh if applicable) to support direct k8s secret fallback without requiring a local secrets.env file on disk:

Make LLDAP_ADMIN_PASS overridable via env var.
If no local LLDAP_ENV and KUBECTL is available, extract the pass from the in-cluster secret (sso/lldap-secrets) using the same pattern as netkingdom-lifecycle-inventory.sh.
Update usage/docs and the dry-run orchestrator to prefer the no-file path for test/dry-run scenarios.
Ensure the password-set port-forward + ldap3 path still works.
Add a --from-k8s or similar flag if needed for explicitness.
Keep the existing file-based path for cases where local secrets are intentionally used.

This eliminates the "create temp secrets.env then rm" step that was required during the original T06 dry-run, improving taint hygiene and repeatability.

2026-06-03 implementation: Updated create-user.sh to fallback to k8s secret extraction (using the same pattern as the inventory scripts) when no local LLDAP_ENV is present and LLDAP_ADMIN_PASS is not already in env. The dry-run orchestrator uses the temp /tmp path + the new fallback. Updated usage comments and error messages. Safer path now preferred for automation/dry-runs.

Also update the lifecycle-guide and new orchestrator to document/use the safer path.

T03 - Console And Make Integration For Dry-Run

id: NET-WP-0019-T03
status: done
priority: medium
state_hub_task_id: "7a264b8a-1b71-4a3e-835b-3c27676d28ef"

Extend the security-bootstrap-console:

Add print_onboarding_dry_run_guide() (or extend the existing lifecycle one) and a lifecycle-dry-run or onboarding-dry-run CLI subcommand that prints the full guided sequence + invokes the orchestrator script if present.
Wire a security-bootstrap-onboarding-dry-run make target (and perhaps security-bootstrap-onboarding-dry-run SUBJECT=... ) that runs the orchestrator + validate.
Ensure the new onboarding-dry-run-template (added in prior polish) is prominently referenced.
Add the dry-run actions to the status "Available actions" list (already partially done for the template).
Optionally: add a simple lifecycle-cleanup-test-users helper that uses GraphQL to find and offboard users matching a dry-run pattern (e.g. t06-, dryrun-).

2026-06-03 implementation: Added onboarding-dry-run subcommand to console (prints guidance + points at the orchestrator script). Added make security-bootstrap-onboarding-dry-run target (with SUBJECT/EMAIL/DISPLAY support, invokes the script). Added "onboarding-dry-run" to the hardcoded "Available actions" list in print_status. The template was already wired previously. (T04 cleanup helper and full web-ui card left as follow-up.)

Update the status print and any relevant payloads.

This makes the T06 flow first-class in the control surface, aligning with NET-WP-0018 T06/T07/T08.

T04 - Add Test User Cleanup Helper And Repeatable Dry-Run Support

id: NET-WP-0019-T04
status: done
priority: medium
state_hub_task_id: "e0053d13-bc7a-41e8-900b-4a18a76e19d0"

Add a helper (script + console command + make target) for cleaning up after dry-runs:

lifecycle-cleanup-dryrun-users [PATTERN] that queries LLDAP for matching users, shows preview, removes from groups, deletes users, records non-secret audit.
Integrate with the orchestrator (e.g. --cleanup flag).
Update the T06 section of the guide and the orchestrator docs.
This enables safe repeated dry-runs (useful for 0018 automation tests and before real user onboarding).

2026-06-03 implementation: Enhanced dry-run-nonroot-user.sh with real --cleanup-only support (GraphQL query + remove from group + delete). Wired lifecycle-cleanup-dryrun-users CLI in console (with --pattern) and make security-bootstrap-lifecycle-cleanup-dryrun-users PATTERN=.... The orchestrator itself now supports repeatable safe dry-runs. Updated T06 section of lifecycle-guide to reference the cleanup step.

T05 - Better OIDC Claims And Verification Hooks For Dry-Runs

id: NET-WP-0019-T05
status: done
priority: low
state_hub_task_id: "33f88f24-98bd-4a4d-b70e-f5811816f196"

Provide a non-secret way to exercise/verify actual KeyCape OIDC claims for a dry-run subject (beyond inferring from LLDAP groups + client verify):

Add a helper in the orchestrator or a new console action that can obtain a short-lived token for the test user (if possible without browser) or at least dump the expected claims structure.
Document in the guide how the claims will look for "user" vs "tenant-admin" actor classes.
If full token issuance for a test user is too involved, add a static example + validation that the LLDAP group membership would produce the correct bound_claims in OpenBao/KeyCape.
Ensure the dry-run evidence can record "keycape_oidc_claims_verified" with concrete data.

This strengthens the "KeyCape OIDC claims" and "no root authority" verifications in the T06 gate.

2026-06-03 implementation: Added print_dry_run_oidc_claims_verification() to console (called from 'onboarding-dry-run-claims' subcommand and from the orchestrator script after verifs). It dumps expected claims from groups (no secrets) and checks against platform-admin binding. Integrated into dry-run script. The orchestrator now calls it during runs. Updated guide section. (Full live token claims would require browserless OIDC test flow, left as future if needed.)

T06 - Expose Dry-Run In Web UI And Cross-Link To 0018

id: NET-WP-0019-T06
status: done
priority: low
state_hub_task_id: "aa8ddc00-e77e-4153-aaba-c4e464d4d1a4"

In the web-ui portion of security_bootstrap_console.py:

Add "dry-run" related records to the appropriate payloads (e.g. lifecycle or runbooks section).
Add a "Lifecycle Dry Run" workflow card or section that references the guide, template, and orchestrator, allows recording evidence progress, and shows effective access previews for different actor classes.
Keep it conservative (no secret input).

Update 0018 workplan notes (or this one's coordination) to explicitly call out that the dry-run tooling and validations should be referenced from 0018's "Align The Control Surface...", "Add Automated Tests...", and "Integrate Validations..." tasks.

Add any simple tests (e.g. template produces valid JSON, validate-dry-run accepts the skeleton).

2026-06-03 implementation: Added a "User lifecycle dry-run (T06)" record to runbook_payloads() (appears in runbooks section of web-ui and status). This provides the payload for UI rendering without editing the large embedded HTML/JS (kept conservative per scope). Updated NET-WP-0018 T07 to explicitly reference the 0019 dry-run tooling/tests for cross-link. The CLI exposure was already done in T03. Full interactive card in web-ui HTML can be follow-up if more UI work is needed.

Acceptance Criteria

A full non-root dry-run (onboard + verify LLDAP/groups/MFA/KeyCape/no-root + lock + offboard + evidence + cleanup) can be performed with minimal manual steps and no persistent plaintext secrets.
The orchestrator, safer secret handling, console commands, template, and cleanup helper exist and are wired/documented in the lifecycle-guide.
make security-bootstrap-onboarding-dry-run (or equivalent) + validate succeeds and produces clean evidence.
The web-ui (if extended) and CLI status surface the dry-run capabilities.
Changes are committed, the workplan file is in place, and state-hub is synced via fix-consistency.
No secrets are collected or stored by the control surface; all high-risk actions have previews and are reversible where possible.
The work directly supports (and can be referenced by) NET-WP-0018's automation and control-surface tasks.

Notes

Builds on prior polish work that added onboarding-dry-run-template, the T06 section to the lifecycle guide, and the template wiring.
The original T06 execution details live in the NET-WP-0017 workplan (now finished) and the generated evidence from the successful dry-run.
Prefer using the existing .local/ inventory scripts and k8s/ helpers as building blocks.
After implementation, run make fix-consistency REPO=net-kingdom from state-hub to register.

12 KiB Raw Blame History