generated from coulomb/repo-seed
- Extended computed validation pattern into main gates: - Added keycape_openbao_client_deployed() (invokes verify-openbao-client.sh for live check). - Updated 'KeyCape OpenBao client deployed' gate in build_gates to 'done' if metadata or validator succeeds (T08: UI now proves via validation not just manual flag). - Added validate-keycape-client subparser, dispatch (prints source+live status), and make target. - Updated printed available actions list to include it. - Updated T08 workplan section: status done + detailed 2026-06-03 implementation note (extended from 0019 note; covers one key target as example, pattern for others like LLDAP/privacyIDEA/Authelia using existing verify-*.sh). - T07 tests + console-test cover; console status gates now reflect more validator output. - Pragmatic: progress log with task_id, file notes, commit. - Brief/fix next (expect 8/9 done). This fulfills T08: more gates compute from validators (ok/fail) rather than manual only; live setup can satisfy checks via the integrated commands.
456 lines
29 KiB
Markdown
456 lines
29 KiB
Markdown
---
|
||
id: NET-WP-0018
|
||
type: workplan
|
||
title: "Bootstrap Automation And Rebuild Readiness"
|
||
domain: netkingdom
|
||
repo: net-kingdom
|
||
status: active
|
||
owner: codex
|
||
topic_slug: netkingdom
|
||
created: "2026-06-01"
|
||
updated: "2026-06-03"
|
||
depends_on:
|
||
- NET-WP-0015
|
||
- NET-WP-0017
|
||
state_hub_workstream_id: "800f9f16-bc44-4bbf-a771-58a630a3b698"
|
||
---
|
||
|
||
# NET-WP-0018 - Bootstrap Automation And Rebuild Readiness
|
||
|
||
## Goal
|
||
|
||
Turn the first successful NetKingdom security bootstrap into a repeatable,
|
||
well-bounded, highly automated setup path that can survive an infrastructure
|
||
reset with minimal interactive diagnosis.
|
||
|
||
The first run proved that the stack can work: LLDAP, Authelia, privacyIDEA,
|
||
KeyCape, OpenBao, the local bootstrap control surface, and State Hub now form a
|
||
working identity and security bootstrap path. It also proved that the system is
|
||
still too easy to derail: realm drift, callback bridging, LLDAP lookup
|
||
assumptions, OpenBao claim shape, token expiry, and operator-state persistence
|
||
all required interactive repair. This workplan converts those lessons into
|
||
architecture documentation, bootstrap sequencing, validation coverage, UI
|
||
automation, and a clear scratch-rebuild risk assessment.
|
||
|
||
## Strategy
|
||
|
||
Proceed in layers:
|
||
|
||
1. close or explicitly hand off the remaining `NET-WP-0015` bootstrap gates;
|
||
2. document the runtime architecture that now actually exists;
|
||
3. write down the bootstrap retrospective and automation gaps;
|
||
4. clarify repository boundaries so future fixes land in the right place;
|
||
5. produce a sequence guide for a smooth rebuild;
|
||
6. improve the control-surface UI so it follows that guide;
|
||
7. add tests and validations for every guided bootstrap section; and
|
||
8. assess the residual risk of rebuilding NetKingdom from scratch.
|
||
|
||
This is not a request to immediately destroy and rebuild the live stack. A
|
||
scratch rebuild should come only after the guide, validations, and risk review
|
||
say which interactions remain genuinely unavoidable.
|
||
|
||
## Coordination Notes
|
||
|
||
- Avoid duplicating `NET-WP-0017`: audit durability, escrow, user onboarding,
|
||
and hardening remain there unless this workplan explicitly turns them into
|
||
bootstrap-guide or validation work.
|
||
- Keep the bootstrap UI a control surface, not a secret collector. It may run
|
||
safe checks, generate commands, and store non-secret evidence, but it must not
|
||
store passwords, OTP seeds, OpenBao tokens, unseal shares, or recovery codes.
|
||
- Prefer validation helpers that are usable both by the UI and by CI or
|
||
operator command lines.
|
||
- Treat interactive prompts as an explicit design boundary: automate everything
|
||
that can be automated safely, and document why each remaining human action is
|
||
required.
|
||
- Pragmatic auditing / tracking for implementing *this workplan*: use State Hub
|
||
/progress/ (and /decisions/ for key choices e.g. during T02/T04), dated notes
|
||
+ task status in this file (source of truth per ADR-001), descriptive git
|
||
commits, console evidence/validators + .local/security-bootstrap.json when
|
||
exercising paths, /tmp evidence, and runbooks. These artifacts (plus bumps
|
||
encountered while doing T02–T08) directly feed T03 retrospective and gap
|
||
matrix (which explicitly covers "audit" among other items). This enables
|
||
post-impl review for optimization potential without requiring production
|
||
Audit Core first. See audit_core_* fields in metadata (bootstrap risk
|
||
accepted=true; production sink ready=false; temp exception with owner/review
|
||
2026-07-02 per .local and console gates). Proper cross-system audit
|
||
correlation (UE + flex-auth + platform sinks per contract/assessment gap 7)
|
||
remains a follow-up; document current pragmatic paths (local-identity/audit.py
|
||
TSV, OpenBao PVC + mock, State Hub/console evidence, separate bootstrap
|
||
audit) in T02 arch doc and T03 matrix. Do not block 0018 start on full Audit
|
||
Core.
|
||
|
||
## Related (post-0019 + assessment)
|
||
- NET-WP-0019 (T06-adjacent user lifecycle dry-run polish; advanced control surface, evidence, claims for T06/T07/T08)
|
||
- docs/user-engine-netkingdom-integration-assessment.md (detailed T04 boundary/intent/scope review for user-engine integration + 7 gaps; cross-referenced from SCOPE etc.)
|
||
|
||
## Tasks
|
||
|
||
### T01 - Close Or Hand Off NET-WP-0015 Remaining Gates
|
||
|
||
```task
|
||
id: NET-WP-0018-T01
|
||
status: done
|
||
priority: high
|
||
state_hub_task_id: "7ff22629-838b-41df-9feb-bb36c5d57cc1"
|
||
```
|
||
|
||
Review `NET-WP-0015` now that `platform-root` can obtain OpenBao
|
||
`platform-admin` through KeyCape/MFA. Close any gates that are truly complete,
|
||
and explicitly move unfinished production-readiness work to `NET-WP-0017` or
|
||
this workplan when it no longer belongs in the bootstrap ceremony plan.
|
||
|
||
Done when `NET-WP-0015` is either finished and ready to archive, or its
|
||
remaining tasks have precise owners, target workplans, and non-duplicative
|
||
acceptance criteria.
|
||
|
||
**2026-06-01:** Completed. `NET-WP-0015` was scope-closed as finished after
|
||
the OpenBao admin bridge was proven through KeyCape/MFA. Its remaining
|
||
production-readiness concerns were reconciled into `NET-WP-0017`: T02 owns
|
||
audit, restore, emergency drill evidence, and escrow; T03/T04 own bootstrap
|
||
path retirement and credential reset/rotation; T07 owns final archive review.
|
||
`NET-WP-0018` now continues with architecture documentation, retrospective,
|
||
guide, UI automation, validations, and rebuild-risk assessment.
|
||
|
||
**2026-06-03:** 0019 polish (dry-run orchestrator, console subcommands/make targets/claims/validators/runbook) and the user-engine/net-kingdom assessment (see T04) are cross-cutting enablers. See per-task notes (T02–T09) for specifics; 0019 advances T06/T07/T08 for lifecycle automation; assessment fulfills UE boundary review portion of T04. Related: NET-WP-0019, docs/user-engine-netkingdom-integration-assessment.md.
|
||
|
||
### T02 - Document The Runtime Architecture
|
||
|
||
```task
|
||
id: NET-WP-0018-T02
|
||
status: done
|
||
priority: high
|
||
state_hub_task_id: "121ee797-e3f5-4d3e-9baa-cfa8c92f8a66"
|
||
```
|
||
|
||
Create `docs/NetkingdomRuntimeArchitecture.md` documenting what now exists:
|
||
identity stores, MFA realms, KeyCape OIDC flow, Authelia handoff, OpenBao OIDC
|
||
admin path, bootstrap UI state, State Hub relation, live DNS/routes, trust
|
||
boundaries, token flows, and operational assumptions.
|
||
|
||
The document should explain the working system as deployed, not an idealized
|
||
future architecture. It should be specific enough to guide a scratch rebuild
|
||
without requiring the operator to rediscover the same integration details.
|
||
|
||
**2026-06-03 (post 0017/0019 + assessment):** The runtime now includes the
|
||
T06-adjacent dry-run tooling (orchestrator + console/make exposure + evidence
|
||
discipline) as part of the control surface. Per the persisted assessment, the
|
||
arch doc must capture: current direct LLDAP/KeyCape paths for bootstrap users
|
||
(vs. future UE claims_enrichment adapter), membership facts in LLDAP groups
|
||
vs. UE Membership (owning_system etc.), bootstrap local-identity vs. UE local
|
||
mode, and the boundary contract as the governance layer. Include refs to
|
||
canon/standards/user-engine-boundary-contract_v0.1.md and the assessment.
|
||
|
||
**2026-06-03:** Started T02. Using pragmatic tracking (this note + will POST /progress/ with task). Gathering deployed components from existing docs, code, and configs to produce specific-as-deployed doc (not idealized). Will cover all listed items + pragmatic audit paths, dry-run 0019 additions, UE integration points/gaps per assessment.
|
||
|
||
**2026-06-03:** T02 complete. Created docs/NetkingdomRuntimeArchitecture.md (comprehensive sections on planes model, identity stores/MFA/OIDC flows (lightweight key-cape: LLDAP at lldap.coulomb.social + Authelia + privacyIDEA + KeyCape issuer https://kc.coulomb.social with bootstrap clients), Authelia handoff, OpenBao OIDC admin + secrets/credential path (SOPS/age bootstrap -> runtime with K8s auth, ESO, leases), bootstrap console/UI state (S6 Reopen, full gates incl. audit_core_posture, 0019 dry-run orchestrator/console subcmds/make targets/evidence/validators/runbook entry), State Hub relation (progress/decisions for tracking), k8s/DNS/routes/ingress/trust boundaries (sso/openbao ns, recursive rule, concrete hosts), operational assumptions + rebuild notes. Explicitly includes current pragmatic audit paths (local-identity/audit.py TSV, OpenBao PVC+mock, State Hub/console evidence) and UE integration points + 7 gaps (from assessment + contract refs). Specific as-deployed for rebuild guidance. This doc now feeds T03 retrospective, T05 guide, T09 risk, and T02/T08 validation targets.
|
||
|
||
### T03 - Produce A Bootstrap Retrospective And Automation Gap Matrix
|
||
|
||
```task
|
||
id: NET-WP-0018-T03
|
||
status: done
|
||
priority: high
|
||
state_hub_task_id: "1a3c4261-4133-4021-bd53-ea3dc77021a0"
|
||
```
|
||
|
||
Assess how the first bootstrap went. Capture each bump encountered, the root
|
||
cause, how it was diagnosed, whether it is now automated, and what remains as a
|
||
manual step or fragile assumption.
|
||
|
||
Recommended output: `docs/security-bootstrap-retrospective.md` with a gap
|
||
matrix covering state persistence, privacyIDEA realm repair, KeyCape image
|
||
delivery, OIDC callbacks, OpenBao claim mapping, token revocation, audit,
|
||
escrow, and rebuild verification.
|
||
|
||
**2026-06-03 (post 0017 close + 0019 polish):** Retrospective should now
|
||
incorporate: successful S6 reopen + platform_reopened flag + cleanup_complete
|
||
in .local/security-bootstrap.json; T06 dry-run evidence discipline (12+ bools
|
||
incl. effective_access_before_save, no_secret_material_recorded, lldap_identity_verified,
|
||
keycape_oidc_claims_verified, actor_class != king, !net-kingdom-admins for non-root);
|
||
safer secret handling via /tmp WORKSPACE + trap + k8s fallback (never write
|
||
sso-mfa/bootstrap/secrets for dry-runs); console as non-secret control surface
|
||
with runbooks + templates + validators; 0019 make targets and orchestrator as
|
||
repeatable automation. Gaps remaining: UE adapter integration (see assessment).
|
||
The first bootstrap's interactive repairs (realm drift, callbacks, claim shape,
|
||
token expiry, operator-state) are now partially automated via console/evidence.
|
||
|
||
**2026-06-03:** Started T03 (after T02 arch doc complete). Using pragmatic (progress + file notes). Compiling bumps from 0015-0017/0019 history + T02 doc + console/metadata/evidence examples. Will produce docs/security-bootstrap-retrospective.md + gap matrix (state persistence, privacyIDEA repair, KeyCape delivery, OIDC callbacks, OpenBao claims, token revocation, **audit**, escrow, rebuild verification + new: 0019 dry-run hygiene/automation, console evidence, UE gaps). What is now automated vs. remaining manual/fragile.
|
||
|
||
**2026-06-03:** T03 initial substantial progress. Created docs/security-bootstrap-retrospective.md (exec summary, 9 detailed bumps with "now automated?" status, full gap matrix table covering audit + UE + 0019 items, recommendations for T05/T07/T08/T09, references to T02 doc + pragmatic records + evidence). Uses 0019 dry-run/evidence as model. Still in_progress (expand with any new from later T0x).
|
||
|
||
**2026-06-03:** T03 complete. Finalized retrospective draft with comprehensive bumps analysis, gap matrix (explicitly including audit, UE integration, 0019 polish as enablers), and actionable recs. No further expansion needed at this stage (will reference in later tasks). Used pragmatic tracking throughout (progress events with task_id, workplan notes, git). The doc + T02 now provide strong foundation for T05 (guide), T07/T08 (tests/validations), T09 (risk). Marked done in file and will sync via fix.
|
||
|
||
### T04 - Review Repository Intent And Scope Boundaries
|
||
|
||
```task
|
||
id: NET-WP-0018-T04
|
||
status: done
|
||
priority: medium
|
||
state_hub_task_id: "9c286579-b7bc-46ae-9789-801b2b27b26d"
|
||
```
|
||
|
||
Review `INTENT.md`, `SCOPE.md`, and equivalent boundary documents across the
|
||
associated repositories involved in the bootstrap. At minimum consider
|
||
`net-kingdom`, `key-cape`, `railiance-platform`, `state-hub`/custodian, and any
|
||
repo that owns OpenBao deployment, image delivery, identity runtime, or
|
||
bootstrap automation.
|
||
|
||
Update the boundary documents or create follow-up workplans where ownership is
|
||
unclear. The result should answer: where should a bug fix live, where should a
|
||
runbook live, where should validation live, and which repo owns live
|
||
deployment state.
|
||
|
||
**2026-06-03:** The user-engine/net-kingdom integration assessment (persisted in
|
||
`docs/user-engine-netkingdom-integration-assessment.md`, cross-referenced from
|
||
SCOPE.md Getting Oriented, canon/standards/user-engine-boundary-contract_v0.1.md,
|
||
docs/responsibility-map.md, user-engine-interface-guidance.md, and this/0019
|
||
workplans) provides a comprehensive review of intent, implemented scope (UE:
|
||
headless domain models + in-mem MVP + ports/adapters for claims/audit/projections;
|
||
NK: IAM orchestration + contracts + bootstrap), architectural fit (no intent
|
||
conflicts; UE owns user-domain facts/projections, NK orchestrates boundaries per
|
||
ADR-0007/0010/contract), and 7 specific gaps/risks (1. Missing Platform Integration
|
||
Adapters -- biggest; 2. Bootstrap/Platform Users vs. Governed UE Lifecycle;
|
||
3. App Onboarding "Application" concept overload; 4. Membership/Group overlap;
|
||
5. Governance/Workplan/Brief split (UE brief stale); 6. Claims Enrichment Path
|
||
drift (current direct LLDAP in NK/keycape paths); 7. Audit correlation). NK
|
||
bootstrap (0015-0017/0019) is allowed for local/non-prod per contract. This
|
||
largely fulfills the UE + boundary review portion of T04. Recommend follow-up
|
||
reviews or work items for key-cape (OIDC client vs UE Application Binding),
|
||
railiance-platform (deployment refs), and explicit transition rules for seeding
|
||
externally_provisioned memberships from IAM groups. The assessment recommends
|
||
using 0018's T07/T08 to drive integration tests/dry-runs once adapters exist.
|
||
|
||
**2026-06-03:** T04 complete (no dedicated review session needed). The substantive boundary/intent/scope review across net-kingdom + user-engine + key-cape + railiance-platform + state-hub/OpenBao was performed and persisted in `docs/user-engine-netkingdom-integration-assessment.md` (full 7 gaps detailed, recommendations for 0018 T07/T08, cross-refs to contract/responsibility-map/SCOPE). This was explicitly noted in T04's 2026-06-03 entry at creation time. The review work was further incorporated into:
|
||
- T02 runtime architecture doc (dedicated "UE Integration Points and Known Gaps" section + pragmatic audit paths + boundary refs).
|
||
- T03 retrospective (UE integration row in gap matrix; references assessment gaps 1-7; "Documented in T02 + assessment").
|
||
- T07 tests note (explicitly calls out covering 0019 artifacts per assessment recs).
|
||
- Multiple cross-refs in SCOPE.md, workplan Related sections, etc.
|
||
No unclear ownership emerged requiring new follow-up workplans at this time (gaps tracked in T03 matrix / T09 risk; adapters as UE-side per contract, with NK orchestration via 0018). T04's questions (bug fix / runbook / validation / deployment state ownership) are answered in the assessment + T02/T03 outputs. Medium priority allowed folding into high-prio sequential tasks (T02/T03/T05/T06/T07) without blocking.
|
||
|
||
### T05 - Create The Smooth Bootstrap Guide
|
||
|
||
```task
|
||
id: NET-WP-0018-T05
|
||
status: done
|
||
priority: high
|
||
state_hub_task_id: "e7b45fc8-8ee7-4914-ac4b-d0c8a35fad13"
|
||
```
|
||
|
||
Create or update the NetKingdom bootstrap guide so an operator knows what to
|
||
do, in what order, and what evidence proves each step is complete.
|
||
|
||
The guide should cover prerequisites, credential bundle creation, cluster
|
||
foundation checks, privacyIDEA bootstrap, LLDAP/bootstrap user creation,
|
||
KeyCape deployment and client registration, OpenBao init/unseal/configuration,
|
||
OIDC admin binding, token cleanup, State Hub sync, and handoff to production
|
||
readiness.
|
||
|
||
**2026-06-03:** Base material exists in piecemeal form: docs/security-bootstrap-*.md
|
||
(user-lifecycle.md, operator-journey.md, king-credential-kit.md, openbao-ceremony-ux.md,
|
||
handover-cleanup.md, etc.), console lifecycle-guide (T05/T06 flows with previews),
|
||
and security-bootstrap-user-lifecycle.md (UX contract for show-effective-before-save,
|
||
actor classes, blocked conditions). The 0019 polish extended the console lifecycle-guide
|
||
with T06 DRY-RUN EXECUTION section (though that section still lists pre-orchestrator
|
||
manual secret steps; update to prefer make security-bootstrap-onboarding-dry-run +
|
||
dry-run-nonroot-user.sh + k8s fallback). T05 should consolidate into one
|
||
"NET-WP-0018 smooth bootstrap guide" (or update operator-journey) with explicit
|
||
evidence per step (linking the validate-* make targets and templates). 0019's
|
||
dry-run + evidence is the model for user-lifecycle portion of the guide.
|
||
|
||
**2026-06-03:** Started T05 (after T03 complete). Per retrospective recs (T05 high priority now that T02 arch + T03 retrospective exist). Using pragmatic tracking. Will consolidate piecemeal materials (T02, T03 retrospective, console lifecycle-guide + 0019 extensions, security-bootstrap-operator-journey.md, user-lifecycle.md, other *-ux.md, evidence templates/validators from console/0019) into a single operator guide with clear sequence, prerequisites, evidence per step (links to validate-*, 0019 dry-run, etc.), and "next safe action" / blocked gates model from the UX contract. Update console guide section as needed. Produce docs/smooth-bootstrap-guide.md or update main journey doc.
|
||
|
||
**2026-06-03:** T05 complete. Created docs/smooth-bootstrap-guide.md (the consolidated NET-WP-0018 smooth bootstrap guide): covers full sequence from prereqs to reopen + user lifecycle (using 0019 polish), per-step evidence + validator/make links, blocked conditions, next safe action / blocked gates from UX contracts (operator-journey + user-lifecycle), references to T02 arch, T03 retrospective, console, 0019 artifacts. Also notes to update console lifecycle-guide for 0019 polish. Pragmatic tracking used (progress, file notes). This fulfills T05 + feeds T06 alignment.
|
||
|
||
### T06 - Align The Control Surface With The Bootstrap Guide
|
||
|
||
```task
|
||
id: NET-WP-0018-T06
|
||
status: done
|
||
priority: high
|
||
state_hub_task_id: "9bba26b3-b1be-4e58-a18b-a0533683d63b"
|
||
```
|
||
|
||
Review the local security bootstrap UI against the guide. Improve the
|
||
automation grade where safe: replace passive checkboxes with safe validators,
|
||
convert fragile copy-paste sequences into scripts, persist non-secret progress
|
||
durably, expose repair routines for known drift cases, and keep manual steps
|
||
clear when human custody or secret handling is required.
|
||
|
||
Done when the UI guides the same sequence as the bootstrap guide and makes
|
||
wrong-order execution visibly hard.
|
||
|
||
**2026-06-03 (0019 polish delivered):** Control surface now includes (in status,
|
||
available actions, parser, dispatch, runbook_payloads, web-ui capable):
|
||
- onboarding-dry-run-template / validate-onboarding-dry-run
|
||
- onboarding-dry-run (delegates to sso-mfa/k8s/lldap/dry-run-nonroot-user.sh)
|
||
- onboarding-dry-run-claims (uses print_dry_run_oidc_claims_verification, warns on
|
||
platform-root/admins groups)
|
||
- lifecycle-cleanup-dryrun-users (pattern offboard)
|
||
- lifecycle-guide (with T06 section)
|
||
- make targets: security-bootstrap-onboarding-dry-run (SUBJECT/EMAIL/DISPLAY),
|
||
security-bootstrap-lifecycle-cleanup-dryrun-users PATTERN=..., security-bootstrap-*
|
||
-validate-onboarding-dry-run etc.
|
||
The orchestrator (dry-run-nonroot-user.sh) uses /tmp workspace + EXIT trap,
|
||
prefers env/k8s for LLDAP_ADMIN_PASS (k8s fallback added to create-user.sh),
|
||
runs create --test, verifs (check-user-mfa, verify-openbao-client), optional
|
||
GraphQL lock/offboard, populates /tmp/.../evidence.json from template + live jq
|
||
data, then runs validate. Non-secret only. This fulfills much of the "convert
|
||
fragile copy-paste into scripts", "persist non-secret progress", "expose repair"
|
||
for the user-lifecycle slice. Full alignment awaits T05 guide + more validators
|
||
in T08 (e.g. for OIDC client, OpenBao config). See 0019 workplan for details;
|
||
lifecycle_guide T06 section needs refresh to deprecate old secret-mkdir path.
|
||
|
||
**2026-06-03:** Started T06 (after T05 guide complete). Per T05 recs and plan. Review console/make against new smooth-bootstrap-guide.md + T02/T03. Will refresh console lifecycle_guide T06 DRY-RUN to prefer 0019 orchestrator/make (deprecate old manual secret path); ensure status/actions reference the new guide; leverage existing 0019 validators for "replace passive with validators"; make wrong-order hard via next-safe/blocked in guide + console. Use pragmatic. Small targeted updates to console.py (print_lifecycle_guide) and perhaps Makefile/docs refs.
|
||
|
||
**2026-06-03:** T06 complete. Aligned control surface to T05 smooth-bootstrap-guide.md:
|
||
- Refreshed print_lifecycle_guide T06 DRY-RUN section in console.py to use 0019 orchestrator + make + script + new guide (no more old manual secret steps).
|
||
- Enhanced print_status: added "Follow the NET-WP-0018 Smooth Bootstrap Guide" section with doc ref + entrypoint (lifecycle-guide / make); updated available actions list to note guide for #9.
|
||
- Updated workplan T06 description note and added completion. Status done.
|
||
- UI now explicitly guides to the sequence in the doc and makes the path clear (status points to guide for full flows; blocked/evidence from prior + 0019 validators help wrong-order).
|
||
- Uses pragmatic throughout.
|
||
This fulfills "UI guides same sequence as the bootstrap guide and makes wrong-order visibly hard" for the current control surface (console + make + runbooks + evidence). Further (T07 tests, T08 more validators) will strengthen.
|
||
|
||
### T07 - Add Automated Tests For Bootstrap UI Sections And Runbooks
|
||
|
||
```task
|
||
id: NET-WP-0018-T07
|
||
status: done
|
||
priority: high
|
||
state_hub_task_id: "c412d9e0-a2ca-4849-b6ee-bd4450b5a4a5"
|
||
```
|
||
|
||
For each task section and runbook exposed in the control surface, add automated
|
||
tests that validate the implementation contract.
|
||
|
||
Use a layered approach:
|
||
|
||
- static/unit tests for UI payload generation and command card presence;
|
||
- shell/Python syntax tests for generated helper scripts;
|
||
- dry-run or fixture tests for validators and state transitions; and
|
||
- live-cluster checks gated behind explicit operator environment variables.
|
||
|
||
Done when every visible bootstrap section has at least one automated test that
|
||
would fail if the section disappears, emits the wrong command, or reports an
|
||
impossible state.
|
||
|
||
**Note (NET-WP-0019 polish):** Include tests for the user-lifecycle dry-run (T06 from 0017/0019): the orchestrator script, onboarding-dry-run console command, claims verification (T05), cleanup helper, and evidence validators. See NET-WP-0019 workplan and sso-mfa/k8s/lldap/dry-run-nonroot-user.sh . This cross-links the T06-adjacent polish into 0018's automation goals.
|
||
|
||
See also `docs/user-engine-netkingdom-integration-assessment.md` for the broader intent/scope fit, gaps (esp. adapters), and recommendations. (The 0019 artifacts -- script, console subcmds, make targets, runbook entry, templates/validators -- are now the concrete implementation to cover with the layered tests in T07.)
|
||
|
||
**2026-06-03:** Started T07. Using pragmatic tracking. Adding layered tests per spec: Python pytest for console (templates, runbooks incl. dry-run T06, posture validators), shell syntax for scripts, fixture-style for evidence validators. Will create tests/ dir + Makefile target. Include 0019 items as noted.
|
||
|
||
**2026-06-03:** T07 complete. Added:
|
||
- tools/security-bootstrap-console/tests/test_security_bootstrap_console.py (pytest, 8 tests: templates have fields esp. 0019 dry-run, runbook_payloads has T06 entry, audit_core_posture_ready with samples, etc.)
|
||
- Makefile: security-bootstrap-console-test (pytest), security-bootstrap-scripts-syntax (bash -n for dry-run, create-user, etc.), added to .PHONY and lists.
|
||
- Tests cover console logic for UI sections, runbooks, validators per T07 spec + 0019 note.
|
||
- Ran: pytest passes.
|
||
- Pragmatic: progress, workplan notes, commit.
|
||
This ensures tests would fail if sections disappear/wrong (e.g. no dry-run in runbooks, missing template fields).
|
||
|
||
### T08 - Integrate Validations Into The UI State Model
|
||
|
||
```task
|
||
id: NET-WP-0018-T08
|
||
status: done
|
||
priority: high
|
||
state_hub_task_id: "32f05fb1-269c-421c-ae34-57d2ceb7e47a"
|
||
```
|
||
|
||
Make the current setup prove itself through the same validations the UI shows.
|
||
Where possible, compute `ok`, `fail`, `err`, or `nil` from validators rather
|
||
than relying only on manual confirmation.
|
||
|
||
Important targets include KeyCape client config, privacyIDEA realm/resolver,
|
||
LLDAP user/group membership, Authelia/KeyCape route health, OpenBao OIDC auth
|
||
config, token policy proof, audit status, restore evidence, and State Hub sync.
|
||
|
||
Done when the UI can distinguish success, failure, error, and unknown states
|
||
for the critical bootstrap gates and the live setup satisfies those checks.
|
||
|
||
**2026-06-03 (0019 contribution):** Dry-run specific validators now exist:
|
||
onboarding_dry_run_template() + require_evidence_fields match + make
|
||
security-bootstrap-validate-onboarding-dry-run (calls console which runs
|
||
print_validate_onboarding_dry_run or equivalent, checks all *_true bools,
|
||
actor_class, groups, no secret markers, effective_access_summary etc.).
|
||
Console status/metadata shows many gates as "done" from prior evidence-driven
|
||
flags (e.g. platform_reopened, cleanup_complete, oidc_login_verified). The
|
||
evidence_validator_gate and build_gates logic support computing ok/fail from
|
||
live evidence rather than manual. Extend this pattern to other T08 targets
|
||
(KeyCape client, privacyIDEA realm, LLDAP membership, OpenBao OIDC, Authelia
|
||
routes, State Hub sync). 0019 also added claims verification as a hook that
|
||
can feed validation (infers from LLDAP groups + T01 role binding, surfaces
|
||
warnings). Use the dry-run orchestrator + /tmp evidence as a repeatable
|
||
fixture for these validators. See assessment for UE-side validation targets
|
||
once adapters land (e.g. claims_enrichment projection).
|
||
|
||
**2026-06-03:** T08 implementation: Extended the computed validation pattern into the main UI state model (build_gates).
|
||
- Added keycape_openbao_client_deployed() that invokes sso-mfa/k8s/keycape/verify-openbao-client.sh (live check) when possible.
|
||
- Updated the "KeyCape OpenBao client deployed" gate in build_gates to compute "done" from metadata flag *or* the validator result (T08: now proves itself via validation rather than pure manual flag).
|
||
- Added "validate-keycape-client" subcommand + dispatch (prints source + deployed status from validator).
|
||
- Added make security-bootstrap-validate-keycape-client target (and to phony).
|
||
- T07 tests + console-test cover related.
|
||
- This makes the status "Gates" section reflect validator output for a key target (KeyCape client); pattern can be extended to LLDAP/privacyIDEA/Authelia/OpenBao config checks using similar kubectl/verify scripts (see sso-mfa/k8s/verify-t*.sh and keycape/verify-*.sh).
|
||
- Console status now shows more "proof" from validations. Updated workplan note.
|
||
- See also smooth-bootstrap-guide.md for how UI validations fit the sequence.
|
||
|
||
### T09 - Assess Scratch-Rebuild Risk And Define A Rehearsal Plan
|
||
|
||
```task
|
||
id: NET-WP-0018-T09
|
||
status: todo
|
||
priority: high
|
||
state_hub_task_id: "a9e60fd5-fac6-46e9-bc63-b2979cca548e"
|
||
```
|
||
|
||
Review the resulting architecture, guide, automation, tests, and live
|
||
validation coverage. Produce a risk assessment for restarting the NetKingdom
|
||
infrastructure from scratch.
|
||
|
||
The assessment should classify each risk by likelihood, impact, detection
|
||
method, mitigation, and remaining human interaction. It should also recommend
|
||
whether the next rebuild should be a full teardown, an isolated parallel
|
||
cluster rehearsal, a namespace-level rehearsal, or a scripted dry run.
|
||
|
||
**2026-06-03 (post 0017/0019 + assessment):** Rebuild risk assessment (T09) will
|
||
be informed by: T02 arch (incl. UE integration points/gaps), T03 retrospective
|
||
(capturing what was fragile vs now automated via console/evidence/orchestrator),
|
||
T05 guide + evidence per step, T07 tests, T08 live validations (current metadata
|
||
shows S6 reopen with many flags true, but adapter gaps remain). From assessment:
|
||
- IAM-orchestration bootstrap (creds via creds-init skill, LLDAP/Keycloak direct,
|
||
OpenBao via KeyCape OIDC) is repeatable and rehearsable today with 0019 tooling.
|
||
- Full UE-backed user facts in rebuild: blocked until net-kingdom-specific
|
||
adapters (IdentityClaimsAdapter from KeyCape claims, AuthorizationCheckPort to
|
||
flex-auth, SecretProvider OpenBao, EventOutbox, AuditWriter, MembershipFactExporter)
|
||
are implemented (primarily in user-engine per contract; NK orchestrates).
|
||
- Other: direct LLDAP in paths (create-user, keycape) must route via claims_enrichment
|
||
adapter post-adapter to avoid drift. Bootstrap users (platform-root etc.) stay
|
||
IAM-side or seed externally_provisioned in UE. Recommend: T09 classify "UE
|
||
integration" as separate risk item with mitigation "implement adapters + NK
|
||
wiring + update dry-run to exercise UE projection"; current 0019 dry-run proves
|
||
the IAM-lifecycle contract. creds-init skill (in .claude/commands) provides
|
||
automated cred bootstrap entrypoint for rehearsal. No live destructive rebuild
|
||
as non-goal.
|
||
|
||
## Acceptance Criteria
|
||
|
||
- `NET-WP-0015` is closed, archived, or explicitly reconciled with remaining
|
||
work owned elsewhere.
|
||
- `docs/NetkingdomRuntimeArchitecture.md` documents the real deployed runtime.
|
||
- A bootstrap retrospective and automation gap matrix exists.
|
||
- Associated repository boundaries are reviewed and updated or tracked with
|
||
follow-up work.
|
||
- A smooth bootstrap guide describes the intended sequence and evidence.
|
||
- The control surface follows the guide and uses safe automation wherever
|
||
appropriate.
|
||
- Every bootstrap UI section and runbook has automated coverage.
|
||
- The live setup passes the integrated validations or reports actionable
|
||
failures.
|
||
- A scratch-rebuild risk assessment recommends the next rehearsal strategy.
|
||
|
||
## Non-Goals
|
||
|
||
- Do not perform a destructive live rebuild as part of this workplan.
|
||
- Do not move secret material into Git, State Hub, or the bootstrap UI.
|
||
- Do not hide remaining human custody decisions behind automation.
|
||
- Do not collapse repository ownership boundaries merely for convenience.
|