generated from coulomb/repo-seed
337 lines
15 KiB
Markdown
337 lines
15 KiB
Markdown
---
|
|
id: NET-WP-0017
|
|
type: workplan
|
|
title: "IT Security Readiness For User Onboarding"
|
|
domain: netkingdom
|
|
repo: net-kingdom
|
|
status: active
|
|
owner: codex
|
|
topic_slug: netkingdom
|
|
created: "2026-05-26"
|
|
updated: "2026-06-01"
|
|
depends_on:
|
|
- NET-WP-0015
|
|
- NET-WP-0016
|
|
- RAIL-PL-WP-0002
|
|
state_hub_workstream_id: "385de708-fd59-4bab-a4f4-28c1c476b3ea"
|
|
---
|
|
|
|
# NET-WP-0017 - IT Security Readiness For User Onboarding
|
|
|
|
## Goal
|
|
|
|
Finish the remaining NetKingdom and Railiance security setup needed before
|
|
ordinary platform users, tenant admins, or fabric admins are onboarded.
|
|
|
|
`NET-WP-0015` established the king credential, OpenBao bootstrap ceremony, and
|
|
guided control surface. This workplan is the narrower finish-line plan: routine
|
|
admin access must use NetKingdom identity, bootstrap-era material must be
|
|
retired or explicitly accepted, audit/recovery posture must be credible, and a
|
|
first non-root onboarding dry run must prove the lifecycle model.
|
|
|
|
## Current Evidence
|
|
|
|
- `platform-root` exists in LLDAP, belongs to `net-kingdom-admins`, has MFA,
|
|
and completed KeyCape OIDC login.
|
|
- Railiance OpenBao is initialized, unsealed, and post-unseal verified.
|
|
- OpenBao initial configuration was applied; `platform/` KV and Kubernetes auth
|
|
exist.
|
|
- The initial OpenBao root token is recorded as revoked.
|
|
- Trial unseal shares were rotated.
|
|
- The KeyCape `openbao-admin` client is live and verified, including the public
|
|
`https://kc.coulomb.social` route and certificate.
|
|
- OpenBao OIDC auth configuration is applied; MFA-backed OpenBao admin login
|
|
completed successfully and the resulting token lookup showed the
|
|
`platform-admin` policy for `platform-root`.
|
|
- Declarative local OpenBao audit and authenticated audit visibility are
|
|
complete; enterprise durable tenant-aware audit retention has been split into
|
|
the standalone `audit-core` product. Residual taint closeout,
|
|
cleanup/rotation, and the first ordinary-user onboarding dry run are still
|
|
pending.
|
|
|
|
## Tasks
|
|
|
|
### T01 - Finish OIDC-Backed OpenBao Admin Login
|
|
|
|
```task
|
|
id: NET-WP-0017-T01
|
|
status: done
|
|
priority: high
|
|
state_hub_task_id: "9b087bbd-631b-4316-b94d-a8265a05b065"
|
|
```
|
|
|
|
Run the fixed OpenBao OIDC helper, record the non-secret completion flag, then
|
|
verify `platform-root` can complete:
|
|
|
|
```bash
|
|
bao login -method=oidc -path=keycape role=platform-admin
|
|
```
|
|
|
|
The verification must prove the resulting OpenBao token has the intended
|
|
`platform-admin` policy without relying on the initial root token or a manually
|
|
minted temporary operator token.
|
|
|
|
**2026-05-29:** DNS and ACME issuance for `kc.coulomb.social` are healthy:
|
|
cert-manager issued `kc-tls`, and `sso-mfa/k8s/keycape/verify-openbao-client.sh`
|
|
passes against the live KeyCape route. `configure-openbao-oidc.sh` has applied
|
|
the OpenBao `auth/keycape` OIDC configuration and `platform-admin` role. The
|
|
remaining T01 gate is the human browser login with MFA and a token lookup that
|
|
shows the expected OpenBao `platform-admin` policy.
|
|
|
|
**2026-06-01:** Added a guided console recovery action for the observed
|
|
privacyIDEA state-loss blocker: if the live instance lacks the `coulomb` realm,
|
|
LLDAP resolver, or self-service policies, the operator can run **Repair
|
|
privacyIDEA realm and self-service** from **Usecases & Runbooks**. The action
|
|
does not store secrets; it calls `repair-realm-live.sh`, prompts live, creates
|
|
temporary env files for `bootstrap-realm.sh`, removes them on exit, and then
|
|
runs `verify-t06.sh`. After repair, `platform-root` TOTP
|
|
enrollment/re-enrollment and the MFA-backed `bao login` proof are still
|
|
required.
|
|
|
|
**2026-06-01:** Fixed the follow-up OpenBao OIDC token exchange
|
|
`user not found` error caused by live `keycape-config` drift: the Secret had
|
|
lost the non-secret LLDAP lookup fields `userOU: ou=people` and
|
|
`groupOU: ou=groups`. The KeyCape live patch helper now enforces those fields
|
|
alongside the `openbao-admin` client, the live Secret was patched, KeyCape was
|
|
restarted, and `verify-openbao-client.sh` passes again.
|
|
|
|
**2026-06-01:** Deployed a KeyCape runtime lookup fix for the remaining
|
|
`user not found` token-exchange failure after config drift was ruled out. The
|
|
LDAP adapter now treats provisioning metadata validation failures as runtime
|
|
warnings instead of blocking token issuance for an otherwise resolved LLDAP
|
|
user. The patched image `main-runtime-lookup-0601` is live and
|
|
`verify-openbao-client.sh` passes after rollout.
|
|
|
|
**2026-06-01:** Deployed the follow-up KeyCape OIDC nonce fix after OpenBao
|
|
rejected the exchanged ID token with `invalid id_token nonce`. KeyCape now
|
|
persists the original authorization `nonce` through pending state and the
|
|
authorization-code session, then emits it in the ID token. The patched image
|
|
`main-nonce-0601` is live, reports 1/1 ready, and `verify-openbao-client.sh`
|
|
passes after rollout.
|
|
|
|
**2026-06-01:** Fixed the next OpenBao role configuration failure,
|
|
`error converting claim 'groups' to string`. KeyCape correctly emits `groups`
|
|
as an array for `groups_claim`; OpenBao only failed because the role also copied
|
|
that array through scalar `claim_mappings`. The helper now leaves groups in
|
|
`groups_claim`/`bound_claims` and maps only scalar `email` and
|
|
`preferred_username` metadata.
|
|
|
|
**2026-06-01:** The operator reached the OpenBao success page, "Signed in via
|
|
your OIDC provider", after reapplying the corrected role. The follow-up
|
|
terminal proof showed `token_policies`/`policies` containing `platform-admin`,
|
|
`token_meta_role: platform-admin`, and `token_meta_username: platform-root`.
|
|
T01 is closed; the pasted short-lived token should be treated as disclosed and
|
|
revoked or allowed to expire after the check.
|
|
|
|
### T02 - Close OpenBao Audit And Recovery Production Gates
|
|
|
|
```task
|
|
id: NET-WP-0017-T02
|
|
status: in_progress
|
|
priority: high
|
|
state_hub_task_id: "909944bd-843a-4a63-8c87-536cea052a88"
|
|
```
|
|
|
|
Resolve the remaining OpenBao production-trust gates:
|
|
|
|
- configure audit declaratively if API-managed audit remains rejected;
|
|
- record the interim Audit Core interface used before enterprise durable audit
|
|
retention is implemented;
|
|
- hand off durable tenant-aware audit shipping beyond the audit PVC to
|
|
`audit-core`;
|
|
- retain non-secret restore-drill evidence and repeat the drill if any
|
|
material changed;
|
|
- record emergency seal/unseal drill evidence; and
|
|
- identify the next independent escrow holder for moving beyond temporary
|
|
single-king custody.
|
|
|
|
**2026-06-01:** Started the OpenBao audit/recovery closeout. Railiance source
|
|
now has a declarative OpenBao file-audit stanza in
|
|
`helm/openbao-values.yaml`, and its initial-config helper now verifies
|
|
`bao audit list` instead of trying to create audit devices through the API.
|
|
The Railiance post-unseal verifier also warns when
|
|
`/openbao/audit/openbao-audit.log` is missing or empty. Live non-secret
|
|
checks still show OpenBao healthy and unsealed with Bound data/audit PVCs, but
|
|
the live Helm values do not yet include the declarative audit stanza and the
|
|
audit directory is empty. Do not move production secrets into OpenBao until a
|
|
planned Helm rollout is performed with unseal shares available, `file/` audit
|
|
is visible, an audit log is written, durable audit shipping beyond the PVC is
|
|
selected, and restore/emergency drill evidence plus a next escrow holder are
|
|
recorded.
|
|
|
|
**2026-06-01:** Completed the attended live rollout of the Railiance
|
|
declarative file-audit configuration. The Helm release was upgraded, the
|
|
`OnDelete` StatefulSet pod was deliberately recycled, the operator unsealed the
|
|
new pod, and `make openbao-verify-post-unseal` now reports OpenBao `2.5.4`,
|
|
`Sealed: false`, an audit directory, and a non-empty
|
|
`/openbao/audit/openbao-audit.log`. The Railiance source now pins the live
|
|
OpenBao image tag to `2.5.4` after the chart upgrade advanced the runtime from
|
|
`2.5.3`; a follow-up Helm revision 3 applied the explicit tag while the pod
|
|
remained ready. T02 remains open for the authenticated `bao audit list` proof,
|
|
durable audit shipping beyond the audit PVC, restore-drill evidence, emergency
|
|
seal/unseal drill evidence, and the next independent escrow holder.
|
|
|
|
**2026-06-01:** Added a Railiance evidence-only helper for the authenticated
|
|
OpenBao proof: `make openbao-verify-authenticated` prompts for an approved
|
|
OpenBao token without echoing it and verifies `file/` audit visibility,
|
|
`platform/` secrets, `kubernetes/` auth, `keycape/` auth, and a non-empty audit
|
|
log without mutating OpenBao configuration. The helper can also reuse a
|
|
still-valid pod token helper with
|
|
`OPENBAO_VERIFY_AUTH_ARGS=--use-token-helper`, avoiding token movement through
|
|
the local shell. It is ready to run with the MFA-backed
|
|
`platform-root`/`platform-admin` path. Durable audit shipping remains open; the
|
|
audit PVC is not a durable sink and non-secret evidence hashes or State Hub
|
|
notes are not substitutes for retained audit log custody.
|
|
|
|
**2026-06-01:** Completed the authenticated OpenBao proof through the
|
|
MFA-backed KeyCape path without printing token material. A fresh
|
|
`bao login -no-print -method=oidc -path=keycape role=platform-admin` browser
|
|
flow cached the pod token helper, then `make openbao-verify-authenticated
|
|
OPENBAO_VERIFY_AUTH_ARGS=--use-token-helper` passed. Evidence: OpenBao is
|
|
unsealed on `2.5.4`, `file/` audit is visible, `platform/` secrets are visible,
|
|
`kubernetes/` and `keycape/` auth methods are visible, and the audit log grew
|
|
from 7969 bytes to 23330 bytes during the check. The cached verifier token was
|
|
then revoked with `bao token revoke -self`. T02 remains open for durable audit
|
|
shipping beyond the audit PVC, restore-drill evidence, emergency seal/unseal
|
|
drill evidence, and the next independent escrow holder.
|
|
|
|
**2026-06-01:** Split enterprise audit retention out of this task and into the
|
|
new standalone `/home/worsch/audit-core` repo. `audit-core` now has
|
|
`INTENT.md`, a product requirements definition, and a minimal replaceable mock
|
|
backend that writes JSONL audit events to
|
|
`/tmp/audit-core/audit-YYYYMMDDTHH.jsonl` and cleans up files older than seven
|
|
days. A smoke event for the OpenBao authenticated readiness proof was written
|
|
through the mock interface, and `audit-core` tests pass. This mock backend is
|
|
acceptable for bootstrap/development wiring and NetKingdom UI integration, but
|
|
it is not durable audit custody and must not be presented as enterprise
|
|
retention. NET-WP-0017-T02 now treats the full tenant-aware durable audit
|
|
fabric as an `audit-core` follow-up rather than an OpenBao bootstrap subtask.
|
|
Remaining T02 gates are restore-drill evidence, emergency seal/unseal drill
|
|
evidence, the next independent escrow holder, and an explicit risk note if
|
|
ordinary onboarding proceeds before the production Audit Core sink exists.
|
|
|
|
**2026-06-01:** Tightened the restore-drill evidence gate. The local bootstrap
|
|
metadata currently says `restore_drill_passed: true`, but that checkbox alone
|
|
does not preserve enough non-secret evidence for review. Railiance now has a
|
|
restore evidence JSON template and `make openbao-validate-restore-evidence`
|
|
validator that checks for snapshot hashes, encrypted-snapshot hash/location,
|
|
isolated restore completion, unseal/status/test-secret verification, isolated
|
|
environment destruction, and `no_secret_material_recorded`. The NetKingdom
|
|
control surface now includes a **Validate restore drill evidence** runbook
|
|
card. T02 should not count the restore gate closed until a real non-secret
|
|
evidence file from the prior or repeated drill passes that validator.
|
|
|
|
### T03 - Close Trial Taint And Retire Bootstrap Admin Paths
|
|
|
|
```task
|
|
id: NET-WP-0017-T03
|
|
status: todo
|
|
priority: high
|
|
state_hub_task_id: "a6cd4325-8f3b-46bb-b810-ca816c35cb29"
|
|
```
|
|
|
|
Review all access paths created during the trial exposure and record the
|
|
compromise response complete only after the operator has either rotated,
|
|
revoked, reset, or explicitly accepted residual risk for:
|
|
|
|
- temporary OpenBao `platform-admin` tokens;
|
|
- bootstrap/root-token-derived paths;
|
|
- early LLDAP/Authelia/KeyCape admin credentials;
|
|
- local plaintext secret workspaces;
|
|
- bootstrap service tokens; and
|
|
- any copied command output or local shell history that may contain secret
|
|
values.
|
|
|
|
### T04 - Harden Bootstrap Infrastructure Before User Onboarding
|
|
|
|
```task
|
|
id: NET-WP-0017-T04
|
|
status: todo
|
|
priority: high
|
|
state_hub_task_id: "12c31f76-68f4-4d2b-853a-f3185cfc761c"
|
|
```
|
|
|
|
Complete the minimum hardening before ordinary users are onboarded:
|
|
|
|
- restrict direct administrative access to LLDAP and privacyIDEA to approved
|
|
operator networks or tunnels;
|
|
- verify no privileged login path bypasses MFA for platform-admin authority;
|
|
- rotate or reset bootstrap-era database, admin, and service credentials that
|
|
were created before custody was established;
|
|
- confirm host/workload checks and vulnerability scans are run or explicitly
|
|
deferred with owner/date; and
|
|
- update the bootstrap console state to `cleanup_complete` only when these
|
|
checks are recorded.
|
|
|
|
### T05 - Implement First User Lifecycle Operator Flow
|
|
|
|
```task
|
|
id: NET-WP-0017-T05
|
|
status: todo
|
|
priority: high
|
|
state_hub_task_id: "aec3ac45-18be-4b04-a863-0c8c70693739"
|
|
```
|
|
|
|
Turn the documented user lifecycle UX into the first practical operator flow
|
|
for:
|
|
|
|
- onboarding a scoped non-root user;
|
|
- temporarily locking that user;
|
|
- permanently offboarding that user;
|
|
- reviewing credentials and MFA state; and
|
|
- creating a fabric/tenant admin without platform-root authority.
|
|
|
|
The flow can begin as console/UI action cards, but it must show effective
|
|
access before saving and must not expose secrets.
|
|
|
|
### T06 - Run A Non-Root Onboarding Dry Run
|
|
|
|
```task
|
|
id: NET-WP-0017-T06
|
|
status: todo
|
|
priority: high
|
|
state_hub_task_id: "c149b2f0-c9ee-4c95-a1df-b25ed0d20579"
|
|
```
|
|
|
|
Create a test or first real non-root user using the new lifecycle flow. Verify:
|
|
|
|
- LLDAP identity and groups;
|
|
- MFA enrollment through privacyIDEA;
|
|
- KeyCape OIDC claims;
|
|
- expected application or platform scope;
|
|
- no platform-root or OpenBao root authority;
|
|
- lock/offboard path can be exercised or simulated; and
|
|
- non-secret audit/progress evidence is recorded.
|
|
|
|
This is the final gate before declaring the platform ready for normal user
|
|
onboarding.
|
|
|
|
### T07 - Review And Retire Superseded Bootstrap Workplans
|
|
|
|
```task
|
|
id: NET-WP-0017-T07
|
|
status: todo
|
|
priority: medium
|
|
state_hub_task_id: "e9ceafb2-14c0-4352-9ac7-e31628feb045"
|
|
```
|
|
|
|
After T01-T06 complete, review `NET-WP-0015`, `NET-WP-0016`,
|
|
`RAIL-PL-WP-0002`, and older NetKingdom credential/bootstrap workplans.
|
|
Mark completed work finished or archived, and leave only longer-horizon items
|
|
such as multi-custodian upgrade, enterprise federation, dynamic database
|
|
credentials, object-storage STS vending, and application onboarding contracts.
|
|
|
|
## Acceptance Criteria
|
|
|
|
- Routine OpenBao administration works through NetKingdom/KeyCape OIDC and MFA.
|
|
- The initial root token and temporary OpenBao admin tokens are not normal
|
|
operating paths.
|
|
- Audit, recovery, emergency seal, and restore evidence are recorded without
|
|
secret values.
|
|
- Bootstrap-era privileged credentials have been rotated, reset, revoked, or
|
|
explicitly accepted as residual risk.
|
|
- A non-root user onboarding dry run succeeds and proves lock/offboard/review
|
|
paths.
|
|
- The bootstrap console can honestly move beyond Admin Identity Integration
|
|
into cleanup and reopening.
|