Files
state-hub/workplans/CUST-WP-0012-multi-user-onboarding.md

278 lines
9.2 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
id: CUST-WP-0012
type: workplan
title: "Multi-User Onboarding and Environment Bootstrap"
domain: custodian
repo: state-hub
status: finished
owner: custodian
topic_slug: custodian
state_hub_workstream_id: "a28d9e29-4119-4b73-9469-f921920253ef"
created: "2026-03-11"
updated: "2026-05-23"
---
# Multi-User Onboarding and Environment Bootstrap
## Goal
Make the Custodian system accessible to collaborators beyond the primary
operator. A new user (or a new machine for the existing operator) should
be able to go from zero to a productive Claude Code session with full
State Hub MCP connectivity in a single session, without manual steps or
undocumented tribal knowledge.
## Context
Several friction points surfaced during the 2026-03-11 session:
- No SSH key for Railiance01 on WSL2 → blocked `make tunnel-loop`
- No `~/.railiance_gitea.conf` → blocked repo creation script
- Token missing `read:user` scope → blocked org repo creation
- No `git credential.helper` → credentials required on every push
- MCP registration is manual and documented only in `CLAUDE.md`
Each of these is a solved problem in isolation. This workstream collects
them into a repeatable, documented bootstrap experience.
## Scope
Two personas:
| Persona | Access level | Typical machine |
|---------|-------------|-----------------|
| Primary operator | Full access, all domains | WSL2 workstation |
| Domain collaborator | Read + write to one domain | COULOMBCORE, remote laptop |
## Tasks
### T01 — Git credential.helper for Gitea access
```task
id: CUST-WP-0012-T01
state_hub_task_id: 71628269-9a75-4dae-a347-e64a86040322
status: done
priority: medium
```
Document and automate `git credential.helper` setup for Gitea
(`http://92.205.130.254:32166`). Recommend `libsecret` (keyring-backed)
on machines that support it; fall back to `credential.helper=store`
(persistent, plaintext `~/.git-credentials`) on headless servers.
Include in bootstrap script (T04) and onboarding guide (T05).
```bash
# Preferred: libsecret (GNOME keyring, WSL2 with keyring daemon)
sudo apt-get install -y libsecret-1-0 libsecret-1-dev
sudo make -C /usr/share/doc/git/contrib/credential/libsecret
git config --global credential.helper \
/usr/share/doc/git/contrib/credential/libsecret/git-credential-libsecret
# Fallback: store (plaintext, suitable for headless servers)
git config --global credential.helper store
# Headless server alternative: cache (in-memory, 1h timeout)
git config --global credential.helper 'cache --timeout=3600'
```
**Done when:** included in bootstrap script; push to Gitea works without
re-entering credentials on second attempt.
**Implemented 2026-05-23:** `scripts/bootstrap-env.sh` configures a global
credential helper when one is not already present. It prefers `libsecret`, uses
`cache --timeout=3600` as the safe automatic fallback, and supports explicit
headless plaintext storage via `--git-helper store --allow-plaintext-store`.
`docs/onboarding.md` documents the tradeoffs.
---
### T02 — SSH key generation and authorization automation
```task
id: CUST-WP-0012-T02
state_hub_task_id: fea965e9-8a8f-439c-9096-8f7756eb71ed
status: done
priority: medium
```
Script or Ansible task that:
1. Generates an `ed25519` key pair on the new machine if none exists
2. Displays the public key with copy instructions
3. Authorizes it on all managed hosts (Railiance01, COULOMBCORE) via
`ssh-copy-id` or Ansible `authorized_key` module
Surfaced by: RAIL-PL-WP-0001 T01 — no SSH key on WSL2 blocked
`make tunnel-loop HOST=tegwick@92.205.62.239`.
```bash
# Generate if missing
[[ -f ~/.ssh/id_ed25519 ]] || ssh-keygen -t ed25519 -f ~/.ssh/id_ed25519 -N ""
# Authorize on a target host (requires existing access once)
ssh-copy-id -i ~/.ssh/id_ed25519.pub tegwick@92.205.62.239
ssh-copy-id -i ~/.ssh/id_ed25519.pub tegwick@92.205.130.254
```
**Done when:** included in bootstrap script; documented in onboarding guide.
**Implemented 2026-05-23:** `scripts/bootstrap-env.sh` generates
`~/.ssh/id_ed25519` if missing, prints the public key, and can run
`ssh-copy-id` for Railiance01 and CoulombCore with `--authorize-ssh`.
`docs/onboarding.md` documents the operator and collaborator path.
---
### T03 — Claude Code MCP registration automation
```task
id: CUST-WP-0012-T03
state_hub_task_id: 60318e9a-972e-45c8-afde-82ed0625f594
status: done
priority: medium
```
Automate the state-hub MCP server registration on a new machine.
Currently this is a multi-step manual process documented in
`~/.claude/CLAUDE.md`. It should be a single `make` target or script:
```bash
# In /home/worsch/state-hub/
make register-mcp # idempotent; safe to re-run
```
The script should:
1. Detect whether `state-hub` is already in `~/.claude.json`
2. Use the current SSE MCP config (`http://127.0.0.1:8001/sse` locally or
`http://127.0.0.1:18001/sse` through ops-bridge)
3. Run `claude mcp add-json -s user state-hub <config>`
4. Print instructions to restart Claude Code
Should also detect whether the state hub is reachable directly
(`http://127.0.0.1:8000`) or needs a tunnel (via ops-bridge), and emit
a warning if neither is available.
**Done when:** `make register-mcp` works on a clean machine; documented
in onboarding guide.
**Implemented 2026-05-23:** `scripts/register-mcp.sh` and the
`make register-mcp` target register the current SSE MCP transport
idempotently. The script detects local/tunnel reachability, supports
`MCP_URL`, `API_BASE`, and `DRY_RUN=1`, and documents the old `.mcp.json` cwd
patch path as legacy.
---
### T04 — Environment bootstrap script
```task
id: CUST-WP-0012-T04
state_hub_task_id: 84a94761-e424-4470-a9a2-64d9cabadb7f
status: done
priority: high
```
Single idempotent script: `scripts/bootstrap-env.sh`
Checks/installs prerequisites and configures the environment:
| Step | What |
|------|------|
| Prerequisites | git, sops, age, helm, kubectl, uv, claude CLI |
| Git credential | `credential.helper` (libsecret or store) |
| SSH key | Generate ed25519 if missing; display public key |
| MCP registration | `make register-mcp` (T03) |
| Gitea config | Prompt for token; write `~/.railiance_gitea.conf` |
| Health check | `curl /state/health`; warn if tunnel needed |
Design constraints:
- Idempotent: safe to run on an already-configured machine
- No silent failures: each step prints ✓ / ✗ / ⚠
- Minimal dependencies: bash + curl only to get started
**Done when:** running the script on a clean Ubuntu 24.04 machine
produces a working Custodian environment with no additional manual steps.
**Implemented 2026-05-23:** `scripts/bootstrap-env.sh` and
`make bootstrap-env` provide the idempotent entrypoint. It supports dry-run,
non-interactive mode, optional apt package installation, SSH authorization,
Gitea token prompting, MCP registration, and State Hub health checks.
---
### T05 — Onboarding guide and user journey documentation
```task
id: CUST-WP-0012-T05
state_hub_task_id: b0839802-659a-475b-8b84-ab7341ea3d15
status: done
priority: medium
```
Write `docs/onboarding.md` in this repository covering the full journey
for both personas:
**Primary operator (new machine):**
1. Prerequisites (git, SSH client)
2. Clone `state-hub` and the relevant domain repository
3. Run `make bootstrap-env` (T04)
4. Restart Claude Code → verify MCP is active
5. First session: `get_state_summary()` → orient → work
**Domain collaborator (new person):**
1. Prerequisites + Gitea account
2. `ssh-copy-id` to get access to Railiance01 (or just COULOMBCORE)
3. Set up ops-bridge tunnel to reach state hub
4. Clone domain repo
5. First Claude Code session with MCP via tunnel
6. Contributing a workplan (ADR-001 convention)
**Done when:** a new collaborator can follow the guide without
clarification from the primary operator.
**Implemented 2026-05-23:** `docs/onboarding.md` covers primary operator and
domain collaborator journeys, including SSH, Gitea token file, credential
helper choices, MCP registration, tunnel setup, and verification checks.
---
### T06 — State Hub multi-user model (deferred)
```task
id: CUST-WP-0012-T06
state_hub_task_id: d5df3302-67b9-4765-a8d8-ea2df53dff6e
status: done
priority: low
```
Design a lightweight user/role model for the state hub:
| Role | Permissions |
|------|-------------|
| Primary operator | Full read/write, all domains |
| Domain collaborator | Read all; write to own domain only |
| Observer | Read-only |
Decision needed: enforce at API layer (HTTP Basic / token auth per
domain) or rely on Gitea repo permissions as the authoritative boundary
(simpler — the hub is a read model anyway).
**Deferred until:** first external collaborator is actively onboarding.
Implement T01T05 first; multi-user access control is only needed when
there is more than one user.
**Implemented 2026-05-23:** `docs/multi-user-access-model.md` records the
current decision: repo permissions, SSH access, tunnels, and OpenBao remain the
authoritative boundaries for this phase; State Hub API auth is deferred until a
real second-user or exposed-deployment trigger exists.
---
## References
- ops-bridge repo: `ops-bridge` (tunnel lifecycle management)
- MCP registration: `~/.claude/CLAUDE.md` (current manual procedure)
- Gitea repo creation: `railiance-cluster/tools/create_railiance_repo.sh`
- ADR-001: workplans as repo artefacts
- Surfaced by: RAIL-PL-WP-0001 T01 execution, 2026-03-11