diff --git a/workplans/CUST-WP-0021-multi-host-repo-paths.md b/workplans/CUST-WP-0021-multi-host-repo-paths.md new file mode 100644 index 0000000..ba509d4 --- /dev/null +++ b/workplans/CUST-WP-0021-multi-host-repo-paths.md @@ -0,0 +1,180 @@ +--- +id: CUST-WP-0021 +type: workplan +title: "State Hub — Multi-Host Repo Path Hardening" +domain: custodian +status: active +owner: custodian +topic_slug: custodian +created: "2026-03-18" +updated: "2026-03-18" +--- + +# State Hub — Multi-Host Repo Path Hardening + +## Summary + +When a kaizen-agentic worker on COULOMBCORE calls file-system-touching MCP +tools (validate_repo_adr, check_repo_consistency, ingest_sbom), the MCP server +runs those scripts on its own machine (bnt-lap001) against its own copy of the +repo. Two problems emerge: + +1. **Wrong path**: `validate_repo_adr` and `ingest_sbom_tool` take raw + filesystem paths as input. A remote agent passes their local path + (e.g. `/home/tegwick/the-custodian`) which does not exist on the server. + +2. **Stale state**: Even when path resolution works, the server runs against + its own checkout. A remote agent ahead on a branch gets misleading results. + +## Design Boundary (documented, not fixed) + +The MCP server is a subprocess on bnt-lap001. It can only read files from +bnt-lap001's filesystem. File-system tools will always operate against the +server's copy. The correct workflow for remote agents working on an ahead +branch is: push/sync first, or run consistency_check.py locally with +`--api-base http://127.0.0.1:18000` rather than via the MCP tool. +This rule is documented in TOOLS.md. + +## Resolves + +- validate_repo_adr raw-path input (broken for remote callers) +- ingest_sbom_tool raw lockfile path (same problem) +- host_paths empty for all repos except kaizen-agentic +- No error when server lacks the repo — silent wrong results + +--- + +## Tasks + +### T01 — Register COULOMBCORE host_paths for all repos it has + +```task +id: CUST-WP-0021-T01 +status: todo +priority: high +``` + +COULOMBCORE hostname: `254.130.205.92.host.secureserver.net` +Repos present at `/home/tegwick/`: + +| repo slug | COULOMBCORE path | +|-----------|-----------------| +| the-custodian | /home/tegwick/the-custodian | +| kaizen-agentic | /home/tegwick/kaizen-agentic | +| ops-bridge | /home/tegwick/ops-bridge | +| marki-docx | /home/tegwick/marki-docx | +| railiance-cluster | /home/tegwick/railiance-cluster | +| railiance-infra | /home/tegwick/railiance-infra | + +Also register bnt-lap001 paths for all repos currently using only `local_path` +(migrate them into `host_paths` so the map is the canonical source): + +| repo slug | bnt-lap001 path | +|-----------|----------------| +| the-custodian | /home/worsch/the-custodian | +| kaizen-agentic | /home/worsch/kaizen-agentic | +| ops-bridge | /home/worsch/ops-bridge | +| activity-core | /home/worsch/activity-core | +| markitect-project | /home/worsch/markitect_project | +| railiance-apps | /home/worsch/railiance-apps | +| railiance-cluster | /home/worsch/railiance-cluster | +| railiance-bootstrap | /home/worsch/railiance-cluster | +| railiance-enablement | /home/worsch/railiance-enablement | +| railiance-hosts | /home/worsch/railiance-infra | +| railiance-infra | /home/worsch/railiance-infra | +| railiance-platform | /home/worsch/railiance-platform | +| key-cape | /home/worsch/key-cape | +| net-kingdom | /home/worsch/net-kingdom | + +Use `POST /repos/{slug}/paths/` with `{"host": "", "path": ""}`. + +--- + +### T02 — Fix validate_repo_adr: accept repo_slug, resolve path from DB + +```task +id: CUST-WP-0021-T02 +status: todo +priority: high +``` + +Change `validate_repo_adr(repo_path: str, ...)` to +`validate_repo_adr(repo_slug: str, ...)` in `mcp_server/server.py`. + +Resolution logic (same as `_kaizen_agents_dir()`): +1. Fetch repo record via `_get(f"/repos/{repo_slug}")` +2. `hostname = socket.gethostname()` +3. `path = host_paths.get(hostname) or repo.get("local_path") or ""` +4. If path is empty or not a directory: return a clear error message with + instructions for remote agents to run the script locally. +5. Pass the resolved path to the validate_repo_adr.py subprocess as before. + +Update the tool docstring to document the new parameter and the design +boundary (tool always runs against the server's copy). + +--- + +### T03 — Fix ingest_sbom_tool: resolve lockfile via repo_slug + relative path + +```task +id: CUST-WP-0021-T03 +status: todo +priority: medium +``` + +Change `ingest_sbom_tool(repo_slug, lockfile_path: str)` so `lockfile_path` +becomes optional and is interpreted as **relative to the repo root** when +provided (not absolute). When omitted, the script auto-detects the lockfile +(existing behaviour with `--repo-path`). + +Resolution logic: +1. Fetch repo record, resolve path via `host_paths[hostname]` / `local_path` +2. If path empty/missing: return clear error +3. If `lockfile_path` provided and is relative: join with resolved repo root +4. If `lockfile_path` is absolute: use as-is (backward compat), but emit a + deprecation warning in the result string +5. Pass `--repo-path ` and optionally `--lockfile ` to script + +--- + +### T04 — Add host-path guard to check_repo_consistency + +```task +id: CUST-WP-0021-T04 +status: todo +priority: medium +``` + +`check_repo_consistency` already resolves paths correctly via the script. +But when `host_paths` is empty and `local_path` is absent the script silently +skips file checks. Add a pre-flight guard in the MCP tool: + +1. Fetch the repo record before spawning the subprocess +2. Run `resolve_repo_path` logic: `host_paths[hostname]` → `local_path` +3. If empty: return an early error message: + ``` + ⚠ No path registered for this host (bnt-lap001). + Register with: update_repo_path(repo_slug, "/path/to/repo") + Remote agents: run consistency_check.py locally with --api-base http://127.0.0.1:18000 + ``` +4. If path is set but directory doesn't exist: same error (not just empty string) + +--- + +### T05 — Document design boundary in TOOLS.md + +```task +id: CUST-WP-0021-T05 +status: todo +priority: low +``` + +Add a section to `state-hub/mcp_server/TOOLS.md` under a new heading +"## Multi-Host & Remote Agent Usage" that explains: + +- File-sys tools (validate_repo_adr, check_repo_consistency, ingest_sbom) + always execute on the MCP server machine against its registered path +- Remote agents on a different branch/ahead-of-server should sync first + OR run the scripts locally with `--api-base http://127.0.0.1:18000` +- How to register a new host's path: `update_repo_path(slug, path, host)` +- Pure-API tools (get_state_summary, create_task, etc.) work from any host