Files
the-custodian/memory/working/260224-wishlist-custodian.md
2026-05-02 00:46:07 +02:00

10 KiB
Raw Permalink Blame History

Custodian Wishlist — Project Registration UX

Date: 2026-02-24 | Context: first project registration (markitect_project)


Baseline: Token Cost of This Session

Registering markitect to the custodian required significant exploratory work. Rough token estimates (input + output combined):

Activity ~Tokens Avoidable?
Failed settings.json edit → full schema dump in error 12,000 Yes
Reading settings.local.json (irrelevant historical allow rules) 4,000 Yes
Reading mcp_server/server.py in full to understand tools 3,000 Partially
Exploring API via OpenAPI spec + state/summary 2,500 Partially
Discovering ~/.claude.json as user-scope MCP store 1,500 Yes
Manual cwd fix after claude mcp add-json dropped it 800 Yes
Small bash outputs, listings, failed mcp list/get calls 2,000 Yes
Actual registration work (CLAUDE.md write, progress event) 1,500
Total ~27,300
Irreducible minimum (if everything was documented) ~1,500
Potential reduction ~18×

The dominant waste was a single failed tool call (wrong config file) that returned ~12K tokens of JSON schema. Second largest: reading files that were not relevant to the registration task.


Wishlist Items

W1 — Document user-scope MCP config location in global CLAUDE.md (quick win, ~5 min)

Problem: The global CLAUDE.md instructs get_state_summary() but says nothing about where the MCP server is registered or how an agent should verify/bootstrap it. An agent unfamiliar with Claude Code internals will try settings.json first (wrong), triggering a full schema dump.

Fix: Add one paragraph to ~/.claude/CLAUDE.md:

The state-hub MCP server is registered at user scope in ~/.claude.json (key: mcpServers). To check: python3 -c "import json; d=json.load(open('$HOME/.claude.json')); print(d.get('mcpServers', {}).keys())". To register from scratch: claude mcp add-json -s user state-hub '<json>'. The JSON config is in /home/worsch/the-custodian/.mcp.json. The cwd field must be added manually after registration (known claude bug).

Impact: Saves the schema dump alone (~12K tokens).


W2 — make register-project in state-hub (high value, ~1 hr)

Problem: Registration requires: check API → get topic ID → verify MCP → write CLAUDE.md → log progress event. These are always the same steps.

Fix: A Makefile target (or small shell script scripts/register_project.sh):

make register-project PROJECT_PATH=/home/worsch/railiance DOMAIN=railiance
# or:
scripts/register_project.sh railiance /home/worsch/railiance

The script should:

  1. Verify API is reachable (fail fast with make api hint)
  2. Fetch the topic ID for DOMAIN from /topics/?status=active
  3. Check if state-hub is in ~/.claude.json; register if missing (include cwd fix)
  4. Write $PROJECT_PATH/CLAUDE.md from the template (W3)
  5. POST a progress event marking the registration

Token impact: Reduces agent-driven registration to a single Bash call. An agent simply invokes the script — no discovery needed.


W3 — CLAUDE.md template for new projects (quick win, ~15 min)

Problem: The project CLAUDE.md was written from scratch. Its content is identical for every project except topic ID and domain name.

Fix: Store a template at state-hub/scripts/project_claude_md.template:

# {PROJECT_NAME} — Claude Code Instructions

## Custodian State Hub Integration

This project is tracked as the **{DOMAIN}** domain in the Custodian State Hub.
Hub topic ID: `{TOPIC_ID}`

### Session Protocol
[... static content ...]

Used by register_project.sh (W2) with sed substitution.


W4 — List topic IDs in canon project charters (low effort, future-proof)

Problem: An agent needs a live API call to get the topic ID, which requires the API to be running and adds latency/tokens.

Fix: Add a frontmatter line to each canon/projects/<domain>/project_charter_v0.1.md:

---
custodian_topic_id: 5571d954-0d30-4950-980d-7bcaaad8e3e2
domain: markitect
---

An agent can then grep for the topic ID without touching the API. Also acts as a canonical cross-reference between the charter and the live state.


W5 — MCP server tool reference card (medium effort, ~30 min)

Problem: An agent reads the full server.py to understand what tools are available. The file is ~350 lines and includes HTTP plumbing, not just tool signatures.

Fix: Add state-hub/mcp_server/TOOLS.md:

## Available Tools

| Tool | Key Args | When to use |
|------|----------|-------------|
| get_state_summary() | — | Session start; orientation |
| add_progress_event(summary, event_type, topic_id, detail) | summary required | Session end; any notable event |
| create_workstream(topic_id, title, owner) | | New workstream for a domain |
| create_task(workstream_id, title, priority) | | New task under a workstream |
| update_task_status(task_id, status) | status: todo/in_progress/blocked/done | |
| record_decision(title, decision_type, topic_id) | | Decisions made or pending |
| resolve_decision(decision_id, rationale, decided_by) | | Close a decision |

An agent can read this 20-line card instead of the full server file. Saves ~2,500 tokens per session that needs to discover the tools.


W6 — Note cwd drop as known issue in .mcp.json or Makefile (quick fix, 5 min)

Problem: claude mcp add-json silently drops the cwd field, causing the server to fail if run from a different working directory. This required a manual JSON patch to ~/.claude.json.

Fix (option A): Add a comment/warning in the state-hub Makefile or README:

claude mcp add-json drops cwd. After running, patch ~/.claude.json manually: python3 scripts/patch_mcp_cwd.py

Fix (option B): Avoid -m mcp_server.server altogether — use the full absolute path to server.py and add the state-hub dir to PYTHONPATH in env:

"command": "/home/worsch/the-custodian/state-hub/.venv/bin/python",
"args": ["/home/worsch/the-custodian/state-hub/mcp_server/server.py"],
"env": { "PYTHONPATH": "/home/worsch/the-custodian/state-hub", "API_BASE": "http://127.0.0.1:8000" }

This works without cwd and is what the custodian's .mcp.json already does with -m — switching to the absolute path + PYTHONPATH env eliminates the bug entirely.


W7 — Idempotent check-custodian hook or command in project sessions (longer term)

Problem: The global CLAUDE.md says "start every session with get_state_summary()" but there's no enforcement. An agent that hasn't read the global CLAUDE.md won't know.

Fix: Add a SessionStart hook to ~/.claude/settings.json:

"hooks": {
  "SessionStart": [{
    "hooks": [{
      "type": "command",
      "command": "curl -sf http://127.0.0.1:8000/state/health && echo 'State Hub: OK' || echo 'State Hub: OFFLINE — run: cd ~/the-custodian/state-hub && make api'",
      "statusMessage": "Checking Custodian State Hub…"
    }]
  }]
}

This doesn't call get_state_summary() (which requires MCP), but at least surfaces whether the API is up, so the agent knows before trying to use the tools.


W8 — First-class extension point entity in state hub schema (EP-CUST-001)

Problem: Extension points are currently recorded as progress_events with event_type=extension_point and structured detail JSON. This works for now but lacks dedicated querying, filtering, dashboard views, and cross-project aggregation.

Convention (current): EP-{PROJECT}-{NNN} IDs, inline markers in spec docs, ## Extension Points summary section at doc end, recorded via add_progress_event with event_type=extension_point.

Fix: When the number of tracked EPs across projects grows enough to warrant it, add a first-class extension_points table to the state-hub schema with fields: ep_id, topic_id, workstream_id, location, trigger, constraint, current_approach, status (open/resolved/superseded).

Constraint: Must be backward-compatible — existing progress_event records with event_type=extension_point should remain queryable or be migrated.

Trigger: When cross-project EP aggregation is needed in the dashboard, or when EPs need to be surfaced in get_state_summary().


Priority Order

# Item Effort Token Saving Do When
W1 Document MCP config location in global CLAUDE.md 5 min 12K / registration Now
W6 Fix cwd drop (use absolute path + PYTHONPATH) 10 min 800 / registration Now
W3 CLAUDE.md template 15 min 500 / registration Next custodian session
W2 register_project.sh script + make register-project 1 hr 25K / registration Before next project
W5 TOOLS.md reference card 30 min 2,500 / session Next custodian session
W4 Topic IDs in canon charters 20 min 200 / registration Low urgency
W7 SessionStart hook for API health 30 min Indirect After W2
W8 First-class EP entity in state hub (EP-CUST-001) 24 hr Indirect When EP count warrants it

2026-05-02 Update — Task-Flow Engine Lifecycle Model

CUST-WP-0035 replaced the old assumption that State Hub lifecycle movement is only a fixed set of status enums and hardcoded transition tables.

Current model:

  • Information objects expose a stored workstation label, still surfaced as status for API compatibility.
  • get_flow_state(entity_type, entity_id) reports reachable workstations, unreachable workstations, and blocking assertions.
  • advance_workstation(entity_type, entity_id, target_workstation) is the preferred lifecycle movement tool when a flow definition exists.
  • Direct status update tools remain useful for bootstrap, legacy workflows, and file-backed consistency sync, but they are no longer the conceptual center of lifecycle management.

Reference spec: state-hub/docs/task-flow-engine-spec.md