generated from coulomb/repo-seed
Plan to make ops-bridge fully usable by worker agents: - T01: SSE transport mode + make mcp-http target - T02: register in ~/.claude.json at user scope - T03: /bridge global slash command skill - T04: worker agent bridge protocol in global CLAUDE.md Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
217 lines
6.2 KiB
Markdown
217 lines
6.2 KiB
Markdown
---
|
|
id: OPS-WP-0002
|
|
type: workplan
|
|
title: "Agent Usability — MCP Registration, Skill, and Worker Orientation"
|
|
domain: custodian
|
|
repo: ops-bridge
|
|
status: active
|
|
owner: custodian
|
|
topic_slug: custodian
|
|
created: "2026-03-21"
|
|
updated: "2026-03-21"
|
|
depends_on: OPS-WP-0001
|
|
---
|
|
|
|
# OPS-WP-0002 — Agent Usability: MCP Registration, Skill, and Worker Orientation
|
|
|
|
## Problem
|
|
|
|
The ops-bridge MCP server (`src/bridge/mcp_server/server.py`) is fully
|
|
implemented with tools for `bridge_up/down/restart/status/check/logs` and
|
|
catalog operations. But no agent can use it because:
|
|
|
|
1. **Not registered** — the server isn't in `~/.claude.json` and has no
|
|
persistent transport mode. It only runs on stdio today.
|
|
2. **No slash command** — agents working ad-hoc (not via MCP) have no
|
|
quick way to check or restore tunnels.
|
|
3. **No worker orientation** — agents on remote machines (CoulombCore,
|
|
Railiance) don't know that bridge is available or how to use it when
|
|
their state-hub connection drops.
|
|
|
|
## Goal
|
|
|
|
Any agent — on the workstation or a remote machine — can:
|
|
- Check tunnel health in one call
|
|
- Bring up a dropped tunnel without manual intervention
|
|
- Recover the state-hub connection if it goes down mid-session
|
|
|
|
## Design
|
|
|
|
### MCP server (workstation, persistent)
|
|
|
|
Run as an SSE service on port 8002 (same pattern as state-hub on 8001).
|
|
Registered at user scope in `~/.claude.json` so it's available to all
|
|
Claude Code sessions.
|
|
|
|
The SSE transport is already supported by FastMCP — just change the
|
|
`mcp.run()` call to accept an `--http` flag or read a `BRIDGE_MCP_PORT`
|
|
env var.
|
|
|
|
### Slash command skill (all machines)
|
|
|
|
A `/bridge` skill at `~/.claude/commands/bridge.md` (global scope) that:
|
|
- Reads `bridge status` output
|
|
- Surfaces any tunnel that is down or stale
|
|
- Offers to bring it up
|
|
- Useful on machines that don't have the MCP server registered
|
|
|
|
### Worker agent orientation (remote machines)
|
|
|
|
Update `CLAUDE.md` (global) and `ops-bridge` session protocol to tell
|
|
worker agents:
|
|
- Check `bridge status` at session start when on a machine with
|
|
ops-bridge installed
|
|
- If state-hub tunnel is down: run `bridge up state-hub-<machine>` to
|
|
restore it before making any state-hub API calls
|
|
- If no bridge command: fall back to direct API URL if reachable
|
|
|
|
---
|
|
|
|
## Tasks
|
|
|
|
### T01 — SSE transport mode for MCP server
|
|
|
|
```task
|
|
id: OPS-WP-0002-T01
|
|
status: todo
|
|
priority: high
|
|
```
|
|
|
|
Add `--http` flag and `BRIDGE_MCP_PORT` env var to `server.py` entry
|
|
point. When `--http` is set, run `mcp.run(transport="sse", port=PORT)`
|
|
instead of stdio.
|
|
|
|
Add `make mcp-http` target to `Makefile`:
|
|
```makefile
|
|
mcp-http: ## Start MCP server in SSE mode (default port 8002)
|
|
BRIDGE_MCP_PORT=$${BRIDGE_MCP_PORT:-8002} uv run python src/bridge/mcp_server/server.py --http
|
|
```
|
|
|
|
Add `make mcp-stop` target that kills any running MCP server on port
|
|
8002.
|
|
|
|
Gate: `bridge_status()` tool callable via SSE on localhost:8002 after
|
|
`make mcp-http`.
|
|
|
|
---
|
|
|
|
### T02 — Register MCP server in ~/.claude.json
|
|
|
|
```task
|
|
id: OPS-WP-0002-T02
|
|
status: todo
|
|
priority: high
|
|
```
|
|
|
|
Register the ops-bridge MCP server at user scope:
|
|
```bash
|
|
claude mcp add-json -s user ops-bridge \
|
|
'{"type":"sse","url":"http://127.0.0.1:8002/sse"}'
|
|
```
|
|
|
|
Document in `ops-bridge` CLAUDE.md:
|
|
```
|
|
To start the MCP server:
|
|
cd ~/ops-bridge && make mcp-http
|
|
|
|
To verify registration:
|
|
python3 -c "import json,os; d=json.load(open(os.path.expanduser('~/.claude.json'))); print(list(d.get('mcpServers',{}).keys()))"
|
|
```
|
|
|
|
Update global `~/.claude/CLAUDE.md` to list `ops-bridge` MCP server
|
|
alongside `state-hub`.
|
|
|
|
Gate: `ops-bridge` appears in Claude Code MCP tool list after `make
|
|
mcp-http`.
|
|
|
|
---
|
|
|
|
### T03 — `/bridge` slash command skill
|
|
|
|
```task
|
|
id: OPS-WP-0002-T03
|
|
status: todo
|
|
priority: medium
|
|
```
|
|
|
|
Create `~/.claude/commands/bridge.md` — a global Claude Code skill for
|
|
tunnel management.
|
|
|
|
**Behaviour:**
|
|
1. Run `bridge status` and parse output
|
|
2. Report each tunnel: name, state, LIVE column
|
|
3. For any tunnel that is `stopped`, `reconnecting`, or `[STALE]`:
|
|
- Offer to run `bridge up <tunnel-name>`
|
|
- After `bridge up`, re-check with `bridge check <tunnel-name>`
|
|
4. If all tunnels are `connected` and LIVE: report green and exit
|
|
|
|
**Skill definition:**
|
|
```yaml
|
|
---
|
|
description: >
|
|
Check ops-bridge tunnel health and restore any dropped tunnels.
|
|
Reports status of all configured tunnels and offers to bring up
|
|
any that are stopped or stale.
|
|
argument-hint: "[tunnel-name]"
|
|
allowed-tools:
|
|
- Bash(bridge status)
|
|
- Bash(bridge up*)
|
|
- Bash(bridge down*)
|
|
- Bash(bridge check*)
|
|
- Bash(bridge logs*)
|
|
---
|
|
```
|
|
|
|
If an optional tunnel name is passed as `$ARGUMENTS`, scope all
|
|
operations to that tunnel only.
|
|
|
|
Gate: `/bridge` skill runs cleanly when all tunnels are up; correctly
|
|
identifies and recovers a manually-stopped tunnel.
|
|
|
|
---
|
|
|
|
### T04 — Worker agent orientation in CLAUDE.md
|
|
|
|
```task
|
|
id: OPS-WP-0002-T04
|
|
status: todo
|
|
priority: medium
|
|
```
|
|
|
|
Update global `~/.claude/CLAUDE.md` — add a **Worker Agent — Bridge
|
|
Protocol** section:
|
|
|
|
```markdown
|
|
## Worker Agent — Bridge Protocol
|
|
|
|
When working on a remote machine (CoulombCore, Railiance nodes):
|
|
|
|
1. At session start, check if `bridge` is installed:
|
|
`which bridge && bridge status`
|
|
2. If state-hub tunnel is down: `bridge up state-hub-<machine-slug>`
|
|
Wait for state `connected` before making state-hub API calls.
|
|
3. If `bridge` is not installed, check if the state-hub API is directly
|
|
reachable: `curl -s http://127.0.0.1:8000/state/health`
|
|
4. Only proceed without state-hub if absolutely necessary — log a
|
|
progress note about the outage when connectivity is restored.
|
|
```
|
|
|
|
Also add a one-liner reminder to the ops-bridge session protocol in
|
|
`.claude/rules/session-protocol.md`:
|
|
> At session start: `bridge status` — bring up any stopped tunnels
|
|
> before accessing remote services.
|
|
|
|
Gate: `~/.claude/CLAUDE.md` contains the Worker Agent section; ops-bridge
|
|
session protocol references bridge status check.
|
|
|
|
---
|
|
|
|
## Done Criteria
|
|
|
|
- [ ] `make mcp-http` starts the MCP server on port 8002 (SSE)
|
|
- [ ] `bridge_status` and `bridge_check` callable as MCP tools from Claude Code
|
|
- [ ] `ops-bridge` registered in `~/.claude.json` at user scope
|
|
- [ ] `/bridge` skill surfaces tunnel states and recovers a stopped tunnel
|
|
- [ ] Global CLAUDE.md has worker agent bridge protocol
|
|
- [ ] All existing tests pass after T01 changes (`make test`)
|