Files
ops-bridge/workplans/OPS-WP-0002-agent-usability.md
tegwick d73b7be45d docs(workplan): OPS-WP-0002 — agent usability via MCP registration and /bridge skill
Plan to make ops-bridge fully usable by worker agents:
- T01: SSE transport mode + make mcp-http target
- T02: register in ~/.claude.json at user scope
- T03: /bridge global slash command skill
- T04: worker agent bridge protocol in global CLAUDE.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-21 15:15:42 +01:00

6.2 KiB

id, type, title, domain, repo, status, owner, topic_slug, created, updated, depends_on
id type title domain repo status owner topic_slug created updated depends_on
OPS-WP-0002 workplan Agent Usability — MCP Registration, Skill, and Worker Orientation custodian ops-bridge active custodian custodian 2026-03-21 2026-03-21 OPS-WP-0001

OPS-WP-0002 — Agent Usability: MCP Registration, Skill, and Worker Orientation

Problem

The ops-bridge MCP server (src/bridge/mcp_server/server.py) is fully implemented with tools for bridge_up/down/restart/status/check/logs and catalog operations. But no agent can use it because:

  1. Not registered — the server isn't in ~/.claude.json and has no persistent transport mode. It only runs on stdio today.
  2. No slash command — agents working ad-hoc (not via MCP) have no quick way to check or restore tunnels.
  3. No worker orientation — agents on remote machines (CoulombCore, Railiance) don't know that bridge is available or how to use it when their state-hub connection drops.

Goal

Any agent — on the workstation or a remote machine — can:

  • Check tunnel health in one call
  • Bring up a dropped tunnel without manual intervention
  • Recover the state-hub connection if it goes down mid-session

Design

MCP server (workstation, persistent)

Run as an SSE service on port 8002 (same pattern as state-hub on 8001). Registered at user scope in ~/.claude.json so it's available to all Claude Code sessions.

The SSE transport is already supported by FastMCP — just change the mcp.run() call to accept an --http flag or read a BRIDGE_MCP_PORT env var.

Slash command skill (all machines)

A /bridge skill at ~/.claude/commands/bridge.md (global scope) that:

  • Reads bridge status output
  • Surfaces any tunnel that is down or stale
  • Offers to bring it up
  • Useful on machines that don't have the MCP server registered

Worker agent orientation (remote machines)

Update CLAUDE.md (global) and ops-bridge session protocol to tell worker agents:

  • Check bridge status at session start when on a machine with ops-bridge installed
  • If state-hub tunnel is down: run bridge up state-hub-<machine> to restore it before making any state-hub API calls
  • If no bridge command: fall back to direct API URL if reachable

Tasks

T01 — SSE transport mode for MCP server

id: OPS-WP-0002-T01
status: todo
priority: high

Add --http flag and BRIDGE_MCP_PORT env var to server.py entry point. When --http is set, run mcp.run(transport="sse", port=PORT) instead of stdio.

Add make mcp-http target to Makefile:

mcp-http: ## Start MCP server in SSE mode (default port 8002)
    BRIDGE_MCP_PORT=$${BRIDGE_MCP_PORT:-8002} uv run python src/bridge/mcp_server/server.py --http

Add make mcp-stop target that kills any running MCP server on port 8002.

Gate: bridge_status() tool callable via SSE on localhost:8002 after make mcp-http.


T02 — Register MCP server in ~/.claude.json

id: OPS-WP-0002-T02
status: todo
priority: high

Register the ops-bridge MCP server at user scope:

claude mcp add-json -s user ops-bridge \
  '{"type":"sse","url":"http://127.0.0.1:8002/sse"}'

Document in ops-bridge CLAUDE.md:

To start the MCP server:
    cd ~/ops-bridge && make mcp-http

To verify registration:
    python3 -c "import json,os; d=json.load(open(os.path.expanduser('~/.claude.json'))); print(list(d.get('mcpServers',{}).keys()))"

Update global ~/.claude/CLAUDE.md to list ops-bridge MCP server alongside state-hub.

Gate: ops-bridge appears in Claude Code MCP tool list after make mcp-http.


T03 — /bridge slash command skill

id: OPS-WP-0002-T03
status: todo
priority: medium

Create ~/.claude/commands/bridge.md — a global Claude Code skill for tunnel management.

Behaviour:

  1. Run bridge status and parse output
  2. Report each tunnel: name, state, LIVE column
  3. For any tunnel that is stopped, reconnecting, or [STALE]:
    • Offer to run bridge up <tunnel-name>
    • After bridge up, re-check with bridge check <tunnel-name>
  4. If all tunnels are connected and LIVE: report green and exit

Skill definition:

---
description: >
  Check ops-bridge tunnel health and restore any dropped tunnels.
  Reports status of all configured tunnels and offers to bring up
  any that are stopped or stale.
argument-hint: "[tunnel-name]"
allowed-tools:
  - Bash(bridge status)
  - Bash(bridge up*)
  - Bash(bridge down*)
  - Bash(bridge check*)
  - Bash(bridge logs*)
---

If an optional tunnel name is passed as $ARGUMENTS, scope all operations to that tunnel only.

Gate: /bridge skill runs cleanly when all tunnels are up; correctly identifies and recovers a manually-stopped tunnel.


T04 — Worker agent orientation in CLAUDE.md

id: OPS-WP-0002-T04
status: todo
priority: medium

Update global ~/.claude/CLAUDE.md — add a Worker Agent — Bridge Protocol section:

## Worker Agent — Bridge Protocol

When working on a remote machine (CoulombCore, Railiance nodes):

1. At session start, check if `bridge` is installed:
   `which bridge && bridge status`
2. If state-hub tunnel is down: `bridge up state-hub-<machine-slug>`
   Wait for state `connected` before making state-hub API calls.
3. If `bridge` is not installed, check if the state-hub API is directly
   reachable: `curl -s http://127.0.0.1:8000/state/health`
4. Only proceed without state-hub if absolutely necessary — log a
   progress note about the outage when connectivity is restored.

Also add a one-liner reminder to the ops-bridge session protocol in .claude/rules/session-protocol.md:

At session start: bridge status — bring up any stopped tunnels before accessing remote services.

Gate: ~/.claude/CLAUDE.md contains the Worker Agent section; ops-bridge session protocol references bridge status check.


Done Criteria

  • make mcp-http starts the MCP server on port 8002 (SSE)
  • bridge_status and bridge_check callable as MCP tools from Claude Code
  • ops-bridge registered in ~/.claude.json at user scope
  • /bridge skill surfaces tunnel states and recovers a stopped tunnel
  • Global CLAUDE.md has worker agent bridge protocol
  • All existing tests pass after T01 changes (make test)