generated from coulomb/repo-seed
feat(diagnostics): end-to-end tunnel check, stale state detection, MCP extensions
- diagnostics.py: TunnelCheckResult with SSH process liveness, port probe, and optional API health check; check_tunnel / check_all_tunnels - cli.py: bridge status shows LIVE column and [STALE] marker when state says connected but PID is dead; bridge check wired to diagnostics - state.py: read_raw_pid helper; _pid_alive exported for reuse - capabilities.py: capabilities registry stubs - mcp_server/server.py: expose check_tunnel and tunnel capabilities over MCP - SCOPE.md: rapid orientation document - workplans/OPS-WP-0001-diagnostics.md: workplan backing this feature - tests: 207 passing (test_cli, test_mcp, test_diagnostics) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
164
workplans/OPS-WP-0001-diagnostics.md
Normal file
164
workplans/OPS-WP-0001-diagnostics.md
Normal file
@@ -0,0 +1,164 @@
|
||||
---
|
||||
id: OPS-WP-0001
|
||||
type: workplan
|
||||
title: "ops-bridge diagnostics and flow improvements"
|
||||
domain: custodian
|
||||
repo: ops-bridge
|
||||
status: done
|
||||
owner: claude
|
||||
topic_slug: custodian
|
||||
created: "2026-03-20"
|
||||
updated: "2026-03-20"
|
||||
state_hub_workstream_id: "6726cea2-447a-40b2-b0a0-edf495f07942"
|
||||
---
|
||||
|
||||
# OPS-WP-0001 — ops-bridge diagnostics and flow improvements
|
||||
|
||||
**Scope:** Add `bridge check` end-to-end diagnostics command, fix `bridge status` to
|
||||
surface live PID liveness and flag stale state, add a `bridge_check` MCP tool, and
|
||||
wire Makefile convenience targets in state-hub.
|
||||
|
||||
**Context:** During a session, `bridge status` reported "connected" but the reverse
|
||||
port forwarding was not active — stale `.state` files written by the daemon. The
|
||||
status command does not verify the SSH process is alive or that the remote port is
|
||||
actually listening.
|
||||
|
||||
---
|
||||
|
||||
## Task: Add `read_raw_pid()` to StateManager
|
||||
|
||||
```task
|
||||
id: OPS-WP-0001-T01
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "05e98e85-699a-4982-bb3e-8f2538cde2c7"
|
||||
```
|
||||
|
||||
Add `read_raw_pid(name)` to `src/bridge/state.py` — reads PID from file without
|
||||
liveness check. Existing `read_pid()` (which also checks liveness) stays unchanged.
|
||||
|
||||
---
|
||||
|
||||
## Task: Create `src/bridge/diagnostics.py`
|
||||
|
||||
```task
|
||||
id: OPS-WP-0001-T02
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "b68d7b1e-850b-469a-9de2-8b5d3d1f1c05"
|
||||
```
|
||||
|
||||
New module with `TunnelCheckResult` dataclass (ssh_process, pid, remote_port,
|
||||
local_api, latency_ms, stale_state, ok property) and `check_tunnel()` /
|
||||
`check_all_tunnels()` functions. SSH probe via subprocess; optional httpx health check.
|
||||
|
||||
---
|
||||
|
||||
## Task: Fix `bridge status` and add `bridge check` to CLI
|
||||
|
||||
```task
|
||||
id: OPS-WP-0001-T03
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "e87c6c5d-170c-4af3-905c-a48fae2edbe5"
|
||||
```
|
||||
|
||||
Fix `status` to show live PID liveness (LIVE column) and flag stale state.
|
||||
Add `check` command with `--json` flag; exit 1 if any tunnel not ok.
|
||||
Add `_print_check_table` helper.
|
||||
|
||||
---
|
||||
|
||||
## Task: Add `bridge_check` MCP tool and `bridge://check` resource
|
||||
|
||||
```task
|
||||
id: OPS-WP-0001-T04
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "7e97c112-20e2-4e2e-b853-53b10998392b"
|
||||
```
|
||||
|
||||
Add `bridge_check(tunnel?)` tool and `bridge://check` resource to
|
||||
`src/bridge/mcp_server/server.py`.
|
||||
|
||||
---
|
||||
|
||||
## Task: Register `bridge_check` capability
|
||||
|
||||
```task
|
||||
id: OPS-WP-0001-T05
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "c69fc748-a706-46db-a4d5-30d60222452b"
|
||||
```
|
||||
|
||||
Add `bridge_check` entry to `src/bridge/capabilities.py` with
|
||||
`required_access_modes=frozenset({"cli", "mcp"})`.
|
||||
|
||||
---
|
||||
|
||||
## Task: Write `tests/test_diagnostics.py`
|
||||
|
||||
```task
|
||||
id: OPS-WP-0001-T06
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "070ed088-74a6-48d3-81cf-739c2a2fd21b"
|
||||
```
|
||||
|
||||
Unit tests: test_no_pid, test_pid_dead, test_pid_alive_port_listening,
|
||||
test_pid_alive_port_closed, test_ssh_timeout.
|
||||
|
||||
---
|
||||
|
||||
## Task: Add `TestCheckCommand` to `tests/test_cli.py`
|
||||
|
||||
```task
|
||||
id: OPS-WP-0001-T07
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "aae5ddc5-f823-4647-a536-8604ddb97946"
|
||||
```
|
||||
|
||||
Tests: test_check_help, test_check_all_pass (marked capability+mode),
|
||||
test_check_any_fail, test_check_json_flag, test_check_specific_tunnel.
|
||||
|
||||
---
|
||||
|
||||
## Task: Add `TestMcpBridgeCheck` to `tests/test_mcp.py`
|
||||
|
||||
```task
|
||||
id: OPS-WP-0001-T08
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "ed492a3d-7a5f-465e-8cc3-d2f992f5462c"
|
||||
```
|
||||
|
||||
Test: test_bridge_check_tool marked capability("bridge_check") + access_mode("mcp").
|
||||
|
||||
---
|
||||
|
||||
## Task: Add tunnels targets to state-hub Makefile
|
||||
|
||||
```task
|
||||
id: OPS-WP-0001-T09
|
||||
status: done
|
||||
priority: medium
|
||||
state_hub_task_id: "a3c77062-cff5-40e3-936c-b210b05f8839"
|
||||
```
|
||||
|
||||
Add `tunnels-up`, `tunnels-status`, `tunnels-check` targets delegating to `bridge`.
|
||||
Add to `.PHONY` line.
|
||||
|
||||
---
|
||||
|
||||
## Task: Run test suite and verify
|
||||
|
||||
```task
|
||||
id: OPS-WP-0001-T10
|
||||
status: done
|
||||
priority: high
|
||||
state_hub_task_id: "e42de76c-fab7-4924-8929-38fa9eaca478"
|
||||
```
|
||||
|
||||
`cd /home/worsch/ops-bridge && uv run pytest tests/ -v` — all tests green.
|
||||
Reference in New Issue
Block a user