Add bridge maintenance cleanup to detect reverse tunnels whose remote
port is bound but no longer forwards (zombie sshd sessions), kill the
stale listeners on the remote host, and optionally restart the tunnel.
Includes install-cron/uninstall-cron/show-cron helpers and README notes
for the actcore-state-hub-bridge failure mode we hit on railiance01.
- ActorType enum (adm/agt/atm) replaces actor_class string; config validates
naming convention (adm-*/agt-*/atm-*) with hard ConfigError on mismatch;
legacy 'human'/'automation' values accepted with DeprecationWarning
- cert_command: pluggable shell string run before each SSH launch; cert written
to state dir; -i cert appended to SSH command alongside -i key
- TTL-aware cert refresh: parses Valid-to via ssh-keygen -L; pre-emptive restart
5 min before expiry (no backoff, no attempt increment); CERT_EXPIRING logged
- CertAcquisitionError: cert failures trigger normal backoff/retry loop
- cert_identity: Key ID parsed from cert and recorded in BRIDGE_CONNECTED event
- bridge cert-status: new CLI command; exit 1 on expired cert; --json flag
- 233 tests passing, ruff clean
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- diagnostics.py: TunnelCheckResult with SSH process liveness, port
probe, and optional API health check; check_tunnel / check_all_tunnels
- cli.py: bridge status shows LIVE column and [STALE] marker when state
says connected but PID is dead; bridge check wired to diagnostics
- state.py: read_raw_pid helper; _pid_alive exported for reuse
- capabilities.py: capabilities registry stubs
- mcp_server/server.py: expose check_tunnel and tunnel capabilities
over MCP
- SCOPE.md: rapid orientation document
- workplans/OPS-WP-0001-diagnostics.md: workplan backing this feature
- tests: 207 passing (test_cli, test_mcp, test_diagnostics)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements the full BRIDGE-WP-0003 workplan: 188 tests passing, 0 lint errors.
## What's added
**Capability registry** (`src/bridge/capabilities.py`):
- 10 capabilities with required_access_modes (cli/mcp/skill)
- Single source of truth for what OpsBridge does and where
**MCP server** (`src/bridge/mcp_server/server.py`):
- 10 FastMCP tools: bridge_up/down/restart/status/logs + 5 catalog_* tools
- 3 resources: bridge://status, catalog://domains, catalog://targets
- `.mcp.json` for project-scope auto-registration
- `scripts/register_mcp.py` for user-scope machine-global registration
**Skill** (`~/.claude/plugins/ops-bridge/bridge-status.md`):
- /bridge-status: health table with emoji indicators + remediation advice
**Cross-mode test coverage enforcement**:
- `tests/conftest.py`: capability/access_mode marks + collect_capability_coverage()
- `tests/test_mcp.py`: 31 FastMCP in-process client tests (Client(mcp) pattern)
- `tests/test_skill.py`: static skill lint against capability registry
- `tests/test_coverage_completeness.py`: meta-test that fails if any required
(capability × mode) pair lacks a test; also validates CLI commands and MCP
tools are registered in the capability registry
**ADR** (`architecture/adr-001-cross-mode-capability-registry.md`):
- Documents the registry pattern and FastMCP 3.x testing approach
Key implementation note: FastMCP 3.x in-process results are in
result.content[0].text (JSON string), not result.data directly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds the OpsCatalog subsystem: a Git-backed YAML catalog of operations
domains, targets, bridges, and actor classes. Includes catalog loader,
cross-reference validator, bridge resolver (inline-first, catalog
fallback), and new CLI commands: `bridge targets`, `bridge targets show`,
`bridge catalog list/validate/show`. Updates `up/down/restart` to resolve
bridge names from the catalog when not defined inline. 142 tests, all green.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Full TDD implementation of the `bridge` CLI tool covering all phases
from BRIDGE-WP-0001: project scaffolding, config loading, state
management, audit logging, health checks, tunnel lifecycle manager, and
all CLI commands (up/down/restart/status/logs). 77 tests, all green.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>