generated from coulomb/repo-seed
docs: add README.txt with usage guide and configuration reference
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
224
README.txt
Normal file
224
README.txt
Normal file
@@ -0,0 +1,224 @@
|
||||
ops-bridge
|
||||
==========
|
||||
|
||||
SSH reverse tunnel lifecycle manager. Keeps remote execution environments
|
||||
(COULOMBCORE, Railiance nodes) connected to the local Custodian State Hub
|
||||
so Claude Code sessions on those machines have full MCP connectivity.
|
||||
|
||||
|
||||
WHAT IT DOES
|
||||
------------
|
||||
|
||||
`bridge` is a CLI tool that manages named SSH reverse tunnels. Each tunnel:
|
||||
|
||||
- Is identified by a human-readable name (e.g. state-hub-coulombcore)
|
||||
- Runs as an SSH reverse port-forward: ssh -R remote:127.0.0.1:local host
|
||||
- Auto-reconnects on drop using exponential backoff
|
||||
- Optionally runs an HTTP health check to confirm the forwarded service
|
||||
is actually reachable (not just the SSH process alive)
|
||||
- Records structured audit events (bridge_started, bridge_connected,
|
||||
health_check_failed, etc.) to a JSON log per tunnel
|
||||
|
||||
Bridge states: stopped -> starting -> connected <-> degraded -> reconnecting
|
||||
|
||||
|
||||
INSTALL
|
||||
-------
|
||||
|
||||
Requires Python 3.11+ and uv (https://docs.astral.sh/uv/).
|
||||
|
||||
uv tool install /path/to/ops-bridge
|
||||
|
||||
This registers the `bridge` command globally. For development:
|
||||
|
||||
cd /path/to/ops-bridge
|
||||
uv tool install -e .
|
||||
|
||||
Verify:
|
||||
|
||||
bridge --help
|
||||
|
||||
|
||||
CONFIGURATION
|
||||
-------------
|
||||
|
||||
Config file: ~/.config/bridge/tunnels.yaml
|
||||
Override with: BRIDGE_CONFIG=/path/to/config.yaml
|
||||
|
||||
Minimal example:
|
||||
|
||||
tunnels:
|
||||
state-hub-coulombcore:
|
||||
host: coulombcore.local
|
||||
remote_port: 18000
|
||||
local_port: 8000
|
||||
ssh_user: ubuntu
|
||||
ssh_key: ~/.ssh/id_ops
|
||||
actor: agent.claude-coulombcore
|
||||
|
||||
actors:
|
||||
agent.claude-coulombcore:
|
||||
class: automation
|
||||
description: Claude Code agent on CoulombCore
|
||||
|
||||
With health check and reconnect policy:
|
||||
|
||||
tunnels:
|
||||
state-hub-coulombcore:
|
||||
host: coulombcore.local
|
||||
remote_port: 18000
|
||||
local_port: 8000
|
||||
ssh_user: ubuntu
|
||||
ssh_key: ~/.ssh/id_ops
|
||||
actor: agent.claude-coulombcore
|
||||
|
||||
health_check:
|
||||
url: http://127.0.0.1:18000/health # checked from the REMOTE host
|
||||
interval_seconds: 30
|
||||
timeout_seconds: 5
|
||||
|
||||
reconnect:
|
||||
max_attempts: 0 # 0 = retry forever
|
||||
backoff_initial: 5
|
||||
backoff_max: 60
|
||||
|
||||
actors:
|
||||
agent.claude-coulombcore:
|
||||
class: automation # "human" or "automation"
|
||||
description: Claude Code agent on CoulombCore
|
||||
operator.bernd:
|
||||
class: human
|
||||
description: Bernd Worsch
|
||||
|
||||
Required tunnel fields: host, remote_port, local_port, ssh_user, ssh_key, actor
|
||||
Required actor fields: class (must be "human" or "automation")
|
||||
|
||||
|
||||
CLI COMMANDS
|
||||
------------
|
||||
|
||||
Lifecycle:
|
||||
|
||||
bridge up [TUNNEL] Start one tunnel, or all if no name given
|
||||
bridge down [TUNNEL] Stop one tunnel, or all
|
||||
bridge restart [TUNNEL] Restart one tunnel, or all
|
||||
|
||||
Observation:
|
||||
|
||||
bridge status Show all tunnels: state, uptime, last event
|
||||
bridge status --json Machine-readable JSON output
|
||||
bridge logs TUNNEL Tail the audit log for a tunnel
|
||||
bridge logs TUNNEL --lines 100 --follow
|
||||
|
||||
Examples:
|
||||
|
||||
bridge up state-hub-coulombcore
|
||||
bridge status
|
||||
bridge logs state-hub-coulombcore --follow
|
||||
bridge down state-hub-coulombcore
|
||||
|
||||
|
||||
OPSCATALOG EXTENSION (optional)
|
||||
--------------------------------
|
||||
|
||||
If you maintain a Git-backed YAML catalog of your infrastructure, point
|
||||
bridge at it in your config:
|
||||
|
||||
catalog_path: ~/ops-infra/opscatalog/
|
||||
|
||||
Catalog layout:
|
||||
|
||||
opscatalog/
|
||||
domains/
|
||||
<domain-id>/
|
||||
domain.yaml
|
||||
targets/
|
||||
<target-id>.yaml
|
||||
bridges/
|
||||
<bridge-id>.yaml
|
||||
|
||||
Then you can use:
|
||||
|
||||
bridge targets [--domain DOMAIN] List all targets (optionally filtered)
|
||||
bridge targets show TARGET_ID Show full target metadata
|
||||
bridge catalog list List domains with counts
|
||||
bridge catalog validate Check catalog for consistency errors
|
||||
bridge catalog show BRIDGE_ID Show a catalog bridge's full metadata
|
||||
|
||||
Bridges defined in the catalog are resolved the same way as inline tunnels.
|
||||
Inline tunnels (in tunnels.yaml) take precedence over catalog bridges when
|
||||
both define the same name.
|
||||
|
||||
|
||||
STATE FILES
|
||||
-----------
|
||||
|
||||
Runtime state is stored in ~/.local/state/bridge/:
|
||||
|
||||
{name}.pid Manager process ID
|
||||
{name}.state Current bridge state (e.g. "connected")
|
||||
{name}.log Audit log, one JSON object per line
|
||||
|
||||
Override the state directory with: BRIDGE_STATE_DIR=/path/to/dir
|
||||
|
||||
|
||||
AUDIT LOG FORMAT
|
||||
----------------
|
||||
|
||||
Each event is one JSON object per line:
|
||||
|
||||
{
|
||||
"ts": "2026-03-12T14:23:01.456789",
|
||||
"tunnel": "state-hub-coulombcore",
|
||||
"event": "bridge_connected",
|
||||
"actor": "agent.claude-coulombcore",
|
||||
"actor_class": "automation",
|
||||
"detail": ""
|
||||
}
|
||||
|
||||
Event types: bridge_started, bridge_connected, bridge_disconnected,
|
||||
bridge_reconnecting, health_check_failed, health_check_recovered,
|
||||
bridge_stopped
|
||||
|
||||
|
||||
DEVELOPMENT
|
||||
-----------
|
||||
|
||||
uv run pytest Run all tests
|
||||
uv run pytest tests/test_cli.py -v Run a specific test file
|
||||
uv run ruff check . Lint
|
||||
|
||||
Source layout:
|
||||
|
||||
src/bridge/
|
||||
cli.py Typer CLI (entry point)
|
||||
models.py Core dataclasses and enums
|
||||
config.py Config loading from tunnels.yaml
|
||||
manager.py Tunnel lifecycle (subprocess, reconnect loop)
|
||||
state.py PID and state file management
|
||||
audit.py Audit event logging
|
||||
health.py HTTP health checker (async, httpx)
|
||||
catalog/ OpsCatalog extension
|
||||
|
||||
|
||||
DESIGN NOTES
|
||||
------------
|
||||
|
||||
- No system daemons. Tunnel processes are managed as subprocesses; PIDs
|
||||
are tracked in ~/.local/state/bridge/.
|
||||
- Graceful shutdown: SIGTERM to the daemon allows a clean exit; SIGKILL
|
||||
follows after 5 seconds if unresponsive.
|
||||
- Actor attribution on every log event (human vs. automation) supports
|
||||
audit traceability (FRS §5.7).
|
||||
- SSH command invoked: ssh -N -R remote_port:127.0.0.1:local_port
|
||||
-i ssh_key ssh_user@host
|
||||
|
||||
|
||||
REPO STRUCTURE
|
||||
--------------
|
||||
|
||||
src/bridge/ Main source
|
||||
tests/ Test suite
|
||||
wiki/ PRD, FRS, OpsCatalog specification
|
||||
workplans/ Custodian State Hub workplan files (BRIDGE-WP-*)
|
||||
pyproject.toml Build config and dependencies
|
||||
Reference in New Issue
Block a user