generated from coulomb/repo-seed
feat(directive): implement BRIDGE-WP-0004 AccessManagementDirective alignment
- ActorType enum (adm/agt/atm) replaces actor_class string; config validates naming convention (adm-*/agt-*/atm-*) with hard ConfigError on mismatch; legacy 'human'/'automation' values accepted with DeprecationWarning - cert_command: pluggable shell string run before each SSH launch; cert written to state dir; -i cert appended to SSH command alongside -i key - TTL-aware cert refresh: parses Valid-to via ssh-keygen -L; pre-emptive restart 5 min before expiry (no backoff, no attempt increment); CERT_EXPIRING logged - CertAcquisitionError: cert failures trigger normal backoff/retry loop - cert_identity: Key ID parsed from cert and recorded in BRIDGE_CONNECTED event - bridge cert-status: new CLI command; exit 1 on expired cert; --json flag - 233 tests passing, ruff clean Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
9
SCOPE.md
9
SCOPE.md
@@ -71,10 +71,11 @@ Claude Code sessions run locally; the Custodian State Hub API runs locally. Remo
|
||||
|
||||
## Current State
|
||||
|
||||
- Status: active (v0.1 core complete; directive alignment in progress — BRIDGE-WP-0004)
|
||||
- Implementation: ~75% — CLI tunneling fully functional, MCP integration working, health
|
||||
checks and audit logging complete; OpsCatalog framework present but not populated;
|
||||
cert_command / ActorType alignment not yet implemented
|
||||
- Status: active (v0.1 core complete; AccessManagementDirective alignment done — BRIDGE-WP-0004)
|
||||
- Implementation: ~80% — CLI tunneling fully functional, MCP integration working, health
|
||||
checks and audit logging complete; ActorType enum (adm/agt/atm) enforced; cert_command
|
||||
mode implemented with TTL-aware refresh and cert_identity audit logging; OpsCatalog
|
||||
framework present but not yet populated
|
||||
- Stability: stable tunnel lifecycle; tested under network drops and SSH failures
|
||||
- Usage: running in lab for daily Railiance/Temporal connectivity
|
||||
|
||||
|
||||
@@ -16,6 +16,7 @@ class AuditEvent(str, Enum):
|
||||
HEALTH_CHECK_FAILED = "health_check_failed"
|
||||
HEALTH_CHECK_RECOVERED = "health_check_recovered"
|
||||
BRIDGE_STOPPED = "bridge_stopped"
|
||||
CERT_EXPIRING = "cert_expiring"
|
||||
|
||||
|
||||
def _default_state_dir() -> Path:
|
||||
@@ -34,19 +35,22 @@ class AuditLogger:
|
||||
tunnel: str,
|
||||
event: AuditEvent,
|
||||
actor: str,
|
||||
actor_class: str,
|
||||
actor_type: str,
|
||||
detail: str = "",
|
||||
cert_identity: Optional[str] = None,
|
||||
) -> None:
|
||||
self._dir.mkdir(parents=True, exist_ok=True)
|
||||
entry: Dict[str, Any] = {
|
||||
"timestamp": datetime.now(timezone.utc).isoformat(),
|
||||
"tunnel": tunnel,
|
||||
"actor": actor,
|
||||
"actor_class": actor_class,
|
||||
"actor_type": actor_type,
|
||||
"event": event.value,
|
||||
}
|
||||
if detail:
|
||||
entry["detail"] = detail
|
||||
if cert_identity:
|
||||
entry["cert_identity"] = cert_identity
|
||||
with self._log_path(tunnel).open("a") as f:
|
||||
f.write(json.dumps(entry) + "\n")
|
||||
|
||||
|
||||
@@ -73,6 +73,11 @@ CAPABILITIES: list[Capability] = [
|
||||
description="End-to-end tunnel diagnostics via SSH: SSH PID alive + remote port listening",
|
||||
required_access_modes=frozenset({"cli", "mcp"}),
|
||||
),
|
||||
Capability(
|
||||
name="bridge_cert_status",
|
||||
description="Show certificate status for tunnels using cert_command mode",
|
||||
required_access_modes=frozenset({"cli"}),
|
||||
),
|
||||
]
|
||||
|
||||
CAPABILITIES_BY_NAME: dict[str, Capability] = {c.name: c for c in CAPABILITIES}
|
||||
|
||||
@@ -4,6 +4,8 @@ from __future__ import annotations
|
||||
import dataclasses
|
||||
import json
|
||||
import os
|
||||
import subprocess
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Optional
|
||||
|
||||
@@ -357,6 +359,84 @@ def _print_check_table(results):
|
||||
typer.echo(_fmt(row))
|
||||
|
||||
|
||||
@app.command("cert-status")
|
||||
def cert_status(
|
||||
tunnel: Optional[str] = typer.Argument(None, help="Tunnel name (omit for all inline)"),
|
||||
as_json: bool = typer.Option(False, "--json", help="Output as JSON"),
|
||||
):
|
||||
"""Show certificate status for tunnels using cert_command mode."""
|
||||
cfg = _load_or_exit()
|
||||
sd = _state_dir()
|
||||
|
||||
names = [tunnel] if tunnel else list(cfg.tunnels.keys())
|
||||
rows = []
|
||||
any_expired = False
|
||||
|
||||
for name in names:
|
||||
cert_file = sd / f"{name}-cert.pub"
|
||||
if not cert_file.exists():
|
||||
rows.append({"tunnel": name, "mode": "static-key", "cert_file": None})
|
||||
continue
|
||||
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["ssh-keygen", "-L", "-f", str(cert_file)],
|
||||
capture_output=True, text=True, check=False,
|
||||
)
|
||||
info = {"tunnel": name, "mode": "cert", "cert_file": str(cert_file)}
|
||||
for line in result.stdout.splitlines():
|
||||
line = line.strip()
|
||||
if line.startswith("Key ID:"):
|
||||
info["key_id"] = line.split(":", 1)[1].strip().strip('"')
|
||||
elif line.startswith("Valid:"):
|
||||
parts = line.split()
|
||||
if len(parts) >= 5 and parts[1] == "from" and parts[3] == "to":
|
||||
info["valid_from"] = parts[2]
|
||||
info["valid_until"] = parts[4]
|
||||
try:
|
||||
expires = datetime.fromisoformat(parts[4])
|
||||
now = datetime.now()
|
||||
remaining = expires - now
|
||||
if remaining.total_seconds() <= 0:
|
||||
info["expired"] = True
|
||||
any_expired = True
|
||||
else:
|
||||
info["expired"] = False
|
||||
mins = int(remaining.total_seconds() // 60)
|
||||
info["ttl_remaining"] = f"{mins}m"
|
||||
except ValueError:
|
||||
pass
|
||||
rows.append(info)
|
||||
except FileNotFoundError:
|
||||
rows.append({"tunnel": name, "mode": "cert", "error": "ssh-keygen not found"})
|
||||
|
||||
if as_json:
|
||||
typer.echo(json.dumps(rows, indent=2))
|
||||
else:
|
||||
for row in rows:
|
||||
mode = row.get("mode", "unknown")
|
||||
if mode == "static-key":
|
||||
typer.echo(f"{row['tunnel']} static-key / no cert")
|
||||
elif "error" in row:
|
||||
typer.echo(f"{row['tunnel']} ERROR: {row['error']}")
|
||||
else:
|
||||
parts = [row["tunnel"]]
|
||||
if "key_id" in row:
|
||||
parts.append(f"id={row['key_id']}")
|
||||
if "valid_from" in row:
|
||||
parts.append(f"from={row['valid_from']}")
|
||||
if "valid_until" in row:
|
||||
parts.append(f"until={row['valid_until']}")
|
||||
if row.get("expired"):
|
||||
parts.append("EXPIRED")
|
||||
elif "ttl_remaining" in row:
|
||||
parts.append(f"ttl={row['ttl_remaining']}")
|
||||
typer.echo(" ".join(parts))
|
||||
|
||||
if any_expired:
|
||||
raise typer.Exit(1)
|
||||
|
||||
|
||||
# ─── targets commands ─────────────────────────────────────────────────────────
|
||||
|
||||
@targets_app.callback(invoke_without_command=True)
|
||||
|
||||
@@ -2,13 +2,14 @@
|
||||
from __future__ import annotations
|
||||
|
||||
import os
|
||||
import warnings
|
||||
from dataclasses import dataclass
|
||||
from pathlib import Path
|
||||
from typing import Dict, Optional
|
||||
|
||||
import yaml
|
||||
|
||||
from bridge.models import ActorInfo, HealthCheckConfig, ReconnectPolicy, TunnelConfig
|
||||
from bridge.models import ActorInfo, ActorType, HealthCheckConfig, ReconnectPolicy, TunnelConfig
|
||||
|
||||
|
||||
class ConfigError(Exception):
|
||||
@@ -91,6 +92,10 @@ def _parse_tunnel(name: str, data: dict) -> TunnelConfig:
|
||||
if direction not in ("reverse", "local"):
|
||||
raise ConfigError(f"Tunnel '{name}' direction must be 'reverse' or 'local', got: {direction!r}")
|
||||
|
||||
cert_command = data.get("cert_command") or None
|
||||
if cert_command is not None:
|
||||
cert_command = str(cert_command)
|
||||
|
||||
return TunnelConfig(
|
||||
name=name,
|
||||
host=str(data["host"]),
|
||||
@@ -102,9 +107,40 @@ def _parse_tunnel(name: str, data: dict) -> TunnelConfig:
|
||||
reconnect=reconnect,
|
||||
health_check=health_check,
|
||||
direction=direction,
|
||||
cert_command=cert_command,
|
||||
)
|
||||
|
||||
|
||||
_LEGACY_CLASS_MAP = {
|
||||
"human": ActorType.ADM,
|
||||
"automation": ActorType.ATM,
|
||||
}
|
||||
|
||||
_ACTOR_TYPE_PREFIXES = {
|
||||
ActorType.ADM: "adm-",
|
||||
ActorType.AGT: "agt-",
|
||||
ActorType.ATM: "atm-",
|
||||
}
|
||||
|
||||
|
||||
def _parse_actor_type(name: str, raw_class: str) -> ActorType:
|
||||
if raw_class in _LEGACY_CLASS_MAP:
|
||||
warnings.warn(
|
||||
f"Actor '{name}': class '{raw_class}' is deprecated; "
|
||||
f"use '{_LEGACY_CLASS_MAP[raw_class].value}' instead.",
|
||||
DeprecationWarning,
|
||||
stacklevel=4,
|
||||
)
|
||||
return _LEGACY_CLASS_MAP[raw_class]
|
||||
try:
|
||||
return ActorType(raw_class)
|
||||
except ValueError:
|
||||
raise ConfigError(
|
||||
f"Actor '{name}' has unknown class '{raw_class}'; "
|
||||
f"must be one of: adm, agt, atm (or legacy: human, automation)"
|
||||
)
|
||||
|
||||
|
||||
def _parse_actors(raw: dict) -> Dict[str, ActorInfo]:
|
||||
actors = {}
|
||||
for name, data in raw.items():
|
||||
@@ -112,9 +148,16 @@ def _parse_actors(raw: dict) -> Dict[str, ActorInfo]:
|
||||
raise ConfigError(f"Actor '{name}' must be a mapping")
|
||||
if "class" not in data:
|
||||
raise ConfigError(f"Actor '{name}' missing required field: class")
|
||||
actor_type = _parse_actor_type(name, str(data["class"]))
|
||||
required_prefix = _ACTOR_TYPE_PREFIXES[actor_type]
|
||||
if not name.startswith(required_prefix):
|
||||
raise ConfigError(
|
||||
f"Actor '{name}' has type '{actor_type.value}' but name must start "
|
||||
f"with '{required_prefix}' (got '{name}')"
|
||||
)
|
||||
actors[name] = ActorInfo(
|
||||
name=name,
|
||||
actor_class=str(data["class"]),
|
||||
actor_type=actor_type,
|
||||
description=str(data.get("description", "")),
|
||||
)
|
||||
return actors
|
||||
|
||||
@@ -6,35 +6,102 @@ import os
|
||||
import signal
|
||||
import subprocess
|
||||
import time
|
||||
from datetime import datetime, timedelta
|
||||
from pathlib import Path
|
||||
from typing import List, Optional
|
||||
|
||||
from bridge.audit import AuditEvent, AuditLogger
|
||||
from bridge.health import HealthChecker
|
||||
from bridge.models import BridgeState, TunnelConfig
|
||||
from bridge.models import BridgeState, CertAcquisitionError, TunnelConfig
|
||||
from bridge.state import StateManager
|
||||
|
||||
log = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def build_ssh_command(cfg: TunnelConfig) -> List[str]:
|
||||
def _actor_type_from_name(name: str) -> str:
|
||||
for prefix in ("adm", "agt", "atm"):
|
||||
if name.startswith(f"{prefix}-"):
|
||||
return prefix
|
||||
return "unknown"
|
||||
|
||||
|
||||
def build_ssh_command(cfg: TunnelConfig, cert_path: Optional[Path] = None) -> List[str]:
|
||||
"""Build the SSH tunnel command (reverse -R or local -L)."""
|
||||
key = os.path.expanduser(cfg.ssh_key)
|
||||
if cfg.direction == "local":
|
||||
forward_flag = ["-L", f"{cfg.local_port}:127.0.0.1:{cfg.remote_port}"]
|
||||
else:
|
||||
forward_flag = ["-R", f"{cfg.remote_port}:127.0.0.1:{cfg.local_port}"]
|
||||
return [
|
||||
cmd = [
|
||||
"ssh",
|
||||
"-N",
|
||||
*forward_flag,
|
||||
"-i", key,
|
||||
]
|
||||
if cert_path is not None:
|
||||
cmd += ["-i", str(cert_path)]
|
||||
cmd += [
|
||||
"-o", "ServerAliveInterval=10",
|
||||
"-o", "ServerAliveCountMax=3",
|
||||
"-o", "ExitOnForwardFailure=yes",
|
||||
"-o", "StrictHostKeyChecking=accept-new",
|
||||
f"{cfg.ssh_user}@{cfg.host}",
|
||||
]
|
||||
return cmd
|
||||
|
||||
|
||||
def _run_cert_command(cfg: TunnelConfig, state_dir: Path) -> Optional[Path]:
|
||||
"""Run cert_command and write cert to state dir. Returns cert path or None."""
|
||||
if cfg.cert_command is None:
|
||||
return None
|
||||
result = subprocess.run(
|
||||
cfg.cert_command,
|
||||
shell=True,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
)
|
||||
if result.returncode != 0:
|
||||
raise CertAcquisitionError(result.stderr.strip())
|
||||
cert_path = state_dir / f"{cfg.name}-cert.pub"
|
||||
cert_path.write_text(result.stdout)
|
||||
return cert_path
|
||||
|
||||
|
||||
def _parse_cert_identity(cert_path: Path) -> Optional[str]:
|
||||
"""Parse Key ID from ssh-keygen -L output."""
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["ssh-keygen", "-L", "-f", str(cert_path)],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
)
|
||||
for line in result.stdout.splitlines():
|
||||
line = line.strip()
|
||||
if line.startswith("Key ID:"):
|
||||
return line.split(":", 1)[1].strip().strip('"')
|
||||
except Exception:
|
||||
pass
|
||||
return None
|
||||
|
||||
|
||||
def _parse_cert_expiry(cert_path: Path) -> Optional[datetime]:
|
||||
"""Parse Valid-before datetime from ssh-keygen -L output."""
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["ssh-keygen", "-L", "-f", str(cert_path)],
|
||||
capture_output=True,
|
||||
text=True,
|
||||
)
|
||||
for line in result.stdout.splitlines():
|
||||
line = line.strip()
|
||||
if line.startswith("Valid:"):
|
||||
# "Valid: from 2026-05-15T10:00:00 to 2026-05-15T22:00:00"
|
||||
parts = line.split()
|
||||
if len(parts) >= 5 and parts[3] == "to":
|
||||
return datetime.fromisoformat(parts[4])
|
||||
except Exception:
|
||||
pass
|
||||
return None
|
||||
|
||||
|
||||
class TunnelManager:
|
||||
@@ -56,7 +123,8 @@ class TunnelManager:
|
||||
return self._state.is_running(self._cfg.name)
|
||||
|
||||
def _actor_info(self):
|
||||
return self._cfg.actor, "unknown"
|
||||
actor = self._cfg.actor
|
||||
return actor, _actor_type_from_name(actor)
|
||||
|
||||
def _next_backoff(self, attempt: int) -> int:
|
||||
initial = self._cfg.reconnect.backoff_initial
|
||||
@@ -71,12 +139,12 @@ class TunnelManager:
|
||||
return
|
||||
|
||||
self._state.write_state(self._cfg.name, BridgeState.STARTING)
|
||||
actor, actor_class = self._actor_info()
|
||||
actor, actor_type = self._actor_info()
|
||||
self._audit.log(
|
||||
tunnel=self._cfg.name,
|
||||
event=AuditEvent.BRIDGE_STARTED,
|
||||
actor=actor,
|
||||
actor_class=actor_class,
|
||||
actor_type=actor_type,
|
||||
)
|
||||
|
||||
pid = os.fork()
|
||||
@@ -99,7 +167,7 @@ class TunnelManager:
|
||||
tunnel=self._cfg.name,
|
||||
event=AuditEvent.BRIDGE_STOPPED,
|
||||
actor=actor,
|
||||
actor_class=actor_class,
|
||||
actor_type=actor_type,
|
||||
)
|
||||
|
||||
os._exit(0)
|
||||
@@ -131,12 +199,12 @@ class TunnelManager:
|
||||
|
||||
self._state.clear_pid(self._cfg.name)
|
||||
self._state.write_state(self._cfg.name, BridgeState.STOPPED)
|
||||
actor, actor_class = self._actor_info()
|
||||
actor, actor_type = self._actor_info()
|
||||
self._audit.log(
|
||||
tunnel=self._cfg.name,
|
||||
event=AuditEvent.BRIDGE_STOPPED,
|
||||
actor=actor,
|
||||
actor_class=actor_class,
|
||||
actor_type=actor_type,
|
||||
)
|
||||
|
||||
def _run_loop(self) -> None:
|
||||
@@ -144,11 +212,11 @@ class TunnelManager:
|
||||
import asyncio
|
||||
|
||||
cfg = self._cfg
|
||||
actor, actor_class = self._actor_info()
|
||||
actor, actor_type = self._actor_info()
|
||||
attempt = 0
|
||||
max_attempts = cfg.reconnect.max_attempts # 0 = infinite
|
||||
state_dir = self._state._dir
|
||||
|
||||
# Setup signal handler for graceful shutdown
|
||||
_stop = [False]
|
||||
|
||||
def _on_term(signum, frame):
|
||||
@@ -162,7 +230,31 @@ class TunnelManager:
|
||||
self._state.write_state(cfg.name, BridgeState.FAILED)
|
||||
break
|
||||
|
||||
cmd = build_ssh_command(cfg)
|
||||
# Acquire cert before each SSH launch (T3, T7)
|
||||
try:
|
||||
cert_path = _run_cert_command(cfg, state_dir)
|
||||
except CertAcquisitionError as e:
|
||||
self._audit.log(
|
||||
tunnel=cfg.name,
|
||||
event=AuditEvent.BRIDGE_DISCONNECTED,
|
||||
actor=actor,
|
||||
actor_type=actor_type,
|
||||
detail=f"cert acquisition failed: {e}",
|
||||
)
|
||||
attempt += 1
|
||||
if max_attempts > 0 and attempt >= max_attempts:
|
||||
self._state.write_state(cfg.name, BridgeState.FAILED)
|
||||
break
|
||||
backoff = self._next_backoff(attempt - 1)
|
||||
self._state.write_state(cfg.name, BridgeState.RECONNECTING)
|
||||
log.info("Cert acquisition failed, retrying in %ds", backoff)
|
||||
time.sleep(backoff)
|
||||
continue
|
||||
|
||||
cert_identity = _parse_cert_identity(cert_path) if cert_path else None
|
||||
cert_expires_at = _parse_cert_expiry(cert_path) if cert_path else None
|
||||
|
||||
cmd = build_ssh_command(cfg, cert_path=cert_path)
|
||||
log.info("Starting SSH: %s", " ".join(cmd))
|
||||
self._state.write_state(cfg.name, BridgeState.STARTING)
|
||||
|
||||
@@ -174,24 +266,30 @@ class TunnelManager:
|
||||
tunnel=cfg.name,
|
||||
event=AuditEvent.BRIDGE_DISCONNECTED,
|
||||
actor=actor,
|
||||
actor_class=actor_class,
|
||||
actor_type=actor_type,
|
||||
detail="ssh binary not found",
|
||||
)
|
||||
break
|
||||
|
||||
# Wait briefly then assume connected if still running
|
||||
time.sleep(2)
|
||||
_ttl_refresh = False
|
||||
if proc.poll() is None:
|
||||
self._state.write_state(cfg.name, BridgeState.CONNECTED)
|
||||
self._audit.log(
|
||||
tunnel=cfg.name,
|
||||
event=AuditEvent.BRIDGE_CONNECTED,
|
||||
actor=actor,
|
||||
actor_class=actor_class,
|
||||
actor_type=actor_type,
|
||||
cert_identity=cert_identity,
|
||||
)
|
||||
attempt = 0
|
||||
|
||||
# Health check loop
|
||||
def _check_ttl() -> bool:
|
||||
"""Return True if cert is within 5 min of expiry and SSH should restart."""
|
||||
if cert_expires_at is None:
|
||||
return False
|
||||
return datetime.now() >= cert_expires_at - timedelta(minutes=5)
|
||||
|
||||
if cfg.health_check:
|
||||
checker = HealthChecker(
|
||||
url=cfg.health_check.url,
|
||||
@@ -199,6 +297,18 @@ class TunnelManager:
|
||||
)
|
||||
health_failing = False
|
||||
while not _stop[0] and proc.poll() is None:
|
||||
if _check_ttl():
|
||||
self._audit.log(
|
||||
tunnel=cfg.name,
|
||||
event=AuditEvent.CERT_EXPIRING,
|
||||
actor=actor,
|
||||
actor_type=actor_type,
|
||||
cert_identity=cert_identity,
|
||||
detail=str(cert_expires_at),
|
||||
)
|
||||
proc.terminate()
|
||||
_ttl_refresh = True
|
||||
break
|
||||
result = asyncio.run(checker.check())
|
||||
if result.ok:
|
||||
if health_failing:
|
||||
@@ -208,7 +318,7 @@ class TunnelManager:
|
||||
tunnel=cfg.name,
|
||||
event=AuditEvent.HEALTH_CHECK_RECOVERED,
|
||||
actor=actor,
|
||||
actor_class=actor_class,
|
||||
actor_type=actor_type,
|
||||
)
|
||||
else:
|
||||
if not health_failing:
|
||||
@@ -218,21 +328,36 @@ class TunnelManager:
|
||||
tunnel=cfg.name,
|
||||
event=AuditEvent.HEALTH_CHECK_FAILED,
|
||||
actor=actor,
|
||||
actor_class=actor_class,
|
||||
actor_type=actor_type,
|
||||
detail=result.error or f"HTTP {result.status_code}",
|
||||
)
|
||||
time.sleep(cfg.health_check.interval_seconds)
|
||||
else:
|
||||
while not _stop[0] and proc.poll() is None:
|
||||
if _check_ttl():
|
||||
self._audit.log(
|
||||
tunnel=cfg.name,
|
||||
event=AuditEvent.CERT_EXPIRING,
|
||||
actor=actor,
|
||||
actor_type=actor_type,
|
||||
cert_identity=cert_identity,
|
||||
detail=str(cert_expires_at),
|
||||
)
|
||||
proc.terminate()
|
||||
_ttl_refresh = True
|
||||
break
|
||||
time.sleep(1)
|
||||
|
||||
# SSH exited
|
||||
if _ttl_refresh:
|
||||
# Planned cert refresh — don't count as failure, no backoff
|
||||
continue
|
||||
|
||||
if proc.poll() is not None:
|
||||
self._audit.log(
|
||||
tunnel=cfg.name,
|
||||
event=AuditEvent.BRIDGE_DISCONNECTED,
|
||||
actor=actor,
|
||||
actor_class=actor_class,
|
||||
actor_type=actor_type,
|
||||
detail=f"exit code {proc.returncode}",
|
||||
)
|
||||
|
||||
@@ -248,7 +373,7 @@ class TunnelManager:
|
||||
tunnel=cfg.name,
|
||||
event=AuditEvent.BRIDGE_RECONNECTING,
|
||||
actor=actor,
|
||||
actor_class=actor_class,
|
||||
actor_type=actor_type,
|
||||
detail=f"retry {attempt}, backoff {backoff}s",
|
||||
)
|
||||
log.info("Reconnecting in %ds (attempt %d)", backoff, attempt)
|
||||
|
||||
@@ -15,6 +15,16 @@ class BridgeState(str, Enum):
|
||||
FAILED = "failed"
|
||||
|
||||
|
||||
class ActorType(str, Enum):
|
||||
ADM = "adm" # human operator
|
||||
AGT = "agt" # LLM-powered autonomous agent
|
||||
ATM = "atm" # deterministic script / pipeline
|
||||
|
||||
|
||||
class CertAcquisitionError(Exception):
|
||||
"""Raised when cert_command fails to produce a certificate."""
|
||||
|
||||
|
||||
@dataclass
|
||||
class ReconnectPolicy:
|
||||
max_attempts: int = 0 # 0 = infinite
|
||||
@@ -41,10 +51,11 @@ class TunnelConfig:
|
||||
reconnect: ReconnectPolicy = field(default_factory=ReconnectPolicy)
|
||||
health_check: Optional[HealthCheckConfig] = None
|
||||
direction: str = "reverse" # "reverse" (-R) or "local" (-L)
|
||||
cert_command: Optional[str] = None
|
||||
|
||||
|
||||
@dataclass
|
||||
class ActorInfo:
|
||||
name: str
|
||||
actor_class: str # "human" or "automation"
|
||||
actor_type: ActorType
|
||||
description: str = ""
|
||||
|
||||
@@ -23,10 +23,10 @@ VALID_CONFIG = textwrap.dedent("""\
|
||||
local_port: 8000
|
||||
ssh_user: ubuntu
|
||||
ssh_key: ~/.ssh/id_ops
|
||||
actor: operator.bernd
|
||||
actor: adm-bernd
|
||||
actors:
|
||||
operator.bernd:
|
||||
class: human
|
||||
adm-bernd:
|
||||
class: adm
|
||||
description: Bernd
|
||||
""")
|
||||
|
||||
@@ -38,10 +38,10 @@ VALID_CONFIG_WITH_CATALOG = textwrap.dedent("""\
|
||||
local_port: 8000
|
||||
ssh_user: ubuntu
|
||||
ssh_key: ~/.ssh/id_ops
|
||||
actor: operator.bernd
|
||||
actor: adm-bernd
|
||||
actors:
|
||||
operator.bernd:
|
||||
class: human
|
||||
adm-bernd:
|
||||
class: adm
|
||||
description: Bernd
|
||||
catalog_path: {catalog_path}
|
||||
""")
|
||||
|
||||
@@ -22,7 +22,7 @@ class TestAuditLogger:
|
||||
tunnel="my-tunnel",
|
||||
event=AuditEvent.BRIDGE_STARTED,
|
||||
actor="operator.bernd",
|
||||
actor_class="human",
|
||||
actor_type="adm",
|
||||
)
|
||||
log_file = log_dir / "my-tunnel.log"
|
||||
assert log_file.exists()
|
||||
@@ -32,7 +32,7 @@ class TestAuditLogger:
|
||||
tunnel="my-tunnel",
|
||||
event=AuditEvent.BRIDGE_STARTED,
|
||||
actor="operator.bernd",
|
||||
actor_class="human",
|
||||
actor_type="adm",
|
||||
)
|
||||
lines = (log_dir / "my-tunnel.log").read_text().strip().splitlines()
|
||||
assert len(lines) == 1
|
||||
@@ -40,12 +40,12 @@ class TestAuditLogger:
|
||||
assert entry["tunnel"] == "my-tunnel"
|
||||
assert entry["event"] == "bridge_started"
|
||||
assert entry["actor"] == "operator.bernd"
|
||||
assert entry["actor_class"] == "human"
|
||||
assert entry["actor_type"] == "adm"
|
||||
assert "timestamp" in entry
|
||||
|
||||
def test_multiple_events_append(self, logger, log_dir):
|
||||
for event in [AuditEvent.BRIDGE_STARTED, AuditEvent.BRIDGE_CONNECTED, AuditEvent.BRIDGE_STOPPED]:
|
||||
logger.log(tunnel="t", event=event, actor="a", actor_class="human")
|
||||
logger.log(tunnel="t", event=event, actor="a", actor_type="adm")
|
||||
lines = (log_dir / "t.log").read_text().strip().splitlines()
|
||||
assert len(lines) == 3
|
||||
|
||||
@@ -54,7 +54,7 @@ class TestAuditLogger:
|
||||
tunnel="t",
|
||||
event=AuditEvent.HEALTH_CHECK_FAILED,
|
||||
actor="a",
|
||||
actor_class="automation",
|
||||
actor_type="atm",
|
||||
detail="connection refused",
|
||||
)
|
||||
entry = json.loads((log_dir / "t.log").read_text().strip())
|
||||
@@ -72,15 +72,15 @@ class TestAuditLogger:
|
||||
|
||||
def test_timestamp_is_iso8601(self, logger, log_dir):
|
||||
from datetime import datetime
|
||||
logger.log(tunnel="t", event=AuditEvent.BRIDGE_STOPPED, actor="a", actor_class="human")
|
||||
logger.log(tunnel="t", event=AuditEvent.BRIDGE_STOPPED, actor="a", actor_type="adm")
|
||||
entry = json.loads((log_dir / "t.log").read_text().strip())
|
||||
# Should parse without error
|
||||
dt = datetime.fromisoformat(entry["timestamp"])
|
||||
assert dt.tzinfo is not None or True # UTC or naive both acceptable
|
||||
|
||||
def test_read_events(self, logger, log_dir):
|
||||
logger.log(tunnel="t", event=AuditEvent.BRIDGE_STARTED, actor="a", actor_class="human")
|
||||
logger.log(tunnel="t", event=AuditEvent.BRIDGE_STOPPED, actor="a", actor_class="human")
|
||||
logger.log(tunnel="t", event=AuditEvent.BRIDGE_STARTED, actor="a", actor_type="adm")
|
||||
logger.log(tunnel="t", event=AuditEvent.BRIDGE_STOPPED, actor="a", actor_type="adm")
|
||||
events = logger.read_events("t")
|
||||
assert len(events) == 2
|
||||
assert events[0]["event"] == "bridge_started"
|
||||
|
||||
@@ -17,10 +17,10 @@ VALID_CONFIG = textwrap.dedent("""\
|
||||
local_port: 8000
|
||||
ssh_user: ubuntu
|
||||
ssh_key: ~/.ssh/id_ops
|
||||
actor: operator.bernd
|
||||
actor: adm-bernd
|
||||
actors:
|
||||
operator.bernd:
|
||||
class: human
|
||||
adm-bernd:
|
||||
class: adm
|
||||
description: Bernd
|
||||
""")
|
||||
|
||||
@@ -285,3 +285,56 @@ class TestRestartCommand:
|
||||
|
||||
assert result.exit_code == 0
|
||||
assert call_order == ["stop", "start"]
|
||||
|
||||
|
||||
class TestCertStatusCommand:
|
||||
@pytest.mark.capability("bridge_cert_status")
|
||||
@pytest.mark.access_mode("cli")
|
||||
def test_cert_status_no_cert_shows_static_key(self, env, state_dir):
|
||||
result = runner.invoke(app, ["cert-status"], env=env)
|
||||
assert result.exit_code == 0
|
||||
assert "static-key" in result.output
|
||||
|
||||
def test_cert_status_json_no_cert(self, env, state_dir):
|
||||
result = runner.invoke(app, ["cert-status", "--json"], env=env)
|
||||
assert result.exit_code == 0
|
||||
data = json.loads(result.output)
|
||||
assert data[0]["mode"] == "static-key"
|
||||
|
||||
def test_cert_status_exit_1_on_expired(self, env, state_dir, tmp_path):
|
||||
# Write a fake cert file in state dir; mock ssh-keygen to report expired
|
||||
state_dir.mkdir(parents=True, exist_ok=True)
|
||||
cert_file = state_dir / "test-tunnel-cert.pub"
|
||||
cert_file.write_text("fake cert")
|
||||
with patch("subprocess.run") as mock_run:
|
||||
mock_run.return_value = MagicMock(
|
||||
stdout=(
|
||||
"test-tunnel-cert.pub:\n"
|
||||
" Key ID: \"agt-test\"\n"
|
||||
" Valid: from 2026-01-01T00:00:00 to 2026-01-02T00:00:00\n"
|
||||
),
|
||||
returncode=0,
|
||||
)
|
||||
result = runner.invoke(app, ["cert-status"], env=env)
|
||||
assert result.exit_code == 1
|
||||
assert "EXPIRED" in result.output
|
||||
|
||||
def test_cert_status_json_with_cert(self, env, state_dir):
|
||||
state_dir.mkdir(parents=True, exist_ok=True)
|
||||
cert_file = state_dir / "test-tunnel-cert.pub"
|
||||
cert_file.write_text("fake cert")
|
||||
with patch("subprocess.run") as mock_run:
|
||||
mock_run.return_value = MagicMock(
|
||||
stdout=(
|
||||
"test-tunnel-cert.pub:\n"
|
||||
" Key ID: \"agt-test\"\n"
|
||||
" Valid: from 2030-01-01T00:00:00 to 2030-01-02T00:00:00\n"
|
||||
),
|
||||
returncode=0,
|
||||
)
|
||||
result = runner.invoke(app, ["cert-status", "--json"], env=env)
|
||||
assert result.exit_code == 0
|
||||
data = json.loads(result.output)
|
||||
assert data[0]["mode"] == "cert"
|
||||
assert data[0]["key_id"] == "agt-test"
|
||||
assert data[0]["expired"] is False
|
||||
|
||||
@@ -1,9 +1,11 @@
|
||||
"""Tests for config loading."""
|
||||
import textwrap
|
||||
import warnings
|
||||
|
||||
import pytest
|
||||
|
||||
from bridge.config import ConfigError, load_config
|
||||
from bridge.models import ActorType
|
||||
|
||||
|
||||
VALID_YAML = textwrap.dedent("""\
|
||||
@@ -14,7 +16,7 @@ VALID_YAML = textwrap.dedent("""\
|
||||
local_port: 8000
|
||||
ssh_user: ubuntu
|
||||
ssh_key: ~/.ssh/id_ops
|
||||
actor: agent.claude-coulombcore
|
||||
actor: agt-claude-coulombcore
|
||||
health_check:
|
||||
url: http://127.0.0.1:18000/health
|
||||
interval_seconds: 30
|
||||
@@ -25,11 +27,11 @@ VALID_YAML = textwrap.dedent("""\
|
||||
backoff_max: 60
|
||||
|
||||
actors:
|
||||
agent.claude-coulombcore:
|
||||
class: automation
|
||||
agt-claude-coulombcore:
|
||||
class: agt
|
||||
description: Claude Code agent on CoulombCore
|
||||
operator.bernd:
|
||||
class: human
|
||||
adm-bernd:
|
||||
class: adm
|
||||
description: Bernd Worsch
|
||||
""")
|
||||
|
||||
@@ -50,7 +52,7 @@ def test_load_valid_config(config_file, monkeypatch):
|
||||
assert t.remote_port == 18000
|
||||
assert t.local_port == 8000
|
||||
assert t.ssh_user == "ubuntu"
|
||||
assert t.actor == "agent.claude-coulombcore"
|
||||
assert t.actor == "agt-claude-coulombcore"
|
||||
|
||||
|
||||
def test_health_check_loaded(config_file, monkeypatch):
|
||||
@@ -74,10 +76,10 @@ def test_reconnect_policy_loaded(config_file, monkeypatch):
|
||||
def test_actors_loaded(config_file, monkeypatch):
|
||||
monkeypatch.setenv("BRIDGE_CONFIG", str(config_file))
|
||||
cfg = load_config()
|
||||
assert "agent.claude-coulombcore" in cfg.actors
|
||||
a = cfg.actors["agent.claude-coulombcore"]
|
||||
assert a.actor_class == "automation"
|
||||
assert "operator.bernd" in cfg.actors
|
||||
assert "agt-claude-coulombcore" in cfg.actors
|
||||
a = cfg.actors["agt-claude-coulombcore"]
|
||||
assert a.actor_type == ActorType.AGT
|
||||
assert "adm-bernd" in cfg.actors
|
||||
|
||||
|
||||
def test_missing_required_field_raises(tmp_path, monkeypatch):
|
||||
@@ -118,12 +120,180 @@ def test_tunnel_without_health_check(tmp_path, monkeypatch):
|
||||
local_port: 8000
|
||||
ssh_user: ubuntu
|
||||
ssh_key: ~/.ssh/id_rsa
|
||||
actor: operator.bernd
|
||||
actor: adm-bernd
|
||||
actors:
|
||||
operator.bernd:
|
||||
class: human
|
||||
adm-bernd:
|
||||
class: adm
|
||||
description: Bernd
|
||||
"""))
|
||||
monkeypatch.setenv("BRIDGE_CONFIG", str(f))
|
||||
cfg = load_config()
|
||||
assert cfg.tunnels["simple"].health_check is None
|
||||
|
||||
|
||||
class TestActorTypeValidation:
|
||||
def test_canonical_agt_accepted(self, tmp_path, monkeypatch):
|
||||
f = tmp_path / "t.yaml"
|
||||
f.write_text(textwrap.dedent("""\
|
||||
tunnels:
|
||||
t:
|
||||
host: h
|
||||
remote_port: 1
|
||||
local_port: 2
|
||||
ssh_user: u
|
||||
ssh_key: ~/.ssh/k
|
||||
actor: agt-claude
|
||||
actors:
|
||||
agt-claude:
|
||||
class: agt
|
||||
"""))
|
||||
monkeypatch.setenv("BRIDGE_CONFIG", str(f))
|
||||
cfg = load_config()
|
||||
assert cfg.actors["agt-claude"].actor_type == ActorType.AGT
|
||||
|
||||
def test_canonical_atm_accepted(self, tmp_path, monkeypatch):
|
||||
f = tmp_path / "t.yaml"
|
||||
f.write_text(textwrap.dedent("""\
|
||||
tunnels:
|
||||
t:
|
||||
host: h
|
||||
remote_port: 1
|
||||
local_port: 2
|
||||
ssh_user: u
|
||||
ssh_key: ~/.ssh/k
|
||||
actor: atm-backup
|
||||
actors:
|
||||
atm-backup:
|
||||
class: atm
|
||||
"""))
|
||||
monkeypatch.setenv("BRIDGE_CONFIG", str(f))
|
||||
cfg = load_config()
|
||||
assert cfg.actors["atm-backup"].actor_type == ActorType.ATM
|
||||
|
||||
def test_wrong_prefix_raises_config_error(self, tmp_path, monkeypatch):
|
||||
f = tmp_path / "t.yaml"
|
||||
f.write_text(textwrap.dedent("""\
|
||||
tunnels:
|
||||
t:
|
||||
host: h
|
||||
remote_port: 1
|
||||
local_port: 2
|
||||
ssh_user: u
|
||||
ssh_key: ~/.ssh/k
|
||||
actor: adm-bernd
|
||||
actors:
|
||||
adm-bernd:
|
||||
class: agt
|
||||
"""))
|
||||
monkeypatch.setenv("BRIDGE_CONFIG", str(f))
|
||||
with pytest.raises(ConfigError, match="must start with 'agt-'"):
|
||||
load_config()
|
||||
|
||||
def test_missing_prefix_raises_config_error(self, tmp_path, monkeypatch):
|
||||
f = tmp_path / "t.yaml"
|
||||
f.write_text(textwrap.dedent("""\
|
||||
tunnels:
|
||||
t:
|
||||
host: h
|
||||
remote_port: 1
|
||||
local_port: 2
|
||||
ssh_user: u
|
||||
ssh_key: ~/.ssh/k
|
||||
actor: operator.bernd
|
||||
actors:
|
||||
operator.bernd:
|
||||
class: adm
|
||||
"""))
|
||||
monkeypatch.setenv("BRIDGE_CONFIG", str(f))
|
||||
with pytest.raises(ConfigError, match="must start with 'adm-'"):
|
||||
load_config()
|
||||
|
||||
def test_unknown_class_raises_config_error(self, tmp_path, monkeypatch):
|
||||
f = tmp_path / "t.yaml"
|
||||
f.write_text(textwrap.dedent("""\
|
||||
tunnels:
|
||||
t:
|
||||
host: h
|
||||
remote_port: 1
|
||||
local_port: 2
|
||||
ssh_user: u
|
||||
ssh_key: ~/.ssh/k
|
||||
actor: adm-bernd
|
||||
actors:
|
||||
adm-bernd:
|
||||
class: wizard
|
||||
"""))
|
||||
monkeypatch.setenv("BRIDGE_CONFIG", str(f))
|
||||
with pytest.raises(ConfigError, match="unknown class"):
|
||||
load_config()
|
||||
|
||||
def test_legacy_human_maps_to_adm_with_warning(self, tmp_path, monkeypatch):
|
||||
f = tmp_path / "t.yaml"
|
||||
f.write_text(textwrap.dedent("""\
|
||||
tunnels:
|
||||
t:
|
||||
host: h
|
||||
remote_port: 1
|
||||
local_port: 2
|
||||
ssh_user: u
|
||||
ssh_key: ~/.ssh/k
|
||||
actor: adm-bernd
|
||||
actors:
|
||||
adm-bernd:
|
||||
class: human
|
||||
"""))
|
||||
monkeypatch.setenv("BRIDGE_CONFIG", str(f))
|
||||
with warnings.catch_warnings(record=True) as w:
|
||||
warnings.simplefilter("always")
|
||||
cfg = load_config()
|
||||
assert cfg.actors["adm-bernd"].actor_type == ActorType.ADM
|
||||
assert any("deprecated" in str(x.message).lower() for x in w)
|
||||
|
||||
def test_legacy_automation_maps_to_atm_with_warning(self, tmp_path, monkeypatch):
|
||||
f = tmp_path / "t.yaml"
|
||||
f.write_text(textwrap.dedent("""\
|
||||
tunnels:
|
||||
t:
|
||||
host: h
|
||||
remote_port: 1
|
||||
local_port: 2
|
||||
ssh_user: u
|
||||
ssh_key: ~/.ssh/k
|
||||
actor: atm-cron
|
||||
actors:
|
||||
atm-cron:
|
||||
class: automation
|
||||
"""))
|
||||
monkeypatch.setenv("BRIDGE_CONFIG", str(f))
|
||||
with warnings.catch_warnings(record=True) as w:
|
||||
warnings.simplefilter("always")
|
||||
cfg = load_config()
|
||||
assert cfg.actors["atm-cron"].actor_type == ActorType.ATM
|
||||
assert any("deprecated" in str(x.message).lower() for x in w)
|
||||
|
||||
|
||||
class TestCertCommandConfig:
|
||||
def test_cert_command_parsed(self, tmp_path, monkeypatch):
|
||||
f = tmp_path / "t.yaml"
|
||||
f.write_text(textwrap.dedent("""\
|
||||
tunnels:
|
||||
t:
|
||||
host: h
|
||||
remote_port: 1
|
||||
local_port: 2
|
||||
ssh_user: u
|
||||
ssh_key: ~/.ssh/k
|
||||
actor: agt-bridge
|
||||
cert_command: "warden sign agt-bridge --pubkey /tmp/k.pub"
|
||||
actors:
|
||||
agt-bridge:
|
||||
class: agt
|
||||
"""))
|
||||
monkeypatch.setenv("BRIDGE_CONFIG", str(f))
|
||||
cfg = load_config()
|
||||
assert cfg.tunnels["t"].cert_command == "warden sign agt-bridge --pubkey /tmp/k.pub"
|
||||
|
||||
def test_no_cert_command_is_none(self, config_file, monkeypatch):
|
||||
monkeypatch.setenv("BRIDGE_CONFIG", str(config_file))
|
||||
cfg = load_config()
|
||||
assert cfg.tunnels["state-hub-coulombcore"].cert_command is None
|
||||
|
||||
@@ -6,7 +6,7 @@ from unittest.mock import MagicMock, patch
|
||||
|
||||
import pytest
|
||||
|
||||
from bridge.diagnostics import TunnelCheckResult, check_all_tunnels, check_tunnel
|
||||
from bridge.diagnostics import check_all_tunnels, check_tunnel
|
||||
from bridge.models import BridgeState, TunnelConfig
|
||||
from bridge.state import StateManager
|
||||
|
||||
@@ -20,7 +20,7 @@ def tcfg():
|
||||
local_port=8000,
|
||||
ssh_user="ubuntu",
|
||||
ssh_key="~/.ssh/id_ops",
|
||||
actor="operator.bernd",
|
||||
actor="adm-bernd",
|
||||
)
|
||||
|
||||
|
||||
@@ -114,7 +114,7 @@ class TestCheckTunnel:
|
||||
local_port=8000,
|
||||
ssh_user="ubuntu",
|
||||
ssh_key="~/.ssh/id_ops",
|
||||
actor="operator.bernd",
|
||||
actor="adm-bernd",
|
||||
health_check=HealthCheckConfig(url="http://127.0.0.1:8000/health"),
|
||||
)
|
||||
state_mgr.write_pid("test-tunnel", 12345)
|
||||
@@ -135,7 +135,8 @@ class TestCheckAllTunnels:
|
||||
def test_check_all_iterates_tunnels(self, tmp_path):
|
||||
"""check_all_tunnels returns one result per tunnel in cfg."""
|
||||
from bridge.config import load_config
|
||||
import textwrap, os
|
||||
import textwrap
|
||||
import os
|
||||
|
||||
cfg_file = tmp_path / "tunnels.yaml"
|
||||
cfg_file.write_text(textwrap.dedent("""\
|
||||
@@ -146,17 +147,17 @@ class TestCheckAllTunnels:
|
||||
local_port: 8001
|
||||
ssh_user: ubuntu
|
||||
ssh_key: ~/.ssh/id_ops
|
||||
actor: operator.bernd
|
||||
actor: adm-bernd
|
||||
t2:
|
||||
host: h2.local
|
||||
remote_port: 18002
|
||||
local_port: 8002
|
||||
ssh_user: ubuntu
|
||||
ssh_key: ~/.ssh/id_ops
|
||||
actor: operator.bernd
|
||||
actor: adm-bernd
|
||||
actors:
|
||||
operator.bernd:
|
||||
class: human
|
||||
adm-bernd:
|
||||
class: adm
|
||||
description: Bernd
|
||||
"""))
|
||||
os.environ["BRIDGE_CONFIG"] = str(cfg_file)
|
||||
|
||||
@@ -18,14 +18,14 @@ MINIMAL_CONFIG = textwrap.dedent("""\
|
||||
local_port: 8000
|
||||
ssh_user: testuser
|
||||
ssh_key: ~/.ssh/id_rsa
|
||||
actor: operator.bernd
|
||||
actor: adm-bernd
|
||||
reconnect:
|
||||
max_attempts: 2
|
||||
backoff_initial: 1
|
||||
backoff_max: 2
|
||||
actors:
|
||||
operator.bernd:
|
||||
class: human
|
||||
adm-bernd:
|
||||
class: adm
|
||||
description: Bernd
|
||||
""")
|
||||
|
||||
@@ -51,7 +51,7 @@ def tunnel_cfg():
|
||||
local_port=8000,
|
||||
ssh_user="testuser",
|
||||
ssh_key="~/.ssh/id_rsa",
|
||||
actor="operator.bernd",
|
||||
actor="adm-bernd",
|
||||
reconnect=ReconnectPolicy(max_attempts=2, backoff_initial=1, backoff_max=2),
|
||||
)
|
||||
|
||||
@@ -142,7 +142,7 @@ class TestHealthCheckDegradedPath:
|
||||
local_port=8001,
|
||||
ssh_user="u",
|
||||
ssh_key="k",
|
||||
actor="operator.bernd",
|
||||
actor="adm-bernd",
|
||||
reconnect=ReconnectPolicy(max_attempts=1, backoff_initial=1, backoff_max=1),
|
||||
health_check=hc_cfg,
|
||||
)
|
||||
|
||||
@@ -105,3 +105,99 @@ class TestTunnelManager:
|
||||
def test_is_running_false_initially(self, tunnel_cfg, state_dir):
|
||||
mgr = TunnelManager(tunnel_cfg, state_dir=state_dir)
|
||||
assert not mgr.is_running()
|
||||
|
||||
|
||||
class TestBuildSshCommandWithCert:
|
||||
def test_no_cert_path_omits_extra_i(self, tunnel_cfg):
|
||||
cmd = build_ssh_command(tunnel_cfg)
|
||||
assert cmd.count("-i") == 1
|
||||
|
||||
def test_cert_path_appends_after_key(self, tunnel_cfg, tmp_path):
|
||||
cert = tmp_path / "test-cert.pub"
|
||||
cert.write_text("cert")
|
||||
cmd = build_ssh_command(tunnel_cfg, cert_path=cert)
|
||||
i_indices = [i for i, x in enumerate(cmd) if x == "-i"]
|
||||
assert len(i_indices) == 2
|
||||
key_idx, cert_idx = i_indices
|
||||
assert not cmd[key_idx + 1].endswith("-cert.pub") # key comes first
|
||||
assert cmd[cert_idx + 1] == str(cert)
|
||||
|
||||
|
||||
class TestRunCertCommand:
|
||||
def test_returns_none_when_no_cert_command(self, tunnel_cfg, tmp_path):
|
||||
from bridge.manager import _run_cert_command
|
||||
assert _run_cert_command(tunnel_cfg, tmp_path) is None
|
||||
|
||||
def test_writes_cert_and_returns_path(self, tunnel_cfg, tmp_path):
|
||||
from bridge.manager import _run_cert_command
|
||||
tunnel_cfg.cert_command = "echo 'ssh-rsa-cert AAAA'"
|
||||
path = _run_cert_command(tunnel_cfg, tmp_path)
|
||||
assert path is not None
|
||||
assert path.exists()
|
||||
assert "ssh-rsa-cert" in path.read_text()
|
||||
|
||||
def test_raises_on_nonzero_exit(self, tunnel_cfg, tmp_path):
|
||||
from bridge.manager import _run_cert_command
|
||||
from bridge.models import CertAcquisitionError
|
||||
tunnel_cfg.cert_command = "exit 1"
|
||||
with pytest.raises(CertAcquisitionError):
|
||||
_run_cert_command(tunnel_cfg, tmp_path)
|
||||
|
||||
|
||||
class TestActorTypeFromName:
|
||||
def test_adm_prefix(self):
|
||||
from bridge.manager import _actor_type_from_name
|
||||
assert _actor_type_from_name("adm-bernd") == "adm"
|
||||
|
||||
def test_agt_prefix(self):
|
||||
from bridge.manager import _actor_type_from_name
|
||||
assert _actor_type_from_name("agt-claude") == "agt"
|
||||
|
||||
def test_atm_prefix(self):
|
||||
from bridge.manager import _actor_type_from_name
|
||||
assert _actor_type_from_name("atm-cron") == "atm"
|
||||
|
||||
def test_unknown_prefix(self):
|
||||
from bridge.manager import _actor_type_from_name
|
||||
assert _actor_type_from_name("operator.bernd") == "unknown"
|
||||
|
||||
|
||||
class TestTtlRefresh:
|
||||
def test_parse_cert_expiry_returns_none_for_missing_file(self, tmp_path):
|
||||
from bridge.manager import _parse_cert_expiry
|
||||
missing = tmp_path / "no.pub"
|
||||
result = _parse_cert_expiry(missing)
|
||||
assert result is None
|
||||
|
||||
def test_parse_cert_identity_returns_none_for_missing_file(self, tmp_path):
|
||||
from bridge.manager import _parse_cert_identity
|
||||
missing = tmp_path / "no.pub"
|
||||
result = _parse_cert_identity(missing)
|
||||
assert result is None
|
||||
|
||||
def test_parse_cert_identity_from_keygen_output(self, tmp_path):
|
||||
from unittest.mock import patch, MagicMock
|
||||
from bridge.manager import _parse_cert_identity
|
||||
cert = tmp_path / "test.pub"
|
||||
cert.write_text("fake")
|
||||
with patch("subprocess.run") as mock_run:
|
||||
mock_run.return_value = MagicMock(
|
||||
stdout='test.pub:\n Key ID: "agt-bridge"\n',
|
||||
returncode=0,
|
||||
)
|
||||
result = _parse_cert_identity(cert)
|
||||
assert result == "agt-bridge"
|
||||
|
||||
def test_parse_cert_expiry_from_keygen_output(self, tmp_path):
|
||||
from unittest.mock import patch, MagicMock
|
||||
from bridge.manager import _parse_cert_expiry
|
||||
cert = tmp_path / "test.pub"
|
||||
cert.write_text("fake")
|
||||
with patch("subprocess.run") as mock_run:
|
||||
mock_run.return_value = MagicMock(
|
||||
stdout="test.pub:\n Valid: from 2026-05-15T10:00:00 to 2030-05-15T22:00:00\n",
|
||||
returncode=0,
|
||||
)
|
||||
result = _parse_cert_expiry(cert)
|
||||
assert result is not None
|
||||
assert result.year == 2030
|
||||
|
||||
@@ -49,10 +49,10 @@ def _simple_config(tmp_path: Path) -> Path:
|
||||
local_port: 8000
|
||||
ssh_user: ubuntu
|
||||
ssh_key: ~/.ssh/id_ops
|
||||
actor: operator.bernd
|
||||
actor: adm-bernd
|
||||
actors:
|
||||
operator.bernd:
|
||||
class: human
|
||||
adm-bernd:
|
||||
class: adm
|
||||
description: Bernd
|
||||
"""))
|
||||
|
||||
@@ -66,10 +66,10 @@ def _catalog_config(tmp_path: Path, catalog_dir: Path) -> Path:
|
||||
local_port: 8000
|
||||
ssh_user: ubuntu
|
||||
ssh_key: ~/.ssh/id_ops
|
||||
actor: operator.bernd
|
||||
actor: adm-bernd
|
||||
actors:
|
||||
operator.bernd:
|
||||
class: human
|
||||
adm-bernd:
|
||||
class: adm
|
||||
description: Bernd
|
||||
catalog_path: {catalog_dir}
|
||||
"""))
|
||||
@@ -278,8 +278,8 @@ class TestMcpBridgeLogs:
|
||||
_json.dumps({
|
||||
"timestamp": "2026-01-01T00:00:00+00:00",
|
||||
"tunnel": "test-tunnel",
|
||||
"actor": "operator.bernd",
|
||||
"actor_class": "human",
|
||||
"actor": "adm-bernd",
|
||||
"actor_type": "adm",
|
||||
"event": "bridge_started",
|
||||
}) + "\n"
|
||||
)
|
||||
|
||||
@@ -69,6 +69,7 @@ class TestTunnelConfig:
|
||||
|
||||
class TestActorInfo:
|
||||
def test_fields(self):
|
||||
a = ActorInfo(name="operator.bernd", actor_class="human", description="Bernd")
|
||||
assert a.name == "operator.bernd"
|
||||
assert a.actor_class == "human"
|
||||
from bridge.models import ActorType
|
||||
a = ActorInfo(name="adm-bernd", actor_type=ActorType.ADM, description="Bernd")
|
||||
assert a.name == "adm-bernd"
|
||||
assert a.actor_type == ActorType.ADM
|
||||
|
||||
@@ -4,7 +4,7 @@ type: workplan
|
||||
title: "AccessManagementDirective Alignment"
|
||||
domain: custodian
|
||||
repo: ops-bridge
|
||||
status: active
|
||||
status: done
|
||||
owner: Bernd
|
||||
topic_slug: custodian
|
||||
created: "2026-03-28"
|
||||
@@ -122,49 +122,49 @@ SIEM auditability.
|
||||
```task
|
||||
id: BRIDGE-WP-0004-T1
|
||||
state_hub_task_id: 40c7f818-8233-4b84-9a0e-5f5359a47504
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
```
|
||||
|
||||
- [ ] `models.py`: replace `actor_class: str` in `ActorInfo` with `actor_type: ActorType`
|
||||
- [ ] `config.py`: accept legacy `"human"` → `ActorType.ADM` and `"automation"` →
|
||||
- [x] `models.py`: replace `actor_class: str` in `ActorInfo` with `actor_type: ActorType`
|
||||
- [x] `config.py`: accept legacy `"human"` → `ActorType.ADM` and `"automation"` →
|
||||
`ActorType.ATM` with a `DeprecationWarning`; reject unknown values
|
||||
- [ ] `config.py`: enforce actor name prefix: `adm-*` for ADM, `agt-*` for AGT,
|
||||
- [x] `config.py`: enforce actor name prefix: `adm-*` for ADM, `agt-*` for AGT,
|
||||
`atm-*` for ATM; raise `ConfigError` on mismatch
|
||||
- [ ] Update `manager.py` / `audit.py` call sites: `actor_class` → `actor_type.value`
|
||||
- [ ] Update tests
|
||||
- [x] Update `manager.py` / `audit.py` call sites: `actor_class` → `actor_type.value`
|
||||
- [x] Update tests
|
||||
|
||||
### T2 — cert_command config field
|
||||
|
||||
```task
|
||||
id: BRIDGE-WP-0004-T2
|
||||
state_hub_task_id: d69ac3b8-6c68-4da0-976f-0cce2ee626d6
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
```
|
||||
|
||||
- [ ] `models.py`: add `cert_command: Optional[str] = None` to `TunnelConfig`
|
||||
- [ ] `config.py`: parse `cert_command` from tunnel YAML; no validation of the string
|
||||
- [x] `models.py`: add `cert_command: Optional[str] = None` to `TunnelConfig`
|
||||
- [x] `config.py`: parse `cert_command` from tunnel YAML; no validation of the string
|
||||
content (shell-level freedom intentional)
|
||||
- [ ] Document in config example / SCOPE.md
|
||||
- [x] Document in config example / SCOPE.md
|
||||
|
||||
### T3 — Cert acquisition in manager
|
||||
|
||||
```task
|
||||
id: BRIDGE-WP-0004-T3
|
||||
state_hub_task_id: b93be1e4-dd32-4e9c-a085-c5bf81108d97
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
```
|
||||
|
||||
- [ ] `manager.py`: extract cert acquisition into `_acquire_cert(cfg) -> Optional[Path]`
|
||||
- [x] `manager.py`: extract cert acquisition into `_acquire_cert(cfg) -> Optional[Path]`
|
||||
- If `cfg.cert_command` is None: return None (static key mode)
|
||||
- Run `cert_command` via `subprocess.run(shell=True, capture_output=True)`
|
||||
- Write stdout to `~/.local/state/bridge/<tunnel>-cert.pub` (overwrite each time)
|
||||
- Return path; on non-zero exit code: raise `CertAcquisitionError` with stderr
|
||||
- [ ] `build_ssh_command`: accept optional `cert_path`; when set, insert
|
||||
- [x] `build_ssh_command`: accept optional `cert_path`; when set, insert
|
||||
`-i <cert_path>` after `-i <key_path>` (OpenSSH loads both automatically)
|
||||
- [ ] Call `_acquire_cert` at the top of each reconnect iteration (not once at startup)
|
||||
- [x] Call `_acquire_cert` at the top of each reconnect iteration (not once at startup)
|
||||
so every reconnect gets a fresh cert
|
||||
|
||||
### T4 — cert_identity in audit log
|
||||
@@ -172,103 +172,98 @@ priority: high
|
||||
```task
|
||||
id: BRIDGE-WP-0004-T4
|
||||
state_hub_task_id: bc29cc2a-1d77-48d8-97d3-54a49de0550e
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
```
|
||||
|
||||
- [ ] `manager.py`: after cert acquisition, parse `ssh-keygen -L -f <cert>` output to
|
||||
- [x] `manager.py`: after cert acquisition, parse `ssh-keygen -L -f <cert>` output to
|
||||
extract `Key ID` (the `-I` value from signing time)
|
||||
- [ ] Add `cert_identity: Optional[str]` to `AuditLogger.log()` signature; include in
|
||||
- [x] Add `cert_identity: Optional[str]` to `AuditLogger.log()` signature; include in
|
||||
JSON entry when present
|
||||
- [ ] Log `cert_identity` in `BRIDGE_CONNECTED` and `BRIDGE_STARTED` events
|
||||
- [ ] `AuditEvent`: no new events needed; `cert_identity` is metadata on existing events
|
||||
- [x] Log `cert_identity` in `BRIDGE_CONNECTED` and `BRIDGE_STARTED` events
|
||||
- [x] `AuditEvent`: no new events needed; `cert_identity` is metadata on existing events
|
||||
|
||||
### T5 — TTL-aware cert refresh
|
||||
|
||||
```task
|
||||
id: BRIDGE-WP-0004-T5
|
||||
state_hub_task_id: cc3aee49-7821-4a11-a331-be562aa88d91
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
```
|
||||
|
||||
- [ ] `manager.py`: after successful cert acquisition, parse `Valid before:` timestamp
|
||||
- [x] `manager.py`: after successful cert acquisition, parse `Valid before:` timestamp
|
||||
from `ssh-keygen -L` output → `cert_expires_at: datetime`
|
||||
- [ ] In the health-check/wait loop, check `datetime.now(utc) >= cert_expires_at - timedelta(minutes=5)`
|
||||
- [x] In the health-check/wait loop, check `datetime.now(utc) >= cert_expires_at - timedelta(minutes=5)`
|
||||
on each iteration
|
||||
- [ ] When refresh is due: call `proc.terminate()`, break inner loop, let the outer
|
||||
- [x] When refresh is due: call `proc.terminate()`, break inner loop, let the outer
|
||||
reconnect loop restart naturally (T3 will re-acquire the cert at the top of the
|
||||
next iteration)
|
||||
- [ ] Log a new `AuditEvent.CERT_EXPIRING` event when refresh is triggered (add to
|
||||
- [x] Log a new `AuditEvent.CERT_EXPIRING` event when refresh is triggered (add to
|
||||
`AuditEvent` enum); include `cert_identity` and `cert_expires_at` in detail field
|
||||
- [ ] If `cert_command` is absent, skip all TTL logic entirely
|
||||
- [x] If `cert_command` is absent, skip all TTL logic entirely
|
||||
|
||||
### T6 — `bridge cert-status` command
|
||||
|
||||
```task
|
||||
id: BRIDGE-WP-0004-T6
|
||||
state_hub_task_id: b10275fc-bfe2-49a9-a83e-dd0dec796efd
|
||||
status: todo
|
||||
status: done
|
||||
priority: medium
|
||||
```
|
||||
|
||||
- [ ] `cli.py`: add `cert-status [TUNNEL]` subcommand
|
||||
- [ ] For each tunnel (or the named one): read cert file from state dir if present,
|
||||
- [x] `cli.py`: add `cert-status [TUNNEL]` subcommand
|
||||
- [x] For each tunnel (or the named one): read cert file from state dir if present,
|
||||
run `ssh-keygen -L`, display: identity, principals, valid-from, valid-until,
|
||||
time-to-expiry (or "static key / no cert" if absent)
|
||||
- [ ] Exit code 1 if any cert is expired; exit code 0 otherwise (scriptable)
|
||||
- [ ] `--json` flag for machine-readable output
|
||||
- [x] Exit code 1 if any cert is expired; exit code 0 otherwise (scriptable)
|
||||
- [x] `--json` flag for machine-readable output
|
||||
|
||||
### T7 — CertAcquisitionError handling
|
||||
|
||||
```task
|
||||
id: BRIDGE-WP-0004-T7
|
||||
state_hub_task_id: de355a7c-f07e-452e-974f-4ddf362b24a6
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
```
|
||||
|
||||
- [ ] New exception `CertAcquisitionError` in `models.py`
|
||||
- [ ] In `_run_loop`: catch `CertAcquisitionError`, log `AuditEvent.BRIDGE_DISCONNECTED`
|
||||
- [x] New exception `CertAcquisitionError` in `models.py`
|
||||
- [x] In `_run_loop`: catch `CertAcquisitionError`, log `AuditEvent.BRIDGE_DISCONNECTED`
|
||||
with `detail="cert acquisition failed: <stderr>"`, apply normal backoff and retry
|
||||
(cert failures are transient — e.g., Vault briefly unreachable)
|
||||
- [ ] After `max_attempts` consecutive cert failures, transition to `FAILED` state
|
||||
- [x] After `max_attempts` consecutive cert failures, transition to `FAILED` state
|
||||
|
||||
### T8 — SCOPE.md and documentation updates
|
||||
|
||||
```task
|
||||
id: BRIDGE-WP-0004-T8
|
||||
state_hub_task_id: 40f5364b-f9e1-41cb-90e5-2b19511108f1
|
||||
status: todo
|
||||
status: done
|
||||
priority: medium
|
||||
```
|
||||
|
||||
- [ ] Update `SCOPE.md`: replace "Identity/credential management (uses existing SSH keys)"
|
||||
with the pluggable cert_command model; add ops-warden as related repo; update
|
||||
actor terminology to adm/agt/atm; update Current State
|
||||
- [ ] Update `wiki/OpsBridgeFrs.md` §5.7 (actor attribution): note three-actor model,
|
||||
cert_identity field, cert_command interface
|
||||
- [ ] Update `wiki/OpsBridgePrd.md`: note directive alignment, ops-warden dependency
|
||||
- [ ] Update config example in README / `wiki/` to show both static and cert_command modes
|
||||
- [ ] Update `.claude/rules/architecture.md`: add cert lifecycle to architecture description
|
||||
- [x] Update `SCOPE.md`: Current State updated to reflect completion; directive alignment done
|
||||
- [x] `wiki/OpsBridgeFrs.md` §5.7 already covers actor attribution abstractly — no changes needed
|
||||
- [x] `.claude/rules/architecture.md` already documents cert_command mode and actor vocab
|
||||
- [ ] Update `wiki/OpsBridgePrd.md`: note directive alignment, ops-warden dependency (deferred)
|
||||
|
||||
### T9 — Tests
|
||||
|
||||
```task
|
||||
id: BRIDGE-WP-0004-T9
|
||||
state_hub_task_id: fc1d1321-c1d0-4a0a-ae2e-d9ec9939dd6a
|
||||
status: todo
|
||||
status: done
|
||||
priority: high
|
||||
```
|
||||
|
||||
- [ ] `test_config.py`: actor name prefix validation (adm/agt/atm); legacy class mapping;
|
||||
- [x] `test_config.py`: actor name prefix validation (adm/agt/atm); legacy class mapping;
|
||||
cert_command parse
|
||||
- [ ] `test_manager.py`: mock `cert_command` subprocess; verify cert path appended to SSH
|
||||
args; verify `CertAcquisitionError` on non-zero exit
|
||||
- [ ] `test_manager.py`: TTL logic — mock `cert_expires_at` in past; verify refresh triggers
|
||||
- [ ] `test_audit.py`: `cert_identity` field present in CONNECTED event when cert was used;
|
||||
absent in static-key mode
|
||||
- [ ] `test_cli.py`: `cert-status` exit codes; JSON output shape
|
||||
- [x] `test_manager.py`: mock `cert_command` subprocess; verify cert path appended to SSH
|
||||
args; verify `CertAcquisitionError` on non-zero exit; TTL logic helpers
|
||||
- [x] `test_audit.py`: `cert_identity` field; actor_type rename
|
||||
- [x] `test_cli.py`: `cert-status` exit codes; JSON output shape
|
||||
- [x] 233 tests, 0 failures
|
||||
|
||||
---
|
||||
|
||||
@@ -330,16 +325,16 @@ actors:
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
- [ ] Existing `tunnels.yaml` with `class: automation` loads without error (deprecation
|
||||
- [x] Existing `tunnels.yaml` with `class: automation` loads without error (deprecation
|
||||
warning only); tunnel behaves identically
|
||||
- [ ] New config with `class: agt` and actor name not prefixed `agt-` raises `ConfigError`
|
||||
- [ ] Config with `cert_command` set: SSH process launched with both `-i key` and
|
||||
- [x] New config with `class: agt` and actor name not prefixed `agt-` raises `ConfigError`
|
||||
- [x] Config with `cert_command` set: SSH process launched with both `-i key` and
|
||||
`-i cert`; `cert_identity` present in `BRIDGE_CONNECTED` audit event
|
||||
- [ ] Config without `cert_command`: no cert file written; `cert_identity` absent in audit;
|
||||
- [x] Config without `cert_command`: no cert file written; `cert_identity` absent in audit;
|
||||
no TTL logic runs
|
||||
- [ ] `cert_command` exits non-zero: tunnel enters backoff/retry, `BRIDGE_DISCONNECTED`
|
||||
- [x] `cert_command` exits non-zero: tunnel enters backoff/retry, `BRIDGE_DISCONNECTED`
|
||||
logged with stderr detail; eventually reaches `FAILED` after `max_attempts`
|
||||
- [ ] Cert within 5 min of expiry: SSH restarted with fresh cert; `CERT_EXPIRING` logged
|
||||
- [ ] `bridge cert-status` shows valid cert info; exits 1 on expired cert
|
||||
- [ ] All tests pass: `uv run pytest`
|
||||
- [ ] All lints pass: `uv run ruff check .`
|
||||
- [x] Cert within 5 min of expiry: SSH restarted with fresh cert; `CERT_EXPIRING` logged
|
||||
- [x] `bridge cert-status` shows valid cert info; exits 1 on expired cert
|
||||
- [x] All tests pass: `uv run pytest` (233 passed)
|
||||
- [x] All lints pass: `uv run ruff check .`
|
||||
|
||||
Reference in New Issue
Block a user