generated from coulomb/repo-seed
docs: add server prerequisites and health check gotchas
Document ClientAliveInterval/ClientAliveCountMax requirement on remote sshd to prevent stale sessions holding ports after reconnect. Document fail2ban ignoreip setup. Clarify that health_check.url must be a local port (not the remote forwarded port), and that SSE endpoints block the health checker. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
41
README.txt
41
README.txt
@@ -229,6 +229,44 @@ Source layout:
|
||||
catalog/ OpsCatalog extension
|
||||
|
||||
|
||||
SERVER PREREQUISITES
|
||||
--------------------
|
||||
|
||||
For reliable auto-reconnect after reboots or network drops, the remote sshd
|
||||
needs two settings in /etc/ssh/sshd_config:
|
||||
|
||||
ClientAliveInterval 30
|
||||
ClientAliveCountMax 3
|
||||
|
||||
Without these, dead SSH sessions hold their remote port forward open (the OS
|
||||
has not yet cleaned up the socket), so the next reconnect attempt hits
|
||||
"remote port forwarding failed" and exits with code 255. With ClientAlive
|
||||
enabled, sshd evicts stale sessions within ~90 seconds and frees the port.
|
||||
|
||||
Apply and reload (no disconnect):
|
||||
|
||||
sudo sed -i 's/#ClientAliveInterval 0/ClientAliveInterval 30/' /etc/ssh/sshd_config
|
||||
sudo sed -i 's/#ClientAliveCountMax 3/ClientAliveCountMax 3/' /etc/ssh/sshd_config
|
||||
sudo kill -HUP $(cat /run/sshd.pid)
|
||||
|
||||
If fail2ban is running on the remote, whitelist the bridge host IP so rapid
|
||||
reconnect storms (e.g. after a key auth failure) do not trigger a ban.
|
||||
Add the client IP to ignoreip in /etc/fail2ban/jail.local:
|
||||
|
||||
[DEFAULT]
|
||||
ignoreip = 127.0.0.1/8 ::1 <your-bridge-host-ip>
|
||||
|
||||
Then reload: sudo systemctl reload fail2ban
|
||||
|
||||
Note: health_check.url must point to a LOCAL port (the local side of the
|
||||
tunnel), not the remote forwarded port. For a reverse tunnel
|
||||
(remote_port=18000, local_port=8000), the correct health check URL is
|
||||
http://127.0.0.1:8000/... — NOT http://127.0.0.1:18000/...
|
||||
For SSE endpoints (MCP), use a non-streaming endpoint from the same service
|
||||
(e.g. the state-hub /state/health) since the health checker waits for the
|
||||
response to complete.
|
||||
|
||||
|
||||
DESIGN NOTES
|
||||
------------
|
||||
|
||||
@@ -240,6 +278,9 @@ DESIGN NOTES
|
||||
audit traceability (FRS §5.7).
|
||||
- SSH command invoked: ssh -N -R remote_port:127.0.0.1:local_port
|
||||
-i ssh_key ssh_user@host
|
||||
- ExitOnForwardFailure=yes is set, so SSH exits immediately if the remote
|
||||
port is already in use. This is intentional — it forces a clean reconnect
|
||||
rather than silently running without the port forward active.
|
||||
|
||||
|
||||
REPO STRUCTURE
|
||||
|
||||
Reference in New Issue
Block a user