ops: establish ops/ directory with Gitea runbook and INC-001 incident report

- Create ops/runbooks/gitea-coulombcore.md — recovery checklist for Gitea
  on COULOMBCORE, documents containerd StartError pattern and CPU budget issue
- Create ops/incidents/2026-03-25-gitea-pgpool-crashloop.md — INC-001 post-mortem
  for 13-day Gitea outage (PGPool CrashLoopBackOff + rolling update CPU deadlock)
- Create ops/README.md — index for runbooks and incidents
- state-hub/dashboard/src/docs/connecting.md: add railiance01 tunnel config
  (was previously unsaved)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-25 11:30:44 +01:00
parent b3a44fb4f3
commit dff9806bb6

View File

@@ -135,6 +135,38 @@ tunnels:
max_attempts: 0
backoff_initial: 5
backoff_max: 60
state-hub-railiance01: # API tunnel
host: 92.205.62.239
remote_port: 18000
local_port: 8000
ssh_user: tegwick
ssh_key: ~/.ssh/id_ops
actor: agent.claude-railiance01
health_check:
url: http://127.0.0.1:8000/state/health
interval_seconds: 30
timeout_seconds: 5
reconnect:
max_attempts: 0
backoff_initial: 5
backoff_max: 60
state-hub-mcp-railiance01: # MCP SSE tunnel
host: 92.205.62.239
remote_port: 18001
local_port: 8001
ssh_user: tegwick
ssh_key: ~/.ssh/id_ops
actor: agent.claude-railiance01
health_check:
url: http://127.0.0.1:18001/sse
interval_seconds: 30
timeout_seconds: 5
reconnect:
max_attempts: 0
backoff_initial: 5
backoff_max: 60
```
ops-bridge source: `~/ops-bridge` · SSH key: `~/.ssh/id_ops`