generated from coulomb/repo-seed
feat(statehub): add offline write buffer relay
This commit is contained in:
95
docs/offline-write-buffer.md
Normal file
95
docs/offline-write-buffer.md
Normal file
@@ -0,0 +1,95 @@
|
||||
# State Hub Offline Write Buffer
|
||||
|
||||
## Decision
|
||||
|
||||
State Hub supports outage buffering through an edge relay with a durable local
|
||||
outbox, plus central idempotency on replayed writes.
|
||||
|
||||
The central service cannot buffer requests that never reach it. Agents should
|
||||
therefore send writes to a local statehub-edge relay when buffering is enabled.
|
||||
The relay forwards immediately while the upstream API is reachable. If the
|
||||
upstream is offline, the relay persists queueable write envelopes in a local
|
||||
SQLite outbox and returns an explicit queued receipt.
|
||||
|
||||
Queued receipts are pending evidence, not successful central commits. Operators
|
||||
must inspect and replay the outbox after recovery.
|
||||
|
||||
## Defaults
|
||||
|
||||
- Relay listen target: operator-selected, recommended 127.0.0.1:18080.
|
||||
- Upstream API: STATEHUB_UPSTREAM_URL, then API_BASE, then
|
||||
http://127.0.0.1:8000.
|
||||
- Outbox path: STATEHUB_OUTBOX_PATH, default
|
||||
~/.statehub/edge-outbox.sqlite3.
|
||||
- Central idempotency retention: 14 days.
|
||||
|
||||
## Route Classes
|
||||
|
||||
### Append-Only, Queueable
|
||||
|
||||
| Method | Path | Notes |
|
||||
| --- | --- | --- |
|
||||
| POST | /progress/ | Session-close progress events. |
|
||||
| POST | /messages/ | Agent coordination messages. |
|
||||
| PATCH | /messages/{id}/read | Safe only when the message id is already known. |
|
||||
| POST | /token-events/ | Token accounting events. |
|
||||
| POST | /token-events/upsert | Source-id based token upsert. |
|
||||
| POST | /decisions/ | Queue only when the caller does not need the generated id immediately. |
|
||||
|
||||
Append-only writes replay with Idempotency-Key. Exact duplicate retries return
|
||||
the original central response. Same key with a different request returns HTTP
|
||||
409.
|
||||
|
||||
### Replace-Style, Queueable With Conflict Checks
|
||||
|
||||
| Method | Path | Notes |
|
||||
| --- | --- | --- |
|
||||
| PATCH | /tasks/{id} | Task status and metadata updates. |
|
||||
| POST | /tasks/bulk-status-sync | Ordered batch; future coalescing may decompose by task. |
|
||||
| PATCH | /decisions/{id} | Decision field update. |
|
||||
| POST | /decisions/{id}/resolve | Decision resolution. |
|
||||
| PATCH | /workplans/{id} | Workplan lifecycle/status updates. |
|
||||
| PATCH | /workstreams/{id} | Legacy alias for workplan update. |
|
||||
|
||||
In v1 the relay does not silently overwrite newer central state after a replay
|
||||
conflict. A 409 response marks the envelope conflict and leaves it available for
|
||||
operator review.
|
||||
|
||||
### Online-Only In V1
|
||||
|
||||
The relay forwards these while the upstream is reachable and returns a clear
|
||||
503 during outage:
|
||||
|
||||
- DELETE endpoints.
|
||||
- Repository sync/import/ingest endpoints.
|
||||
- Consistency sweep mutation endpoints.
|
||||
- Fabric graph exports and external pulls.
|
||||
- Schema/bootstrap/admin operations.
|
||||
- Requests with credentials, authorization tokens, attachments, or large opaque
|
||||
payloads.
|
||||
|
||||
## Non-Secret Outbox Contract
|
||||
|
||||
The outbox stores method, path, scrubbed JSON body, route class, source metadata,
|
||||
idempotency key, retry status, last error, and central response summaries. It
|
||||
never stores authorization headers, bearer tokens, cookies, API keys, passwords,
|
||||
or secret-looking JSON fields. Payloads over 64 KiB are rejected.
|
||||
|
||||
## Operator Commands
|
||||
|
||||
statehub outbox status
|
||||
statehub outbox list --status queued
|
||||
statehub outbox replay --upstream-url http://127.0.0.1:8000
|
||||
statehub outbox export --output /tmp/statehub-outbox.json
|
||||
statehub outbox retry ENVELOPE_ID
|
||||
statehub outbox cancel ENVELOPE_ID
|
||||
|
||||
## Recovery Checklist
|
||||
|
||||
1. Confirm the central State Hub API is reachable.
|
||||
2. Run statehub outbox status on each host that may have queued writes.
|
||||
3. Run statehub outbox replay until no due queued envelopes remain.
|
||||
4. Review conflict envelopes manually.
|
||||
5. Run make fix-consistency REPO=state-hub so file-backed workplan/task state
|
||||
remains canonical after replay.
|
||||
6. Record a progress note with non-secret replay counts.
|
||||
Reference in New Issue
Block a user