Files
state-hub/docs/offline-write-buffer.md

3.7 KiB

State Hub Offline Write Buffer

Decision

State Hub supports outage buffering through an edge relay with a durable local outbox, plus central idempotency on replayed writes.

The central service cannot buffer requests that never reach it. Agents should therefore send writes to a local statehub-edge relay when buffering is enabled. The relay forwards immediately while the upstream API is reachable. If the upstream is offline, the relay persists queueable write envelopes in a local SQLite outbox and returns an explicit queued receipt.

Queued receipts are pending evidence, not successful central commits. Operators must inspect and replay the outbox after recovery.

Defaults

  • Relay listen target: operator-selected, recommended 127.0.0.1:18080.
  • Upstream API: STATEHUB_UPSTREAM_URL, then API_BASE, then http://127.0.0.1:8000.
  • Outbox path: STATEHUB_OUTBOX_PATH, default ~/.statehub/edge-outbox.sqlite3.
  • Central idempotency retention: 14 days.

Route Classes

Append-Only, Queueable

Method Path Notes
POST /progress/ Session-close progress events.
POST /messages/ Agent coordination messages.
PATCH /messages/{id}/read Safe only when the message id is already known.
POST /token-events/ Token accounting events.
POST /token-events/upsert Source-id based token upsert.
POST /decisions/ Queue only when the caller does not need the generated id immediately.

Append-only writes replay with Idempotency-Key. Exact duplicate retries return the original central response. Same key with a different request returns HTTP 409.

Replace-Style, Queueable With Conflict Checks

Method Path Notes
PATCH /tasks/{id} Task status and metadata updates.
POST /tasks/bulk-status-sync Ordered batch; future coalescing may decompose by task.
PATCH /decisions/{id} Decision field update.
POST /decisions/{id}/resolve Decision resolution.
PATCH /workplans/{id} Workplan lifecycle/status updates.
PATCH /workstreams/{id} Legacy alias for workplan update.

In v1 the relay does not silently overwrite newer central state after a replay conflict. A 409 response marks the envelope conflict and leaves it available for operator review.

Online-Only In V1

The relay forwards these while the upstream is reachable and returns a clear 503 during outage:

  • DELETE endpoints.
  • Repository sync/import/ingest endpoints.
  • Consistency sweep mutation endpoints.
  • Fabric graph exports and external pulls.
  • Schema/bootstrap/admin operations.
  • Requests with credentials, authorization tokens, attachments, or large opaque payloads.

Non-Secret Outbox Contract

The outbox stores method, path, scrubbed JSON body, route class, source metadata, idempotency key, retry status, last error, and central response summaries. It never stores authorization headers, bearer tokens, cookies, API keys, passwords, or secret-looking JSON fields. Payloads over 64 KiB are rejected.

Operator Commands

statehub outbox status
statehub outbox list --status queued
statehub outbox replay --upstream-url http://127.0.0.1:8000
statehub outbox export --output /tmp/statehub-outbox.json
statehub outbox retry ENVELOPE_ID
statehub outbox cancel ENVELOPE_ID

Recovery Checklist

  1. Confirm the central State Hub API is reachable.
  2. Run statehub outbox status on each host that may have queued writes.
  3. Run statehub outbox replay until no due queued envelopes remain.
  4. Review conflict envelopes manually.
  5. Run make fix-consistency REPO=state-hub so file-backed workplan/task state remains canonical after replay.
  6. Record a progress note with non-secret replay counts.