Files

tegwick 41ce4ede2c feat(dashboard): poll optimisation — T4, T5, T6

T4: workstreams.md and dependencies.md now call /state/deps instead of the
    full /state/summary — removes 2 heavy 10-table queries per 60 s cycle.

T5: index.md's 4 independent polling loops (summaryState, sbomSnapState,
    regsState, wsChartState) consolidated into a single pageState generator
    with one Promise.all batch and a shared backoff counter.

T6: config.js gains waitForVisible(ms) — pauses polling entirely while the
    tab is hidden and fires immediately on visibilitychange.  pollDelay()
    simplified (hidden-tab POLL_HIDDEN logic removed).  All 16 polling pages
    migrated from await sleep(pollDelay(...)) to await waitForVisible(pollDelay(...)).

CUST-WP-0039 complete — all 6 tasks done.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-05-11 17:58:18 +02:00

7.1 KiB

Raw Permalink Blame History

id, type, title, domain, status, owner, topic_slug, created, updated, state_hub_workstream_id

id	type	title	domain	status	owner	topic_slug	created	updated	state_hub_workstream_id
CUST-WP-0039	workplan	Dashboard Poll Optimization	custodian	done	custodian	custodian	2026-05-11	2026-05-11	d5ffb008-a517-4b8b-86ce-093fcc285fb3

Dashboard Poll Optimization

Problem

With uvicorn --reload watching .venv/ now fixed (CUST-WP-0039 precursor), the remaining sustained load on the API worker comes from the dashboard polling pattern:

24 pages, 14 with active polling loops (POLL_HEAVY = 60 s, POLL = 15 s)
index.md alone runs 4 independent polling loops firing 11 API calls per cycle: /state/summary, /sbom/snapshots/, /progress/, /workstreams/, /tasks/?limit=2000, /topics/, /repos/, /workstreams/workplan-index
workstreams.md and dependencies.md each call /state/summary (the most expensive endpoint — queries 10+ tables) every 60 s just to extract dependency edges from open_workstreams[].depends_on
Reference data (/topics/, /repos/) is fetched independently by 10+ pages every 60 s with no caching; these datasets change rarely
Background tabs still poll at 120 s (POLL_HIDDEN) — they could pause entirely

Goals

Reduce API request rate and per-request cost when the dashboard is open, without degrading UX or data freshness for the pages the user is actively viewing.

Out of scope

SSE / WebSocket push (would require significant API rework)
Observable data loaders / static build mode (different deployment model)
BroadcastChannel cross-tab sharing (nice-to-have, not in this workplan)

Tasks

T1 — Add Cache-Control headers to reference endpoints

id: CUST-WP-0039-T1
status: done
priority: high
state_hub_task_id: "b36713d8-d1d5-43c5-86c3-e22f72b68d62"

Add Cache-Control: max-age=60, stale-while-revalidate=30 to the list responses for /topics/, /repos/, and /domains/. These datasets change only when a human explicitly creates/renames a domain or registers a repo — never on their own.

Browser-level caching means that when 10 pages all fetch /topics/ within a 60 s window, only the first request hits the API; the rest are served from cache.

Implementation: Add a FastAPI middleware or a response-header dependency in api/routers/topics.py, repos.py, and domains.py list endpoints. Use from fastapi.responses import Response + response.headers["Cache-Control"], or a shared cache_headers dependency.

T2 — Add ETag support to high-frequency list endpoints

id: CUST-WP-0039-T2
status: done
priority: high
state_hub_task_id: "75f1c2cd-0baf-4747-8c67-1dbfa81bde41"

Add ETag (content hash of the response body) and handle If-None-Match for /workstreams/, /tasks/, and /state/summary. When the data hasn't changed the API returns 304 Not Modified with no body — roughly 95% smaller than a full response.

Implementation:

Add a FastAPI middleware (in api/main.py) that intercepts JSON list responses, computes md5(body), sets ETag: "<hash>", and returns 304 if the request carries a matching If-None-Match header.
No client changes needed — fetch() respects ETags automatically when the response includes Cache-Control: no-cache (which forces revalidation but allows 304).

T3 — Add lightweight `/state/deps` endpoint

id: CUST-WP-0039-T3
status: done
priority: high
state_hub_task_id: "cb7608d3-5dad-4b51-9b91-080539f7aa65"

workstreams.md and dependencies.md call /state/summary (a ~10-table query) only to extract open_workstreams[].{id, depends_on, blocks}. Add a dedicated endpoint that returns just this:

GET /state/deps
→ [{"id": "...", "title": "...", "status": "...", "depends_on": [...], "blocks": [...]}]

Query: SELECT id, title, status FROM workstreams WHERE status IN ('active','blocked') plus the dependency join — roughly 1/10th the work of the full summary.

Implementation: New route in api/routers/state.py (or a new deps.py). Schema: WorkstreamDepStub already exists in api/schemas/workstream_dependency.py — reuse or extend it.

T4 — Replace `/state/summary` in workstreams.md and dependencies.md

id: CUST-WP-0039-T4
status: done
priority: medium
depends_on: [CUST-WP-0039-T3]
state_hub_task_id: "b80dce9c-b1ef-4606-9460-5100d6f58bce"

Switch workstreams.md and dependencies.md to use the new /state/deps endpoint instead of the full /state/summary. Both pages construct a dep-edge map from open_workstreams[].depends_on; /state/deps provides exactly that.

Changes:

dashboard/src/workstreams.md: replace apiFetch("/state/summary", ...) with apiFetch("/state/deps"), update the variable extraction (openWs = depsData)
dashboard/src/dependencies.md: same substitution, update edge-building loop

T5 — Consolidate index.md's 4 polling loops into 1

id: CUST-WP-0039-T5
status: done
priority: medium
state_hub_task_id: "7c2d5e01-9de5-48ad-aa0b-a37cf5332ad9"

index.md runs 4 independent while(true) generators (summaryState, sbomSnapState, regsState, wsChartState) that each sleep 60 s independently. They were split because different sections needed different data, but they all use POLL_HEAVY and can be unified into a single loop with one Promise.all that fetches all 8 endpoints together.

Benefits:

4 timers → 1: simpler, predictable, backoff applies uniformly
Fetch batching: all 8 requests fire simultaneously, most finish within the same server round-trip window
Simpler failure handling: one failures counter, one backoff

Approach: single pageState generator that yields a flat object with all fields (summary, snapshots, milestones, wsAll). Destructure at the use sites.

T6 — Full visibility-based polling pause in config.js

id: CUST-WP-0039-T6
status: done
priority: low
state_hub_task_id: "31b6a353-040a-4f87-b2f1-1deab5cf6191"

pollDelay() currently extends the interval to POLL_HIDDEN = 120 s when the tab is hidden. Change this to pause polling entirely while hidden and resume immediately on visibilitychange.