the-custodian/workplans/CUST-WP-0039-dashboard-poll-optimization.md

---
id: CUST-WP-0039
type: workplan
title: "Dashboard Poll Optimization"
domain: custodian
status: done
owner: custodian
topic_slug: custodian
created: "2026-05-11"
updated: "2026-05-11"
state_hub_workstream_id: "d5ffb008-a517-4b8b-86ce-093fcc285fb3"
---

# Dashboard Poll Optimization

## Problem

With `uvicorn --reload` watching `.venv/` now fixed (CUST-WP-0039 precursor), the
remaining sustained load on the API worker comes from the dashboard polling pattern:

- **24 pages**, 14 with active polling loops (POLL_HEAVY = 60 s, POLL = 15 s)
- **`index.md` alone** runs 4 independent polling loops firing 11 API calls per cycle:
  `/state/summary`, `/sbom/snapshots/`, `/progress/`, `/workstreams/`, `/tasks/?limit=2000`,
  `/topics/`, `/repos/`, `/workstreams/workplan-index`
- **`workstreams.md` and `dependencies.md`** each call `/state/summary` (the most
  expensive endpoint — queries 10+ tables) every 60 s just to extract dependency
  edges from `open_workstreams[].depends_on`
- **Reference data** (`/topics/`, `/repos/`) is fetched independently by 10+ pages
  every 60 s with no caching; these datasets change rarely
- **Background tabs** still poll at 120 s (`POLL_HIDDEN`) — they could pause entirely

## Goals

Reduce API request rate and per-request cost when the dashboard is open, without
degrading UX or data freshness for the pages the user is actively viewing.

## Out of scope

- SSE / WebSocket push (would require significant API rework)
- Observable data loaders / static build mode (different deployment model)
- BroadcastChannel cross-tab sharing (nice-to-have, not in this workplan)

---

## Tasks

### T1 — Add Cache-Control headers to reference endpoints

```task
id: CUST-WP-0039-T1
status: done
priority: high
state_hub_task_id: "b36713d8-d1d5-43c5-86c3-e22f72b68d62"
```

Add `Cache-Control: max-age=60, stale-while-revalidate=30` to the list responses
for `/topics/`, `/repos/`, and `/domains/`. These datasets change only when a human
explicitly creates/renames a domain or registers a repo — never on their own.

Browser-level caching means that when 10 pages all fetch `/topics/` within a 60 s
window, only the first request hits the API; the rest are served from cache.

**Implementation:** Add a FastAPI middleware or a response-header dependency in
`api/routers/topics.py`, `repos.py`, and `domains.py` list endpoints. Use
`from fastapi.responses import Response` + `response.headers["Cache-Control"]`, or
a shared `cache_headers` dependency.

---

### T2 — Add ETag support to high-frequency list endpoints

```task
id: CUST-WP-0039-T2
status: done
priority: high
state_hub_task_id: "75f1c2cd-0baf-4747-8c67-1dbfa81bde41"
```

Add `ETag` (content hash of the response body) and handle `If-None-Match` for
`/workstreams/`, `/tasks/`, and `/state/summary`. When the data hasn't changed the
API returns `304 Not Modified` with no body — roughly 95% smaller than a full
response.

**Implementation:**
- Add a FastAPI middleware (in `api/main.py`) that intercepts JSON list responses,
  computes `md5(body)`, sets `ETag: "<hash>"`, and returns 304 if the request
  carries a matching `If-None-Match` header.
- No client changes needed — `fetch()` respects ETags automatically when the
  response includes `Cache-Control: no-cache` (which forces revalidation but
  allows 304).

---

### T3 — Add lightweight `/state/deps` endpoint

```task
id: CUST-WP-0039-T3
status: done
priority: high
state_hub_task_id: "cb7608d3-5dad-4b51-9b91-080539f7aa65"
```

`workstreams.md` and `dependencies.md` call `/state/summary` (a ~10-table query)
only to extract `open_workstreams[].{id, depends_on, blocks}`. Add a dedicated
endpoint that returns just this:

```json
GET /state/deps
→ [{"id": "...", "title": "...", "status": "...", "depends_on": [...], "blocks": [...]}]
```

Query: `SELECT id, title, status FROM workstreams WHERE status IN ('active','blocked')`
plus the dependency join — roughly 1/10th the work of the full summary.

**Implementation:** New route in `api/routers/state.py` (or a new `deps.py`).
Schema: `WorkstreamDepStub` already exists in `api/schemas/workstream_dependency.py`
— reuse or extend it.

---

### T4 — Replace `/state/summary` in workstreams.md and dependencies.md

```task
id: CUST-WP-0039-T4
status: done
priority: medium
depends_on: [CUST-WP-0039-T3]
state_hub_task_id: "b80dce9c-b1ef-4606-9460-5100d6f58bce"
```

Switch `workstreams.md` and `dependencies.md` to use the new `/state/deps` endpoint
instead of the full `/state/summary`. Both pages construct a dep-edge map from
`open_workstreams[].depends_on`; `/state/deps` provides exactly that.

Changes:
- `dashboard/src/workstreams.md`: replace `apiFetch("/state/summary", ...)` with
  `apiFetch("/state/deps")`, update the variable extraction (`openWs = depsData`)
- `dashboard/src/dependencies.md`: same substitution, update edge-building loop

---

### T5 — Consolidate index.md's 4 polling loops into 1

```task
id: CUST-WP-0039-T5
status: done
priority: medium
state_hub_task_id: "7c2d5e01-9de5-48ad-aa0b-a37cf5332ad9"
```

`index.md` runs 4 independent `while(true)` generators (`summaryState`,
`sbomSnapState`, `regsState`, `wsChartState`) that each sleep 60 s independently.
They were split because different sections needed different data, but they all use
POLL_HEAVY and can be unified into a single loop with one `Promise.all` that fetches
all 8 endpoints together.

Benefits:
- 4 timers → 1: simpler, predictable, backoff applies uniformly
- Fetch batching: all 8 requests fire simultaneously, most finish within the same
  server round-trip window
- Simpler failure handling: one `failures` counter, one backoff

Approach: single `pageState` generator that yields a flat object with all fields
(summary, snapshots, milestones, wsAll). Destructure at the use sites.

---

### T6 — Full visibility-based polling pause in config.js

```task
id: CUST-WP-0039-T6
status: done
priority: low
state_hub_task_id: "31b6a353-040a-4f87-b2f1-1deab5cf6191"
```

`pollDelay()` currently extends the interval to `POLL_HIDDEN = 120 s` when the tab
is hidden. Change this to pause polling entirely while hidden and resume immediately
on `visibilitychange`.

**Implementation:**

```js
// config.js — replace pollDelay() with:
export async function waitForVisible(base) {
  if (typeof document === "undefined") return sleep(base);
  if (document.visibilityState === "visible") return sleep(base);
  return new Promise(resolve => {
    const handler = () => { document.removeEventListener("visibilitychange", handler); resolve(); };
    document.addEventListener("visibilitychange", handler);
  });
}
```

Pages replace `await sleep(pollDelay(...))` with `await waitForVisible(base)`.
When the user switches back to the tab, the next poll fires immediately rather
than waiting up to 120 s for the backoff to expire.

---

## Expected impact

| Change | Request reduction |
|--------|------------------|
| T1 (cache headers) | ~70% drop in /topics, /repos, /domains hits |
| T2 (ETags) | ~80% payload reduction for unchanged list responses |
| T3+T4 (deps endpoint) | 2 full summary calls removed per 60 s cycle |
| T5 (consolidate index) | 4 loops → 1, reduces timer jitter and staggered load |
| T6 (visibility pause) | Eliminates all background-tab traffic entirely |