Files
citation-evidence/docs/decisions/ADR-0005-persistence.md
tegwick d54daf2e61 Implement CE-WP-0002 T03-T09: ingest, anchor resolution, engine, UI, persistence, e2e
Completes the PDF review slice end-to-end. After this commit a user can
open a fixture, select text, save an evidence item with commentary, see
it in the sidebar, reload the page, click the item, and the viewer
scrolls to the passage.

- T03 src/source/pdf/{fingerprint,extract,ingest}.ts + 39 fixture tests
  - SHA-256 fingerprint over a fresh ArrayBuffer (TS BufferSource-safe)
  - PDF.js text extract; per-page normalize then join with "\n\n"
  - PageMap + OffsetMap (gap-free coverage); pageLength = end - start
  - Updated manifest's Betriebskosten quote to one PDF.js extracts cleanly
- T04 src/anchor/selectors/{create,resolve}.ts + 25 unit + 7 fixture tests
  - createSelectors emits the maximal redundant set (TextQuote +
    TextPosition + PdfRect + PdfPageText when available)
  - resolveSelectors implements the SharedContracts §7 ladder; confidence
    1.0 (pos+quote) → 0.7 (rect-only) → 0 (unresolved)
  - Cross-module integration test moved to tests/integration/ to honor
    the anchor↛source boundary lint rule
- T05 engine: sync event bus over the closed §4 vocabulary, Map-backed
  repos, services, createEngine() composition root, 12 tests
- T06 work + app: three-pane shell (CollectionList | ViewerShell |
  EvidenceSidebar) wired through EngineProvider; EngineContext lives in
  src/work/ to respect the work↛app boundary; SpikeApp deleted
- T07 AnnotationToolbar: pendingSelection in context; Save runs
  createSelectors → engine.annotations.create → engine.evidence.create
- T08 click-to-reopen + localStorage persistence
  - scrollToAnnotation state in context with a version counter so a
    second click on the same item re-fires the viewer scroll
  - captureSnapshot/restoreSnapshot/attachPersister/restoreFromStorage;
    restore bypasses services to avoid event-loops
  - active-document id persisted alongside the snapshot so reload lands
    on the same fixture; ADR-0005 written
  - 9 persistence tests
- T09 tests/integration/app-prd-scenario.dom.test.tsx
  - end-to-end happy-dom test of PRD scenario steps 1-8 through the real
    React tree; viewer + ingest mocked per ADR-0004's headless-Chromium
    limitation. Fixed memo-deps bug in EvidenceSidebar/ViewerShell where
    useEngineEventTick values were not included in the useMemo deps,
    leaving stale memoization across event-driven re-renders
- vitest.config.ts: happy-dom for *.dom.test.{ts,tsx} files
- noEmit added to tsconfig so tsc -b doesn't litter src/ with .js outputs

Gates: typecheck ✓ lint ✓ test 109/109 across 11 files ✓ build ✓

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 10:58:11 +02:00

86 lines
3.8 KiB
Markdown

# ADR-0005 — Persistence for the MVP slice
- Status: accepted (provisional — durable storage owned by a later workplan)
- Date: 2026-05-25
- Workplan: CE-WP-0002-T08 (click-to-reopen requires reload-survival)
## Context
CE-WP-0002 needs the click-to-reopen flow to survive a page reload (PRD
scenario step 4 → "even after a full page reload"). The full persistence
design (SQLite local-first vs Postgres server-first) is too large to land
inside this slice — `wiki/ArchitectureOverview.md` §10 lays out the bigger
picture but the workplan explicitly defers the decision.
The engine already runs `Map`-backed in-memory repositories
(`src/engine/repos/in-memory.ts`). To survive reloads we need *some*
persistence boundary now, without committing to the long-term store.
## Options
- **A. localStorage snapshot (this ADR).** The SPA serializes the entire
engine state into a single JSON blob on every mutation and restores it
on mount. No new dependencies; no schema migrations; no networking.
Per-tab only.
- **B. IndexedDB-backed store.** More headroom, more API surface, async
reads. Needed eventually for binary blobs (PDF bytes) but overkill for
the few hundred annotations the MVP produces.
- **C. SQLite via `sql.js` or `wa-sqlite`.** Brings query semantics into
the browser. Heavy for the MVP and entangles us with a database we may
not keep.
- **D. Server-backed persistence from day one.** Requires shipping a
backend. Premature.
## Decision
Adopt **A: localStorage snapshot**, deliberately temporary.
Implementation lives in `src/engine/persistence.ts`:
- `captureSnapshot(engine)` returns
`{ documents, representations, annotations, evidenceItems }`.
- `attachPersister(engine, { key })` subscribes to every mutating engine
event and writes a fresh snapshot to `localStorage` after each.
- `restoreFromStorage(engine, { key })` reads the snapshot on app mount
and hydrates the repos *directly* (bypassing service `create()` calls)
so no spurious `*Created` events fire — the persister would otherwise
loop on its own writes, and other UI listeners would see "the same
annotation was created again" on every reload.
- Snapshot is versioned (`SNAPSHOT_VERSION = 1`); a version mismatch
throws on restore so a future schema bump is loud.
`src/work/EngineContext.tsx`'s `EngineProvider` wires this on first mount.
A sibling localStorage key holds the last-active `documentId` so reload
lands the user back on the same fixture.
## Why this is acceptable for the MVP
- The engine never holds PDF bytes — only metadata + selectors + commentary.
A typical session is well under 1 MB even with hundreds of annotations,
comfortably within the ~5 MB localStorage budget.
- The repositories' `create()` signatures already match the shape an
eventual durable repo would expose; swapping the implementation is a
localised change.
- "Survives reload" is the only persistence requirement of CE-WP-0002.
Cross-device sync, multi-user access, query-by-tag, history — none are
in scope yet.
## What this defers
- A real persistence ADR (SQLite local-first vs Postgres server-first vs
IndexedDB) for CE-WP-0005+ work.
- PDF byte persistence. Today the SPA re-fetches `/fixtures/pdfs/*` on
load; bytes do not enter the snapshot.
- Multi-tab consistency. Tabs see each other's writes only on reload.
- Migrations beyond the version check.
## Consequences
- `src/engine/persistence.ts` is the single point of contact for storage.
When the real durable-store ADR lands, that module is what changes.
- Tests inject a memory-Storage shim into `attachPersister` /
`restoreFromStorage` so they don't depend on a browser environment
(see `src/engine/persistence.test.ts`).
- Clearing the user's browser storage destroys all annotations — call
this out in the README once the MVP ships.