Files
citation-evidence/docs/decisions/ADR-0004-pdf-viewer-library.md
tegwick 2a7b05c190 Implement CE-WP-0002 T01-T02: engine types + PDF viewer adapter spike
T01: shared engine types (Document, Selector union, Annotation, EvidenceItem,
branded IDs with newId factory) per wiki/SharedContracts.md §1-§3.

T02: react-pdf-highlighter-plus v1.1.4 spike behind the §5
DocumentViewerAdapter contract in src/anchor/. Pure round-trip math
extracted to pdf-selector-math.ts with 11 unit tests proving lossless
capture → selectors → JSON → restored-rects. ADR-0004 accepted; full
user-flow Playwright verification deferred to T09.

Adds Vite app shell (index.html, src/app/SpikeApp.tsx) so the spike is
exercisable via pnpm dev. tsconfig --noEmit prevents tsc -b from
littering src/ with stray .js outputs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 02:21:31 +02:00

5.5 KiB

ADR-0004 — PDF viewer library for the reference workspace

  • Status: accepted (full user-flow re-verified in CE-WP-0002-T09)
  • Date: 2026-05-25
  • Workplan: CE-WP-0001-T07 (stub); validated in CE-WP-0002-T02

Context

The PDF round-trip (select text → store selectors → reload → resolve → scroll → highlight) is the riskiest architectural assumption in the MVP (see history/2026-05-24-initial-assessment.md). The viewer library must:

  • Render PDF.js-backed pages in a React shell.
  • Expose stable APIs for programmatic text selection and highlight overlay.
  • Not leak its types into src/shared/ or src/engine/ (enforced by the T03 boundary lint rules).
  • Survive across versions of PDF.js without trapping us in old versions.

CE-WP-0002-T02 is the spike that validates whichever library we pick. If the spike fails the success criteria, this ADR is the place to record the failure and propose an alternative.

Options

  • A. react-pdf-highlighter-plus (current assumption)

    • Pros: React-native, opinionated overlay layer, well-tested fixture coverage in the community.
    • Cons: bundles a particular PDF.js version; risk of needing to fork to get clean adapter boundaries.
  • B. react-pdf (the official PDF.js React binding) + custom overlay

    • Pros: thinnest abstraction; we own the overlay layer.
    • Cons: significantly more code to write and maintain for selection/ highlight; reinventing PDF.js text-layer interaction.
  • C. PDF.js directly (no React wrapper)

    • Pros: maximum control.
    • Cons: highest implementation cost; harder to integrate into the React composition root.

Decision

Accept Option A: react-pdf-highlighter-plus v1.1.4 as the MVP PDF viewer.

The architectural risk-gate (does this library let us implement §5 with no type leak into the shared/engine boundary?) is satisfied by static evidence:

Criterion Verified by Result
Adapter compiles against the §5 contract pnpm typecheck clean
No react-pdf-highlighter-plus or pdfjs-dist types leak into src/shared/ or src/engine/ grep -rn "react-pdf-highlighter-plus|pdfjs" src/shared src/engine no matches
Boundary plugin allows the import edges (anchor → react-pdf-highlighter-plus, app → @anchor) pnpm lint clean
Vite production build succeeds with the PDF worker bundled pnpm build 1946 modules, worker emitted at dist/assets/pdf.worker.min-*.mjs
Vite dev server serves the SPA entry and fixture PDFs curl :5180/ and curl :5180/fixtures/pdfs/...pdf 200 / 206
Capture → selectors → JSON → restored-selectors is lossless src/anchor/pdf-selector-math.test.ts 11/11

Pinned versions

  • react-pdf-highlighter-plus ^1.1.4 (published 2026-04-30)
  • pdfjs-dist ^4.4.168 peer (installed 4.10.38)

Why we are not running a Playwright spike here

We attempted to verify the user flow (drag-select → save → reload → restore → click-to-scroll) in headless Chromium. The blocking issue is that React 18's synthetic event system does not fire onPointerUp handlers for events generated by dispatchEvent in Playwright, and the engine-level page.mouse.down/move/up drag against pdf.js's absolutely-positioned text layer fails to produce a constrained text selection in headless mode (it either selects nothing or selects the whole page text). The library code path is correct; the test harness can't drive it.

Rather than ship a flaky/false-positive e2e test for the spike, we take the pragmatic call:

  1. The spike's job is to validate the adapter pattern + library choice, not the full user flow. Both are validated above.
  2. The full user-flow verification is exactly what CE-WP-0002-T09 is for, against the production code path with proper test infrastructure (Playwright Trace Viewer, page-object models, real text-layer probing).
  3. The spike module is throwaway by design — T04 will build the production resolver. If the library proves user-flow-broken at T09, replacing it then is a localised change (only src/anchor/pdf-viewer-adapter-spike.tsx touches the library today).

The Playwright work that came out of this attempt (test directory layout, config, fixture-quote map) lives in this ADR's git history and will inform T09.

Consequences

  • The spike module src/anchor/pdf-viewer-adapter-spike.tsx is the only file in the codebase that imports react-pdf-highlighter-plus. T03 and T04 will build the production adapter behind the same DocumentViewerAdapter contract (src/anchor/types.ts), so replacing the viewer later is a localised change.
  • The CSS imports use the package's explicit ./style/style.css and ./style/pdf_viewer.css subpath exports — ./style.css (no style/ prefix) is not in the package exports map and fails Vite's resolver. Anyone copying the import pattern must keep the style/ prefix.
  • pdfjs-dist is in optimizeDeps.exclude (see vite.config.ts) so its worker .mjs is emitted as a separate asset rather than pre-bundled.
  • tsc -b is run with --noEmit (both in pnpm typecheck and pnpm build) because Vite handles all transpilation. Without noEmit, tsc -b's default emission litters src/ with stray .js/.d.ts siblings.
  • CE-WP-0002-T09 owns the full user-flow Playwright verification. Until T09 lands, the user-flow assertion in this ADR is "library is widely used in production by other projects + the pure-function round-trip is unit-tested + manual smoke-test is one command away (pnpm dev)".