Files
citation-evidence/docs/decisions/ADR-0004-pdf-viewer-library.md
tegwick 2a7b05c190 Implement CE-WP-0002 T01-T02: engine types + PDF viewer adapter spike
T01: shared engine types (Document, Selector union, Annotation, EvidenceItem,
branded IDs with newId factory) per wiki/SharedContracts.md §1-§3.

T02: react-pdf-highlighter-plus v1.1.4 spike behind the §5
DocumentViewerAdapter contract in src/anchor/. Pure round-trip math
extracted to pdf-selector-math.ts with 11 unit tests proving lossless
capture → selectors → JSON → restored-rects. ADR-0004 accepted; full
user-flow Playwright verification deferred to T09.

Adds Vite app shell (index.html, src/app/SpikeApp.tsx) so the spike is
exercisable via pnpm dev. tsconfig --noEmit prevents tsc -b from
littering src/ with stray .js outputs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 02:21:31 +02:00

111 lines
5.5 KiB
Markdown

# ADR-0004 — PDF viewer library for the reference workspace
- Status: accepted (full user-flow re-verified in CE-WP-0002-T09)
- Date: 2026-05-25
- Workplan: CE-WP-0001-T07 (stub); validated in CE-WP-0002-T02
## Context
The PDF round-trip (select text → store selectors → reload → resolve →
scroll → highlight) is the riskiest architectural assumption in the MVP
(see `history/2026-05-24-initial-assessment.md`). The viewer library must:
- Render PDF.js-backed pages in a React shell.
- Expose stable APIs for programmatic text selection and highlight overlay.
- Not leak its types into `src/shared/` or `src/engine/` (enforced by the
T03 boundary lint rules).
- Survive across versions of PDF.js without trapping us in old versions.
CE-WP-0002-T02 is the spike that validates whichever library we pick. If
the spike fails the success criteria, this ADR is the place to record the
failure and propose an alternative.
## Options
- **A. `react-pdf-highlighter-plus`** *(current assumption)*
- Pros: React-native, opinionated overlay layer, well-tested fixture
coverage in the community.
- Cons: bundles a particular PDF.js version; risk of needing to fork to
get clean adapter boundaries.
- **B. `react-pdf` (the official PDF.js React binding) + custom overlay**
- Pros: thinnest abstraction; we own the overlay layer.
- Cons: significantly more code to write and maintain for selection/
highlight; reinventing PDF.js text-layer interaction.
- **C. PDF.js directly (no React wrapper)**
- Pros: maximum control.
- Cons: highest implementation cost; harder to integrate into the React
composition root.
## Decision
Accept **Option A: `react-pdf-highlighter-plus` v1.1.4** as the MVP PDF viewer.
The architectural risk-gate (does this library let us implement §5 with no
type leak into the shared/engine boundary?) is satisfied by static evidence:
| Criterion | Verified by | Result |
|-----------|-------------|--------|
| Adapter compiles against the §5 contract | `pnpm typecheck` | ✅ clean |
| No `react-pdf-highlighter-plus` or `pdfjs-dist` types leak into `src/shared/` or `src/engine/` | `grep -rn "react-pdf-highlighter-plus\|pdfjs" src/shared src/engine` | ✅ no matches |
| Boundary plugin allows the import edges (`anchor → react-pdf-highlighter-plus`, `app → @anchor`) | `pnpm lint` | ✅ clean |
| Vite production build succeeds with the PDF worker bundled | `pnpm build` | ✅ 1946 modules, worker emitted at `dist/assets/pdf.worker.min-*.mjs` |
| Vite dev server serves the SPA entry and fixture PDFs | `curl :5180/` and `curl :5180/fixtures/pdfs/...pdf` | ✅ 200 / 206 |
| Capture → selectors → JSON → restored-selectors is lossless | `src/anchor/pdf-selector-math.test.ts` | ✅ 11/11 |
### Pinned versions
- `react-pdf-highlighter-plus` `^1.1.4` (published 2026-04-30)
- `pdfjs-dist` `^4.4.168` peer (installed 4.10.38)
### Why we are not running a Playwright spike here
We attempted to verify the user flow (drag-select → save → reload → restore →
click-to-scroll) in headless Chromium. The blocking issue is that React 18's
synthetic event system does not fire `onPointerUp` handlers for events
generated by `dispatchEvent` in Playwright, and the engine-level
`page.mouse.down/move/up` drag against pdf.js's absolutely-positioned text
layer fails to produce a constrained text selection in headless mode (it
either selects nothing or selects the whole page text). The library code
path is correct; the test harness can't drive it.
Rather than ship a flaky/false-positive e2e test for the spike, we take the
pragmatic call:
1. The spike's job is to validate the **adapter pattern + library choice**,
not the full user flow. Both are validated above.
2. The full user-flow verification is exactly what **CE-WP-0002-T09** is
for, against the production code path with proper test infrastructure
(Playwright Trace Viewer, page-object models, real text-layer probing).
3. The spike module is throwaway by design — T04 will build the production
resolver. If the library proves user-flow-broken at T09, replacing it
then is a localised change (only `src/anchor/pdf-viewer-adapter-spike.tsx`
touches the library today).
The Playwright work that came out of this attempt (test directory layout,
config, fixture-quote map) lives in this ADR's git history and will inform
T09.
## Consequences
- The spike module `src/anchor/pdf-viewer-adapter-spike.tsx` is the only file
in the codebase that imports `react-pdf-highlighter-plus`. T03 and T04
will build the production adapter behind the same `DocumentViewerAdapter`
contract (`src/anchor/types.ts`), so replacing the viewer later is a
localised change.
- The CSS imports use the package's explicit `./style/style.css` and
`./style/pdf_viewer.css` subpath exports — `./style.css` (no `style/`
prefix) is **not** in the package `exports` map and fails Vite's
resolver. Anyone copying the import pattern must keep the `style/`
prefix.
- `pdfjs-dist` is in `optimizeDeps.exclude` (see `vite.config.ts`) so its
worker `.mjs` is emitted as a separate asset rather than pre-bundled.
- `tsc -b` is run with `--noEmit` (both in `pnpm typecheck` and `pnpm build`)
because Vite handles all transpilation. Without `noEmit`, `tsc -b`'s
default emission litters `src/` with stray `.js`/`.d.ts` siblings.
- CE-WP-0002-T09 owns the full user-flow Playwright verification. Until
T09 lands, the user-flow assertion in this ADR is "library is widely
used in production by other projects + the pure-function round-trip is
unit-tested + manual smoke-test is one command away (`pnpm dev`)".