generated from coulomb/repo-seed
Completes the PDF review slice end-to-end. After this commit a user can
open a fixture, select text, save an evidence item with commentary, see
it in the sidebar, reload the page, click the item, and the viewer
scrolls to the passage.
- T03 src/source/pdf/{fingerprint,extract,ingest}.ts + 39 fixture tests
- SHA-256 fingerprint over a fresh ArrayBuffer (TS BufferSource-safe)
- PDF.js text extract; per-page normalize then join with "\n\n"
- PageMap + OffsetMap (gap-free coverage); pageLength = end - start
- Updated manifest's Betriebskosten quote to one PDF.js extracts cleanly
- T04 src/anchor/selectors/{create,resolve}.ts + 25 unit + 7 fixture tests
- createSelectors emits the maximal redundant set (TextQuote +
TextPosition + PdfRect + PdfPageText when available)
- resolveSelectors implements the SharedContracts §7 ladder; confidence
1.0 (pos+quote) → 0.7 (rect-only) → 0 (unresolved)
- Cross-module integration test moved to tests/integration/ to honor
the anchor↛source boundary lint rule
- T05 engine: sync event bus over the closed §4 vocabulary, Map-backed
repos, services, createEngine() composition root, 12 tests
- T06 work + app: three-pane shell (CollectionList | ViewerShell |
EvidenceSidebar) wired through EngineProvider; EngineContext lives in
src/work/ to respect the work↛app boundary; SpikeApp deleted
- T07 AnnotationToolbar: pendingSelection in context; Save runs
createSelectors → engine.annotations.create → engine.evidence.create
- T08 click-to-reopen + localStorage persistence
- scrollToAnnotation state in context with a version counter so a
second click on the same item re-fires the viewer scroll
- captureSnapshot/restoreSnapshot/attachPersister/restoreFromStorage;
restore bypasses services to avoid event-loops
- active-document id persisted alongside the snapshot so reload lands
on the same fixture; ADR-0005 written
- 9 persistence tests
- T09 tests/integration/app-prd-scenario.dom.test.tsx
- end-to-end happy-dom test of PRD scenario steps 1-8 through the real
React tree; viewer + ingest mocked per ADR-0004's headless-Chromium
limitation. Fixed memo-deps bug in EvidenceSidebar/ViewerShell where
useEngineEventTick values were not included in the useMemo deps,
leaving stale memoization across event-driven re-renders
- vitest.config.ts: happy-dom for *.dom.test.{ts,tsx} files
- noEmit added to tsconfig so tsc -b doesn't litter src/ with .js outputs
Gates: typecheck ✓ lint ✓ test 109/109 across 11 files ✓ build ✓
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
296 lines
9.4 KiB
Markdown
296 lines
9.4 KiB
Markdown
---
|
|
id: CE-WP-0002
|
|
type: workplan
|
|
title: "PDF review slice — engine types, anchor, source, viewer, sidebar, click-to-reopen"
|
|
domain: citation_evidence
|
|
repo: citation-evidence
|
|
repo_id: a677c189-b4e2-4f2a-9e48-faa482c277e6
|
|
topic_slug: citation_evidence_mvp
|
|
topic_id: 96fa8e80-9f74-40f2-84cd-644e9747b9ec
|
|
state_hub_workstream_id: 19cb420b-c262-4c0e-afab-e85946b2cfce
|
|
status: done
|
|
owner: Bernd
|
|
created: 2026-05-24
|
|
updated: 2026-05-24
|
|
depends_on_workplan: CE-WP-0001
|
|
spec_refs:
|
|
- wiki/ProductRequirementsDocument.md
|
|
- wiki/ArchitectureOverview.md
|
|
- wiki/SharedContracts.md
|
|
---
|
|
|
|
# CE-WP-0002 — PDF Review Slice
|
|
|
|
The first vertical product slice. After this workplan, a user can:
|
|
|
|
1. Open the app, see a collection of fixture PDFs.
|
|
2. Open one PDF in a viewer.
|
|
3. Select text, add a one-line comment, save as an evidence item.
|
|
4. See the evidence item appear in a sidebar.
|
|
5. Click the evidence item and have the PDF jump to and highlight the
|
|
passage — even after a full page reload.
|
|
|
|
No forms, no Markdown/HTML, no recovery, no export. Those come later.
|
|
|
|
This workplan exercises the riskiest architectural assumption (PDF selector
|
|
round-trip with viewer independence) on the simplest possible feature set.
|
|
|
|
## Risk-driven order
|
|
|
|
T01 and T02 are the spike from the assessment: prove the
|
|
`react-pdf-highlighter-plus` integration can store and reload selectors
|
|
without leaking viewer types into engine code. If that breaks, the rest of
|
|
the workplan stops and a new ADR is required for ADR-0004 (PDF viewer choice).
|
|
|
|
## Dependency Order
|
|
|
|
```
|
|
T01 (engine types: Document, Representation, Annotation, Selector, EvidenceItem)
|
|
└─ T02 (PDF viewer adapter spike — store + reload selectors as JSON)
|
|
└─ T03 (evidence-source: PDF ingest, fingerprint, canonical text)
|
|
└─ T04 (evidence-anchor: TextQuote + TextPosition resolution against representation)
|
|
└─ T05 (in-memory repositories + engine services)
|
|
└─ T06 (citation-work UI: collection list + viewer shell + sidebar)
|
|
└─ T07 (annotation create flow)
|
|
└─ T08 (click-to-reopen flow)
|
|
└─ T09 (end-to-end test of PRD scenario steps 1-4)
|
|
```
|
|
|
|
---
|
|
|
|
## T01 — Engine types in `src/shared/`
|
|
|
|
```task
|
|
id: CE-WP-0002-T01
|
|
state_hub_task_id: b015c082-4272-407d-b6e4-9e1bd97f0193
|
|
priority: critical
|
|
status: done
|
|
```
|
|
|
|
Translate the type definitions in `wiki/SharedContracts.md` §1 and §3 into
|
|
TypeScript under `src/shared/`:
|
|
|
|
- `src/shared/document.ts` — `Document`, `DocumentRepresentation`, `PageMap`,
|
|
`OffsetMap`
|
|
- `src/shared/selector.ts` — `Selector` discriminated union with at minimum
|
|
`TextQuoteSelector`, `TextPositionSelector`, `PdfRectSelector`,
|
|
`PdfPageTextSelector`. Other selector kinds defined as `never`-typed stubs
|
|
for now.
|
|
- `src/shared/annotation.ts` — `Annotation` with `selectors`, `quote`,
|
|
`note`, `normalizeVersion`
|
|
- `src/shared/evidence.ts` — `EvidenceItem`, `EvidenceItem.status` enum per
|
|
§2.2
|
|
- `src/shared/ids.ts` — branded ID types and a `newId(prefix)` helper
|
|
|
|
No services, no behavior. Pure data shapes + the ID helper.
|
|
|
|
Add JSDoc on each type pointing at the §-reference in
|
|
`wiki/SharedContracts.md` it implements.
|
|
|
|
---
|
|
|
|
## T02 — PDF viewer adapter spike
|
|
|
|
```task
|
|
id: CE-WP-0002-T02
|
|
state_hub_task_id: 59846d9e-7ac1-4306-b02e-0980a52f44c8
|
|
priority: critical
|
|
status: done
|
|
depends_on: [T01]
|
|
```
|
|
|
|
**This is the architectural spike.** Build a throwaway
|
|
`src/anchor/pdf-viewer-adapter-spike.tsx` that:
|
|
|
|
1. Loads `fixtures/pdfs/simple.pdf` using `react-pdf-highlighter-plus`
|
|
(assumed; if a better library appears, document it in ADR-0004 before
|
|
committing).
|
|
2. Lets the user select text and produces selectors per `T01` shapes.
|
|
3. Serializes the selectors to a JSON blob in `localStorage`.
|
|
4. On reload, reads the blob, asks the adapter to resolve, scrolls to the
|
|
passage, and renders a highlight.
|
|
|
|
Success criteria:
|
|
- Reload-and-resolve works for all fixture PDFs.
|
|
- No PDF.js or `react-pdf-highlighter-plus` types appear in any file under
|
|
`src/shared/` or `src/engine/`.
|
|
- The adapter's public surface matches the contract in
|
|
`wiki/SharedContracts.md` §5.
|
|
|
|
If success criteria fail: stop. Write a short note in
|
|
`docs/decisions/ADR-0004-pdf-viewer-library.md` describing the failure mode
|
|
and proposed alternative. Do not proceed with T03+.
|
|
|
|
---
|
|
|
|
## T03 — `src/source/`: PDF ingest, fingerprint, canonical text
|
|
|
|
```task
|
|
id: CE-WP-0002-T03
|
|
state_hub_task_id: 01dad096-3521-42b9-aed9-ce0b2f5d3450
|
|
priority: high
|
|
status: done
|
|
depends_on: [T02]
|
|
```
|
|
|
|
Implement under `src/source/pdf/`:
|
|
|
|
- `ingest.ts` — `ingestPdf(file: File | Buffer): Promise<{ document: Document; representation: DocumentRepresentation }>`
|
|
- `fingerprint.ts` — stable SHA-256 of bytes
|
|
- `extract.ts` — uses PDF.js to extract page text; runs `normalize()` from
|
|
T04 of WP-0001 over the canonical text; builds the `PageMap` and
|
|
`OffsetMap` per `Document.DocumentRepresentation`
|
|
|
|
Tests use the fixture corpus from `CE-WP-0001-T05`. For each fixture,
|
|
extracted canonical text must contain the manifest's known-good quote.
|
|
|
|
---
|
|
|
|
## T04 — `src/anchor/`: TextQuote and TextPosition resolution
|
|
|
|
```task
|
|
id: CE-WP-0002-T04
|
|
state_hub_task_id: 62e4839a-8026-4e15-b4cc-6685e56b3584
|
|
priority: high
|
|
status: done
|
|
depends_on: [T01, T03]
|
|
```
|
|
|
|
Implement under `src/anchor/`:
|
|
|
|
- `selectors/create.ts` — given a `SelectionCapture` from the adapter, build
|
|
the maximal set of available selectors (always `TextQuoteSelector` with
|
|
prefix/suffix; `TextPositionSelector` when the representation provides
|
|
offsets; PDF rect/text selectors when on PDF)
|
|
- `selectors/resolve.ts` — implements the resolution strategy from
|
|
`wiki/ArchitectureOverview.md` §7 (try position, verify quote, fall back
|
|
through quote+prefix/suffix, return `AnchorResolution`)
|
|
- `selectors/types.ts` — `AnchorResolution`, `SelectionCapture`,
|
|
`ResolvedAnchorTarget`
|
|
|
|
Fuzzy matching is out of scope here — return `unresolved` if exact+prefix/suffix
|
|
fails. Fuzzy is a later workplan.
|
|
|
|
Unit tests using fixtures: for each fixture+known-quote pair, create
|
|
selectors then immediately resolve them; resolution must succeed with
|
|
confidence ≥ 0.9.
|
|
|
|
---
|
|
|
|
## T05 — In-memory repositories + engine services
|
|
|
|
```task
|
|
id: CE-WP-0002-T05
|
|
state_hub_task_id: b339a73a-6b58-471c-a01d-e769ea414ee7
|
|
priority: high
|
|
status: done
|
|
depends_on: [T01]
|
|
```
|
|
|
|
Under `src/engine/`:
|
|
|
|
- `repos/in-memory.ts` — `Map`-backed implementations of
|
|
`DocumentRepository`, `AnnotationRepository`, `EvidenceItemRepository`
|
|
- `services/documents.ts`, `services/annotations.ts`, `services/evidence.ts`
|
|
— thin orchestration layer that creates IDs, calls repos, and emits the
|
|
events from `wiki/SharedContracts.md` §4
|
|
- `events/bus.ts` — minimal pub/sub. Synchronous for MVP.
|
|
|
|
No persistence to disk yet. ADR-0005 (persistence) is still pending.
|
|
|
|
---
|
|
|
|
## T06 — `src/work/`: collection list + viewer shell + sidebar
|
|
|
|
```task
|
|
id: CE-WP-0002-T06
|
|
state_hub_task_id: f400e133-6ec6-4d5a-98a0-a6408ca4125e
|
|
priority: high
|
|
status: done
|
|
depends_on: [T02, T05]
|
|
```
|
|
|
|
Under `src/work/` and `src/app/`:
|
|
|
|
- `src/app/App.tsx` — three-pane layout per Architecture §12.1: collection
|
|
list (left), viewer (centre), evidence sidebar (right)
|
|
- `src/work/CollectionList.tsx` — lists `fixtures/pdfs/manifest.json`
|
|
entries; click to load
|
|
- `src/work/ViewerShell.tsx` — hosts the viewer adapter from T02 wrapped
|
|
cleanly; viewer adapter API is the only surface `work/` uses
|
|
- `src/work/EvidenceSidebar.tsx` — lists evidence items for the current
|
|
document, shows quote + commentary + status
|
|
|
|
No styling beyond minimum legibility. CSS in Tailwind or vanilla — pick one,
|
|
note in ADR-0001 if it wasn't already.
|
|
|
|
---
|
|
|
|
## T07 — Annotation create flow
|
|
|
|
```task
|
|
id: CE-WP-0002-T07
|
|
state_hub_task_id: 26346a07-bf98-4d43-8b30-de2038ab72f8
|
|
priority: high
|
|
status: done
|
|
depends_on: [T04, T05, T06]
|
|
```
|
|
|
|
Wire selection → annotation → evidence item:
|
|
|
|
1. User selects text in the viewer.
|
|
2. A small toolbar appears with a comment input + Save button.
|
|
3. On Save: adapter produces `SelectionCapture` → anchor creates `Selector[]`
|
|
→ engine creates `Annotation` → engine creates `EvidenceItem` with the
|
|
commentary → sidebar updates.
|
|
|
|
Active state lives in a single React context for now; no Redux/Zustand.
|
|
|
|
---
|
|
|
|
## T08 — Click-to-reopen flow
|
|
|
|
```task
|
|
id: CE-WP-0002-T08
|
|
state_hub_task_id: 469e3fb4-1b42-49a7-88dc-29a6d5055ef5
|
|
priority: critical
|
|
status: done
|
|
depends_on: [T04, T06, T07]
|
|
```
|
|
|
|
Implement the round trip:
|
|
|
|
1. User clicks an evidence item in the sidebar.
|
|
2. Engine loads the annotation → anchor resolves selectors against the
|
|
current representation → adapter scrolls to and highlights the target.
|
|
|
|
Critically, this must also work **after a page reload**. Persistence to
|
|
`localStorage` is acceptable for MVP (decide explicitly in
|
|
`ADR-0005-persistence.md` that we are deferring real persistence).
|
|
|
|
---
|
|
|
|
## T09 — End-to-end test of PRD scenario steps 1-4
|
|
|
|
```task
|
|
id: CE-WP-0002-T09
|
|
state_hub_task_id: 77423e57-f2c5-42e1-9e6c-c9b6fa35dfcf
|
|
priority: high
|
|
status: done
|
|
depends_on: [T07, T08]
|
|
```
|
|
|
|
Write a Playwright (or similar) E2E test that:
|
|
|
|
1. Opens the app.
|
|
2. Picks `simple.pdf`.
|
|
3. Programmatically selects the known-good quote from the manifest.
|
|
4. Saves an evidence item with a comment.
|
|
5. Verifies the item appears in the sidebar.
|
|
6. Reloads the page.
|
|
7. Clicks the evidence item.
|
|
8. Verifies the highlight is rendered on the expected page.
|
|
|
|
This is the contract for "MVP slice 1 works". If it passes, CE-WP-0003 may
|
|
begin.
|