Capture mode state lived only in React memory and was lost when
reopening a session or remounting EngineProvider.
- Add per-session localStorage capture snapshot (schema, values, links)
- Restore on session mount; persist on field/schema/link changes
- Seed binder links from storage without spurious bus events
- Clean up capture key when session is deleted
- Integration test for reload persistence
PdfLoader reloads the PDF when its document prop is a new object each
render. Memoize the loader config on pdfUrl only.
Also stabilize SpikeHighlightContainer via context (no remount on focus
change) and narrow scroll-effect deps to highlight id signature.
- Wire fieldValues state in FormsApp so controlled inputs persist typed data
- Add runScrollToHighlightJob with rAF retries when utils/highlights not ready
- Re-trigger scroll when highlights update after PDF load
- Tests: scroll-job unit test, forms-field-values integration tests
- Workplan CE-WP-0008 marked done
The version of pdf_viewer.css bundled with react-pdf-highlighter-plus
is only a minimal *override* (≈40 lines: opacity, z-index, blend
mode). It's missing the foundational rules that PDF.js's TextLayer
relies on — `position: absolute`, `inset: 0`, and the entire
`--scale-factor` CSS-variable machinery that PDF.js 4.x uses to
position each glyph.
Without those rules, each text-layer span gets rendered with default
positioning context and `font-size: calc(<base> * var(--scale-factor))`
collapses to 0 → spans either pile up at the top-left of the page or
land at wrong y-coordinates regardless of where the glyph actually
sits on the canvas. The reported symptom ("origin seems to be the top
of the page always") matches exactly.
Importing `pdfjs-dist/web/pdf_viewer.css` first, then the library's
overrides on top, gives PDF.js the CSS it expects.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Strong likelihood that the "text layer is misplaced / body text not
selectable" symptoms across multiple PDFs come from PDF.js falling
back to substitute font metrics. Without the cmaps directory (CID
character maps for non-Latin fonts) and the standard_fonts directory
(Helvetica/Times/Courier metrics for unembedded standard fonts), the
canvas glyphs use embedded font data while the text-layer span
positions are computed from fallback metrics. The two diverge — text
spans land in the wrong place, or text content can't be decoded at
all, leaving the body unselectable.
Both directories are now copied into the served root by
vite-plugin-static-copy and passed to pdfjs.getDocument() as
`cMapUrl: "/cmaps/"` + `cMapPacked: true` + `standardFontDataUrl:
"/standard_fonts/"` via PdfLoader's `document` prop (which accepts a
full DocumentInitParameters object).
If this is the right diagnosis, the textLayer overlay should now line
up with the visible glyphs on the same PDFs that were producing
fragmented captures. If the body text is still unselectable, the PDF
genuinely lacks a text layer for those glyphs (image-only content)
and OCR would be the only path forward.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The previous iteration left inactive document cards on a
white-with-grey-border style and only flipped to light-blue on
activation. The intent (matching the evidence-card pattern of
always-yellow with a thicker border when active) was to always-blue
with a thin/thick dark-blue border.
Inactive: 1px #0050b3 on #e8f0ff
Active: 3px #0050b3 on #e8f0ff
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three UX iterations rolled into one:
1. Unified evidence form
- New EvidenceFormBody is the single source for "citation +
commentary" editing. Both InlineCaptureForm (creating fresh
evidence from a selection) and the EvidenceCard edit mode render
this body with their own save/cancel labels + badge/helper text.
- The capture form now exposes the citation as an editable
textarea — pre-filled with the selection text — so the user can
refine a partial capture before saving without re-selecting.
- Old testid prefixes are unchanged for the inline-capture flow
(`inline-capture-quote/commentary/save/cancel`); edit-mode
testids are now `evidence-edit-<id>-{quote,commentary,save,cancel}`.
2. Active document card
- The blue background alone was the only "this is open" cue. Added
a 3px #0050b3 border (matching the evidence-card thick-border
pattern, but in the documents-are-blue palette) plus a
`data-active` attribute.
3. PDF layer-hide diagnostics
- New debug flags `hideCanvas`, `hideTextLayer`, `hideAnnotationLayer`,
`hideXfaLayer` — applied as `.ce-hide-<layer>` classes on the viewer
wrapper, each `display: none`-ing the matching PDF.js layer.
- SessionMenu groups the toggles under a "PDF diagnostics" header
with a new shared DebugCheckbox helper. The existing "Debug text
layer" highlight toggle now lives in the same group.
- Lets the user isolate stacking issues by elimination — e.g.
"hide text layer, can I now see the canvas content underneath?".
Tests
- citation-card-export-e2e + session-export-reimport switched from
placeholder/role-name lookups to the inline-capture testids so
they survive form-copy changes.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Click "Create session" with the input empty and a name of the form
`YYMMDD-session-NNN` is generated automatically: today's date as
two-digit year/month/day, then a zero-padded counter that starts at
000 and increments past the highest existing match for the same day.
Added:
- `computeNextDefaultName(existing, now?)` pure helper exported from
`@engine/services/sessions`.
- `SessionService.nextDefaultName(now?)` method that wraps it
against the current repo.
- Both create call sites (CreateFirstSession empty state +
SessionMenu's New session form) fall back to `service.nextDefaultName()`
when the trimmed input is empty.
- 5 new unit tests covering today-only counting, max-not-count
increment, and trimmed/wrong-shape filtering.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Significant UX iteration:
Visual palette
- Debug text-layer overlay flips from yellow to light grey so it no
longer collides with the evidence highlight colour.
- New highlight-styles.css matches the sidebar's #fff8d6/#e0c050
palette so a passage marked in the document and its sidebar card
speak the same visual language.
- Active (focused) evidence: same fill, thick #b78b1c outline on both
the highlight and the sidebar card. Library's red --scrolledTo
box-shadow is suppressed.
Activation model
- Click an evidence card in the sidebar → activates that item +
scrolls the viewer to the passage + thickens the borders (existing
behaviour, now visually clearer).
- Click a highlight in the document → activates the evidence that
owns that annotation. New `findByAnnotationId()` on EvidenceService
is the reverse lookup. Wired through a new `onHighlightClicked`
prop on PdfSpikeViewer + `activeAnnotationId` prop that drives the
data-ce-active attribute on the highlight wrapper.
Inline edit
- Each evidence card has a ✎ button that flips the card into an
inline form with the citation (quote) and commentary fields.
- Saving calls a new `AnnotationService.updateQuote()` +
existing `EvidenceService.updateCommentary()`. The selectors are
untouched, so the marked passage in the document stays put — the
inline hint says so explicitly.
- New `AnnotationUpdated` event added to the engine event vocabulary
(SharedContracts.md §4 updated).
Capture form placement
- The yellow "New annotation" toolbar that lived above the viewer is
gone. A new InlineCaptureForm component is now slotted into the
sidebar between the cards that bracket the new selection in
document flow (sorted by page + y of the first PdfRectSelector).
If the new selection is before all existing evidence it appears at
the top; if after all of them, at the bottom.
- The legacy AnnotationToolbar.tsx is removed; the public surface
re-exports `InlineCaptureForm` instead.
Test updates
- tests/integration/citation-card-export-e2e.dom.test.tsx: switched
to the seed-session helper (matches the other E2Es) since the
fixture-button click path is gone.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The first cut of "Debug text layer" only painted direct `<span>`
children of `.textLayer`. PDF.js 4.x wraps marked content in nested
spans/divs, so the entire selectable area wasn't visible — making it
hard to tell whether a region is "no text layer at all" vs. "text
layer present but small/dense".
Changes:
- CSS now targets every descendant of `.textLayer`, dims the canvas
underneath, and outlines the `.textLayer` container itself so its
full extent is obvious.
- TextHighlight rectangles flip to green in debug mode so saved
highlights don't get washed out by the debug yellow.
- The viewer now logs:
[ce] viewer highlights — which annotations rendered, which
were skipped, with rects + page
[ce] scrollToAnnotation — whether the target was found in
the highlights array when an
activation arrived
This is the diagnostic loop for the "viewport scrolls but the
highlight doesn't appear" report — if highlight count is > 0 in the
first log but the green rectangle is off-screen, the saved rects
inherited the same text-layer misalignment that caused the partial
selection captures in the first place.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
PDF text selection misbehaviour (some glyphs unselectable, selections
jumping to other positions) is almost always caused by misalignment
between the visible canvas-rendered glyphs and the invisible text
layer that PDF.js overlays for selection. There's no way to see this
without devtools — which makes it hard for end users to tell whether
a specific PDF is OCR-noisy, has bad font fallbacks, or has no text
layer at all.
This adds a developer-facing toggle in the SessionMenu ("Debug text
layer") that:
- paints every text-layer span yellow with a blue outline so it's
obvious where text is selectable and where it isn't, and
- logs every onSelection event to the browser console with the
captured text, page, normalized rects, and the selectors the
pipeline derived from it.
Preference persists in localStorage under
`citation-evidence:debug:textLayer`. Surfaced via a new
`useDebugFlag()` hook in @work so the SessionMenu (app layer) and the
ViewerShell (work layer) can both subscribe without breaching the
boundary plugin.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
UX gaps that surfaced while running the demo:
- ViewerShell hardcoded `/fixtures/pdfs/<title>` for the PDF URL,
ignoring the `document.uri` blob URL that uploaded PDFs carry. The
viewer either 404'd or — worse — silently served a fixture whose
filename happened to collide. Prefer document.uri when present.
- SessionMenu only let you delete the *active* session. Added a small
per-row "✕" button next to every session in the Switch-to list so a
user can drop a session's data without first switching to it. Same
click-to-confirm pattern as the existing Delete action.
- Added a "Reset all data…" affordance in both the SessionMenu and the
empty-state landing. Calls a new `clearAllSessionData()` helper that
wipes every `citation-evidence:*` key from localStorage, then forces
a reload so all in-memory caches start fresh.
- `attachSessionPersister.writeOnDelete` was leaking the per-session
`active-document-id:v1` key on every session delete. Now removed
alongside the engine snapshot key.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The PDF.js library refuses to open documents without a worker URL.
Production builds were throwing "No GlobalWorkerOptions.workerSrc
specified" on any upload because neither the source-layer ingest
(extract.ts) nor the viewer adapter ever set one — they relied on the
host application to do it, and the browser bootstrap didn't.
main.tsx now imports the worker via Vite's `?url` suffix so the file
is bundled into the build, and sets GlobalWorkerOptions.workerSrc
once before any PDF code runs. Added src/vite-env.d.ts so TypeScript
knows about the `?url` import suffix.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`make preview` runs `pnpm build && pnpm preview` so the demo can be
served from the production bundle at http://localhost:4173/. Plus
convenience targets for dev, build, test, typecheck, lint.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Turn the MVP into a self-contained demo. Users now:
1. Land on an empty-state and create a named session.
2. Drag-drop or pick arbitrary PDFs into that session.
3. Annotate, build evidence, link to form fields — all session-scoped.
4. Export the whole session as a single .zip archive (manifest +
per-document PDFs).
5. Import a .zip back — into a new session, or merged into an
existing one (documents deduped by SHA-256 fingerprint;
annotations/evidence/links added additively).
Architecture:
- New shared types: SessionId, Session, SessionArchiveManifest +
parseSessionArchiveManifest with schema-version validation.
- SessionService (engine/services/sessions.ts) handles lifecycle
(create/rename/delete/setActive) + emits 4 new events through its
own bus; SharedContracts.md §4 lists the additions.
- SessionProvider (work/SessionContext.tsx) owns the cross-session
state: service, per-session PdfByteStore registry, per-session
version counter that drives EngineProvider remounts after imports.
- EngineProvider becomes session-aware (sessionId prop drives per-
session localStorage keys). Bumping engineRevision after
restoreFromStorage forces consumers to re-render so restored repos
show up immediately.
- PdfByteStore (source/pdf/byte-store.ts) holds Uint8Array bytes per
document and mints blob URLs; ingestPdfFromFile is the upload
entry-point that wraps the existing ingestPdf pipeline.
- ADR-0008 locks the ZIP layout (manifest.json + documents/<id>.pdf),
the manifest schema (schemaVersion 1), and the merge-on-collision
policy. JSZip is the only new dependency.
- App.tsx restructured: SessionProvider at the root, EngineProvider
keyed by ${sessionId}:${version}, hash routing #/s/<id>[/forms/demo],
SessionMenu top-bar, CreateFirstSession empty state.
- New DocumentRemoved event for per-document delete cleanup in
CollectionList; engine.documents.remove() is the new service method.
Tests:
- Unit: 16 SessionService lifecycle + persistence tests;
per-session snapshot round-trip; PdfByteStore + ingestPdfFromFile;
SessionArchive parser; exportSessionZip + importSessionZip with
create + merge + corrupt-archive paths.
- DOM: UploadDropzone, session-scoped CollectionList delete,
SessionMenu create/switch/rename, routing parser.
- E2E: tests/integration/session-export-reimport.dom.test.tsx walks
the full create → annotate → export → reimport flow and asserts
the additive merge (deduped doc + doubled evidence rows).
- Legacy E2Es updated to use a seed-session helper instead of the
removed fixture-button flow.
Known limitation (documented in ADR-0008): re-importing your own
freshly-exported ZIP creates duplicate annotations. Forward pointer
left for an importBundleId follow-up.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Per-evidence-item export: click Export → Copy as Markdown / Copy as HTML
writes a portable citation card to the clipboard. Cmd/Ctrl+Shift+C
exports the active evidence as Markdown.
- ADR-0007 locks the Markdown + HTML output formats.
- New shared types: CitationCard, openContextUrl(), resolveSourceLabel().
- Engine renderers under src/engine/rendering/: renderCitationCardMarkdown,
renderCitationCardHtml — snapshot-tested, escape-safe, BEM classes for HTML.
- src/work/useExportEvidence.ts wires engine + renderers + clipboard.
- EvidenceSidebar gains an Export popover per row + auto-dismissing toast.
- E2E test (tests/integration/citation-card-export-e2e.dom.test.tsx)
walks PRD scenario steps 10-11 and asserts the clipboard payload.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
T01 EvidenceLink/EvidenceSet types
- src/shared/evidence-link.ts: status (§2.4), relation (§2.5), target
- src/shared/evidence-set.ts: ordered group + activeEvidenceItemId
- enum-conformance test parses SharedContracts.md and asserts the
runtime lists match exactly
T02 Binding service + in-memory link repo + active-state machine
- src/binder/repos/in-memory-links.ts: Map-backed EvidenceLinkRepository
- src/binder/services/bindings.ts: link/unlink/list/update/setActive
emitting §4 EvidenceLinkCreated / EvidenceLinkUpdated /
EvidenceItemActivated
- src/binder/state/active.ts: (target, evidence, annotation) reducer
+ ActiveStateProvider + useActiveState hook
- extended engine/events/types.ts with EvidenceLinkCreated,
EvidenceLinkUpdated, FormFieldActivated payloads
T03 Rect registry (SharedContracts §7 — contract FROZEN)
- src/binder/visual-guide/rect-registry.ts: register/getRect/subscribe
+ invalidate + getVersion for useSyncExternalStore
- events.ts: scroll/resize/focus pumps via window + ResizeObserver +
IntersectionObserver, rAF-throttled
- react-hooks.ts: RectRegistryProvider, useRegisterRect(kind,id,ref),
useRectRegistryVersion
T04 Form schema + renderer
- src/app/forms/demo-schema.ts: text/textarea/date minimal schema
- src/binder/FormRenderer.tsx: renders schema, each field registers
as rect kind="field"; active field gets aria-current="true"
- placed in binder/ (not work/) because work cannot import binder per
DependencyMap.md §2 and the renderer needs the rect-registry hook;
workplan T04 was amended in-place to document this
T05 Side-by-side Forms layout + click-to-link
- src/app/forms/FormsApp.tsx + src/app/App.tsx top-bar router with
hash route #/forms/demo
- BinderProvider mounted at app root so links survive tab switching
- stage-evidence-then-click-field linking interaction with banner
+ per-field link-count chip
T06 Active-evidence cycling
- src/app/forms/ActiveEvidenceChips.tsx: chips per active target,
Tab cycles natively, first chip auto-activates on field focus,
each chip registers as rect kind="evidence-card"
- ScrollBridge in FormsApp wires activeAnnotationId to viewer scroll
- EvidenceSidebar + EvidenceStrip highlight the active item via the
new useLastActivatedEvidence hook in work/EngineContext
T07 SVG visual-guide overlay
- src/binder/visual-guide/Overlay.tsx: single fixed-positioned SVG,
draws field→card and card→highlight bezier curves for the active
triple, rAF-throttled via the registry
- src/anchor exposes getHighlightClientRects(annotationId); the
spike viewer wraps highlights in [data-highlight-id] so the helper
can locate them
- src/app/forms/HighlightRectBridge.tsx: registers the active
annotation's rect via that helper
T08 End-to-end test (PRD scenario steps 5-9)
- tests/integration/forms-overlay-e2e.dom.test.tsx: full path from
Review-mode capture through Forms-mode link to active triple +
aria-current assertions + 2 SVG paths in the overlay
- additional integration coverage: forms-link-flow + forms-active-cycling
Gates: typecheck ✓ · lint ✓ · build ✓ · 152/152 tests across 21 files.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Completes the PDF review slice end-to-end. After this commit a user can
open a fixture, select text, save an evidence item with commentary, see
it in the sidebar, reload the page, click the item, and the viewer
scrolls to the passage.
- T03 src/source/pdf/{fingerprint,extract,ingest}.ts + 39 fixture tests
- SHA-256 fingerprint over a fresh ArrayBuffer (TS BufferSource-safe)
- PDF.js text extract; per-page normalize then join with "\n\n"
- PageMap + OffsetMap (gap-free coverage); pageLength = end - start
- Updated manifest's Betriebskosten quote to one PDF.js extracts cleanly
- T04 src/anchor/selectors/{create,resolve}.ts + 25 unit + 7 fixture tests
- createSelectors emits the maximal redundant set (TextQuote +
TextPosition + PdfRect + PdfPageText when available)
- resolveSelectors implements the SharedContracts §7 ladder; confidence
1.0 (pos+quote) → 0.7 (rect-only) → 0 (unresolved)
- Cross-module integration test moved to tests/integration/ to honor
the anchor↛source boundary lint rule
- T05 engine: sync event bus over the closed §4 vocabulary, Map-backed
repos, services, createEngine() composition root, 12 tests
- T06 work + app: three-pane shell (CollectionList | ViewerShell |
EvidenceSidebar) wired through EngineProvider; EngineContext lives in
src/work/ to respect the work↛app boundary; SpikeApp deleted
- T07 AnnotationToolbar: pendingSelection in context; Save runs
createSelectors → engine.annotations.create → engine.evidence.create
- T08 click-to-reopen + localStorage persistence
- scrollToAnnotation state in context with a version counter so a
second click on the same item re-fires the viewer scroll
- captureSnapshot/restoreSnapshot/attachPersister/restoreFromStorage;
restore bypasses services to avoid event-loops
- active-document id persisted alongside the snapshot so reload lands
on the same fixture; ADR-0005 written
- 9 persistence tests
- T09 tests/integration/app-prd-scenario.dom.test.tsx
- end-to-end happy-dom test of PRD scenario steps 1-8 through the real
React tree; viewer + ingest mocked per ADR-0004's headless-Chromium
limitation. Fixed memo-deps bug in EvidenceSidebar/ViewerShell where
useEngineEventTick values were not included in the useMemo deps,
leaving stale memoization across event-driven re-renders
- vitest.config.ts: happy-dom for *.dom.test.{ts,tsx} files
- noEmit added to tsconfig so tsc -b doesn't litter src/ with .js outputs
Gates: typecheck ✓ lint ✓ test 109/109 across 11 files ✓ build ✓
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
T01: shared engine types (Document, Selector union, Annotation, EvidenceItem,
branded IDs with newId factory) per wiki/SharedContracts.md §1-§3.
T02: react-pdf-highlighter-plus v1.1.4 spike behind the §5
DocumentViewerAdapter contract in src/anchor/. Pure round-trip math
extracted to pdf-selector-math.ts with 11 unit tests proving lossless
capture → selectors → JSON → restored-rects. ADR-0004 accepted; full
user-flow Playwright verification deferred to T09.
Adds Vite app shell (index.html, src/app/SpikeApp.tsx) so the spike is
exercisable via pnpm dev. tsconfig --noEmit prevents tsc -b from
littering src/ with stray .js outputs.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
T01 Toolchain — vite + pnpm 9.15 + React 18 + strict TS (ADR-0001).
T02 Folder layout — src/{shared,engine,anchor,source,binder,work,app}/
mirroring the future subsystem split, with path aliases.
T03 Boundary lint — eslint-plugin-boundaries enforcing the dependency
edges from wiki/DependencyMap.md §4; verified by a violating fixture.
T04 Canonical normalization v1 — src/shared/text/normalize.ts with
NORMALIZE_VERSION=1; 10/10 vitest covering ligatures, CRLF, soft
hyphens (including line-break reassembly), mixed whitespace.
T05 PDF fixture corpus — 7 user-supplied German PDFs in fixtures/pdfs/
(gitignored binaries) plus a manifest with verbatim known-good
quotes and page counts, ready for CE-WP-0002 selector tests.
T06 README upgrade — umbrella README points at wiki/docs/workplans
and documents the dev workflow.
T07 ADR-0002..0006 stubs in docs/decisions/.
Toolchain end-to-end: pnpm install + lint + typecheck + test all green.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Created topic citation_evidence_mvp (96fa8e80-…) and one workstream per
workplan under the citation-evidence repo. Created one state-hub task per
task block. All UUIDs now recorded in workplan frontmatter
(state_hub_workstream_id) and per task block (state_hub_task_id) so ralph
can call update_task_status against them on first run.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>