Commit Graph

16 Commits

Author SHA1 Message Date
f0af8887d1 Strengthen text-layer debug + log highlight render path
The first cut of "Debug text layer" only painted direct `<span>`
children of `.textLayer`. PDF.js 4.x wraps marked content in nested
spans/divs, so the entire selectable area wasn't visible — making it
hard to tell whether a region is "no text layer at all" vs. "text
layer present but small/dense".

Changes:
- CSS now targets every descendant of `.textLayer`, dims the canvas
  underneath, and outlines the `.textLayer` container itself so its
  full extent is obvious.
- TextHighlight rectangles flip to green in debug mode so saved
  highlights don't get washed out by the debug yellow.
- The viewer now logs:
    [ce] viewer highlights        — which annotations rendered, which
                                    were skipped, with rects + page
    [ce] scrollToAnnotation       — whether the target was found in
                                    the highlights array when an
                                    activation arrived

This is the diagnostic loop for the "viewport scrolls but the
highlight doesn't appear" report — if highlight count is > 0 in the
first log but the green rectangle is off-screen, the saved rects
inherited the same text-layer misalignment that caused the partial
selection captures in the first place.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-26 22:05:13 +02:00
0638c441c4 Add Debug text layer toggle for diagnosing PDF selection issues
PDF text selection misbehaviour (some glyphs unselectable, selections
jumping to other positions) is almost always caused by misalignment
between the visible canvas-rendered glyphs and the invisible text
layer that PDF.js overlays for selection. There's no way to see this
without devtools — which makes it hard for end users to tell whether
a specific PDF is OCR-noisy, has bad font fallbacks, or has no text
layer at all.

This adds a developer-facing toggle in the SessionMenu ("Debug text
layer") that:

- paints every text-layer span yellow with a blue outline so it's
  obvious where text is selectable and where it isn't, and
- logs every onSelection event to the browser console with the
  captured text, page, normalized rects, and the selectors the
  pipeline derived from it.

Preference persists in localStorage under
`citation-evidence:debug:textLayer`. Surfaced via a new
`useDebugFlag()` hook in @work so the SessionMenu (app layer) and the
ViewerShell (work layer) can both subscribe without breaching the
boundary plugin.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-26 21:43:15 +02:00
67bcc2423c Add per-row session delete + Reset all data; fix viewer URL fallback
UX gaps that surfaced while running the demo:

- ViewerShell hardcoded `/fixtures/pdfs/<title>` for the PDF URL,
  ignoring the `document.uri` blob URL that uploaded PDFs carry. The
  viewer either 404'd or — worse — silently served a fixture whose
  filename happened to collide. Prefer document.uri when present.

- SessionMenu only let you delete the *active* session. Added a small
  per-row "✕" button next to every session in the Switch-to list so a
  user can drop a session's data without first switching to it. Same
  click-to-confirm pattern as the existing Delete action.

- Added a "Reset all data…" affordance in both the SessionMenu and the
  empty-state landing. Calls a new `clearAllSessionData()` helper that
  wipes every `citation-evidence:*` key from localStorage, then forces
  a reload so all in-memory caches start fresh.

- `attachSessionPersister.writeOnDelete` was leaking the per-session
  `active-document-id:v1` key on every session delete. Now removed
  alongside the engine snapshot key.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-26 20:49:37 +02:00
d5474a1bd9 Set pdfjs GlobalWorkerOptions.workerSrc at app bootstrap
The PDF.js library refuses to open documents without a worker URL.
Production builds were throwing "No GlobalWorkerOptions.workerSrc
specified" on any upload because neither the source-layer ingest
(extract.ts) nor the viewer adapter ever set one — they relied on the
host application to do it, and the browser bootstrap didn't.

main.tsx now imports the worker via Vite's `?url` suffix so the file
is bundled into the build, and sets GlobalWorkerOptions.workerSrc
once before any PDF code runs. Added src/vite-env.d.ts so TypeScript
knows about the `?url` import suffix.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-26 15:28:51 +02:00
a5f5c7d8a8 Add Makefile with preview target
`make preview` runs `pnpm build && pnpm preview` so the demo can be
served from the production bundle at http://localhost:4173/. Plus
convenience targets for dev, build, test, typecheck, lint.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-26 15:17:33 +02:00
779ae0d317 Implement CE-WP-0005 T01-T08: demo app — sessions, uploads, ZIP archive
Turn the MVP into a self-contained demo. Users now:
  1. Land on an empty-state and create a named session.
  2. Drag-drop or pick arbitrary PDFs into that session.
  3. Annotate, build evidence, link to form fields — all session-scoped.
  4. Export the whole session as a single .zip archive (manifest +
     per-document PDFs).
  5. Import a .zip back — into a new session, or merged into an
     existing one (documents deduped by SHA-256 fingerprint;
     annotations/evidence/links added additively).

Architecture:
- New shared types: SessionId, Session, SessionArchiveManifest +
  parseSessionArchiveManifest with schema-version validation.
- SessionService (engine/services/sessions.ts) handles lifecycle
  (create/rename/delete/setActive) + emits 4 new events through its
  own bus; SharedContracts.md §4 lists the additions.
- SessionProvider (work/SessionContext.tsx) owns the cross-session
  state: service, per-session PdfByteStore registry, per-session
  version counter that drives EngineProvider remounts after imports.
- EngineProvider becomes session-aware (sessionId prop drives per-
  session localStorage keys). Bumping engineRevision after
  restoreFromStorage forces consumers to re-render so restored repos
  show up immediately.
- PdfByteStore (source/pdf/byte-store.ts) holds Uint8Array bytes per
  document and mints blob URLs; ingestPdfFromFile is the upload
  entry-point that wraps the existing ingestPdf pipeline.
- ADR-0008 locks the ZIP layout (manifest.json + documents/<id>.pdf),
  the manifest schema (schemaVersion 1), and the merge-on-collision
  policy. JSZip is the only new dependency.
- App.tsx restructured: SessionProvider at the root, EngineProvider
  keyed by ${sessionId}:${version}, hash routing #/s/<id>[/forms/demo],
  SessionMenu top-bar, CreateFirstSession empty state.
- New DocumentRemoved event for per-document delete cleanup in
  CollectionList; engine.documents.remove() is the new service method.

Tests:
- Unit: 16 SessionService lifecycle + persistence tests;
  per-session snapshot round-trip; PdfByteStore + ingestPdfFromFile;
  SessionArchive parser; exportSessionZip + importSessionZip with
  create + merge + corrupt-archive paths.
- DOM: UploadDropzone, session-scoped CollectionList delete,
  SessionMenu create/switch/rename, routing parser.
- E2E: tests/integration/session-export-reimport.dom.test.tsx walks
  the full create → annotate → export → reimport flow and asserts
  the additive merge (deduped doc + doubled evidence rows).
- Legacy E2Es updated to use a seed-session helper instead of the
  removed fixture-button flow.

Known limitation (documented in ADR-0008): re-importing your own
freshly-exported ZIP creates duplicate annotations. Forward pointer
left for an importBundleId follow-up.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-26 14:57:28 +02:00
8632f7b04a Implement CE-WP-0004 T01-T05: citation card export (Markdown + HTML)
Per-evidence-item export: click Export → Copy as Markdown / Copy as HTML
writes a portable citation card to the clipboard. Cmd/Ctrl+Shift+C
exports the active evidence as Markdown.

- ADR-0007 locks the Markdown + HTML output formats.
- New shared types: CitationCard, openContextUrl(), resolveSourceLabel().
- Engine renderers under src/engine/rendering/: renderCitationCardMarkdown,
  renderCitationCardHtml — snapshot-tested, escape-safe, BEM classes for HTML.
- src/work/useExportEvidence.ts wires engine + renderers + clipboard.
- EvidenceSidebar gains an Export popover per row + auto-dismissing toast.
- E2E test (tests/integration/citation-card-export-e2e.dom.test.tsx)
  walks PRD scenario steps 10-11 and asserts the clipboard payload.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-26 14:43:17 +02:00
8607c252c4 Implement CE-WP-0003 T01-T08: form binding + visual guide overlay
T01 EvidenceLink/EvidenceSet types
  - src/shared/evidence-link.ts: status (§2.4), relation (§2.5), target
  - src/shared/evidence-set.ts: ordered group + activeEvidenceItemId
  - enum-conformance test parses SharedContracts.md and asserts the
    runtime lists match exactly

T02 Binding service + in-memory link repo + active-state machine
  - src/binder/repos/in-memory-links.ts: Map-backed EvidenceLinkRepository
  - src/binder/services/bindings.ts: link/unlink/list/update/setActive
    emitting §4 EvidenceLinkCreated / EvidenceLinkUpdated /
    EvidenceItemActivated
  - src/binder/state/active.ts: (target, evidence, annotation) reducer
    + ActiveStateProvider + useActiveState hook
  - extended engine/events/types.ts with EvidenceLinkCreated,
    EvidenceLinkUpdated, FormFieldActivated payloads

T03 Rect registry (SharedContracts §7 — contract FROZEN)
  - src/binder/visual-guide/rect-registry.ts: register/getRect/subscribe
    + invalidate + getVersion for useSyncExternalStore
  - events.ts: scroll/resize/focus pumps via window + ResizeObserver +
    IntersectionObserver, rAF-throttled
  - react-hooks.ts: RectRegistryProvider, useRegisterRect(kind,id,ref),
    useRectRegistryVersion

T04 Form schema + renderer
  - src/app/forms/demo-schema.ts: text/textarea/date minimal schema
  - src/binder/FormRenderer.tsx: renders schema, each field registers
    as rect kind="field"; active field gets aria-current="true"
  - placed in binder/ (not work/) because work cannot import binder per
    DependencyMap.md §2 and the renderer needs the rect-registry hook;
    workplan T04 was amended in-place to document this

T05 Side-by-side Forms layout + click-to-link
  - src/app/forms/FormsApp.tsx + src/app/App.tsx top-bar router with
    hash route #/forms/demo
  - BinderProvider mounted at app root so links survive tab switching
  - stage-evidence-then-click-field linking interaction with banner
    + per-field link-count chip

T06 Active-evidence cycling
  - src/app/forms/ActiveEvidenceChips.tsx: chips per active target,
    Tab cycles natively, first chip auto-activates on field focus,
    each chip registers as rect kind="evidence-card"
  - ScrollBridge in FormsApp wires activeAnnotationId to viewer scroll
  - EvidenceSidebar + EvidenceStrip highlight the active item via the
    new useLastActivatedEvidence hook in work/EngineContext

T07 SVG visual-guide overlay
  - src/binder/visual-guide/Overlay.tsx: single fixed-positioned SVG,
    draws field→card and card→highlight bezier curves for the active
    triple, rAF-throttled via the registry
  - src/anchor exposes getHighlightClientRects(annotationId); the
    spike viewer wraps highlights in [data-highlight-id] so the helper
    can locate them
  - src/app/forms/HighlightRectBridge.tsx: registers the active
    annotation's rect via that helper

T08 End-to-end test (PRD scenario steps 5-9)
  - tests/integration/forms-overlay-e2e.dom.test.tsx: full path from
    Review-mode capture through Forms-mode link to active triple +
    aria-current assertions + 2 SVG paths in the overlay
  - additional integration coverage: forms-link-flow + forms-active-cycling

Gates: typecheck ✓ · lint ✓ · build ✓ · 152/152 tests across 21 files.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 15:53:17 +02:00
d54daf2e61 Implement CE-WP-0002 T03-T09: ingest, anchor resolution, engine, UI, persistence, e2e
Completes the PDF review slice end-to-end. After this commit a user can
open a fixture, select text, save an evidence item with commentary, see
it in the sidebar, reload the page, click the item, and the viewer
scrolls to the passage.

- T03 src/source/pdf/{fingerprint,extract,ingest}.ts + 39 fixture tests
  - SHA-256 fingerprint over a fresh ArrayBuffer (TS BufferSource-safe)
  - PDF.js text extract; per-page normalize then join with "\n\n"
  - PageMap + OffsetMap (gap-free coverage); pageLength = end - start
  - Updated manifest's Betriebskosten quote to one PDF.js extracts cleanly
- T04 src/anchor/selectors/{create,resolve}.ts + 25 unit + 7 fixture tests
  - createSelectors emits the maximal redundant set (TextQuote +
    TextPosition + PdfRect + PdfPageText when available)
  - resolveSelectors implements the SharedContracts §7 ladder; confidence
    1.0 (pos+quote) → 0.7 (rect-only) → 0 (unresolved)
  - Cross-module integration test moved to tests/integration/ to honor
    the anchor↛source boundary lint rule
- T05 engine: sync event bus over the closed §4 vocabulary, Map-backed
  repos, services, createEngine() composition root, 12 tests
- T06 work + app: three-pane shell (CollectionList | ViewerShell |
  EvidenceSidebar) wired through EngineProvider; EngineContext lives in
  src/work/ to respect the work↛app boundary; SpikeApp deleted
- T07 AnnotationToolbar: pendingSelection in context; Save runs
  createSelectors → engine.annotations.create → engine.evidence.create
- T08 click-to-reopen + localStorage persistence
  - scrollToAnnotation state in context with a version counter so a
    second click on the same item re-fires the viewer scroll
  - captureSnapshot/restoreSnapshot/attachPersister/restoreFromStorage;
    restore bypasses services to avoid event-loops
  - active-document id persisted alongside the snapshot so reload lands
    on the same fixture; ADR-0005 written
  - 9 persistence tests
- T09 tests/integration/app-prd-scenario.dom.test.tsx
  - end-to-end happy-dom test of PRD scenario steps 1-8 through the real
    React tree; viewer + ingest mocked per ADR-0004's headless-Chromium
    limitation. Fixed memo-deps bug in EvidenceSidebar/ViewerShell where
    useEngineEventTick values were not included in the useMemo deps,
    leaving stale memoization across event-driven re-renders
- vitest.config.ts: happy-dom for *.dom.test.{ts,tsx} files
- noEmit added to tsconfig so tsc -b doesn't litter src/ with .js outputs

Gates: typecheck ✓ lint ✓ test 109/109 across 11 files ✓ build ✓

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 10:58:11 +02:00
2a7b05c190 Implement CE-WP-0002 T01-T02: engine types + PDF viewer adapter spike
T01: shared engine types (Document, Selector union, Annotation, EvidenceItem,
branded IDs with newId factory) per wiki/SharedContracts.md §1-§3.

T02: react-pdf-highlighter-plus v1.1.4 spike behind the §5
DocumentViewerAdapter contract in src/anchor/. Pure round-trip math
extracted to pdf-selector-math.ts with 11 unit tests proving lossless
capture → selectors → JSON → restored-rects. ADR-0004 accepted; full
user-flow Playwright verification deferred to T09.

Adds Vite app shell (index.html, src/app/SpikeApp.tsx) so the spike is
exercisable via pnpm dev. tsconfig --noEmit prevents tsc -b from
littering src/ with stray .js outputs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 02:21:31 +02:00
2f25f99cae Implement CE-WP-0001 Foundations: TS scaffold, lint boundaries, normalize v1, fixtures
T01 Toolchain — vite + pnpm 9.15 + React 18 + strict TS (ADR-0001).
T02 Folder layout — src/{shared,engine,anchor,source,binder,work,app}/
    mirroring the future subsystem split, with path aliases.
T03 Boundary lint — eslint-plugin-boundaries enforcing the dependency
    edges from wiki/DependencyMap.md §4; verified by a violating fixture.
T04 Canonical normalization v1 — src/shared/text/normalize.ts with
    NORMALIZE_VERSION=1; 10/10 vitest covering ligatures, CRLF, soft
    hyphens (including line-break reassembly), mixed whitespace.
T05 PDF fixture corpus — 7 user-supplied German PDFs in fixtures/pdfs/
    (gitignored binaries) plus a manifest with verbatim known-good
    quotes and page counts, ready for CE-WP-0002 selector tests.
T06 README upgrade — umbrella README points at wiki/docs/workplans
    and documents the dev workflow.
T07 ADR-0002..0006 stubs in docs/decisions/.

Toolchain end-to-end: pnpm install + lint + typecheck + test all green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 00:13:03 +02:00
707620adfb Wire workplans to state-hub: topic, 4 workstreams, 29 tasks (UUIDs in frontmatter + task blocks)
Created topic citation_evidence_mvp (96fa8e80-…) and one workstream per
workplan under the citation-evidence repo. Created one state-hub task per
task block. All UUIDs now recorded in workplan frontmatter
(state_hub_workstream_id) and per task block (state_hub_task_id) so ralph
can call update_task_status against them on first run.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 17:07:58 +02:00
d06a456c2a Establish shared-contracts home, dependency map, MVP workplans, and umbrella-first strategy
- INTENT.md: declare umbrella as the home for shared contracts; document
  umbrella-first MVP decision (code lives here until subsystems stabilize)
- wiki/SharedContracts.md: vocabulary, state enums, relation types,
  selector taxonomy, event vocabulary, viewer adapter contract,
  canonical text normalization, rect-registry contract
- wiki/DependencyMap.md: allowed dependency edges; folder layout +
  lint-rule strategy during umbrella-first phase
- history/2026-05-24-initial-assessment.md: alignment review, technical
  risks, and the umbrella-first pivot rationale
- workplans/CE-WP-0001..0004: four ralph-compatible workplans covering
  foundations, PDF review slice, form binding + visual guide, and
  citation card export — implementing PRD §20 end-to-end

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 16:42:25 +02:00
bc95737e6a Added intent prd and architecture 2026-05-24 15:26:34 +02:00
5c1ffcb58c Update README.md 2026-05-24 13:21:09 +00:00
Coulomb Social
175b241f40 Initial commit 2026-05-24 13:20:45 +00:00