Strong likelihood that the "text layer is misplaced / body text not selectable" symptoms across multiple PDFs come from PDF.js falling back to substitute font metrics. Without the cmaps directory (CID character maps for non-Latin fonts) and the standard_fonts directory (Helvetica/Times/Courier metrics for unembedded standard fonts), the canvas glyphs use embedded font data while the text-layer span positions are computed from fallback metrics. The two diverge — text spans land in the wrong place, or text content can't be decoded at all, leaving the body unselectable. Both directories are now copied into the served root by vite-plugin-static-copy and passed to pdfjs.getDocument() as `cMapUrl: "/cmaps/"` + `cMapPacked: true` + `standardFontDataUrl: "/standard_fonts/"` via PdfLoader's `document` prop (which accepts a full DocumentInitParameters object). If this is the right diagnosis, the textLayer overlay should now line up with the visible glyphs on the same PDFs that were producing fragmented captures. If the body text is still unselectable, the PDF genuinely lacks a text layer for those glyphs (image-only content) and OCR would be the only path forward. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
citation-evidence
A document-centered evidence workspace for capturing, managing, presenting,
and re-opening citations. The umbrella over the six-package design described
in INTENT.md and wiki/ArchitectureOverview.md.
During the MVP all code lives here under src/ (see "Repository layout"
below). Sister repos hold INTENT only — code migrates outward when each
subsystem stabilises.
Documentation
| Where | What |
|---|---|
INTENT.md |
Project intent, scope, the umbrella-first decision |
wiki/ |
PRD, Architecture, SharedContracts, DependencyMap |
docs/decisions/ |
ADRs (architecturally significant decisions) |
workplans/ |
Ralph-driven workplans that implement the MVP slice |
history/ |
Time-stamped assessments and post-mortems |
The canonical contracts are in wiki/SharedContracts.md;
the partition boundaries are in wiki/DependencyMap.md.
Both are referenced from every workplan and from each sister repo's INTENT.md.
Repository layout
src/
shared/ # vocabulary, types, pure helpers → becomes part of citation-engine
engine/ # services, repositories, event bus → becomes part of citation-engine
anchor/ # selector creation/resolution, viewer adapter contract → becomes evidence-anchor
source/ # ingest, fingerprint, extraction, recovery → becomes evidence-source
binder/ # evidence-to-target binding, visual guide → becomes evidence-binder
work/ # review UI (sidebar, viewer shell) → becomes citation-work
app/ # the reference workspace shell → stays in citation-evidence
The dependency-edge rules between partitions are enforced by ESLint via
eslint-plugin-boundaries (see eslint.config.js). Extraction to a sister
repo is intended to be a git mv plus a package.json cut — nothing more.
Sister repos
Peers under ~/; each holds INTENT.md only during MVP:
~/citation-engine— shared model + engine services~/evidence-anchor— selectors + adapter contract~/evidence-source— ingest, representation, recovery~/evidence-binder— binding, visual guide, rect registry~/citation-work— review UI surfaces
Dev workflow
Requirements: Node 20 LTS (see .nvmrc) and pnpm 9.
pnpm install
pnpm dev # vite dev server (once src/app/ has a real entry)
pnpm test # vitest one-shot
pnpm test:watch
pnpm lint # eslint with boundary rules
pnpm typecheck # tsc --noEmit
pnpm build # production bundle
Workplans (Ralph)
Workplans drive incremental implementation through the ralph loop. The harness
lives in ~/ralph-workplan/; see workplans/README.md for the active list
and ordering.
/ralph-workplan workplans/CE-WP-0001-foundations.md
The loop self-retires when every task in the file has status: done and the
workplan's frontmatter status: done.