--- id: CE-WP-0001 type: workplan title: "Foundations — TS scaffold, folder layout, lint boundaries, normalization, fixtures" domain: citation_evidence repo: citation-evidence repo_id: a677c189-b4e2-4f2a-9e48-faa482c277e6 topic_slug: citation_evidence_mvp topic_id: 96fa8e80-9f74-40f2-84cd-644e9747b9ec state_hub_workstream_id: 1737bf6e-a3cb-413e-81b8-932f6f85791c status: done owner: Bernd created: 2026-05-24 updated: 2026-05-24 spec_refs: - wiki/ProductRequirementsDocument.md - wiki/ArchitectureOverview.md - wiki/SharedContracts.md - wiki/DependencyMap.md --- # CE-WP-0001 — Foundations Establish the skeleton of the umbrella-first MVP: a TypeScript project with a folder layout that mirrors the future subsystem split (so that extracting to sister repos later is a `git mv` plus a `package.json` cut), lint rules that enforce the dependency map at the folder level, the versioned canonical-text normalization function, and a small but representative PDF fixtures corpus. No product features yet. This workplan exists so that everything from `CE-WP-0002` onward has somewhere to land. ## Decisions captured here Each task below corresponds to a Phase-0 ADR. The ADR lives at `docs/decisions/ADR-NNNN-.md`. If a task involves a choice that wasn't already decided, the agent stops and asks Bernd before writing code. ## Dependency Order ``` T01 (toolchain decision + package.json) └─ T02 (folder layout per DependencyMap §4) └─ T03 (lint rules enforcing dep edges) └─ T04 (canonical text normalization v1, versioned) └─ T05 (fixtures: 5+ representative PDFs + a manifest) └─ T06 (README upgrade + dev workflow doc) └─ T07 (write the six pending ADRs as stubs) ``` --- ## T01 — Toolchain + package.json + tsconfig ```task id: CE-WP-0001-T01 state_hub_task_id: 4de816d0-34de-4bdf-a802-da1b0feefc19 priority: critical status: done ``` Decide the TS toolchain (vite vs tsc-only vs Next.js) and write a single `package.json` at the repo root. Decisions to lock in this task as an ADR (`docs/decisions/ADR-0001-toolchain.md`): - Bundler: vite (recommended, fastest dev loop for a React MVP) - Package manager: pnpm (recommended, plays well with future workspace split) - React 18+ - Strict TS Deliverables: - `package.json` with `dev`, `build`, `test`, `lint`, `typecheck` scripts - `tsconfig.json` with strict mode, paths for the `src/` partitions - `.nvmrc` pinning Node version - `docs/decisions/ADR-0001-toolchain.md` written and committed Do not install application dependencies yet — just the toolchain. --- ## T02 — Folder layout matching DependencyMap §4 ```task id: CE-WP-0001-T02 state_hub_task_id: 448d2d93-9517-4649-8aac-e00907a12a0a priority: critical status: done depends_on: [T01] ``` Create the source folder layout: ``` src/ shared/ # will become @citation-evidence/engine (types + contracts) engine/ # will become @citation-evidence/engine (services) anchor/ # will become @citation-evidence/anchor source/ # will become @citation-evidence/source work/ # will become @citation-evidence/work (UI) binder/ # will become @citation-evidence/binder app/ # the reference workspace shell ``` Each folder gets: - A one-line `README.md` stating its future home - An `index.ts` that re-exports its public API (empty for now) Add path aliases in `tsconfig.json`: `@shared/*`, `@engine/*`, etc. --- ## T03 — Lint rules enforcing dependency edges ```task id: CE-WP-0001-T03 state_hub_task_id: abd08afb-78e5-4b41-b956-53e5605c1113 priority: high status: done depends_on: [T02] ``` Install `eslint-plugin-boundaries` (or equivalent) and configure rules per `wiki/DependencyMap.md` §4: | Folder | May import from | |--------------|--------------------------------------------------| | `shared/` | (nothing internal) | | `engine/` | `shared/` | | `anchor/` | `shared/`, `engine/` | | `source/` | `shared/`, `engine/` | | `binder/` | `shared/`, `engine/`, `anchor/` | | `work/` | `shared/`, `engine/`, `anchor/`, `source/` | | `app/` | any | Add a failing test fixture that imports `source/` from `binder/` and confirm lint catches it; remove the fixture afterward. `npm run lint` must pass on a clean tree. --- ## T04 — Canonical text normalization v1 ```task id: CE-WP-0001-T04 state_hub_task_id: 0ca4f848-20c5-425e-8996-a73569c9be16 priority: critical status: done depends_on: [T02] ``` Implement `src/shared/text/normalize.ts` per `wiki/SharedContracts.md` §6: 1. Unicode NFC 2. Normalize line endings to `\n` 3. Collapse horizontal whitespace runs to a single space 4. Strip soft hyphens (U+00AD) 5. Preserve paragraph boundaries (`\n\n`) Public API: ```ts export const NORMALIZE_VERSION = 1; export function normalize(input: string): { text: string; version: number }; ``` Include unit tests covering: ligatures, CRLF input, soft-hyphenated German, mixed whitespace, paragraph preservation. Stored selectors will record this version number so that future normalization changes can be detected as a migration concern. --- ## T05 — PDF fixtures corpus + manifest ```task id: CE-WP-0001-T05 state_hub_task_id: 0b686530-ef89-4172-b5c8-de97fa7b7ef0 priority: high status: done depends_on: [T01] ``` Assemble `fixtures/pdfs/` with at least 5 representative PDFs: - A simple single-column text PDF - A two-column academic PDF (e.g. ACM-style) - A German PDF with umlauts and soft hyphens - A form PDF (e.g. a public-sector application form) - A PDF with a heading hierarchy Write `fixtures/pdfs/manifest.json` recording for each: - filename - short description - expected page count - one short "known-good quote" with the page number it appears on (used by CE-WP-0002 selector tests) Keep each PDF small (< 1 MB) and check sources/licenses into `fixtures/pdfs/SOURCES.md`. Public-domain or Bernd-authored only. --- ## T06 — README upgrade + dev workflow doc ```task id: CE-WP-0001-T06 state_hub_task_id: b0a5b5a4-81f0-4359-a6e1-67bc6c77e52b priority: medium status: done depends_on: [T01, T02] ``` Replace the one-line `README.md` with a real one: - What citation-evidence is (one paragraph from INTENT) - Repository layout (point at `src/` partitions and what each becomes) - Where to find docs (`wiki/`, `docs/decisions/`, `history/`, `workplans/`) - Dev workflow: `pnpm install`, `pnpm dev`, `pnpm test`, `pnpm lint` - Pointer to `~/ralph-workplan/` for how workplans are driven Add a one-paragraph `README.md` in each of the five sister repos pointing back at this umbrella + reminding readers that code lives upstream during the MVP phase. --- ## T07 — Stub the six pending ADRs ```task id: CE-WP-0001-T07 state_hub_task_id: 15456374-73e0-403e-b805-2e259247e615 priority: medium status: done depends_on: [T01] ``` Create stub files in `docs/decisions/` for each ADR mentioned in `wiki/SharedContracts.md` §10: - `ADR-0001-toolchain.md` (filled in by T01) - `ADR-0002-monorepo-vs-polyrepo.md` - `ADR-0003-w3c-mapping-scope.md` - `ADR-0004-pdf-viewer-library.md` - `ADR-0005-persistence.md` - `ADR-0006-selector-ownership-split.md` Each stub: title, status (`proposed` for 2-6), context (one paragraph explaining what the decision is about and why it matters), options (bullet list with pros/cons), decision (blank), consequences (blank). These are not decisions yet — they are *the questions that must be answered before the relevant code lands*. The MVP can proceed without 2-6 being resolved because no extraction or persistence happens until later workplans.