generated from coulomb/repo-seed
Establish shared-contracts home, dependency map, MVP workplans, and umbrella-first strategy
- INTENT.md: declare umbrella as the home for shared contracts; document umbrella-first MVP decision (code lives here until subsystems stabilize) - wiki/SharedContracts.md: vocabulary, state enums, relation types, selector taxonomy, event vocabulary, viewer adapter contract, canonical text normalization, rect-registry contract - wiki/DependencyMap.md: allowed dependency edges; folder layout + lint-rule strategy during umbrella-first phase - history/2026-05-24-initial-assessment.md: alignment review, technical risks, and the umbrella-first pivot rationale - workplans/CE-WP-0001..0004: four ralph-compatible workplans covering foundations, PDF review slice, form binding + visual guide, and citation card export — implementing PRD §20 end-to-end Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
53
INTENT.md
53
INTENT.md
@@ -189,6 +189,59 @@ This repository should be:
|
|||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
## Home for Shared Contracts
|
||||||
|
|
||||||
|
This repository is the **single home for everything the sister repos must
|
||||||
|
agree on**. The canonical documents live in `wiki/`:
|
||||||
|
|
||||||
|
* `wiki/ProductRequirementsDocument.md` — what the product does
|
||||||
|
* `wiki/ArchitectureOverview.md` — how the subsystems compose
|
||||||
|
* `wiki/SharedContracts.md` — vocabulary, state enums, relation types, selector taxonomy, event types, viewer adapter contract, canonical text normalization
|
||||||
|
* `wiki/DependencyMap.md` — which subsystem may depend on which
|
||||||
|
* `docs/decisions/` — ADRs that resolve ambiguities and bind the contract
|
||||||
|
|
||||||
|
Sister repos (`citation-engine`, `evidence-anchor`, `evidence-source`,
|
||||||
|
`citation-work`, `evidence-binder`) defer to these documents. When their
|
||||||
|
own `INTENT.md` files mention "shared contracts", they mean the documents
|
||||||
|
listed above.
|
||||||
|
|
||||||
|
Changes to shared contracts happen here, not in the sister repos.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## MVP Strategy — Umbrella-First (decided 2026-05-24)
|
||||||
|
|
||||||
|
**The MVP lives entirely in this repository before being segmented into the
|
||||||
|
sister repos.** This is a deliberate trade-off: fewer interface decisions up
|
||||||
|
front, more refactoring later when extraction happens.
|
||||||
|
|
||||||
|
The reasoning:
|
||||||
|
|
||||||
|
1. The architectural boundaries documented in the sister INTENT files are
|
||||||
|
hypotheses. We do not yet know which ones will hold up under real product
|
||||||
|
pressure.
|
||||||
|
2. Coordinating six repos with no working code is expensive. Coordinating one
|
||||||
|
repo with working code is cheap.
|
||||||
|
3. Interfaces designed in advance of implementation tend to be wrong.
|
||||||
|
4. Extracting working code into a new repo is a known, bounded refactor.
|
||||||
|
Reshaping a premature interface while implementing against it is not.
|
||||||
|
|
||||||
|
Concretely:
|
||||||
|
|
||||||
|
* All MVP source code lives under `citation-evidence/src/`, partitioned by
|
||||||
|
future-repo names (`shared/`, `engine/`, `anchor/`, `source/`, `work/`,
|
||||||
|
`binder/`, `app/`).
|
||||||
|
* The `DependencyMap.md` rules are enforced by lint rules on these folders.
|
||||||
|
* The five sister repos remain INTENT-only during MVP — they document the
|
||||||
|
intended boundary, not current code.
|
||||||
|
* When a subsystem's interface stabilizes (typically after the MVP scenario
|
||||||
|
has run end-to-end at least once), its `src/<repo-name>/` slice extracts
|
||||||
|
to the sister repo.
|
||||||
|
|
||||||
|
This INTENT will be updated when extraction happens.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
## Success Criteria
|
## Success Criteria
|
||||||
|
|
||||||
The repository is successful when it allows a developer or agent to understand, run, and extend the citation-evidence system as an integrated product.
|
The repository is successful when it allows a developer or agent to understand, run, and extend the citation-evidence system as an integrated product.
|
||||||
|
|||||||
113
history/2026-05-24-initial-assessment.md
Normal file
113
history/2026-05-24-initial-assessment.md
Normal file
@@ -0,0 +1,113 @@
|
|||||||
|
# Initial Assessment — citation-evidence ecosystem
|
||||||
|
|
||||||
|
**Date:** 2026-05-24
|
||||||
|
**Author:** Claude (Opus 4.7), commissioned by Bernd
|
||||||
|
**Scope:** Review of `citation-evidence` umbrella PRD and Architecture overview, plus all five sister-repo `INTENT.md` files, for alignment, risk, and recommended approach.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Overall alignment across the six INTENT.md files
|
||||||
|
|
||||||
|
The vocabulary is impressively coherent: every repo speaks of
|
||||||
|
`Document → DocumentRepresentation → Annotation → Selector → EvidenceItem → EvidenceLink → CitationCard`.
|
||||||
|
Each `INTENT.md` follows the same Purpose / Scope / Out-of-Scope / Architectural Position / First-Useful-Version / Success Criteria shape.
|
||||||
|
Out-of-scope sections show the authors deliberately *pushing* responsibilities into other repos — a healthy signal.
|
||||||
|
|
||||||
|
The PRD and Architecture overview in `citation-evidence/wiki/` are also internally consistent: the PRD's functional requirements map cleanly to the architecture's data flows and to subsystem scopes.
|
||||||
|
|
||||||
|
But the documents were authored in quick succession (all on 2026-05-24, within ~30 minutes of each other based on file timestamps) and **never reconciled against each other**, which created the issues below.
|
||||||
|
|
||||||
|
## 2. What should be improved
|
||||||
|
|
||||||
|
### 2.1 Concrete ownership ambiguities to resolve in short ADRs
|
||||||
|
|
||||||
|
| Concept | Conflict |
|
||||||
|
|---|---|
|
||||||
|
| **`Selector` types** | `citation-engine` claims it as a "key concept owned"; `evidence-anchor`'s scope lists "selector type definitions". Likely fix: *interfaces* in engine, *creation/resolution/algorithms* in anchor. |
|
||||||
|
| **`EvidenceLink` / `EvidenceSet`** | Engine claims both as owned domain types; `evidence-binder` lists "evidence-to-target binding model" and "evidence sets" in scope. Same engine-defines-type / binder-owns-behavior split needed. |
|
||||||
|
| **Status enums** | Architecture's `EvidenceItem.status` is `candidate\|confirmed\|rejected\|needs-check`. `citation-work` adds `strong-support\|weak-support\|contradicts`. `evidence-binder` adds *target-specific* states (`conflicting-evidence`, `insufficient-evidence`, `verified`) plus extra relations (`context-for`, `derived-from`, `needs-check`). Three repos inventing overlapping enums. |
|
||||||
|
| **Viewer adapters** | Architecture diagram shows them as a separate box, no owner. Adapter methods (`load`, `createSelectorsFromSelection`, `resolveSelectors`, `scrollToResolvedTarget`, `renderHighlight`) straddle `evidence-source` and `evidence-anchor`. Pick one home (likely `evidence-anchor`, with `evidence-source` providing the representation). |
|
||||||
|
| **`CitationRecoveryAttempt`** | Type in engine, behavior in `evidence-source` — semantic ownership split that will rot. |
|
||||||
|
| **Document review status (FR-006)** | No repo claims it; `citation-work` hints "may later be moved into a shared model". |
|
||||||
|
|
||||||
|
### 2.2 Repository scaffolding gaps
|
||||||
|
|
||||||
|
- The umbrella architecture (§3.1) promises `apps/workspace-demo/`, `docs/decisions/`, `integration-tests/`, `docker-compose.yml` — none of this exists yet.
|
||||||
|
- All six READMEs are essentially empty (1 line). New contributors and agents won't know where to start.
|
||||||
|
- `citation-evidence` is **not registered in the state-hub**. For a project that splits across six repos, you lose central memory of decisions/dependencies/progress without it.
|
||||||
|
|
||||||
|
### 2.3 Architectural decisions still pending
|
||||||
|
|
||||||
|
ADR-001 through ADR-005 in the architecture doc are framed as "recommendations" rather than commitments. Each blocks code:
|
||||||
|
|
||||||
|
- React-first vs web-component-first (drives repo packaging)
|
||||||
|
- Local-first vs server-first storage (drives persistence interface shape)
|
||||||
|
- W3C internal model vs mapping (drives every type definition)
|
||||||
|
- `react-pdf-highlighter-plus` vs PDF.js direct (drives MVP timeline by weeks)
|
||||||
|
- Recovery scope local-only vs external
|
||||||
|
|
||||||
|
### 2.4 Missing cross-repo contract artefacts
|
||||||
|
|
||||||
|
There is no central dependency map. Each repo says "I expect to depend on X" but nothing names which repo *publishes* the shared types package(s). Pick monorepo (pnpm workspace) vs polyrepo with published `@citation-evidence/engine` npm packages before the first commit of code lands — switching later is painful.
|
||||||
|
|
||||||
|
## 3. Technical risks to inspect first
|
||||||
|
|
||||||
|
In rough order of "if this is broken, the architecture doesn't work":
|
||||||
|
|
||||||
|
1. **PDF canonical-text stability** — the entire selector/anchor model assumes a given PDF + extraction pipeline produces *the same* canonical text each time. PDF.js text extraction has known issues with multi-column layouts, custom-glyph fonts, ligatures, soft hyphens, and reading order. Build a corpus of 15-20 representative PDFs (governmental forms, two-column papers, scanned-then-OCR'd, German umlauts) and confirm round-trip selector resolution before committing to the model.
|
||||||
|
|
||||||
|
2. **`react-pdf-highlighter-plus` abstraction leakage** — this library is opinionated; wrapping it cleanly while keeping the engine viewer-independent is the central architectural test. Do a focused spike: load PDF → select → store selectors as JSON → reload page → resolve from JSON → highlight. If this leaks PDF.js types into the engine API, the boundary fails on day one.
|
||||||
|
|
||||||
|
3. **Canonical-text normalization is a silent migration** — every stored annotation's `TextQuoteSelector` / `TextPositionSelector` depends on the *exact* normalization rules used at creation time. Treat normalization as a versioned, deterministic function from day one. If you change Unicode normalization or whitespace handling later, every stored annotation breaks silently.
|
||||||
|
|
||||||
|
4. **Visual guide overlay coupling** — `evidence-binder` owns the visual-guide *model*, but rendering needs DOM rects from three sources: the form (binder's UI?), the evidence sidebar (`citation-work`), and the document highlight (viewer adapter). Three subsystems contributing rects to one overlay is the highest-coupling part of the system. Define an explicit *rect registry* contract before any of them ships UI.
|
||||||
|
|
||||||
|
5. **CSS Custom Highlight API support** — architecture mentions it for HTML/Markdown with fallback. Browser support is uneven; the fallback (usually DOM range-based span wrapping) is what will actually run on most users' machines. Verify the fallback path is acceptable, not the optimistic primary.
|
||||||
|
|
||||||
|
6. **W3C Web Annotation mapping is not free** — JSON-LD selectors can express things your internal model can't (and vice versa). Round-tripping is a research task, not a one-day mapping. Decide whether mapping is "lossy but useful" or "MUST round-trip" before stabilizing types.
|
||||||
|
|
||||||
|
7. **Multi-repo dependency cycle risk** — engine ↔ anchor (`Selector` ownership), engine ↔ source (`RecoveryAttempt`), engine ↔ binder (`Link`/`Set`) all currently look bidirectional in the INTENT files. Without a strict "types-only flow downward, behavior flows upward" rule, you will hit `npm install` cycles.
|
||||||
|
|
||||||
|
## 4. Rough approach (original phased plan)
|
||||||
|
|
||||||
|
**Phase 0 — Foundations (1-2 weeks, no production code)**
|
||||||
|
- Register `citation-evidence` as a state-hub domain + register all six repos
|
||||||
|
- Write 5-7 micro-ADRs in `citation-evidence/docs/decisions/` resolving the ownership ambiguities above
|
||||||
|
- Pick monorepo-vs-polyrepo and pin Node/TS toolchain
|
||||||
|
- Assemble a 15-20 PDF test corpus and check it into a fixtures location
|
||||||
|
- Write a real README for each repo pointing at INTENT + architecture
|
||||||
|
|
||||||
|
**Phase 1 — Vertical slice on the easiest format (4-6 weeks)**
|
||||||
|
- Engine: TS types + in-memory repos only
|
||||||
|
- Anchor: text-quote + text-position selectors, fuzzy match deferred
|
||||||
|
- Source: PDF text extraction + fingerprint only
|
||||||
|
- Work: one-document UI, sidebar, create annotation, click-to-reopen
|
||||||
|
- Umbrella: wire it into a reference app
|
||||||
|
- Goal: prove viewer-independence on PDFs end-to-end. No forms, no recovery, no Markdown.
|
||||||
|
|
||||||
|
**Phase 2 — Evidence binding & form mode (4 weeks)**
|
||||||
|
- Binder + visual-guide rect registry
|
||||||
|
- One form-schema example with side-by-side viewer
|
||||||
|
- This is where the active-state coordination claim gets stress-tested
|
||||||
|
|
||||||
|
**Phase 3 — Format expansion (4 weeks)**
|
||||||
|
- HTML adapter (sanitization + DOM range selectors)
|
||||||
|
- Markdown adapter
|
||||||
|
- Confirms the format-neutral claim
|
||||||
|
|
||||||
|
**Phase 4 — Local citation recovery (4 weeks)**
|
||||||
|
- Local-library search, exact + fuzzy quote match, confirmation UI
|
||||||
|
- Defer external source lookup until local pipeline is reliable
|
||||||
|
|
||||||
|
## 5. Pivot — umbrella-first MVP (decided 2026-05-24)
|
||||||
|
|
||||||
|
The user has chosen to **build the MVP entirely inside `citation-evidence`** before segmenting code into the sister repos. The reasoning: get the product working end-to-end with minimal coordination cost, then extract subsystems once the contracts have been validated by actual use.
|
||||||
|
|
||||||
|
This means:
|
||||||
|
|
||||||
|
- All MVP source code lives under `citation-evidence/` (likely `src/` partitioned by future-repo names: `engine/`, `anchor/`, `source/`, `work/`, `binder/`).
|
||||||
|
- The five sister repos remain as INTENT-only placeholders during MVP — they document the intended boundaries, but code will move in only when a subsystem's contract has stabilized.
|
||||||
|
- Interface design is explicitly deferred. Phase-0 ADRs become Phase-N extractions, informed by real friction points.
|
||||||
|
- Shared contracts live in `citation-evidence/wiki/SharedContracts.md` and `citation-evidence/wiki/DependencyMap.md`.
|
||||||
|
|
||||||
|
This trade-off accepts more rework later (when subsystems extract) in exchange for faster MVP velocity now and better-informed boundaries when extraction happens.
|
||||||
155
wiki/DependencyMap.md
Normal file
155
wiki/DependencyMap.md
Normal file
@@ -0,0 +1,155 @@
|
|||||||
|
# Dependency Map — citation-evidence
|
||||||
|
|
||||||
|
This document describes the **allowed dependency edges** between the
|
||||||
|
subsystems of the citation-evidence ecosystem. It is the cycle-prevention
|
||||||
|
contract.
|
||||||
|
|
||||||
|
It complements `SharedContracts.md` (which says *what* is shared) by saying
|
||||||
|
*who is allowed to depend on whom*.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. The rule
|
||||||
|
|
||||||
|
> Types flow downward from `citation-engine`. Behavior flows upward into
|
||||||
|
> specialised repos. No subsystem may import another subsystem's behavior
|
||||||
|
> unless this map shows an edge.
|
||||||
|
|
||||||
|
The umbrella repo `citation-evidence` is allowed to depend on every
|
||||||
|
subsystem; nothing depends on the umbrella.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Allowed edges
|
||||||
|
|
||||||
|
```
|
||||||
|
┌───────────────────────┐
|
||||||
|
│ citation-evidence │ (umbrella)
|
||||||
|
└───────────┬───────────┘
|
||||||
|
│ depends on
|
||||||
|
┌──────────────────────────┼────────────────────────────┐
|
||||||
|
▼ ▼ ▼
|
||||||
|
┌───────────────┐ ┌────────────────┐ ┌────────────────┐
|
||||||
|
│ citation- │ │ evidence- │ │ citation- │
|
||||||
|
│ work │ │ binder │ │ engine │
|
||||||
|
└──────┬────────┘ └────────┬───────┘ └────────┬───────┘
|
||||||
|
│ │ │
|
||||||
|
│ depends on │ depends on │ depends on
|
||||||
|
│ │ │ (nothing —
|
||||||
|
▼ ▼ │ leaf node)
|
||||||
|
┌────────────────┐ ┌────────────────┐ │
|
||||||
|
│ evidence- │ │ evidence- │ │
|
||||||
|
│ anchor │ │ anchor │ │
|
||||||
|
└──────┬─────────┘ └────────┬───────┘ │
|
||||||
|
│ │ │
|
||||||
|
│ depends on │ depends on │
|
||||||
|
▼ ▼ ▼
|
||||||
|
┌────────────────┐ ┌────────────────┐ (citation-engine)
|
||||||
|
│ evidence- │ │ citation- │
|
||||||
|
│ source │ │ engine │
|
||||||
|
└────────┬───────┘ └────────────────┘
|
||||||
|
│
|
||||||
|
│ depends on
|
||||||
|
▼
|
||||||
|
┌────────────────┐
|
||||||
|
│ citation- │
|
||||||
|
│ engine │
|
||||||
|
└────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
In tabular form:
|
||||||
|
|
||||||
|
| Repo | May depend on | Must not depend on |
|
||||||
|
|--------------------|--------------------------------------------------------|-----------------------------------------|
|
||||||
|
| `citation-engine` | (nothing — it is the leaf) | every other subsystem |
|
||||||
|
| `evidence-anchor` | `citation-engine` | `evidence-source`, `citation-work`, `evidence-binder`, `citation-evidence` |
|
||||||
|
| `evidence-source` | `citation-engine` | `evidence-anchor`, `citation-work`, `evidence-binder`, `citation-evidence` |
|
||||||
|
| `evidence-binder` | `citation-engine`, `evidence-anchor` | `evidence-source`, `citation-work`, `citation-evidence` |
|
||||||
|
| `citation-work` | `citation-engine`, `evidence-anchor`, `evidence-source`| `evidence-binder`, `citation-evidence` |
|
||||||
|
| `citation-evidence`| all five subsystems | (nothing else in the ecosystem) |
|
||||||
|
|
||||||
|
Notes:
|
||||||
|
|
||||||
|
- `evidence-source` does NOT depend on `evidence-anchor`. When an ingestion
|
||||||
|
pipeline needs to know "could a selector resolve here?", the answer comes
|
||||||
|
through events, not direct calls.
|
||||||
|
- `citation-work` does NOT depend on `evidence-binder`. Linking evidence to
|
||||||
|
form fields is a separate workflow; the review workspace should function
|
||||||
|
without it. A separate "evidence-backed form" application composes work +
|
||||||
|
binder + engine.
|
||||||
|
- `evidence-binder` does NOT depend on `evidence-source`. When a binder needs
|
||||||
|
source context, it asks `evidence-anchor` to resolve the annotation, which
|
||||||
|
in turn knows nothing about how the document was ingested.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Communication channels
|
||||||
|
|
||||||
|
Direct imports are allowed only along the edges above. Where two subsystems
|
||||||
|
need to coordinate without being allowed to import each other, they use one
|
||||||
|
of these indirect channels:
|
||||||
|
|
||||||
|
| Channel | Owner | Notes |
|
||||||
|
|---------------------------------|------------------|---------------------------------------------------------|
|
||||||
|
| Shared event bus | `citation-engine`| Vocabulary frozen in `SharedContracts.md` §4 |
|
||||||
|
| Shared types package | `citation-engine`| Re-exported through `@citation-evidence/engine` (post-extraction) |
|
||||||
|
| Rect registry | `evidence-binder`| Used by form UI, evidence sidebar, viewer adapter |
|
||||||
|
| Persistence interfaces | `citation-engine`| Concrete adapters in subsystems but interfaces in engine|
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. During umbrella-first MVP
|
||||||
|
|
||||||
|
While all code lives in `citation-evidence/src/`, the rule is enforced by
|
||||||
|
**folder structure** and **lint rules**:
|
||||||
|
|
||||||
|
```
|
||||||
|
citation-evidence/src/
|
||||||
|
shared/ ← what will become citation-engine (types + contracts)
|
||||||
|
engine/ ← what will become citation-engine (services)
|
||||||
|
anchor/ ← what will become evidence-anchor
|
||||||
|
source/ ← what will become evidence-source
|
||||||
|
work/ ← what will become citation-work (UI)
|
||||||
|
binder/ ← what will become evidence-binder
|
||||||
|
app/ ← the umbrella reference app
|
||||||
|
```
|
||||||
|
|
||||||
|
Lint rule (to be added in WP-0001):
|
||||||
|
|
||||||
|
- `engine/` may import only from `shared/`.
|
||||||
|
- `anchor/` may import only from `shared/`, `engine/`.
|
||||||
|
- `source/` may import only from `shared/`, `engine/`.
|
||||||
|
- `binder/` may import only from `shared/`, `engine/`, `anchor/`.
|
||||||
|
- `work/` may import only from `shared/`, `engine/`, `anchor/`, `source/`.
|
||||||
|
- `app/` may import from any.
|
||||||
|
|
||||||
|
Violating these rules in MVP is a lint error, not a runtime error. When
|
||||||
|
subsystems extract into their own repos, the lint rule disappears and the
|
||||||
|
package boundary enforces the same constraint.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Why these rules
|
||||||
|
|
||||||
|
1. **`citation-engine` as the leaf** prevents the most common monorepo pathology:
|
||||||
|
the "core" repo accumulating UI/IO dependencies because it was easier than
|
||||||
|
inverting a dependency.
|
||||||
|
2. **`citation-work` ⊄ `evidence-binder`** keeps the review workspace usable
|
||||||
|
even when there is no form context (e.g. just collecting evidence for a
|
||||||
|
report).
|
||||||
|
3. **`evidence-binder` ⊄ `evidence-source`** keeps binding logic from
|
||||||
|
accidentally caring about ingestion details.
|
||||||
|
4. **No subsystem depends on `citation-evidence`** — the umbrella is a
|
||||||
|
composition point, not a library.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. Change process
|
||||||
|
|
||||||
|
Adding an edge to this map is a change to the contract.
|
||||||
|
|
||||||
|
- New edges require a short ADR in `docs/decisions/`.
|
||||||
|
- Removing an edge requires a refactoring plan (where do consumers go?).
|
||||||
|
- The MVP itself is an exception: edges that turn out to be wrong during
|
||||||
|
umbrella-first development are recorded as "deferred reshape" items in the
|
||||||
|
relevant workplan, not as ADRs.
|
||||||
296
wiki/SharedContracts.md
Normal file
296
wiki/SharedContracts.md
Normal file
@@ -0,0 +1,296 @@
|
|||||||
|
# Shared Contracts — citation-evidence
|
||||||
|
|
||||||
|
This document is the **single source of truth** for everything that more than one
|
||||||
|
subsystem in the citation-evidence ecosystem must agree on:
|
||||||
|
|
||||||
|
- the **vocabulary** (entity names and what they mean),
|
||||||
|
- the **canonical state enums** for entities that flow across repo boundaries,
|
||||||
|
- the **relation type** vocabulary,
|
||||||
|
- the **selector type** taxonomy,
|
||||||
|
- the **event type** vocabulary,
|
||||||
|
- the **ownership rules** for shared types versus shared behavior.
|
||||||
|
|
||||||
|
The five sister repos (`citation-engine`, `evidence-anchor`, `evidence-source`,
|
||||||
|
`citation-work`, `evidence-binder`) defer to this document. When their
|
||||||
|
`INTENT.md` files refer to "shared contracts", they mean this file.
|
||||||
|
|
||||||
|
During the umbrella-first MVP phase, the **TypeScript implementations** of
|
||||||
|
these contracts live in `citation-evidence/src/shared/` and are imported by
|
||||||
|
the per-subsystem code under `citation-evidence/src/{engine,anchor,source,work,binder}/`.
|
||||||
|
When a subsystem extracts to its own repo, it takes its slice of the shared
|
||||||
|
types with it — but this document remains the canonical vocabulary.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. Vocabulary
|
||||||
|
|
||||||
|
These nine entities are the vocabulary every subsystem uses.
|
||||||
|
|
||||||
|
| Entity | One-line definition | Owner (post-extraction) |
|
||||||
|
|---------------------------|----------------------------------------------------------------------------------------------------|-------------------------|
|
||||||
|
| `Document` | An identified source object: PDF, Markdown, HTML, scan, etc. | `citation-engine` |
|
||||||
|
| `DocumentRepresentation` | A normalized, addressable view of a document (canonical text, page map, structure). | `citation-engine` |
|
||||||
|
| `Selector` | A technical locator for a passage inside a representation. | `citation-engine` (types) / `evidence-anchor` (behavior) |
|
||||||
|
| `Annotation` | A technical mark on a document range, expressed as one or more selectors plus quote text. | `citation-engine` |
|
||||||
|
| `EvidenceItem` | A meaningful evidence object built from one or more annotations, with commentary and status. | `citation-engine` |
|
||||||
|
| `EvidenceSet` | An ordered group of evidence items associated with a target or topic. | `citation-engine` (type) / `evidence-binder` (behavior) |
|
||||||
|
| `EvidenceLink` | A relation between an `EvidenceItem` and a structured target (form field, claim, requirement, …). | `citation-engine` (type) / `evidence-binder` (behavior) |
|
||||||
|
| `CitationCard` | A renderable, exportable presentation of an evidence item. | `citation-engine` |
|
||||||
|
| `CitationRecoveryAttempt` | A traceable attempt to locate a cited passage from an external clue. | `citation-engine` (type) / `evidence-source` (behavior) |
|
||||||
|
|
||||||
|
**Ownership rule:** *types and interfaces flow downward from `citation-engine`;
|
||||||
|
behavior flows upward into the specialised repos*. Where the table shows a
|
||||||
|
split, the engine repo holds the data shape and the other repo holds the
|
||||||
|
algorithms and lifecycle.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Canonical state enums
|
||||||
|
|
||||||
|
These enums are the authoritative values. Subsystems must not invent local
|
||||||
|
variants without updating this document first.
|
||||||
|
|
||||||
|
### 2.1 `Annotation.resolutionStatus`
|
||||||
|
|
||||||
|
```
|
||||||
|
resolved — selectors located the passage with high confidence
|
||||||
|
ambiguous — multiple plausible candidates found
|
||||||
|
unresolved — no plausible candidate found
|
||||||
|
stale — representation has changed since selectors were stored
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2.2 `EvidenceItem.status`
|
||||||
|
|
||||||
|
```
|
||||||
|
candidate — captured but not yet vetted
|
||||||
|
confirmed — verified by a user as useful evidence
|
||||||
|
rejected — explicitly discarded
|
||||||
|
needs-check — flagged for review
|
||||||
|
```
|
||||||
|
|
||||||
|
> **Note:** earlier subsystem drafts introduced `strong-support`, `weak-support`,
|
||||||
|
> and `contradicts` on the item. Those concepts now live on the **link**, not
|
||||||
|
> the item — see §2.4.
|
||||||
|
|
||||||
|
### 2.3 `Document.reviewStatus` (when used by `citation-work`)
|
||||||
|
|
||||||
|
```
|
||||||
|
unreviewed
|
||||||
|
in-review
|
||||||
|
relevant
|
||||||
|
rejected
|
||||||
|
needs-follow-up
|
||||||
|
cited
|
||||||
|
verified
|
||||||
|
```
|
||||||
|
|
||||||
|
`citation-work` may treat any of these as the active state; the canonical
|
||||||
|
storage lives on the Document record in `citation-engine`.
|
||||||
|
|
||||||
|
### 2.4 `EvidenceLink.status` (per target)
|
||||||
|
|
||||||
|
```
|
||||||
|
no-evidence
|
||||||
|
candidate
|
||||||
|
confirmed
|
||||||
|
conflicting
|
||||||
|
insufficient
|
||||||
|
verified
|
||||||
|
```
|
||||||
|
|
||||||
|
`no-evidence` is a *derived* state computed when a target has zero links;
|
||||||
|
it is not stored on a link itself.
|
||||||
|
|
||||||
|
### 2.5 `EvidenceLink.relation`
|
||||||
|
|
||||||
|
```
|
||||||
|
supports
|
||||||
|
contradicts
|
||||||
|
explains
|
||||||
|
qualifies
|
||||||
|
source-for
|
||||||
|
context-for
|
||||||
|
```
|
||||||
|
|
||||||
|
This is the closed vocabulary for the MVP. Adding a relation requires updating
|
||||||
|
this document and the `EvidenceLink` schema together.
|
||||||
|
|
||||||
|
### 2.6 `CitationRecoveryAttempt.state`
|
||||||
|
|
||||||
|
```
|
||||||
|
created
|
||||||
|
source-found-fulltext
|
||||||
|
source-found-preview-only
|
||||||
|
source-found-metadata-only
|
||||||
|
source-not-found
|
||||||
|
quote-found
|
||||||
|
quote-not-found
|
||||||
|
candidate-passages-found
|
||||||
|
manual-confirmation-needed
|
||||||
|
confirmed
|
||||||
|
annotation-created
|
||||||
|
failed
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. Selector taxonomy
|
||||||
|
|
||||||
|
A `Selector` is a discriminated union of:
|
||||||
|
|
||||||
|
```
|
||||||
|
TextQuoteSelector exact quote + prefix/suffix context
|
||||||
|
TextPositionSelector canonical text start/end offsets
|
||||||
|
PdfRectSelector page number + normalized page rectangles
|
||||||
|
PdfPageTextSelector page number + page-local text offsets
|
||||||
|
DomRangeSelector DOM path + range offsets (HTML/Markdown)
|
||||||
|
StructuralSelector heading/section/AST path
|
||||||
|
FragmentSelector exported fragment / deep link (export-only)
|
||||||
|
```
|
||||||
|
|
||||||
|
**Selector redundancy rule:** when an annotation is created, the system stores
|
||||||
|
*all selector types that are available* for that document representation, not
|
||||||
|
just one. Resolution tries them in order of expected confidence and stops at
|
||||||
|
the first high-confidence match.
|
||||||
|
|
||||||
|
W3C Web Annotation mapping uses these same concepts but as JSON-LD; the mapping
|
||||||
|
is documented separately (see ADR-0003 — pending).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Event vocabulary
|
||||||
|
|
||||||
|
Events are the primary integration mechanism between subsystems. The closed
|
||||||
|
event vocabulary for the MVP is:
|
||||||
|
|
||||||
|
```
|
||||||
|
DocumentImported
|
||||||
|
DocumentRepresentationGenerated
|
||||||
|
AnnotationCreated
|
||||||
|
AnnotationResolved
|
||||||
|
AnnotationResolutionFailed
|
||||||
|
EvidenceItemCreated
|
||||||
|
EvidenceItemUpdated
|
||||||
|
EvidenceLinkCreated
|
||||||
|
EvidenceLinkUpdated
|
||||||
|
EvidenceItemActivated
|
||||||
|
FormFieldActivated
|
||||||
|
CitationCardRendered
|
||||||
|
CitationRecoveryStarted
|
||||||
|
CitationRecoveryCandidateFound
|
||||||
|
CitationRecoveryConfirmed
|
||||||
|
```
|
||||||
|
|
||||||
|
Subsystems must emit these events through a shared event bus owned by
|
||||||
|
`citation-engine`. Subsystems may listen to any event but must not invent
|
||||||
|
event types without updating this document.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 5. Viewer adapter contract
|
||||||
|
|
||||||
|
Viewer adapters are the bridge between a document format and the rest of the
|
||||||
|
system. They are **owned by `evidence-anchor`** as far as the contract goes;
|
||||||
|
concrete adapters may live in either `evidence-anchor` or `evidence-source`
|
||||||
|
depending on whether the heavy lifting is selector logic or document
|
||||||
|
representation logic.
|
||||||
|
|
||||||
|
```ts
|
||||||
|
interface DocumentViewerAdapter {
|
||||||
|
mediaTypes: string[];
|
||||||
|
load(document: Document, representation?: DocumentRepresentation): Promise<void>;
|
||||||
|
getCurrentSelection(): Promise<SelectionCapture | null>;
|
||||||
|
createSelectorsFromSelection(selection: SelectionCapture): Promise<Selector[]>;
|
||||||
|
resolveSelectors(selectors: Selector[]): Promise<AnchorResolution>;
|
||||||
|
scrollToResolvedTarget(target: ResolvedAnchorTarget, opts?: { center?: boolean; behavior?: "auto"|"smooth" }): Promise<void>;
|
||||||
|
renderHighlight(target: ResolvedAnchorTarget, opts?: HighlightRenderOptions): Promise<void>;
|
||||||
|
getHighlightClientRects(annotationId: string): Promise<DOMRect[]>;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
MVP delivers a single `PDFViewerAdapter`. HTML and Markdown adapters are
|
||||||
|
deferred.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 6. Canonical text normalization
|
||||||
|
|
||||||
|
All text-based selectors and quote matching depend on a deterministic
|
||||||
|
normalization function. The MVP normalization is:
|
||||||
|
|
||||||
|
1. Unicode NFC normalization.
|
||||||
|
2. Replace all line-ending sequences with `\n`.
|
||||||
|
3. Collapse runs of horizontal whitespace into a single space.
|
||||||
|
4. Strip soft hyphens (U+00AD).
|
||||||
|
5. Preserve paragraph boundaries (double `\n`).
|
||||||
|
|
||||||
|
**This function is versioned.** Stored selectors record the normalization
|
||||||
|
version they were created against. Changing the function later requires either
|
||||||
|
backwards-compatible behavior or a re-anchoring migration.
|
||||||
|
|
||||||
|
The reference implementation lives in `citation-evidence/src/shared/text/normalize.ts`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 7. Visual guide rect registry
|
||||||
|
|
||||||
|
The visual-guide overlay (form field → evidence card → source highlight)
|
||||||
|
requires DOM rects from three independently-rendered subsystems. The contract
|
||||||
|
is a **rect registry** owned by `evidence-binder`:
|
||||||
|
|
||||||
|
```ts
|
||||||
|
interface RectRegistry {
|
||||||
|
register(kind: "field" | "evidence-card" | "highlight", id: string, getRect: () => DOMRect | null): () => void;
|
||||||
|
getRect(kind: "field" | "evidence-card" | "highlight", id: string): DOMRect | null;
|
||||||
|
subscribe(listener: (event: RectRegistryEvent) => void): () => void;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Each renderer (form, evidence sidebar, viewer adapter) registers a
|
||||||
|
`getRect` callback. The overlay queries on-demand and re-renders on scroll,
|
||||||
|
resize, focus, and active-evidence change.
|
||||||
|
|
||||||
|
This contract MUST be defined and stable before any of the three renderers
|
||||||
|
hardens, or the overlay becomes the system's coupling bottleneck.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 8. Ownership rules (the short version)
|
||||||
|
|
||||||
|
1. **Types and interfaces** flow downward from `citation-engine`.
|
||||||
|
2. **Behavior and algorithms** live in the specialised repos.
|
||||||
|
3. Where a concept appears in both a type and a behavior context (e.g.
|
||||||
|
`Selector`, `EvidenceLink`, `EvidenceSet`, `CitationRecoveryAttempt`),
|
||||||
|
the engine owns the shape and the specialised repo owns the lifecycle.
|
||||||
|
4. **The shared event bus is engine-owned**; subsystems publish and subscribe
|
||||||
|
but do not extend the event vocabulary unilaterally.
|
||||||
|
5. **No new enum values, relation types, event types, or selector kinds**
|
||||||
|
land in code without first appearing in this document.
|
||||||
|
6. During umbrella-first MVP: rules 1-5 are aspirational. We will tolerate
|
||||||
|
small violations in `citation-evidence/src/` and reconcile during extraction.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 9. Change process
|
||||||
|
|
||||||
|
Changes to this document are change to the contract.
|
||||||
|
|
||||||
|
- Small additions (a new enum value, a new event type) can be made in a single
|
||||||
|
PR that updates this doc + the type definitions + at least one consumer.
|
||||||
|
- Breaking changes (renaming an entity, removing a state, changing an
|
||||||
|
ownership split) require a short ADR in `docs/decisions/` and a heads-up
|
||||||
|
progress event on the state-hub.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 10. Pending ADRs that will affect this document
|
||||||
|
|
||||||
|
These are listed in `docs/decisions/` once written. Until then the document
|
||||||
|
reflects the current best understanding from the architecture overview.
|
||||||
|
|
||||||
|
- **ADR-0001** — Umbrella-first MVP strategy (decided 2026-05-24, this session).
|
||||||
|
- **ADR-0002** — Monorepo vs polyrepo packaging (pending).
|
||||||
|
- **ADR-0003** — W3C Web Annotation: lossy mapping vs round-trip guarantee (pending).
|
||||||
|
- **ADR-0004** — PDF viewer library choice: `react-pdf-highlighter-plus` vs PDF.js direct (pending).
|
||||||
|
- **ADR-0005** — Persistence: local-first SQLite vs Postgres from day one (pending).
|
||||||
|
- **ADR-0006** — Selector ownership split (types in engine, algorithms in anchor) (pending — implied here).
|
||||||
246
workplans/CE-WP-0001-foundations.md
Normal file
246
workplans/CE-WP-0001-foundations.md
Normal file
@@ -0,0 +1,246 @@
|
|||||||
|
---
|
||||||
|
id: CE-WP-0001
|
||||||
|
type: workplan
|
||||||
|
title: "Foundations — TS scaffold, folder layout, lint boundaries, normalization, fixtures"
|
||||||
|
domain: citation_evidence
|
||||||
|
repo: citation-evidence
|
||||||
|
repo_id: a677c189-b4e2-4f2a-9e48-faa482c277e6
|
||||||
|
status: todo
|
||||||
|
owner: Bernd
|
||||||
|
created: 2026-05-24
|
||||||
|
updated: 2026-05-24
|
||||||
|
spec_refs:
|
||||||
|
- wiki/ProductRequirementsDocument.md
|
||||||
|
- wiki/ArchitectureOverview.md
|
||||||
|
- wiki/SharedContracts.md
|
||||||
|
- wiki/DependencyMap.md
|
||||||
|
---
|
||||||
|
|
||||||
|
# CE-WP-0001 — Foundations
|
||||||
|
|
||||||
|
Establish the skeleton of the umbrella-first MVP: a TypeScript project with
|
||||||
|
a folder layout that mirrors the future subsystem split (so that extracting
|
||||||
|
to sister repos later is a `git mv` plus a `package.json` cut), lint rules
|
||||||
|
that enforce the dependency map at the folder level, the versioned
|
||||||
|
canonical-text normalization function, and a small but representative PDF
|
||||||
|
fixtures corpus.
|
||||||
|
|
||||||
|
No product features yet. This workplan exists so that everything from
|
||||||
|
`CE-WP-0002` onward has somewhere to land.
|
||||||
|
|
||||||
|
## Decisions captured here
|
||||||
|
|
||||||
|
Each task below corresponds to a Phase-0 ADR. The ADR lives at
|
||||||
|
`docs/decisions/ADR-NNNN-<slug>.md`. If a task involves a choice that wasn't
|
||||||
|
already decided, the agent stops and asks Bernd before writing code.
|
||||||
|
|
||||||
|
## Dependency Order
|
||||||
|
|
||||||
|
```
|
||||||
|
T01 (toolchain decision + package.json)
|
||||||
|
└─ T02 (folder layout per DependencyMap §4)
|
||||||
|
└─ T03 (lint rules enforcing dep edges)
|
||||||
|
└─ T04 (canonical text normalization v1, versioned)
|
||||||
|
└─ T05 (fixtures: 5+ representative PDFs + a manifest)
|
||||||
|
└─ T06 (README upgrade + dev workflow doc)
|
||||||
|
└─ T07 (write the six pending ADRs as stubs)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T01 — Toolchain + package.json + tsconfig
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0001-T01
|
||||||
|
priority: critical
|
||||||
|
status: todo
|
||||||
|
```
|
||||||
|
|
||||||
|
Decide the TS toolchain (vite vs tsc-only vs Next.js) and write a single
|
||||||
|
`package.json` at the repo root. Decisions to lock in this task as an ADR
|
||||||
|
(`docs/decisions/ADR-0001-toolchain.md`):
|
||||||
|
|
||||||
|
- Bundler: vite (recommended, fastest dev loop for a React MVP)
|
||||||
|
- Package manager: pnpm (recommended, plays well with future workspace split)
|
||||||
|
- React 18+
|
||||||
|
- Strict TS
|
||||||
|
|
||||||
|
Deliverables:
|
||||||
|
- `package.json` with `dev`, `build`, `test`, `lint`, `typecheck` scripts
|
||||||
|
- `tsconfig.json` with strict mode, paths for the `src/` partitions
|
||||||
|
- `.nvmrc` pinning Node version
|
||||||
|
- `docs/decisions/ADR-0001-toolchain.md` written and committed
|
||||||
|
|
||||||
|
Do not install application dependencies yet — just the toolchain.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T02 — Folder layout matching DependencyMap §4
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0001-T02
|
||||||
|
priority: critical
|
||||||
|
status: todo
|
||||||
|
depends_on: [T01]
|
||||||
|
```
|
||||||
|
|
||||||
|
Create the source folder layout:
|
||||||
|
|
||||||
|
```
|
||||||
|
src/
|
||||||
|
shared/ # will become @citation-evidence/engine (types + contracts)
|
||||||
|
engine/ # will become @citation-evidence/engine (services)
|
||||||
|
anchor/ # will become @citation-evidence/anchor
|
||||||
|
source/ # will become @citation-evidence/source
|
||||||
|
work/ # will become @citation-evidence/work (UI)
|
||||||
|
binder/ # will become @citation-evidence/binder
|
||||||
|
app/ # the reference workspace shell
|
||||||
|
```
|
||||||
|
|
||||||
|
Each folder gets:
|
||||||
|
- A one-line `README.md` stating its future home
|
||||||
|
- An `index.ts` that re-exports its public API (empty for now)
|
||||||
|
|
||||||
|
Add path aliases in `tsconfig.json`: `@shared/*`, `@engine/*`, etc.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T03 — Lint rules enforcing dependency edges
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0001-T03
|
||||||
|
priority: high
|
||||||
|
status: todo
|
||||||
|
depends_on: [T02]
|
||||||
|
```
|
||||||
|
|
||||||
|
Install `eslint-plugin-boundaries` (or equivalent) and configure rules per
|
||||||
|
`wiki/DependencyMap.md` §4:
|
||||||
|
|
||||||
|
| Folder | May import from |
|
||||||
|
|--------------|--------------------------------------------------|
|
||||||
|
| `shared/` | (nothing internal) |
|
||||||
|
| `engine/` | `shared/` |
|
||||||
|
| `anchor/` | `shared/`, `engine/` |
|
||||||
|
| `source/` | `shared/`, `engine/` |
|
||||||
|
| `binder/` | `shared/`, `engine/`, `anchor/` |
|
||||||
|
| `work/` | `shared/`, `engine/`, `anchor/`, `source/` |
|
||||||
|
| `app/` | any |
|
||||||
|
|
||||||
|
Add a failing test fixture that imports `source/` from `binder/` and confirm
|
||||||
|
lint catches it; remove the fixture afterward.
|
||||||
|
|
||||||
|
`npm run lint` must pass on a clean tree.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T04 — Canonical text normalization v1
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0001-T04
|
||||||
|
priority: critical
|
||||||
|
status: todo
|
||||||
|
depends_on: [T02]
|
||||||
|
```
|
||||||
|
|
||||||
|
Implement `src/shared/text/normalize.ts` per `wiki/SharedContracts.md` §6:
|
||||||
|
|
||||||
|
1. Unicode NFC
|
||||||
|
2. Normalize line endings to `\n`
|
||||||
|
3. Collapse horizontal whitespace runs to a single space
|
||||||
|
4. Strip soft hyphens (U+00AD)
|
||||||
|
5. Preserve paragraph boundaries (`\n\n`)
|
||||||
|
|
||||||
|
Public API:
|
||||||
|
|
||||||
|
```ts
|
||||||
|
export const NORMALIZE_VERSION = 1;
|
||||||
|
export function normalize(input: string): { text: string; version: number };
|
||||||
|
```
|
||||||
|
|
||||||
|
Include unit tests covering: ligatures, CRLF input, soft-hyphenated German,
|
||||||
|
mixed whitespace, paragraph preservation.
|
||||||
|
|
||||||
|
Stored selectors will record this version number so that future normalization
|
||||||
|
changes can be detected as a migration concern.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T05 — PDF fixtures corpus + manifest
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0001-T05
|
||||||
|
priority: high
|
||||||
|
status: todo
|
||||||
|
depends_on: [T01]
|
||||||
|
```
|
||||||
|
|
||||||
|
Assemble `fixtures/pdfs/` with at least 5 representative PDFs:
|
||||||
|
|
||||||
|
- A simple single-column text PDF
|
||||||
|
- A two-column academic PDF (e.g. ACM-style)
|
||||||
|
- A German PDF with umlauts and soft hyphens
|
||||||
|
- A form PDF (e.g. a public-sector application form)
|
||||||
|
- A PDF with a heading hierarchy
|
||||||
|
|
||||||
|
Write `fixtures/pdfs/manifest.json` recording for each:
|
||||||
|
- filename
|
||||||
|
- short description
|
||||||
|
- expected page count
|
||||||
|
- one short "known-good quote" with the page number it appears on (used by
|
||||||
|
CE-WP-0002 selector tests)
|
||||||
|
|
||||||
|
Keep each PDF small (< 1 MB) and check sources/licenses into
|
||||||
|
`fixtures/pdfs/SOURCES.md`. Public-domain or Bernd-authored only.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T06 — README upgrade + dev workflow doc
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0001-T06
|
||||||
|
priority: medium
|
||||||
|
status: todo
|
||||||
|
depends_on: [T01, T02]
|
||||||
|
```
|
||||||
|
|
||||||
|
Replace the one-line `README.md` with a real one:
|
||||||
|
|
||||||
|
- What citation-evidence is (one paragraph from INTENT)
|
||||||
|
- Repository layout (point at `src/` partitions and what each becomes)
|
||||||
|
- Where to find docs (`wiki/`, `docs/decisions/`, `history/`, `workplans/`)
|
||||||
|
- Dev workflow: `pnpm install`, `pnpm dev`, `pnpm test`, `pnpm lint`
|
||||||
|
- Pointer to `~/ralph-workplan/` for how workplans are driven
|
||||||
|
|
||||||
|
Add a one-paragraph `README.md` in each of the five sister repos pointing
|
||||||
|
back at this umbrella + reminding readers that code lives upstream during
|
||||||
|
the MVP phase.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T07 — Stub the six pending ADRs
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0001-T07
|
||||||
|
priority: medium
|
||||||
|
status: todo
|
||||||
|
depends_on: [T01]
|
||||||
|
```
|
||||||
|
|
||||||
|
Create stub files in `docs/decisions/` for each ADR mentioned in
|
||||||
|
`wiki/SharedContracts.md` §10:
|
||||||
|
|
||||||
|
- `ADR-0001-toolchain.md` (filled in by T01)
|
||||||
|
- `ADR-0002-monorepo-vs-polyrepo.md`
|
||||||
|
- `ADR-0003-w3c-mapping-scope.md`
|
||||||
|
- `ADR-0004-pdf-viewer-library.md`
|
||||||
|
- `ADR-0005-persistence.md`
|
||||||
|
- `ADR-0006-selector-ownership-split.md`
|
||||||
|
|
||||||
|
Each stub: title, status (`proposed` for 2-6), context (one paragraph
|
||||||
|
explaining what the decision is about and why it matters), options (bullet
|
||||||
|
list with pros/cons), decision (blank), consequences (blank).
|
||||||
|
|
||||||
|
These are not decisions yet — they are *the questions that must be answered
|
||||||
|
before the relevant code lands*. The MVP can proceed without 2-6 being
|
||||||
|
resolved because no extraction or persistence happens until later workplans.
|
||||||
283
workplans/CE-WP-0002-pdf-review-slice.md
Normal file
283
workplans/CE-WP-0002-pdf-review-slice.md
Normal file
@@ -0,0 +1,283 @@
|
|||||||
|
---
|
||||||
|
id: CE-WP-0002
|
||||||
|
type: workplan
|
||||||
|
title: "PDF review slice — engine types, anchor, source, viewer, sidebar, click-to-reopen"
|
||||||
|
domain: citation_evidence
|
||||||
|
repo: citation-evidence
|
||||||
|
repo_id: a677c189-b4e2-4f2a-9e48-faa482c277e6
|
||||||
|
status: todo
|
||||||
|
owner: Bernd
|
||||||
|
created: 2026-05-24
|
||||||
|
updated: 2026-05-24
|
||||||
|
depends_on_workplan: CE-WP-0001
|
||||||
|
spec_refs:
|
||||||
|
- wiki/ProductRequirementsDocument.md
|
||||||
|
- wiki/ArchitectureOverview.md
|
||||||
|
- wiki/SharedContracts.md
|
||||||
|
---
|
||||||
|
|
||||||
|
# CE-WP-0002 — PDF Review Slice
|
||||||
|
|
||||||
|
The first vertical product slice. After this workplan, a user can:
|
||||||
|
|
||||||
|
1. Open the app, see a collection of fixture PDFs.
|
||||||
|
2. Open one PDF in a viewer.
|
||||||
|
3. Select text, add a one-line comment, save as an evidence item.
|
||||||
|
4. See the evidence item appear in a sidebar.
|
||||||
|
5. Click the evidence item and have the PDF jump to and highlight the
|
||||||
|
passage — even after a full page reload.
|
||||||
|
|
||||||
|
No forms, no Markdown/HTML, no recovery, no export. Those come later.
|
||||||
|
|
||||||
|
This workplan exercises the riskiest architectural assumption (PDF selector
|
||||||
|
round-trip with viewer independence) on the simplest possible feature set.
|
||||||
|
|
||||||
|
## Risk-driven order
|
||||||
|
|
||||||
|
T01 and T02 are the spike from the assessment: prove the
|
||||||
|
`react-pdf-highlighter-plus` integration can store and reload selectors
|
||||||
|
without leaking viewer types into engine code. If that breaks, the rest of
|
||||||
|
the workplan stops and a new ADR is required for ADR-0004 (PDF viewer choice).
|
||||||
|
|
||||||
|
## Dependency Order
|
||||||
|
|
||||||
|
```
|
||||||
|
T01 (engine types: Document, Representation, Annotation, Selector, EvidenceItem)
|
||||||
|
└─ T02 (PDF viewer adapter spike — store + reload selectors as JSON)
|
||||||
|
└─ T03 (evidence-source: PDF ingest, fingerprint, canonical text)
|
||||||
|
└─ T04 (evidence-anchor: TextQuote + TextPosition resolution against representation)
|
||||||
|
└─ T05 (in-memory repositories + engine services)
|
||||||
|
└─ T06 (citation-work UI: collection list + viewer shell + sidebar)
|
||||||
|
└─ T07 (annotation create flow)
|
||||||
|
└─ T08 (click-to-reopen flow)
|
||||||
|
└─ T09 (end-to-end test of PRD scenario steps 1-4)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T01 — Engine types in `src/shared/`
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0002-T01
|
||||||
|
priority: critical
|
||||||
|
status: todo
|
||||||
|
```
|
||||||
|
|
||||||
|
Translate the type definitions in `wiki/SharedContracts.md` §1 and §3 into
|
||||||
|
TypeScript under `src/shared/`:
|
||||||
|
|
||||||
|
- `src/shared/document.ts` — `Document`, `DocumentRepresentation`, `PageMap`,
|
||||||
|
`OffsetMap`
|
||||||
|
- `src/shared/selector.ts` — `Selector` discriminated union with at minimum
|
||||||
|
`TextQuoteSelector`, `TextPositionSelector`, `PdfRectSelector`,
|
||||||
|
`PdfPageTextSelector`. Other selector kinds defined as `never`-typed stubs
|
||||||
|
for now.
|
||||||
|
- `src/shared/annotation.ts` — `Annotation` with `selectors`, `quote`,
|
||||||
|
`note`, `normalizeVersion`
|
||||||
|
- `src/shared/evidence.ts` — `EvidenceItem`, `EvidenceItem.status` enum per
|
||||||
|
§2.2
|
||||||
|
- `src/shared/ids.ts` — branded ID types and a `newId(prefix)` helper
|
||||||
|
|
||||||
|
No services, no behavior. Pure data shapes + the ID helper.
|
||||||
|
|
||||||
|
Add JSDoc on each type pointing at the §-reference in
|
||||||
|
`wiki/SharedContracts.md` it implements.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T02 — PDF viewer adapter spike
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0002-T02
|
||||||
|
priority: critical
|
||||||
|
status: todo
|
||||||
|
depends_on: [T01]
|
||||||
|
```
|
||||||
|
|
||||||
|
**This is the architectural spike.** Build a throwaway
|
||||||
|
`src/anchor/pdf-viewer-adapter-spike.tsx` that:
|
||||||
|
|
||||||
|
1. Loads `fixtures/pdfs/simple.pdf` using `react-pdf-highlighter-plus`
|
||||||
|
(assumed; if a better library appears, document it in ADR-0004 before
|
||||||
|
committing).
|
||||||
|
2. Lets the user select text and produces selectors per `T01` shapes.
|
||||||
|
3. Serializes the selectors to a JSON blob in `localStorage`.
|
||||||
|
4. On reload, reads the blob, asks the adapter to resolve, scrolls to the
|
||||||
|
passage, and renders a highlight.
|
||||||
|
|
||||||
|
Success criteria:
|
||||||
|
- Reload-and-resolve works for all fixture PDFs.
|
||||||
|
- No PDF.js or `react-pdf-highlighter-plus` types appear in any file under
|
||||||
|
`src/shared/` or `src/engine/`.
|
||||||
|
- The adapter's public surface matches the contract in
|
||||||
|
`wiki/SharedContracts.md` §5.
|
||||||
|
|
||||||
|
If success criteria fail: stop. Write a short note in
|
||||||
|
`docs/decisions/ADR-0004-pdf-viewer-library.md` describing the failure mode
|
||||||
|
and proposed alternative. Do not proceed with T03+.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T03 — `src/source/`: PDF ingest, fingerprint, canonical text
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0002-T03
|
||||||
|
priority: high
|
||||||
|
status: todo
|
||||||
|
depends_on: [T02]
|
||||||
|
```
|
||||||
|
|
||||||
|
Implement under `src/source/pdf/`:
|
||||||
|
|
||||||
|
- `ingest.ts` — `ingestPdf(file: File | Buffer): Promise<{ document: Document; representation: DocumentRepresentation }>`
|
||||||
|
- `fingerprint.ts` — stable SHA-256 of bytes
|
||||||
|
- `extract.ts` — uses PDF.js to extract page text; runs `normalize()` from
|
||||||
|
T04 of WP-0001 over the canonical text; builds the `PageMap` and
|
||||||
|
`OffsetMap` per `Document.DocumentRepresentation`
|
||||||
|
|
||||||
|
Tests use the fixture corpus from `CE-WP-0001-T05`. For each fixture,
|
||||||
|
extracted canonical text must contain the manifest's known-good quote.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T04 — `src/anchor/`: TextQuote and TextPosition resolution
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0002-T04
|
||||||
|
priority: high
|
||||||
|
status: todo
|
||||||
|
depends_on: [T01, T03]
|
||||||
|
```
|
||||||
|
|
||||||
|
Implement under `src/anchor/`:
|
||||||
|
|
||||||
|
- `selectors/create.ts` — given a `SelectionCapture` from the adapter, build
|
||||||
|
the maximal set of available selectors (always `TextQuoteSelector` with
|
||||||
|
prefix/suffix; `TextPositionSelector` when the representation provides
|
||||||
|
offsets; PDF rect/text selectors when on PDF)
|
||||||
|
- `selectors/resolve.ts` — implements the resolution strategy from
|
||||||
|
`wiki/ArchitectureOverview.md` §7 (try position, verify quote, fall back
|
||||||
|
through quote+prefix/suffix, return `AnchorResolution`)
|
||||||
|
- `selectors/types.ts` — `AnchorResolution`, `SelectionCapture`,
|
||||||
|
`ResolvedAnchorTarget`
|
||||||
|
|
||||||
|
Fuzzy matching is out of scope here — return `unresolved` if exact+prefix/suffix
|
||||||
|
fails. Fuzzy is a later workplan.
|
||||||
|
|
||||||
|
Unit tests using fixtures: for each fixture+known-quote pair, create
|
||||||
|
selectors then immediately resolve them; resolution must succeed with
|
||||||
|
confidence ≥ 0.9.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T05 — In-memory repositories + engine services
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0002-T05
|
||||||
|
priority: high
|
||||||
|
status: todo
|
||||||
|
depends_on: [T01]
|
||||||
|
```
|
||||||
|
|
||||||
|
Under `src/engine/`:
|
||||||
|
|
||||||
|
- `repos/in-memory.ts` — `Map`-backed implementations of
|
||||||
|
`DocumentRepository`, `AnnotationRepository`, `EvidenceItemRepository`
|
||||||
|
- `services/documents.ts`, `services/annotations.ts`, `services/evidence.ts`
|
||||||
|
— thin orchestration layer that creates IDs, calls repos, and emits the
|
||||||
|
events from `wiki/SharedContracts.md` §4
|
||||||
|
- `events/bus.ts` — minimal pub/sub. Synchronous for MVP.
|
||||||
|
|
||||||
|
No persistence to disk yet. ADR-0005 (persistence) is still pending.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T06 — `src/work/`: collection list + viewer shell + sidebar
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0002-T06
|
||||||
|
priority: high
|
||||||
|
status: todo
|
||||||
|
depends_on: [T02, T05]
|
||||||
|
```
|
||||||
|
|
||||||
|
Under `src/work/` and `src/app/`:
|
||||||
|
|
||||||
|
- `src/app/App.tsx` — three-pane layout per Architecture §12.1: collection
|
||||||
|
list (left), viewer (centre), evidence sidebar (right)
|
||||||
|
- `src/work/CollectionList.tsx` — lists `fixtures/pdfs/manifest.json`
|
||||||
|
entries; click to load
|
||||||
|
- `src/work/ViewerShell.tsx` — hosts the viewer adapter from T02 wrapped
|
||||||
|
cleanly; viewer adapter API is the only surface `work/` uses
|
||||||
|
- `src/work/EvidenceSidebar.tsx` — lists evidence items for the current
|
||||||
|
document, shows quote + commentary + status
|
||||||
|
|
||||||
|
No styling beyond minimum legibility. CSS in Tailwind or vanilla — pick one,
|
||||||
|
note in ADR-0001 if it wasn't already.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T07 — Annotation create flow
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0002-T07
|
||||||
|
priority: high
|
||||||
|
status: todo
|
||||||
|
depends_on: [T04, T05, T06]
|
||||||
|
```
|
||||||
|
|
||||||
|
Wire selection → annotation → evidence item:
|
||||||
|
|
||||||
|
1. User selects text in the viewer.
|
||||||
|
2. A small toolbar appears with a comment input + Save button.
|
||||||
|
3. On Save: adapter produces `SelectionCapture` → anchor creates `Selector[]`
|
||||||
|
→ engine creates `Annotation` → engine creates `EvidenceItem` with the
|
||||||
|
commentary → sidebar updates.
|
||||||
|
|
||||||
|
Active state lives in a single React context for now; no Redux/Zustand.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T08 — Click-to-reopen flow
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0002-T08
|
||||||
|
priority: critical
|
||||||
|
status: todo
|
||||||
|
depends_on: [T04, T06, T07]
|
||||||
|
```
|
||||||
|
|
||||||
|
Implement the round trip:
|
||||||
|
|
||||||
|
1. User clicks an evidence item in the sidebar.
|
||||||
|
2. Engine loads the annotation → anchor resolves selectors against the
|
||||||
|
current representation → adapter scrolls to and highlights the target.
|
||||||
|
|
||||||
|
Critically, this must also work **after a page reload**. Persistence to
|
||||||
|
`localStorage` is acceptable for MVP (decide explicitly in
|
||||||
|
`ADR-0005-persistence.md` that we are deferring real persistence).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T09 — End-to-end test of PRD scenario steps 1-4
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0002-T09
|
||||||
|
priority: high
|
||||||
|
status: todo
|
||||||
|
depends_on: [T07, T08]
|
||||||
|
```
|
||||||
|
|
||||||
|
Write a Playwright (or similar) E2E test that:
|
||||||
|
|
||||||
|
1. Opens the app.
|
||||||
|
2. Picks `simple.pdf`.
|
||||||
|
3. Programmatically selects the known-good quote from the manifest.
|
||||||
|
4. Saves an evidence item with a comment.
|
||||||
|
5. Verifies the item appears in the sidebar.
|
||||||
|
6. Reloads the page.
|
||||||
|
7. Clicks the evidence item.
|
||||||
|
8. Verifies the highlight is rendered on the expected page.
|
||||||
|
|
||||||
|
This is the contract for "MVP slice 1 works". If it passes, CE-WP-0003 may
|
||||||
|
begin.
|
||||||
246
workplans/CE-WP-0003-form-binding-visual-guide.md
Normal file
246
workplans/CE-WP-0003-form-binding-visual-guide.md
Normal file
@@ -0,0 +1,246 @@
|
|||||||
|
---
|
||||||
|
id: CE-WP-0003
|
||||||
|
type: workplan
|
||||||
|
title: "Form binding + visual guide — EvidenceLink, rect registry, SVG overlay"
|
||||||
|
domain: citation_evidence
|
||||||
|
repo: citation-evidence
|
||||||
|
repo_id: a677c189-b4e2-4f2a-9e48-faa482c277e6
|
||||||
|
status: todo
|
||||||
|
owner: Bernd
|
||||||
|
created: 2026-05-24
|
||||||
|
updated: 2026-05-24
|
||||||
|
depends_on_workplan: CE-WP-0002
|
||||||
|
spec_refs:
|
||||||
|
- wiki/ProductRequirementsDocument.md
|
||||||
|
- wiki/ArchitectureOverview.md
|
||||||
|
- wiki/SharedContracts.md
|
||||||
|
---
|
||||||
|
|
||||||
|
# CE-WP-0003 — Form Binding + Visual Guide
|
||||||
|
|
||||||
|
Build the evidence-backed form mode and the SVG visual guide overlay.
|
||||||
|
After this workplan, a user can:
|
||||||
|
|
||||||
|
1. Open a form next to the document viewer.
|
||||||
|
2. Drag (or click-to-link) an evidence item from the sidebar onto a form
|
||||||
|
field.
|
||||||
|
3. Click a form field → its linked evidence items appear → the active
|
||||||
|
evidence's source passage is scrolled into view and highlighted → an SVG
|
||||||
|
guide visually connects the field, the evidence card, and the highlight.
|
||||||
|
4. Cycle through multiple evidence items on the same field.
|
||||||
|
|
||||||
|
This is the workplan that stress-tests the rect-registry contract from
|
||||||
|
`wiki/SharedContracts.md` §7. The form, the evidence card, and the viewer's
|
||||||
|
highlight all need to publish rects to a single overlay that re-renders on
|
||||||
|
scroll/resize/focus.
|
||||||
|
|
||||||
|
## Dependency Order
|
||||||
|
|
||||||
|
```
|
||||||
|
T01 (EvidenceLink + EvidenceSet types + relation/status enums)
|
||||||
|
└─ T02 (binding service + in-memory link repo + active-state machine)
|
||||||
|
└─ T03 (rect registry — the contract from SharedContracts.md §7)
|
||||||
|
└─ T04 (form schema + simple field renderer)
|
||||||
|
└─ T05 (side-by-side layout + drag-or-click to link)
|
||||||
|
└─ T06 (active-evidence cycling on a field)
|
||||||
|
└─ T07 (SVG visual guide overlay)
|
||||||
|
└─ T08 (E2E test of PRD scenario steps 5-9)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T01 — `EvidenceLink` + `EvidenceSet` types
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0003-T01
|
||||||
|
priority: critical
|
||||||
|
status: todo
|
||||||
|
```
|
||||||
|
|
||||||
|
Add under `src/shared/`:
|
||||||
|
|
||||||
|
- `src/shared/evidence-link.ts` — `EvidenceLink`, `EvidenceLink.status`
|
||||||
|
enum per SharedContracts §2.4, `EvidenceLink.relation` enum per §2.5,
|
||||||
|
`EvidenceTarget` generic shape
|
||||||
|
- `src/shared/evidence-set.ts` — `EvidenceSet` with `activeEvidenceItemId`
|
||||||
|
|
||||||
|
No services. Pure shapes.
|
||||||
|
|
||||||
|
Add a unit test asserting that the union of all enum values matches the
|
||||||
|
`SharedContracts.md` lists exactly — if someone adds a value without
|
||||||
|
updating the doc, the test fails.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T02 — Binding service + in-memory link repo + active-state machine
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0003-T02
|
||||||
|
priority: high
|
||||||
|
status: todo
|
||||||
|
depends_on: [T01]
|
||||||
|
```
|
||||||
|
|
||||||
|
Under `src/binder/`:
|
||||||
|
|
||||||
|
- `repos/in-memory-links.ts` — Map-backed `EvidenceLinkRepository`
|
||||||
|
- `services/bindings.ts` — `linkEvidenceToTarget`, `unlinkEvidence`,
|
||||||
|
`listEvidenceForTarget`, `setActiveEvidence`
|
||||||
|
- `state/active.ts` — a small machine tracking
|
||||||
|
`(activeTarget, activeEvidenceItem, activeAnnotation)`. Exposed as a React
|
||||||
|
context.
|
||||||
|
|
||||||
|
Emit the events from SharedContracts §4 (`EvidenceLinkCreated`,
|
||||||
|
`EvidenceItemActivated`, `FormFieldActivated`).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T03 — Rect registry (the SharedContracts §7 contract)
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0003-T03
|
||||||
|
priority: critical
|
||||||
|
status: todo
|
||||||
|
depends_on: [T02]
|
||||||
|
```
|
||||||
|
|
||||||
|
Implement under `src/binder/visual-guide/`:
|
||||||
|
|
||||||
|
- `rect-registry.ts` — `RectRegistry` with `register`, `getRect`,
|
||||||
|
`subscribe` per SharedContracts §7
|
||||||
|
- `react-hooks.ts` — `useRegisterRect(kind, id, ref)` for components to
|
||||||
|
register a ref-derived rect
|
||||||
|
- `events.ts` — registry emits `rect-changed` events on
|
||||||
|
scroll/resize/focus/active-evidence-change (use ResizeObserver +
|
||||||
|
IntersectionObserver + window resize + window scroll listeners)
|
||||||
|
|
||||||
|
Unit tests: register a fake field, evidence card, and highlight; mutate
|
||||||
|
their bounding rects; assert subscribers fire with the new rects.
|
||||||
|
|
||||||
|
**This contract must not change after T03.** Three subsystems will depend on
|
||||||
|
it in T05/T06/T07.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T04 — Form schema + simple field renderer
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0003-T04
|
||||||
|
priority: medium
|
||||||
|
status: todo
|
||||||
|
depends_on: [T01]
|
||||||
|
```
|
||||||
|
|
||||||
|
A deliberately minimal form schema lives in `src/app/forms/demo-schema.ts`:
|
||||||
|
|
||||||
|
```ts
|
||||||
|
type FormFieldSchema =
|
||||||
|
| { type: "text"; id: string; label: string }
|
||||||
|
| { type: "textarea"; id: string; label: string }
|
||||||
|
| { type: "date"; id: string; label: string };
|
||||||
|
```
|
||||||
|
|
||||||
|
JSON Schema is **not** used yet — defer that to a later ADR. The MVP form
|
||||||
|
just needs to render 3-4 fields and accept evidence links.
|
||||||
|
|
||||||
|
- `src/work/FormRenderer.tsx` renders the schema as a basic form
|
||||||
|
- Each field registers itself with the rect registry as kind `"field"` with
|
||||||
|
the field's `id`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T05 — Side-by-side layout + link evidence to field
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0003-T05
|
||||||
|
priority: high
|
||||||
|
status: todo
|
||||||
|
depends_on: [T02, T04]
|
||||||
|
```
|
||||||
|
|
||||||
|
A new app route `/forms/demo` shows the side-by-side layout from Architecture
|
||||||
|
§12.2:
|
||||||
|
|
||||||
|
- Left: `FormRenderer` with a demo schema (3 fields)
|
||||||
|
- Right: viewer (reusing `ViewerShell` from CE-WP-0002)
|
||||||
|
- Bottom strip or popover: evidence list
|
||||||
|
|
||||||
|
Linking interaction: click an evidence item, then click a field → link
|
||||||
|
created. (Drag-and-drop is a polish item, not MVP.) Visual indication on
|
||||||
|
linked fields (e.g. a chip showing the count of linked evidence items).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T06 — Active-evidence cycling on a field
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0003-T06
|
||||||
|
priority: high
|
||||||
|
status: todo
|
||||||
|
depends_on: [T05]
|
||||||
|
```
|
||||||
|
|
||||||
|
When a field is focused:
|
||||||
|
|
||||||
|
1. Binder loads the field's evidence set.
|
||||||
|
2. The first evidence item becomes active.
|
||||||
|
3. The viewer scrolls to and highlights its annotation.
|
||||||
|
4. Keyboard `Tab`/`Shift-Tab` within the field's evidence chips cycles
|
||||||
|
active evidence; viewer scrolls accordingly.
|
||||||
|
5. The evidence sidebar highlights the active evidence card.
|
||||||
|
|
||||||
|
Each evidence card registers itself with the rect registry as
|
||||||
|
`"evidence-card"`.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T07 — SVG visual guide overlay
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0003-T07
|
||||||
|
priority: high
|
||||||
|
status: todo
|
||||||
|
depends_on: [T03, T06]
|
||||||
|
```
|
||||||
|
|
||||||
|
Implement `src/binder/visual-guide/Overlay.tsx`:
|
||||||
|
|
||||||
|
- Single absolutely-positioned SVG covering the viewport
|
||||||
|
- Subscribes to the rect registry
|
||||||
|
- On every change, redraws two curves: `field → evidence-card` and
|
||||||
|
`evidence-card → highlight`
|
||||||
|
- Active-only — only the currently active triple gets drawn
|
||||||
|
- Throttled to animation frames
|
||||||
|
|
||||||
|
Acceptance: scroll the viewer, resize the window, change active evidence —
|
||||||
|
the guide tracks every change without visible lag.
|
||||||
|
|
||||||
|
The viewer adapter from CE-WP-0002 must expose
|
||||||
|
`getHighlightClientRects(annotationId)` so the highlight's rect can be
|
||||||
|
registered.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T08 — E2E test of PRD scenario steps 5-9
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0003-T08
|
||||||
|
priority: high
|
||||||
|
status: todo
|
||||||
|
depends_on: [T05, T07]
|
||||||
|
```
|
||||||
|
|
||||||
|
Extend the Playwright E2E from CE-WP-0002-T09:
|
||||||
|
|
||||||
|
5. Navigate to `/forms/demo`.
|
||||||
|
6. Link the previously-created evidence item to the "summary" field.
|
||||||
|
7. Click the "summary" field.
|
||||||
|
8. Assert the field, the evidence card, and the highlight all have an
|
||||||
|
`aria-current="true"` (or equivalent active marker).
|
||||||
|
9. Assert the SVG overlay contains exactly two `<path>` elements (one
|
||||||
|
field→card, one card→highlight).
|
||||||
|
10. Scroll the viewer; assert the SVG paths' endpoints update within the
|
||||||
|
next animation frame.
|
||||||
|
|
||||||
|
If this passes, the form-binding slice is complete and CE-WP-0004 may run
|
||||||
|
in parallel with any deferred polish work.
|
||||||
164
workplans/CE-WP-0004-citation-card-export.md
Normal file
164
workplans/CE-WP-0004-citation-card-export.md
Normal file
@@ -0,0 +1,164 @@
|
|||||||
|
---
|
||||||
|
id: CE-WP-0004
|
||||||
|
type: workplan
|
||||||
|
title: "Citation card export — Markdown and HTML renderers, sidebar export"
|
||||||
|
domain: citation_evidence
|
||||||
|
repo: citation-evidence
|
||||||
|
repo_id: a677c189-b4e2-4f2a-9e48-faa482c277e6
|
||||||
|
status: todo
|
||||||
|
owner: Bernd
|
||||||
|
created: 2026-05-24
|
||||||
|
updated: 2026-05-24
|
||||||
|
depends_on_workplan: CE-WP-0002
|
||||||
|
spec_refs:
|
||||||
|
- wiki/ProductRequirementsDocument.md
|
||||||
|
- wiki/ArchitectureOverview.md
|
||||||
|
- wiki/SharedContracts.md
|
||||||
|
---
|
||||||
|
|
||||||
|
# CE-WP-0004 — Citation Card Export
|
||||||
|
|
||||||
|
The final step of the MVP scenario: turn an evidence item into a portable
|
||||||
|
Markdown or HTML citation card.
|
||||||
|
|
||||||
|
After this workplan, a user can:
|
||||||
|
|
||||||
|
1. Click "Export" on an evidence item in the sidebar.
|
||||||
|
2. Choose Markdown or HTML.
|
||||||
|
3. Get a clipboard-ready citation card with quote, source label,
|
||||||
|
commentary, and a link back to source context.
|
||||||
|
|
||||||
|
This workplan can run in parallel with CE-WP-0003 once CE-WP-0002 is done —
|
||||||
|
it touches different code paths.
|
||||||
|
|
||||||
|
## Dependency Order
|
||||||
|
|
||||||
|
```
|
||||||
|
T01 (CitationCard type + open-context URL convention)
|
||||||
|
└─ T02 (Markdown renderer)
|
||||||
|
└─ T03 (HTML renderer)
|
||||||
|
└─ T04 (sidebar Export button + copy-to-clipboard)
|
||||||
|
└─ T05 (E2E test of PRD scenario step 10)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T01 — `CitationCard` type + open-context URL convention
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0004-T01
|
||||||
|
priority: high
|
||||||
|
status: todo
|
||||||
|
```
|
||||||
|
|
||||||
|
Under `src/shared/`:
|
||||||
|
|
||||||
|
- `src/shared/citation-card.ts` — `CitationCard` per Architecture §4.7
|
||||||
|
- `src/shared/open-context-url.ts` — function `openContextUrl(annotationId)`
|
||||||
|
returning a URL of the form
|
||||||
|
`/viewer?document=<docId>&annotation=<annId>` (per Architecture §14.3)
|
||||||
|
|
||||||
|
The URL is the deep link that an exported card uses to reopen the source
|
||||||
|
context in this MVP. When persistence becomes real (post-MVP), the URL
|
||||||
|
scheme stays the same.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T02 — Markdown citation card renderer
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0004-T02
|
||||||
|
priority: high
|
||||||
|
status: todo
|
||||||
|
depends_on: [T01]
|
||||||
|
```
|
||||||
|
|
||||||
|
Under `src/engine/rendering/`:
|
||||||
|
|
||||||
|
- `markdown.ts` — `renderCitationCardMarkdown(evidenceItem, document, annotation): string`
|
||||||
|
|
||||||
|
Output format (lock this in `docs/decisions/ADR-0007-citation-card-format.md`):
|
||||||
|
|
||||||
|
```markdown
|
||||||
|
> {quote}
|
||||||
|
|
||||||
|
— *{sourceLabel}* · [Open source]({openContextUrl})
|
||||||
|
|
||||||
|
{commentary}
|
||||||
|
```
|
||||||
|
|
||||||
|
Where `sourceLabel` is `document.title` if present, else the filename, else
|
||||||
|
the document URI.
|
||||||
|
|
||||||
|
Unit tests: snapshot a few rendered cards against fixtures.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T03 — HTML citation card renderer
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0004-T03
|
||||||
|
priority: high
|
||||||
|
status: todo
|
||||||
|
depends_on: [T01]
|
||||||
|
```
|
||||||
|
|
||||||
|
Under `src/engine/rendering/`:
|
||||||
|
|
||||||
|
- `html.ts` — `renderCitationCardHtml(evidenceItem, document, annotation): string`
|
||||||
|
|
||||||
|
Output: a single `<aside class="citation-card">` element with `<blockquote>`,
|
||||||
|
`<cite>`, `<a>` (open context), and `<div class="commentary">`. Inline
|
||||||
|
styles avoided — host page provides CSS. Sanitize commentary as plain text
|
||||||
|
(no raw HTML pass-through).
|
||||||
|
|
||||||
|
Web component `<citation-card>` from Architecture §14.2 is *not* in scope
|
||||||
|
here — it ships in a later workplan.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T04 — Sidebar Export button + copy-to-clipboard
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0004-T04
|
||||||
|
priority: medium
|
||||||
|
status: todo
|
||||||
|
depends_on: [T02, T03]
|
||||||
|
```
|
||||||
|
|
||||||
|
Add to `src/work/EvidenceSidebar.tsx`:
|
||||||
|
|
||||||
|
- Per evidence item: an "Export" affordance (icon button or menu)
|
||||||
|
- On click: small popover with two buttons, "Copy as Markdown" and
|
||||||
|
"Copy as HTML"
|
||||||
|
- On click: render via T02/T03 and write to clipboard with the standard
|
||||||
|
`navigator.clipboard` API; show a transient confirmation toast
|
||||||
|
|
||||||
|
Keyboard shortcut `Cmd/Ctrl+Shift+C` exports the active evidence item as
|
||||||
|
Markdown (the most common action).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## T05 — E2E test of PRD scenario step 10
|
||||||
|
|
||||||
|
```task
|
||||||
|
id: CE-WP-0004-T05
|
||||||
|
priority: medium
|
||||||
|
status: todo
|
||||||
|
depends_on: [T04]
|
||||||
|
```
|
||||||
|
|
||||||
|
Extend the Playwright E2E:
|
||||||
|
|
||||||
|
10. After the earlier steps, click Export → Copy as Markdown on the saved
|
||||||
|
evidence item.
|
||||||
|
11. Read the clipboard; assert it contains the quote text, the document
|
||||||
|
title, the commentary, and a URL matching the
|
||||||
|
`/viewer?document=...&annotation=...` shape.
|
||||||
|
|
||||||
|
If this passes, MVP scenario steps 1-10 are all green and the
|
||||||
|
umbrella-first MVP is *done* for the first reference scenario from PRD §20.
|
||||||
|
|
||||||
|
The next workplan (post-MVP) would be `CE-WP-0005` to either extract the
|
||||||
|
first stable subsystem (likely `citation-engine`) into its own repo or to
|
||||||
|
add Markdown/HTML document support.
|
||||||
41
workplans/README.md
Normal file
41
workplans/README.md
Normal file
@@ -0,0 +1,41 @@
|
|||||||
|
# MVP Workplans
|
||||||
|
|
||||||
|
These four workplans implement the **first reference scenario** from
|
||||||
|
`wiki/ProductRequirementsDocument.md` §20 — end-to-end PDF evidence
|
||||||
|
capture → form binding → citation card export — entirely inside the
|
||||||
|
`citation-evidence` repository.
|
||||||
|
|
||||||
|
| Workplan | Title | Status |
|
||||||
|
|----------|----------------------------------------|--------|
|
||||||
|
| `CE-WP-0001` | Foundations — scaffold, folders, lint rules, normalize, fixtures | todo |
|
||||||
|
| `CE-WP-0002` | PDF review slice — engine types, anchor, source, viewer, sidebar | todo |
|
||||||
|
| `CE-WP-0003` | Form binding + visual guide — EvidenceLink, rect registry, overlay | todo |
|
||||||
|
| `CE-WP-0004` | Citation card export — Markdown + HTML renderers, sidebar export | todo |
|
||||||
|
|
||||||
|
## Order
|
||||||
|
|
||||||
|
Strictly sequential. `CE-WP-0002` depends on the folder/lint scaffolding from
|
||||||
|
`CE-WP-0001`. `CE-WP-0003` and `CE-WP-0004` depend on the engine types,
|
||||||
|
viewer adapter, and sidebar from `CE-WP-0002`.
|
||||||
|
|
||||||
|
## How to run a workplan
|
||||||
|
|
||||||
|
```
|
||||||
|
/ralph-workplan workplans/CE-WP-0001-foundations.md
|
||||||
|
```
|
||||||
|
|
||||||
|
Ralph drives the loop and retires automatically when all tasks in the
|
||||||
|
workplan are marked `done`. See `~/.claude/plugins/ralph-workplan/ralph-workplan.md`.
|
||||||
|
|
||||||
|
## Acceptance for MVP
|
||||||
|
|
||||||
|
The first reference scenario from PRD §20 runs end-to-end:
|
||||||
|
|
||||||
|
1. Create a collection
|
||||||
|
2. Upload a PDF
|
||||||
|
3. Select a passage, add commentary, create an evidence item
|
||||||
|
4. Open a side-by-side form
|
||||||
|
5. Link the evidence item to a form field
|
||||||
|
6. Focus the field → field, evidence card, and PDF passage all highlighted
|
||||||
|
7. SVG guide visible between field → card → highlight
|
||||||
|
8. Export evidence as a Markdown citation card
|
||||||
Reference in New Issue
Block a user