Files
citation-engine/INTENT.md
tegwick 6ba8f23b1f Add MVP Coordination section: code lives in citation-evidence umbrella during MVP
Documents the umbrella-first MVP decision (2026-05-24). This repo remains
INTENT-only until the engine's interfaces stabilize through real product
use. Points at the umbrella's wiki/SharedContracts.md, wiki/DependencyMap.md,
and docs/decisions/ as the canonical homes for cross-repo agreements.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 16:50:59 +02:00

10 KiB

INTENT

Purpose

This repository exists to provide the core domain engine for the citation-evidence ecosystem.

citation-engine defines the stable conceptual model, service contracts, API boundaries, persistence interfaces, and citation rendering logic needed to manage documents, annotations, evidence items, evidence links, and reusable citation presentations.

It is the domain center of the system.


Primary Utility

The repository provides the shared engine that allows the other citation-evidence subsystem repositories to work against a common model.

It should define and coordinate the core concepts:

  • Document
  • DocumentRepresentation
  • Annotation
  • Selector
  • EvidenceItem
  • EvidenceSet
  • EvidenceLink
  • CitationCard
  • CitationRecoveryAttempt

The engine should make it possible to create, store, retrieve, relate, render, and export evidence-backed citations without depending on one specific viewer, frontend, storage backend, or document format.


Intended Users

Primary users of this repository are developers and agents building the citation-evidence system.

They include:

  • developers implementing the review workspace
  • developers implementing evidence-backed form workflows
  • developers implementing document ingestion and citation recovery
  • developers implementing anchoring and re-anchoring logic
  • developers integrating citation-evidence into other applications
  • coding agents that need a stable model and API boundary for implementation work

End users should usually experience this repository indirectly through applications built on top of it.


Strategic Role

The strategic role of citation-engine is to prevent the citation-evidence ecosystem from becoming a loose collection of viewer-specific annotation tools.

It provides the shared domain language that keeps the system coherent.

The repository should ensure that:

  • annotations are not tied to one viewer implementation,
  • evidence is treated as a first-class object,
  • source passages can be reused across forms, claims, requirements, reports, and webpages,
  • citation presentation is portable,
  • storage and rendering implementations remain replaceable,
  • subsystem repositories can evolve without breaking the core conceptual model.

Core Concept

The engine models the central flow of evidence-backed citation work:

Document
  → DocumentRepresentation
  → Annotation
  → EvidenceItem
  → EvidenceLink
  → CitationCard

A Document is the known source.

A DocumentRepresentation is a normalized, searchable, addressable representation of that source.

An Annotation is a technical mark on a source range.

An EvidenceItem is the meaningful evidence object created from one or more annotations.

An EvidenceLink connects evidence to a structured target such as a form field, claim, requirement, decision, or document section.

A CitationCard is a portable rendering of evidence for use in webpages, Markdown, reports, or other documents.


Scope

This repository should own:

  • the core domain model
  • TypeScript interfaces and schemas for citation-evidence entities
  • service contracts for documents, annotations, evidence, bindings, and citation rendering
  • persistence interfaces
  • event definitions
  • citation card rendering contracts
  • import/export contracts
  • W3C Web Annotation mapping where practical
  • orchestration-level use cases that combine domain objects

It should provide the stable contracts consumed by:

  • evidence-anchor
  • evidence-source
  • citation-work
  • evidence-binder
  • citation-evidence

Out of Scope

This repository should not own implementation details that belong to more focused subsystems.

Specifically, it should not own:

  • PDF rendering internals
  • viewer-specific selection capture
  • low-level selector resolution algorithms
  • fuzzy text matching implementations
  • document parsing and ingestion pipelines
  • OCR processing
  • external source discovery implementations
  • full review workspace UI
  • form UI rendering
  • visual guide overlay rendering
  • application shell and deployment configuration

Those responsibilities belong to the appropriate subsystem repositories.


Architectural Position

The repository sits between the product shell and the specialized subsystems.

citation-evidence
  integrated product shell

citation-engine
  core model, APIs, persistence contracts, citation rendering

evidence-anchor
  selector creation, resolution, re-anchoring, highlight contracts

evidence-source
  ingestion, extraction, metadata, source discovery, recovery

evidence-binder
  evidence-to-field / claim / requirement links

citation-work
  review workspace and annotation UX

The engine should define the contract, not dictate every implementation.


Design Principles

Viewer Independence

The engine must not depend on one PDF viewer, Markdown renderer, HTML renderer, or frontend framework.

Viewer-specific logic should be hidden behind adapters owned by other subsystems.

Evidence as First-Class Object

Evidence must not be reduced to a highlight.

An evidence item may include commentary, confidence, status, tags, and links to structured targets.

Selector Neutrality

The engine should understand selectors as domain objects but should not own all selector resolution logic.

Selector creation and resolution belong primarily to evidence-anchor.

Storage Replaceability

The engine should define persistence interfaces that can be implemented by local files, SQLite, PostgreSQL, browser storage, or other storage backends.

Portable Presentation

Citation rendering should support multiple output targets, especially:

  • internal web UI
  • web components
  • Markdown
  • HTML
  • later report/document exports

Standards Compatibility

The engine should support mapping to W3C Web Annotation concepts where practical, but it does not need to use JSON-LD as the only internal representation.

Agent Readiness

The model and APIs should be explicit, machine-readable, and suitable for agentic workflows.


Initial Domain Services

The first implementation should likely define service contracts for:

DocumentService
  create, get, update, list, attach representation

AnnotationService
  create, get, list by document, resolve status, update

EvidenceService
  create evidence item, attach annotation, update commentary, set status

EvidenceBindingService
  link evidence to target, list evidence for target, switch active evidence

CitationRenderingService
  render citation card as HTML, Markdown, or structured object

ImportExportService
  import/export internal model and W3C-compatible annotation data

These services may initially be interfaces and in-memory implementations only.


Initial Entity Set

The first model version should include:

Document
DocumentRepresentation
Annotation
Selector
EvidenceItem
EvidenceSet
EvidenceLink
CitationCard
CitationTarget
CitationRecoveryAttempt

The first implementation does not need to be complete, but the naming should stabilize early to guide the other repositories.


Integration Expectations

This repository should be easy to consume from other subsystem projects.

Expected consumers:

  • evidence-anchor uses the selector and annotation types.
  • evidence-source creates documents and document representations.
  • citation-work creates annotations and evidence items through engine services.
  • evidence-binder creates and manages evidence links.
  • citation-evidence integrates the services into the reference workspace.

The engine should avoid circular dependencies with these repositories.


First Useful Version

A first useful version of citation-engine should provide:

  • core TypeScript types for the main domain objects,
  • in-memory repositories for development and tests,
  • basic services for creating documents, annotations, evidence items, and evidence links,
  • a simple citation card renderer for Markdown and HTML,
  • basic event types,
  • initial W3C Web Annotation mapping notes or stubs,
  • examples showing how the other subsystem repos should interact with the engine.

Success Criteria

The repository is successful when another developer or coding agent can use it to understand and implement the core citation-evidence domain without guessing the central concepts.

A first successful implementation should make it possible to:

  1. create a document,
  2. attach a document representation,
  3. create an annotation with selectors,
  4. create an evidence item from the annotation,
  5. link the evidence item to a target,
  6. render the evidence as a citation card,
  7. leave viewer-specific and ingestion-specific work to other subsystems.

Repository Character

This repository should be:

  • domain-centered,
  • stable but evolvable,
  • implementation-light at first,
  • strongly typed,
  • explicit about boundaries,
  • friendly to both humans and coding agents,
  • suitable as the conceptual backbone of the citation-evidence ecosystem.

MVP Coordination — Code Lives Upstream

During the umbrella-first MVP phase (decided 2026-05-24), the source code for this subsystem does not live in this repository yet. It lives in the umbrella repo at citation-evidence/src/engine/ and citation-evidence/src/shared/.

This INTENT.md documents the intended responsibilities and boundaries. When the engine's interfaces have stabilized through actual MVP use, the corresponding code extracts into this repository.

Shared contracts (vocabulary, state enums, relation types, selector taxonomy, event types, viewer adapter, canonical text normalization, allowed dependency edges) are maintained in the umbrella repo:

  • citation-evidence/wiki/SharedContracts.md
  • citation-evidence/wiki/DependencyMap.md
  • citation-evidence/docs/decisions/ (ADRs)

This subsystem's eventual code must not contradict those documents. Changes to shared contracts happen in the umbrella, not here.

Under the dependency map, citation-engine is the leaf node — it depends on none of the other subsystems. Every other subsystem depends on it.


Guiding Statement

citation-engine exists to make evidence-backed citations a stable, reusable, and portable domain model.