7.2 KiB
INTENT
Purpose
This repository exists to provide the umbrella product, integration shell, and reference implementation for citation-evidence.
citation-evidence is a document-centered evidence workspace for capturing, managing, presenting, and reopening citations with contextual commentary across PDFs, Markdown, HTML, and other document formats.
The project enables users to turn source passages into reusable evidence objects that can support form fields, claims, requirements, decisions, reports, and web publications.
A citation should not be a dead reference. It should be an actionable bridge back to the source context.
Primary Utility
The repository provides the integrated workspace and coordination layer for the citation-evidence system.
It brings together the subsystem projects:
- citation-engine — core domain model, APIs, persistence contracts, citation rendering, and orchestration
- evidence-anchor — durable selectors, anchoring, re-anchoring, and highlight resolution
- evidence-source — document ingestion, text extraction, source metadata, and citation recovery
- citation-work — document collection review, annotation workflow, and evidence sidebar UX
- evidence-binder — linking evidence to form fields, claims, requirements, decisions, and other structured targets
The umbrella repository exists to demonstrate and validate how these subsystems work together as one coherent product.
Intended Users
Primary users are people and systems that need evidence-backed information work:
- researchers and analysts reviewing document collections
- form workers and case processors who need source-backed field entries
- consultants and knowledge workers producing evidence-backed reports
- compliance, audit, procurement, and legal-adjacent workers who need traceable justification
- product and requirements workers linking source material to structured decisions
- developers integrating citation-evidence capabilities into other applications
- agentic assistants helping users search, extract, bind, and present evidence
Strategic Role
The strategic role of citation-evidence is to establish a reusable infrastructure layer for evidence-backed information spaces.
It connects three activities that are often handled separately:
- reading and annotating documents,
- extracting reusable citations and commentary,
- binding evidence to structured outputs such as forms, claims, requirements, reports, and web pages.
The project should become a foundation for workflows where information must remain traceable to its source context.
Core Concept
The central flow of the system is:
Source Document
→ Document Representation
→ Durable Annotation Anchor
→ Evidence Item with Commentary
→ Evidence Link to Field / Claim / Requirement
→ Portable Citation Card
→ Reopenable Source Context
The system treats an evidence item as more than a highlight.
An evidence item is a reusable object that can:
- quote a source passage,
- preserve commentary,
- reopen the source context,
- support or contradict a structured target,
- be exported into another document or webpage,
- be reused by humans and software agents.
Scope
This repository owns the integrated product scope.
It should contain:
- product documentation
- architecture documentation
- integration scenarios
- reference workspace application
- cross-subsystem examples
- demo data and test workflows
- deployment sketches
- system-level acceptance tests
- onboarding material for developers and agents
It should coordinate the subsystem repositories without absorbing their responsibilities.
Out of Scope
This repository should not become the implementation home for all subsystem internals.
Specifically, it should not own:
- low-level selector and re-anchoring algorithms
- full document ingestion and extraction pipelines
- the complete persistence implementation
- all viewer-specific internals
- all form-binding logic
- all citation rendering logic
Those responsibilities belong in the focused subsystem repositories.
The umbrella repository should integrate, validate, and demonstrate them.
Initial Product Modes
The integrated product should support three primary modes.
1. Document Review
Users add documents to a collection, review them, highlight relevant passages, add commentary, and create reusable evidence items.
2. Evidence-Backed Forms
Users display source documents next to structured forms. Form fields can be linked to evidence items. Activating a field focuses the corresponding source citation and visually connects field, evidence item, and document highlight.
3. Citation Recovery
Users provide a citation, quote, or source clue. The system searches local and eventually configured external sources, locates candidate passages, and allows the user to confirm and turn the passage into a navigable annotation.
Architectural Direction
The project should be built around a headless, format-neutral evidence model with viewer-specific adapters.
Key principles:
- citations must not depend on one specific viewer implementation
- multiple selector types should be stored for durable re-anchoring
- evidence items should be first-class domain objects
- PDFs, Markdown, HTML, and future formats should share the same evidence model
- uncertain source recovery should require human confirmation
- citation cards should be portable across web, Markdown, and later report outputs
- APIs and data structures should be suitable for agentic workflows
Initial Reference Scenario
The first end-to-end scenario should be:
- A user creates a document collection.
- The user adds a PDF.
- The user selects a passage and adds commentary.
- The system creates an annotation and evidence item.
- The user opens a form next to the document.
- The user links the evidence item to a form field.
- The user focuses the field.
- The system highlights the field, evidence item, and source passage.
- The system draws a visual guide between them.
- The user exports the evidence as a Markdown or HTML citation card.
This scenario validates the core product promise without requiring advanced collaboration or external source discovery.
Repository Character
This repository should be:
- integrative rather than monolithic
- product-oriented rather than library-only
- documentation-rich
- testable through reference scenarios
- friendly to human developers and coding agents
- explicit about subsystem boundaries
- suitable as the entry point for the overall citation-evidence ecosystem
Success Criteria
The repository is successful when it allows a developer or agent to understand, run, and extend the citation-evidence system as an integrated product.
A first useful version should make it possible to:
- load a document collection,
- review a PDF,
- create an evidence item from selected text,
- link that evidence item to a structured form field,
- reopen the cited source context from the field,
- render the evidence as a citation card,
- understand which subsystem owns which part of the implementation.
Guiding Statement
citation-evidence exists to make source-backed information work navigable, reusable, and trustworthy.