generated from coulomb/repo-seed
212 lines
7.2 KiB
Markdown
212 lines
7.2 KiB
Markdown
# INTENT
|
|
|
|
## Purpose
|
|
|
|
This repository exists to provide the umbrella product, integration shell, and reference implementation for **citation-evidence**.
|
|
|
|
**citation-evidence** is a document-centered evidence workspace for capturing, managing, presenting, and reopening citations with contextual commentary across PDFs, Markdown, HTML, and other document formats.
|
|
|
|
The project enables users to turn source passages into reusable evidence objects that can support form fields, claims, requirements, decisions, reports, and web publications.
|
|
|
|
A citation should not be a dead reference. It should be an actionable bridge back to the source context.
|
|
|
|
---
|
|
|
|
## Primary Utility
|
|
|
|
The repository provides the integrated workspace and coordination layer for the citation-evidence system.
|
|
|
|
It brings together the subsystem projects:
|
|
|
|
- **citation-engine** — core domain model, APIs, persistence contracts, citation rendering, and orchestration
|
|
- **evidence-anchor** — durable selectors, anchoring, re-anchoring, and highlight resolution
|
|
- **evidence-source** — document ingestion, text extraction, source metadata, and citation recovery
|
|
- **citation-work** — document collection review, annotation workflow, and evidence sidebar UX
|
|
- **evidence-binder** — linking evidence to form fields, claims, requirements, decisions, and other structured targets
|
|
|
|
The umbrella repository exists to demonstrate and validate how these subsystems work together as one coherent product.
|
|
|
|
---
|
|
|
|
## Intended Users
|
|
|
|
Primary users are people and systems that need evidence-backed information work:
|
|
|
|
- researchers and analysts reviewing document collections
|
|
- form workers and case processors who need source-backed field entries
|
|
- consultants and knowledge workers producing evidence-backed reports
|
|
- compliance, audit, procurement, and legal-adjacent workers who need traceable justification
|
|
- product and requirements workers linking source material to structured decisions
|
|
- developers integrating citation-evidence capabilities into other applications
|
|
- agentic assistants helping users search, extract, bind, and present evidence
|
|
|
|
---
|
|
|
|
## Strategic Role
|
|
|
|
The strategic role of **citation-evidence** is to establish a reusable infrastructure layer for **evidence-backed information spaces**.
|
|
|
|
It connects three activities that are often handled separately:
|
|
|
|
1. reading and annotating documents,
|
|
2. extracting reusable citations and commentary,
|
|
3. binding evidence to structured outputs such as forms, claims, requirements, reports, and web pages.
|
|
|
|
The project should become a foundation for workflows where information must remain traceable to its source context.
|
|
|
|
---
|
|
|
|
## Core Concept
|
|
|
|
The central flow of the system is:
|
|
|
|
```text
|
|
Source Document
|
|
→ Document Representation
|
|
→ Durable Annotation Anchor
|
|
→ Evidence Item with Commentary
|
|
→ Evidence Link to Field / Claim / Requirement
|
|
→ Portable Citation Card
|
|
→ Reopenable Source Context
|
|
````
|
|
|
|
The system treats an evidence item as more than a highlight.
|
|
|
|
An evidence item is a reusable object that can:
|
|
|
|
* quote a source passage,
|
|
* preserve commentary,
|
|
* reopen the source context,
|
|
* support or contradict a structured target,
|
|
* be exported into another document or webpage,
|
|
* be reused by humans and software agents.
|
|
|
|
---
|
|
|
|
## Scope
|
|
|
|
This repository owns the integrated product scope.
|
|
|
|
It should contain:
|
|
|
|
* product documentation
|
|
* architecture documentation
|
|
* integration scenarios
|
|
* reference workspace application
|
|
* cross-subsystem examples
|
|
* demo data and test workflows
|
|
* deployment sketches
|
|
* system-level acceptance tests
|
|
* onboarding material for developers and agents
|
|
|
|
It should coordinate the subsystem repositories without absorbing their responsibilities.
|
|
|
|
---
|
|
|
|
## Out of Scope
|
|
|
|
This repository should not become the implementation home for all subsystem internals.
|
|
|
|
Specifically, it should not own:
|
|
|
|
* low-level selector and re-anchoring algorithms
|
|
* full document ingestion and extraction pipelines
|
|
* the complete persistence implementation
|
|
* all viewer-specific internals
|
|
* all form-binding logic
|
|
* all citation rendering logic
|
|
|
|
Those responsibilities belong in the focused subsystem repositories.
|
|
|
|
The umbrella repository should integrate, validate, and demonstrate them.
|
|
|
|
---
|
|
|
|
## Initial Product Modes
|
|
|
|
The integrated product should support three primary modes.
|
|
|
|
### 1. Document Review
|
|
|
|
Users add documents to a collection, review them, highlight relevant passages, add commentary, and create reusable evidence items.
|
|
|
|
### 2. Evidence-Backed Forms
|
|
|
|
Users display source documents next to structured forms. Form fields can be linked to evidence items. Activating a field focuses the corresponding source citation and visually connects field, evidence item, and document highlight.
|
|
|
|
### 3. Citation Recovery
|
|
|
|
Users provide a citation, quote, or source clue. The system searches local and eventually configured external sources, locates candidate passages, and allows the user to confirm and turn the passage into a navigable annotation.
|
|
|
|
---
|
|
|
|
## Architectural Direction
|
|
|
|
The project should be built around a headless, format-neutral evidence model with viewer-specific adapters.
|
|
|
|
Key principles:
|
|
|
|
* citations must not depend on one specific viewer implementation
|
|
* multiple selector types should be stored for durable re-anchoring
|
|
* evidence items should be first-class domain objects
|
|
* PDFs, Markdown, HTML, and future formats should share the same evidence model
|
|
* uncertain source recovery should require human confirmation
|
|
* citation cards should be portable across web, Markdown, and later report outputs
|
|
* APIs and data structures should be suitable for agentic workflows
|
|
|
|
---
|
|
|
|
## Initial Reference Scenario
|
|
|
|
The first end-to-end scenario should be:
|
|
|
|
1. A user creates a document collection.
|
|
2. The user adds a PDF.
|
|
3. The user selects a passage and adds commentary.
|
|
4. The system creates an annotation and evidence item.
|
|
5. The user opens a form next to the document.
|
|
6. The user links the evidence item to a form field.
|
|
7. The user focuses the field.
|
|
8. The system highlights the field, evidence item, and source passage.
|
|
9. The system draws a visual guide between them.
|
|
10. The user exports the evidence as a Markdown or HTML citation card.
|
|
|
|
This scenario validates the core product promise without requiring advanced collaboration or external source discovery.
|
|
|
|
---
|
|
|
|
## Repository Character
|
|
|
|
This repository should be:
|
|
|
|
* integrative rather than monolithic
|
|
* product-oriented rather than library-only
|
|
* documentation-rich
|
|
* testable through reference scenarios
|
|
* friendly to human developers and coding agents
|
|
* explicit about subsystem boundaries
|
|
* suitable as the entry point for the overall citation-evidence ecosystem
|
|
|
|
---
|
|
|
|
## Success Criteria
|
|
|
|
The repository is successful when it allows a developer or agent to understand, run, and extend the citation-evidence system as an integrated product.
|
|
|
|
A first useful version should make it possible to:
|
|
|
|
* load a document collection,
|
|
* review a PDF,
|
|
* create an evidence item from selected text,
|
|
* link that evidence item to a structured form field,
|
|
* reopen the cited source context from the field,
|
|
* render the evidence as a citation card,
|
|
* understand which subsystem owns which part of the implementation.
|
|
|
|
---
|
|
|
|
## Guiding Statement
|
|
|
|
**citation-evidence exists to make source-backed information work navigable, reusable, and trustworthy.**
|
|
|