generated from coulomb/repo-seed
- INTENT.md: declare umbrella as the home for shared contracts; document umbrella-first MVP decision (code lives here until subsystems stabilize) - wiki/SharedContracts.md: vocabulary, state enums, relation types, selector taxonomy, event vocabulary, viewer adapter contract, canonical text normalization, rect-registry contract - wiki/DependencyMap.md: allowed dependency edges; folder layout + lint-rule strategy during umbrella-first phase - history/2026-05-24-initial-assessment.md: alignment review, technical risks, and the umbrella-first pivot rationale - workplans/CE-WP-0001..0004: four ralph-compatible workplans covering foundations, PDF review slice, form binding + visual guide, and citation card export — implementing PRD §20 end-to-end Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
265 lines
9.4 KiB
Markdown
265 lines
9.4 KiB
Markdown
# INTENT
|
|
|
|
## Purpose
|
|
|
|
This repository exists to provide the umbrella product, integration shell, and reference implementation for **citation-evidence**.
|
|
|
|
**citation-evidence** is a document-centered evidence workspace for capturing, managing, presenting, and reopening citations with contextual commentary across PDFs, Markdown, HTML, and other document formats.
|
|
|
|
The project enables users to turn source passages into reusable evidence objects that can support form fields, claims, requirements, decisions, reports, and web publications.
|
|
|
|
A citation should not be a dead reference. It should be an actionable bridge back to the source context.
|
|
|
|
---
|
|
|
|
## Primary Utility
|
|
|
|
The repository provides the integrated workspace and coordination layer for the citation-evidence system.
|
|
|
|
It brings together the subsystem projects:
|
|
|
|
- **citation-engine** — core domain model, APIs, persistence contracts, citation rendering, and orchestration
|
|
- **evidence-anchor** — durable selectors, anchoring, re-anchoring, and highlight resolution
|
|
- **evidence-source** — document ingestion, text extraction, source metadata, and citation recovery
|
|
- **citation-work** — document collection review, annotation workflow, and evidence sidebar UX
|
|
- **evidence-binder** — linking evidence to form fields, claims, requirements, decisions, and other structured targets
|
|
|
|
The umbrella repository exists to demonstrate and validate how these subsystems work together as one coherent product.
|
|
|
|
---
|
|
|
|
## Intended Users
|
|
|
|
Primary users are people and systems that need evidence-backed information work:
|
|
|
|
- researchers and analysts reviewing document collections
|
|
- form workers and case processors who need source-backed field entries
|
|
- consultants and knowledge workers producing evidence-backed reports
|
|
- compliance, audit, procurement, and legal-adjacent workers who need traceable justification
|
|
- product and requirements workers linking source material to structured decisions
|
|
- developers integrating citation-evidence capabilities into other applications
|
|
- agentic assistants helping users search, extract, bind, and present evidence
|
|
|
|
---
|
|
|
|
## Strategic Role
|
|
|
|
The strategic role of **citation-evidence** is to establish a reusable infrastructure layer for **evidence-backed information spaces**.
|
|
|
|
It connects three activities that are often handled separately:
|
|
|
|
1. reading and annotating documents,
|
|
2. extracting reusable citations and commentary,
|
|
3. binding evidence to structured outputs such as forms, claims, requirements, reports, and web pages.
|
|
|
|
The project should become a foundation for workflows where information must remain traceable to its source context.
|
|
|
|
---
|
|
|
|
## Core Concept
|
|
|
|
The central flow of the system is:
|
|
|
|
```text
|
|
Source Document
|
|
→ Document Representation
|
|
→ Durable Annotation Anchor
|
|
→ Evidence Item with Commentary
|
|
→ Evidence Link to Field / Claim / Requirement
|
|
→ Portable Citation Card
|
|
→ Reopenable Source Context
|
|
````
|
|
|
|
The system treats an evidence item as more than a highlight.
|
|
|
|
An evidence item is a reusable object that can:
|
|
|
|
* quote a source passage,
|
|
* preserve commentary,
|
|
* reopen the source context,
|
|
* support or contradict a structured target,
|
|
* be exported into another document or webpage,
|
|
* be reused by humans and software agents.
|
|
|
|
---
|
|
|
|
## Scope
|
|
|
|
This repository owns the integrated product scope.
|
|
|
|
It should contain:
|
|
|
|
* product documentation
|
|
* architecture documentation
|
|
* integration scenarios
|
|
* reference workspace application
|
|
* cross-subsystem examples
|
|
* demo data and test workflows
|
|
* deployment sketches
|
|
* system-level acceptance tests
|
|
* onboarding material for developers and agents
|
|
|
|
It should coordinate the subsystem repositories without absorbing their responsibilities.
|
|
|
|
---
|
|
|
|
## Out of Scope
|
|
|
|
This repository should not become the implementation home for all subsystem internals.
|
|
|
|
Specifically, it should not own:
|
|
|
|
* low-level selector and re-anchoring algorithms
|
|
* full document ingestion and extraction pipelines
|
|
* the complete persistence implementation
|
|
* all viewer-specific internals
|
|
* all form-binding logic
|
|
* all citation rendering logic
|
|
|
|
Those responsibilities belong in the focused subsystem repositories.
|
|
|
|
The umbrella repository should integrate, validate, and demonstrate them.
|
|
|
|
---
|
|
|
|
## Initial Product Modes
|
|
|
|
The integrated product should support three primary modes.
|
|
|
|
### 1. Document Review
|
|
|
|
Users add documents to a collection, review them, highlight relevant passages, add commentary, and create reusable evidence items.
|
|
|
|
### 2. Evidence-Backed Forms
|
|
|
|
Users display source documents next to structured forms. Form fields can be linked to evidence items. Activating a field focuses the corresponding source citation and visually connects field, evidence item, and document highlight.
|
|
|
|
### 3. Citation Recovery
|
|
|
|
Users provide a citation, quote, or source clue. The system searches local and eventually configured external sources, locates candidate passages, and allows the user to confirm and turn the passage into a navigable annotation.
|
|
|
|
---
|
|
|
|
## Architectural Direction
|
|
|
|
The project should be built around a headless, format-neutral evidence model with viewer-specific adapters.
|
|
|
|
Key principles:
|
|
|
|
* citations must not depend on one specific viewer implementation
|
|
* multiple selector types should be stored for durable re-anchoring
|
|
* evidence items should be first-class domain objects
|
|
* PDFs, Markdown, HTML, and future formats should share the same evidence model
|
|
* uncertain source recovery should require human confirmation
|
|
* citation cards should be portable across web, Markdown, and later report outputs
|
|
* APIs and data structures should be suitable for agentic workflows
|
|
|
|
---
|
|
|
|
## Initial Reference Scenario
|
|
|
|
The first end-to-end scenario should be:
|
|
|
|
1. A user creates a document collection.
|
|
2. The user adds a PDF.
|
|
3. The user selects a passage and adds commentary.
|
|
4. The system creates an annotation and evidence item.
|
|
5. The user opens a form next to the document.
|
|
6. The user links the evidence item to a form field.
|
|
7. The user focuses the field.
|
|
8. The system highlights the field, evidence item, and source passage.
|
|
9. The system draws a visual guide between them.
|
|
10. The user exports the evidence as a Markdown or HTML citation card.
|
|
|
|
This scenario validates the core product promise without requiring advanced collaboration or external source discovery.
|
|
|
|
---
|
|
|
|
## Repository Character
|
|
|
|
This repository should be:
|
|
|
|
* integrative rather than monolithic
|
|
* product-oriented rather than library-only
|
|
* documentation-rich
|
|
* testable through reference scenarios
|
|
* friendly to human developers and coding agents
|
|
* explicit about subsystem boundaries
|
|
* suitable as the entry point for the overall citation-evidence ecosystem
|
|
|
|
---
|
|
|
|
## Home for Shared Contracts
|
|
|
|
This repository is the **single home for everything the sister repos must
|
|
agree on**. The canonical documents live in `wiki/`:
|
|
|
|
* `wiki/ProductRequirementsDocument.md` — what the product does
|
|
* `wiki/ArchitectureOverview.md` — how the subsystems compose
|
|
* `wiki/SharedContracts.md` — vocabulary, state enums, relation types, selector taxonomy, event types, viewer adapter contract, canonical text normalization
|
|
* `wiki/DependencyMap.md` — which subsystem may depend on which
|
|
* `docs/decisions/` — ADRs that resolve ambiguities and bind the contract
|
|
|
|
Sister repos (`citation-engine`, `evidence-anchor`, `evidence-source`,
|
|
`citation-work`, `evidence-binder`) defer to these documents. When their
|
|
own `INTENT.md` files mention "shared contracts", they mean the documents
|
|
listed above.
|
|
|
|
Changes to shared contracts happen here, not in the sister repos.
|
|
|
|
---
|
|
|
|
## MVP Strategy — Umbrella-First (decided 2026-05-24)
|
|
|
|
**The MVP lives entirely in this repository before being segmented into the
|
|
sister repos.** This is a deliberate trade-off: fewer interface decisions up
|
|
front, more refactoring later when extraction happens.
|
|
|
|
The reasoning:
|
|
|
|
1. The architectural boundaries documented in the sister INTENT files are
|
|
hypotheses. We do not yet know which ones will hold up under real product
|
|
pressure.
|
|
2. Coordinating six repos with no working code is expensive. Coordinating one
|
|
repo with working code is cheap.
|
|
3. Interfaces designed in advance of implementation tend to be wrong.
|
|
4. Extracting working code into a new repo is a known, bounded refactor.
|
|
Reshaping a premature interface while implementing against it is not.
|
|
|
|
Concretely:
|
|
|
|
* All MVP source code lives under `citation-evidence/src/`, partitioned by
|
|
future-repo names (`shared/`, `engine/`, `anchor/`, `source/`, `work/`,
|
|
`binder/`, `app/`).
|
|
* The `DependencyMap.md` rules are enforced by lint rules on these folders.
|
|
* The five sister repos remain INTENT-only during MVP — they document the
|
|
intended boundary, not current code.
|
|
* When a subsystem's interface stabilizes (typically after the MVP scenario
|
|
has run end-to-end at least once), its `src/<repo-name>/` slice extracts
|
|
to the sister repo.
|
|
|
|
This INTENT will be updated when extraction happens.
|
|
|
|
---
|
|
|
|
## Success Criteria
|
|
|
|
The repository is successful when it allows a developer or agent to understand, run, and extend the citation-evidence system as an integrated product.
|
|
|
|
A first useful version should make it possible to:
|
|
|
|
* load a document collection,
|
|
* review a PDF,
|
|
* create an evidence item from selected text,
|
|
* link that evidence item to a structured form field,
|
|
* reopen the cited source context from the field,
|
|
* render the evidence as a citation card,
|
|
* understand which subsystem owns which part of the implementation.
|
|
|
|
---
|
|
|
|
## Guiding Statement
|
|
|
|
**citation-evidence exists to make source-backed information work navigable, reusable, and trustworthy.**
|
|
|