generated from coulomb/repo-seed
- Blob URL stability, scroll centre, strip-only visual guide - Focus-gated linking, unlink clears overlay, field badge tooltips - Capture layout (viewer centre), grey guide lines, Add field button - Workplans CE-WP-0006 (done) and CE-WP-0007 (T01-T09 done, T10-T12 todo) - Integration tests and viewer-url helpers
205 lines
9.6 KiB
Markdown
205 lines
9.6 KiB
Markdown
# Ecosystem State Assessment — citation-evidence family
|
|
|
|
**Date:** 2026-06-07
|
|
**Author:** Grok (Cursor), commissioned by Bernd
|
|
**Scope:** Review of all six `INTENT.md` files in the citation-evidence family, plus the
|
|
umbrella repo's code, workplans, wiki contracts, and test coverage — to assess current
|
|
state and recommend next steps.
|
|
|
|
---
|
|
|
|
## 1. Family topology
|
|
|
|
The citation-evidence ecosystem comprises **one umbrella repo and five subsystem repos**:
|
|
|
|
```text
|
|
citation-evidence (umbrella — all MVP code lives here)
|
|
├── citation-engine (domain model, services, persistence, rendering)
|
|
├── evidence-anchor (selectors, resolution, viewer adapter contract)
|
|
├── evidence-source (ingest, extraction, citation recovery)
|
|
├── citation-work (review workspace UX)
|
|
└── evidence-binder (evidence-to-target binding, visual guide)
|
|
```
|
|
|
|
| Repo | Declared role | Actual state (2026-06-07) |
|
|
|------|---------------|---------------------------|
|
|
| **citation-evidence** | Umbrella product, contracts, reference app | **Active** — ~118 TS/TSX files, tests, workplans, wiki, ADRs |
|
|
| **citation-engine** | Domain model, services, persistence, rendering | **INTENT + README only** — code in `src/{shared,engine}/` |
|
|
| **evidence-anchor** | Selectors, resolution, viewer adapter | **INTENT + README only** — code in `src/anchor/` |
|
|
| **evidence-source** | Ingest, extraction, recovery | **INTENT + README only** — code in `src/source/` (PDF only) |
|
|
| **citation-work** | Review workspace UX | **INTENT + README only** — code in `src/work/` |
|
|
| **evidence-binder** | Evidence-to-target binding, visual guide | **INTENT + README only** — code in `src/binder/` |
|
|
|
|
This is **intentional**, not neglect. On 2026-05-24 the family adopted an
|
|
**umbrella-first MVP** (ADR-0002 context, `INTENT.md` §MVP Strategy): prove the product
|
|
in one repo, then extract subsystems once boundaries are validated by real use.
|
|
|
|
---
|
|
|
|
## 2. INTENT.md quality — design maturity is high
|
|
|
|
All six `INTENT.md` files are coherent and mutually reinforcing. They share:
|
|
|
|
- The same core flow:
|
|
`Document → DocumentRepresentation → Annotation → EvidenceItem → EvidenceLink → CitationCard`
|
|
- Explicit **in-scope / out-of-scope** boundaries (each repo pushes responsibilities outward)
|
|
- A consistent document shape (Purpose, Scope, Workflows, Success Criteria, Guiding Statement)
|
|
- A shared **"MVP Coordination — Code Lives Upstream"** section pointing at
|
|
`citation-evidence/wiki/`
|
|
|
|
The umbrella `INTENT.md` is the strategic anchor: it owns shared contracts, integration,
|
|
and the reference scenario. Sister repos document *future* homes, not current code.
|
|
|
|
### 2.1 Ambiguities from the original INTENTs — largely resolved
|
|
|
|
The initial assessment (`history/2026-05-24-initial-assessment.md`) flagged overlapping
|
|
ownership (selectors, evidence states, viewer adapters, recovery). Those have since been
|
|
codified in:
|
|
|
|
- `wiki/SharedContracts.md` — canonical enums, vocabulary, type/behavior split
|
|
- `wiki/DependencyMap.md` — allowed import edges, cycle prevention
|
|
- `docs/decisions/` — ADR-0004 (PDF viewer), ADR-0006 (selector ownership),
|
|
ADR-0005 (persistence), ADR-0007 (citation card format), ADR-0008 (session archive), etc.
|
|
|
|
Notable reconciliations baked into sister INTENTs:
|
|
|
|
- `strong-support` / `weak-support` / `contradicts` moved from `EvidenceItem.status`
|
|
to `EvidenceLink.relation`
|
|
- Selector **types** → engine; selector **algorithms** → anchor
|
|
- `citation-work` must not depend on `evidence-binder` (review works standalone;
|
|
forms compose both)
|
|
|
|
---
|
|
|
|
## 3. Implementation state — MVP reference scenario is done
|
|
|
|
Workplans **CE-WP-0001 through CE-WP-0005** are all `status: done`:
|
|
|
|
| Workplan | Delivers |
|
|
|----------|----------|
|
|
| CE-WP-0001 | Scaffold, folder partitions, ESLint boundary rules, normalization, fixtures |
|
|
| CE-WP-0002 | PDF review slice — engine types, anchor, source ingest, viewer, sidebar |
|
|
| CE-WP-0003 | Form binding + visual guide (rect registry, SVG overlay) |
|
|
| CE-WP-0004 | Citation card export (Markdown + HTML) |
|
|
| CE-WP-0005 | Named sessions, arbitrary PDF upload, ZIP export/import |
|
|
|
|
The PRD §20 reference scenario is covered end-to-end for **PDF**:
|
|
|
|
1. Create collection/session
|
|
2. Upload PDF
|
|
3. Select passage → annotation → evidence item
|
|
4. Open side-by-side form
|
|
5. Link evidence to field
|
|
6. Focus field → coordinated highlight + visual guide
|
|
7. Export citation card
|
|
|
|
Test coverage includes 7 integration tests (PRD scenario, forms flows, overlay, citation
|
|
export, session ZIP round-trip, anchor/source roundtrip) plus extensive unit tests per
|
|
subsystem folder. Recent git activity (June 2026) shows active polish on PDF text-layer
|
|
positioning and session UX.
|
|
|
|
Boundary enforcement is real: `eslint-plugin-boundaries` guards the
|
|
`src/{shared,engine,anchor,source,binder,work,app}/` dependency graph described in
|
|
`DependencyMap.md`.
|
|
|
|
---
|
|
|
|
## 4. Gap analysis — vision vs. current code
|
|
|
|
Against the full product vision in the PRD and subsystem INTENTs, significant pieces
|
|
remain **designed but not built**:
|
|
|
|
| Capability | PRD / INTENT status | Code status |
|
|
|------------|---------------------|-------------|
|
|
| **PDF review & evidence capture** | Primary MVP | **Implemented** |
|
|
| **Evidence-backed forms + visual guide** | Primary MVP | **Implemented** |
|
|
| **Citation card export** | Primary MVP | **Implemented** |
|
|
| **Session portability (ZIP)** | Demo enhancement | **Implemented** (CE-WP-0005) |
|
|
| **Markdown / HTML documents** | Primary goal (FR) | **Not started** — `src/source/` is PDF-only |
|
|
| **Citation recovery mode** | Third product mode | **Not started** — `CitationRecoveryAttempt` in contracts/ids only |
|
|
| **Document review status workflow** | `citation-work` INTENT | **Not wired** — `reviewStatus` enum in contracts, no UI usage |
|
|
| **External source discovery** | Future / privacy-sensitive | **Deferred** (correct per PRD non-goals) |
|
|
| **Sister repo extraction** | Post-MVP | **Not started** — all code still in umbrella |
|
|
| **Monorepo vs. polyrepo decision** | ADR-0002 | **Still blank** — blocks clean extraction |
|
|
|
|
**Housekeeping debt:** `workplans/README.md` is stale (still lists CE-WP-0001..0004 as
|
|
`todo`); the individual workplan files correctly show `done`.
|
|
|
|
---
|
|
|
|
## 5. Per-repo assessment
|
|
|
|
### 5.1 citation-evidence — healthy, past MVP baseline
|
|
|
|
**Strengths:** Working reference app, enforced architecture, rich documentation, completed
|
|
Ralph workplans, contracts that sister repos can defer to.
|
|
|
|
**Risks:** Umbrella carries all complexity; extraction strategy undecided; PDF-only
|
|
implementation may hide format-neutral claims until HTML/Markdown adapters land; citation
|
|
recovery is a large remaining vertical with no code yet.
|
|
|
|
**Verdict:** The **center of gravity** of the family. This is where all meaningful
|
|
engineering lives today.
|
|
|
|
### 5.2 Sister repos (engine, anchor, source, work, binder) — scaffolded placeholders
|
|
|
|
**Strengths:** Excellent `INTENT.md` + `README.md` that correctly point upstream; LICENSE
|
|
and git remotes in place; boundaries pre-negotiated via umbrella wiki.
|
|
|
|
**Gaps:** No `package.json`, no source, no CI, no published packages. They are **boundary
|
|
documents**, not runnable libraries.
|
|
|
|
**Verdict:** Ready as **extraction targets**, not as independent products. Extraction should
|
|
follow ADR-0002 resolution and a deliberate `git mv` + package cut per README.
|
|
|
|
---
|
|
|
|
## 6. Strategic read
|
|
|
|
The family is in a **deliberate transitional architecture**:
|
|
|
|
```text
|
|
Phase A (complete): Design six-repo boundaries + build MVP in umbrella
|
|
Phase B (current): Harden PDF path, demo UX, contracts via real use
|
|
Phase C (next): Format expansion (MD/HTML) and/or citation recovery
|
|
Phase D (later): Extract subsystems to sister repos
|
|
```
|
|
|
|
Compared to the original phased plan in `history/2026-05-24-initial-assessment.md`, the
|
|
project has **skipped ahead**: Phase 1 (PDF vertical slice) and Phase 2 (form binding)
|
|
are done, plus demo/session portability. Phase 3 (format expansion) and Phase 4 (local
|
|
citation recovery) have **not** started.
|
|
|
|
The INTENT documents describe a mature, agent-friendly architecture. The code validates the
|
|
**hardest integration path** (PDF selection → durable selectors → form binding → visual
|
|
guide → export). What remains is mostly **breadth** (more formats, recovery mode) and
|
|
**structural** (extraction, packaging).
|
|
|
|
---
|
|
|
|
## 7. Recommended priorities
|
|
|
|
1. **Update `workplans/README.md`** to reflect CE-WP-0001..0005 as done; add CE-WP-0006
|
|
for the next vertical (Markdown adapter or local citation recovery — pick one).
|
|
2. **Resolve ADR-0002** before any extraction — monorepo workspaces vs. published
|
|
packages affects everything downstream.
|
|
3. **Either** expand formats (validates "format-neutral" claim) **or** build citation
|
|
recovery (validates third product mode) — doing both in parallel would split focus.
|
|
4. **Extract `citation-engine` first** when ready — it is the leaf node every other repo
|
|
depends on; `shared/` + `engine/` are the most stable slices.
|
|
|
|
---
|
|
|
|
## 8. Bottom line
|
|
|
|
The citation family is **well-architected on paper and materially implemented in one
|
|
place**. The six `INTENT.md` files form a consistent, boundary-aware design; the umbrella
|
|
repo has delivered a working PDF-centric MVP with tests and enforced dependency rules. The
|
|
five sister repos are **correctly empty** during umbrella-first MVP — they are extraction
|
|
targets, not lagging implementations.
|
|
|
|
**Overall state:** design maturity high, implementation maturity solid for PDF MVP,
|
|
extraction maturity low, product breadth ~half of full PRD vision.
|
|
|
|
The main open question is what comes next — format expansion, citation recovery, or
|
|
subsystem extraction. |