feat(infospace): systematic long-text processing — rich commit bodies, per-source eval/classify, chapters view

Three coordinated changes that let the pipeline produce a clean chapter-by-chapter git history on long texts without archaeology after the fact. 1. Richer commit messages. `SourcePipeline._git_commit` now diffs the staged changes, buckets added files by output subdirectory (entities, evaluations, classifications, mappings, analyses, metrics, logs), and includes counts in the commit body. So `git log` reads "entities: +23, evaluations: +23" per chapter instead of the same generic blurb on every commit. Zero behaviour change when no output changed; falls back to the original message if the diff query fails. 2. --eval-after-source / --classify-after-source on `infospace process`. After a source's stages succeed, the pipeline identifies which entity files are *new* (set diff of entity slugs before vs after), loads their EntityMeta, and runs per-entity evaluation and/or classification scoped to just those slugs before the per-source git commit lands. Result: each chapter's commit is self-contained — extraction + evaluation + classification in one atomic unit. Gated behind explicit flags because the cost is real (LLM latency per chapter rather than amortised across one bulk batch). 3. `markitect infospace chapters` subcommand. Lists source files in canonical order with entity count, evaluated count, classified count, and mean per-entity score per source. Text or JSON output. Natural triage surface for long-text infospaces — spot chapters that under-extracted or evaluated poorly. Also: `docs/advanced-usage.md` gets a new "Systematic processing of long texts" section with the recommended flag combo and the tradeoff note on cost. 11 new unit tests cover the chapters command (text/json/no-sources), the process flag wiring (help + provider requirement), and the commit-body bucket logic. Full infospace+llm unit suite (315 tests) green; 3 pre-existing infospace failures unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-22 08:24:26 +02:00
parent 9e8d73fa7d
commit e3e5b8ecc1
4 changed files with 501 additions and 4 deletions
--- a/examples/infospace-with-history/docs/advanced-usage.md
+++ b/examples/infospace-with-history/docs/advanced-usage.md
@@ -171,6 +171,57 @@ you need to look at, rather than a bare ratio.

 ---

+## 5. Systematic processing of long texts
+
+For long source material (books, multi-chapter specifications, corpora), the
+pipeline can produce a clean chapter-by-chapter git history on its own if
+you let it. The pattern:
+
+```bash
+# Process all sources in canonical order, eval and classify per chapter,
+# snapshot metrics after each chapter.
+markitect infospace process --all \
+    --provider openrouter \
+    --eval-after-source \
+    --classify-after-source \
+    --check-after-each
+```
+
+What you get:
+
+- **One commit per source file**, not per batch run. The commit message body
+  lists counts by bucket (`entities: +23`, `evaluations: +23`,
+  `classifications: +23`) derived from the actual staged diff, so `git log`
+  reads like the story of the infospace growing.
+- **Chapter-atomic commits.** `--eval-after-source` and
+  `--classify-after-source` evaluate and classify *only the new entities*
+  from the just-processed source before the commit lands, so each commit is
+  a self-contained chapter snapshot.
+- **Metrics-per-chapter trail.** `--check-after-each` appends a snapshot to
+  `output/metrics/history.yaml` after every chapter, so `markitect infospace
+  history` later shows the metric trajectory rather than just start/end.
+
+**Cost tradeoff.** `--eval-after-source` pays LLM latency per chapter rather
+than amortising it across one bulk batch. It's worth it when you care about
+the git history or want early quality signal, not when you're bulk-backfilling
+a known-good corpus.
+
+**Triage during the run.** While processing, use `markitect infospace
+chapters` in another shell to see per-source entity/eval/classify counts and
+mean scores — handy for spotting chapters that under-extracted or evaluated
+poorly.
+
+```
+$ markitect infospace chapters
+source               entities  evaluated  classified  mean_score
+-------------------  --------  ---------  ----------  ----------
+book-1-chapter-01    96        96         79          4.22
+book-1-chapter-02    16        16         10          4.06
+…
+```
+
+---
+
 ## See also

 - `METRICS-METHODOLOGY.md` — how each metric is computed.