Three coordinated changes that let the pipeline produce a clean chapter-by-chapter git history on long texts without archaeology after the fact. 1. Richer commit messages. `SourcePipeline._git_commit` now diffs the staged changes, buckets added files by output subdirectory (entities, evaluations, classifications, mappings, analyses, metrics, logs), and includes counts in the commit body. So `git log` reads "entities: +23, evaluations: +23" per chapter instead of the same generic blurb on every commit. Zero behaviour change when no output changed; falls back to the original message if the diff query fails. 2. --eval-after-source / --classify-after-source on `infospace process`. After a source's stages succeed, the pipeline identifies which entity files are *new* (set diff of entity slugs before vs after), loads their EntityMeta, and runs per-entity evaluation and/or classification scoped to just those slugs before the per-source git commit lands. Result: each chapter's commit is self-contained — extraction + evaluation + classification in one atomic unit. Gated behind explicit flags because the cost is real (LLM latency per chapter rather than amortised across one bulk batch). 3. `markitect infospace chapters` subcommand. Lists source files in canonical order with entity count, evaluated count, classified count, and mean per-entity score per source. Text or JSON output. Natural triage surface for long-text infospaces — spot chapters that under-extracted or evaluated poorly. Also: `docs/advanced-usage.md` gets a new "Systematic processing of long texts" section with the recommended flag combo and the tradeoff note on cost. 11 new unit tests cover the chapters command (text/json/no-sources), the process flag wiring (help + provider requirement), and the commit-body bucket logic. Full infospace+llm unit suite (315 tests) green; 3 pre-existing infospace failures unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This example provides a tutorial and reference experiment for how to set up a viable infospace with history using markitect.
The task is to capture the knowledge from Adam Smith's The Wealth of Nations available digitally in the public domain as a transcript of the original text and transform and extend it to a collection of concepts and entities from a systems theoretical point of view based on Stafford Beer's Viable System Model that is consistent and complete.
The tutorial should explain how to use the concept of schemas to provide a scaffolding for how to structure the necessary information entities and define a set of prompts and instructions using the prompt dependency resolution infrastructure to incrementally inject chapters of the book.
The information space should utilize the option of keeping changes as git history. And define metrics for completeness and consistency.
While running the experiment no changes must be made to the markitect infrastructure.
If demand for optimization or fixing errors occurs, a list of corresponding tasks should be generated. It will be used to optimize the markitect infrastructure to then rerun the experiment to optimize tooling and infospace over time and again.
--worsch, 10th Feb. 2026