Files
markitect-main/examples/infospace-with-history
tegwick 7c38f9b427 merge(reprocess-v2): complete pipeline rewrite and full corpus processing
Merges the reprocess-v2 branch into main, covering:

Infrastructure changes:
- markitect infospace process — new CLI command for batch source processing
- SourcePipeline — @{macro} substitution, skip-if-exists, git commit per source
- PipelineStage config extended with name, output_dir, output_macro,
  split_entities, macros, max_tokens fields
- Per-stage max_tokens (extract=8k, map-to-vsm=10k, synthesize=4k)
- LLM provenance comment in each new entity file
- output/processing-log.yaml with per-source token/cost/duration/retry stats
- Retry on all LLM errors (not just rate limits) with 5s back-off
- C2 coverage: add domain_densities, density_std, cross_cutting_ratio

Example (infospace-with-history):
- All 35 chapters processed: 1021 entities across Books 1–5
- Per-chapter git commits showing metric evolution from 0 → final state
- Final metrics: coverage=0.44, granularity=2.95, redundancy=0.006
- METRICS-METHODOLOGY.md C2 section corrected and expanded

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 00:11:39 +01:00
..

This example provides a tutorial and reference experiment for how to set up a viable infospace with history using markitect.

The task is to capture the knowledge from Adam Smith's The Wealth of Nations available digitally in the public domain as a transcript of the original text and transform and extend it to a collection of concepts and entities from a systems theoretical point of view based on Stafford Beer's Viable System Model that is consistent and complete.

The tutorial should explain how to use the concept of schemas to provide a scaffolding for how to structure the necessary information entities and define a set of prompts and instructions using the prompt dependency resolution infrastructure to incrementally inject chapters of the book.

The information space should utilize the option of keeping changes as git history. And define metrics for completeness and consistency.

While running the experiment no changes must be made to the markitect infrastructure.

If demand for optimization or fixing errors occurs, a list of corresponding tasks should be generated. It will be used to optimize the markitect infrastructure to then rerun the experiment to optimize tooling and infospace over time and again.

--worsch, 10th Feb. 2026