# Building an Infospace with History — Tutorial This tutorial walks through how to build a structured **infospace** from Adam Smith's *The Wealth of Nations*, mapping classical economic concepts to Stafford Beer's **Viable System Model** (VSM), using MarkiTect's infospace tooling. By the end you will understand how to: 1. Declare an infospace with `infospace.yaml` and `markitect infospace init` 2. Design schemas that scaffold structured LLM output 3. Write prompt templates with dependency injection (`@{macro}` syntax) 4. Run an incremental, chapter-by-chapter pipeline 5. Evaluate entity quality and run collection-level checks 6. Review viability against declared thresholds 7. Track every change through git history 8. Use a completed infospace as a discipline for a new project --- ## 1. What Is an Infospace? An **infospace** is a curated, self-describing collection of **entities** (concepts, mechanisms, observations) that together explain a **topic** through the lens of one or more **disciplines**. | Term | This example | |---|---| | Topic | *The Wealth of Nations* (Smith, 1776) | | Discipline | Viable System Model (Beer) | | Entities | Economic concepts: division of labour, natural price, … | | Viability | Does the entity set answer the competency questions? | The challenge with a large source corpus is that it is too big for a single prompt. MarkiTect processes it **incrementally**, one chapter at a time, building up the entity set and tracking progress through git. An infospace is **viable** when it meets threshold scores across defined metrics — it is fit for purpose as an explanatory tool. --- ## 2. Project Layout ``` examples/infospace-with-history/ │ ├── infospace.yaml # Declarative infospace configuration (NEW) ├── README.md ├── TUTORIAL.md # This file ├── INFRA-TASKS.md # Infrastructure issues found during the experiment ├── process_chapters.py # Pipeline script (chapter processing) ├── infospace.db # SQLite artifact database (generated, not in git) │ ├── schemas/ # Output structure definitions │ ├── economic-entity-schema-v1.0.md │ ├── vsm-concept-schema-v1.0.md │ ├── vsm-mapping-schema-v1.0.md │ └── chapter-analysis-schema-v1.0.md │ ├── templates/ # Prompt templates (with @{macro} placeholders) │ ├── extract-entities.md │ ├── map-to-vsm.md │ ├── synthesize-analysis.md │ └── assess-metrics.md │ ├── artifacts/ # Input artifacts │ ├── sources/ # Chapter text (35 files) │ ├── guidelines/ # Extraction and mapping rules │ └── vsm-reference/ # VSM framework definition │ └── output/ # Generated artifacts (LLM outputs) ├── entities/ # Flat canonical entity set + chapter views │ ├── division-of-labour.md # Canonical entity file (PRIMARY) │ ├── exchange.md │ ├── book-1-chapter-01-entities.md # Chapter view (transclusion) │ └── ... ├── mappings/ # Per-chapter VSM mappings ├── analyses/ # Per-chapter synthesised analyses └── metrics/ # Collection metrics + history ├── metrics.yaml # Latest metric values └── history.yaml # Timestamped snapshot log ``` **Entity organisation**: The infospace maintains a **flat canonical set** of entities — one markdown file per entity in `output/entities/`. Duplicate slugs across chapters are skipped (first occurrence wins). Per-chapter `*-entities.md` files are **secondary views** using transclusion directives (`{{ include "entity.md" }}`), so editing a canonical file updates every chapter view that references it. --- ## 3. Initialising an Infospace ### Starting fresh Use `markitect infospace init` to create an `infospace.yaml`: ```bash cd my-new-infospace/ markitect infospace init \ --topic "The Wealth of Nations" \ --domain "Classical Economics" \ --sources artifacts/sources/ \ --discipline "Viable System Model" ``` This creates a minimal `infospace.yaml`. Edit it to add schemas, competency questions, and viability thresholds: ```yaml topic: name: "The Wealth of Nations" domain: "Classical Economics" sources: artifacts/sources/ disciplines: - name: "Viable System Model" path: artifacts/vsm-reference/ schemas: entity: schemas/economic-entity-schema-v1.0.md mapping: schemas/vsm-mapping-schema-v1.0.md analysis: schemas/chapter-analysis-schema-v1.0.md competency_questions: | 1. How does Smith's division of labour map to VSM System 1 operations? 2. What mechanisms in WoN correspond to VSM coordination (System 2)? 3. Where does Smith describe self-organising regulation (System 3)? 4. What role does the "invisible hand" play as a System 4 mechanism? 5. How do Smith's views on government map to System 5 policy? 6. Is the WoN entity set viable as an explanatory framework? viability: redundancy_ratio: { max: 0.10 } coverage_ratio: { min: 0.50 } coherence_components: { max: 3 } consistency_cycles: { max: 0 } granularity_entropy: { min: 1.0 } pipeline: stages: - name: extract-entities template: templates/extract-entities.md - name: map-to-vsm template: templates/map-to-vsm.md - name: synthesize-analysis template: templates/synthesize-analysis.md ``` ### Checking status At any point, inspect the infospace: ```bash markitect infospace status # Infospace: The Wealth of Nations # Domain: Classical Economics # Entities: 109 # Domains: Production, Distribution, Exchange, Regulation # Disciplines: Viable System Model # Chapters: 9/35 processed markitect infospace entities # Lists all entities with domain, source chapter, word count ``` --- ## 4. Designing Schemas Before writing any prompts, define **schemas** — markdown documents that tell the LLM exactly what sections each output must contain. Schemas are not code; the LLM reads them as instructions. ### Economic Entity Schema (`schemas/economic-entity-schema-v1.0.md`) Every extracted entity must have: - **H1 heading** with the entity name (title case) - **Definition** (20–150 words, precise and non-circular) - **Source Chapter** citing Book and Chapter - **Context** — where in Smith's argument the entity appears - **Economic Domain** (Production, Distribution, Exchange, etc.) Optional: Smith's Original Wording, Modern Interpretation. ### VSM Mapping Schema (`schemas/vsm-mapping-schema-v1.0.md`) Every entity-to-VSM mapping must have: - **H1 heading**: `Entity Name -> VSM Concept Name` - **Economic Entity Reference** and **VSM Concept Reference** - **Mapping Rationale** (minimum 30 words, grounded in Beer's definitions) - **Mapping Strength**: Strong, Moderate, or Weak ### Chapter Analysis Schema (`schemas/chapter-analysis-schema-v1.0.md`) The per-chapter synthesis includes: - **Chapter Summary** (50–300 words) - **Entities Extracted** — bulleted list - **VSM Mappings** — entity, concept, strength - **VSM Coverage** — explicit assessment of S1 through S5 and S3* - **Gaps & Observations** **Key insight**: Schemas are artifacts — they live in the repository and can be versioned, diffed, and refined just like code. Improving a schema and re-processing a chapter is visible as a git diff. --- ## 5. Writing Prompt Templates Each template is a markdown file with `@{macro_name}` placeholders that MarkiTect's resolver fills with artifact content at compile time. ### Template 1: Extract Entities (`templates/extract-entities.md`) ```markdown # Extract Economic Entities You are an analytical economist specialising in classical economic theory. Your task is to extract distinct economic entities from a chapter of Adam Smith's *The Wealth of Nations*. ## Source Chapter @{chapter_text} ## Extraction Guidelines @{extraction_rules} ## VSM Framework Context @{vsm_framework} ## Existing Entities @{existing_entities} ## Output Format Output each entity delimited by `--- ENTITY: ---` markers. ``` The `@{existing_entities}` macro is generated at runtime from canonical files already on disk, enabling incremental extraction without duplication. ### Template 2: Map to VSM (`templates/map-to-vsm.md`) Inputs: `@{entities}`, `@{vsm_framework}`, `@{mapping_rules}`. ### Template 3: Synthesise Analysis (`templates/synthesize-analysis.md`) Inputs: `@{chapter_text}`, `@{entities}`, `@{mappings}`, `@{vsm_framework}`. ### Template 4: Assess Metrics (`templates/assess-metrics.md`) Inputs: `@{all_analyses}` (all chapter analyses concatenated), `@{vsm_framework}`. Runs across the entire infospace, not per-chapter. **Dependency chain per chapter:** ``` chapter_text ─────┐ extraction_rules ──┤ vsm_framework ────┤ ▼ extract-entities │ ▼ entities map-to-vsm │ ▼ mappings synthesize-analysis │ ▼ analysis ``` --- ## 6. Populating Artifacts ### Source chapters (`artifacts/sources/`) 35 markdown files with the full public-domain text of each chapter. Named `book-1-chapter-01.md` through `book-5-chapter-03.md`. ### Guidelines (`artifacts/guidelines/`) - **`extraction-rules.md`** — What constitutes an entity, granularity rules, naming conventions. - **`mapping-rules.md`** — How to map entities to VSM systems, what constitutes Strong/Moderate/Weak strength. ### VSM reference (`artifacts/vsm-reference/`) - **`vsm-framework.md`** — Complete description of Beer's VSM (S1–S5, S3*, recursion, variety, viability, algedonic signals, autonomy) with economic interpretations. --- ## 7. Processing Chapters `process_chapters.py` orchestrates the three-stage pipeline. It initialises the artifact repository, loads static artifacts, runs entity extraction → VSM mapping → analysis synthesis, and commits each chapter to git. ### Single chapter ```bash # Manual mode (writes prompts, awaits output files): python process_chapters.py --chapter book-1-chapter-05 --no-commit # Auto mode via OpenRouter (free models available): python process_chapters.py --chapter book-1-chapter-05 --provider openrouter # With a specific free model: python process_chapters.py --chapter book-1-chapter-05 \ --provider openrouter --model meta-llama/llama-4-maverick:free ``` ### Whole book or all chapters ```bash python process_chapters.py --book 1 --provider openrouter python process_chapters.py --all --provider openrouter ``` ### Check progress ```bash python process_chapters.py --list ``` ``` Available chapters (35): Chapter Entities Mappings Analysis ------------------------------ ------------ ------------ ------------ book-1-chapter-01 done (13) done done book-1-chapter-02 done (7) done done ... Canonical entity set: 109 unique entities ``` ### Entity lifecycle Entities in the canonical set are **never silently deleted**. Retire an entity by archiving it with a documented reason: ```bash python process_chapters.py --archive-entity enlarged-monopoly \ --reason "Subsumed by monopoly-price — same market distortion" ``` The archived file moves to `output/entities/archive/.md` with a dated header, preserving the intellectual history of every decision. --- ## 8. Evaluating Entity Quality Once chapters are processed, evaluate the entity set using the infospace tooling commands. ### Per-entity evaluation ```bash # Evaluate all entities (requires LLM provider): markitect infospace evaluate --provider openrouter # Evaluate entities from a specific chapter: markitect infospace evaluate --chapter book-1-chapter-05 --provider openrouter # Re-evaluate a single entity: markitect infospace evaluate --entity division-of-labour --provider openrouter ``` This runs the `evaluate-entity` prompt template against each entity, scoring dimensions like definition precision, source grounding, and VSM relevance. Results are written to `output/evaluations/`. ### Collection-level checks (C1–C5) ```bash # Run all five collection checks: markitect infospace check --provider openrouter # Run individual checks: markitect infospace check redundancy # C1: Are any entities synonymous? markitect infospace check coverage # C2: Which domain × VSM cells are empty? markitect infospace check coherence # C3: Is the entity graph well-connected? markitect infospace check consistency # C4: Are there circular definitions? markitect infospace check granularity # C5: Is abstraction level balanced? ``` Each check uses the platform's embedding, graph analysis, and FCA infrastructure. Results are written to `output/metrics/` and a new snapshot is appended to `metrics-history.yaml`. Sample output: ``` Running collection checks on 109 entities... C1 — redundancy redundancy_ratio: 0.0183 high_similarity_pairs: 2 C2 — coverage coverage_ratio: 0.4286 empty_cells: [['Regulation', 'S3*'], ['Historical', 'S5']] C3 — coherence coherence_components: 1 modularity: 0.412 C4 — consistency consistency_cycles: 0 grounding_ratio: 0.94 C5 — granularity granularity_entropy: 2.69 ``` --- ## 9. Reviewing Viability ```bash markitect infospace viability ``` Compares the latest metrics against the thresholds declared in `infospace.yaml`: ``` Metric Value Threshold Status ----------------------------------------------------------- redundancy_ratio 0.0183 max=0.10 PASS coverage_ratio 0.4286 min=0.50 FAIL coherence_components 1 max=3 PASS consistency_cycles 0 max=0 PASS granularity_entropy 2.6900 min=1.0 PASS Viable: NO (4/5 thresholds met) ``` Coverage is currently failing (42% < 50% threshold) because only 9 of 35 chapters have been processed. Once more chapters are done, coverage will rise. ### Metrics history ```bash markitect infospace history ``` Shows how metrics evolved across runs: ``` Snapshot Date Entities coverage redundancy entropy ------------------------------------------------------------- 6ba48eb2 2026-02-19 85 0.361 0.000 2.687 ``` --- ## 10. Tracking History with Git Every processed chapter produces one git commit containing: - Compiled prompts (`*-prompt.md`) — audit what was sent to the LLM - Canonical entity files (`output/entities/.md`) — first occurrence wins - Chapter entity views (`-entities.md`) — transclusion references - Generated outputs (`*-mappings.md`, `*-analysis.md`) This means: - `git log` shows the chronological order of processing - `git diff` between commits shows what each chapter contributed - You can `git bisect` to find where quality degraded - You can revert a chapter and re-process with improved guidelines The `clean-example-history` branch in this repository demonstrates the intended structure: each chapter is a single, self-contained commit. Use it as a reference for how the infospace grew step by step. To commit manually after reviewing: ```bash python process_chapters.py --chapter book-1-chapter-05 --provider openrouter --no-commit # review output/entities/ and output/mappings/ git add examples/infospace-with-history/output/ git commit -m "infospace: process book-1-chapter-05" ``` --- ## 11. Cost and Performance | | OpenRouter (free) | OpenRouter (paid) | Gemini (free) | |---|---|---|---| | Time per chapter | ~5 min | ~2 min | ~45 sec | | Cost per chapter | $0.00 | ~$0.07 | $0.00 | | Default model | `arcee-ai/trinity-large-preview:free` | `anthropic/claude-sonnet-4` | `gemini-2.5-flash` | | Rate limits | ~200 req/day | High | Per-minute | **OpenRouter free tier**: Sign up at [openrouter.ai](https://openrouter.ai) (no credit card required). Store your key in `apikey-openrouter.txt` in the project root (git-ignored), or set `OPENROUTER_API_KEY`. ```bash export OPENROUTER_API_KEY=$(cat apikey-openrouter.txt | tr -d '[:space:]') ``` Use `openrouter/free` to automatically select from whichever free model is available: ```bash python process_chapters.py --chapter book-1-chapter-05 \ --provider openrouter --model openrouter/free ``` **Gemini free tier**: Get a key at [aistudio.google.com/apikey](https://aistudio.google.com/apikey), store in `apikey-geminifree.txt`. Note: The `claude-code` provider (Claude CLI subprocess) is not available when running inside a Claude Code session due to nested session restrictions. --- ## 12. Completing the Remaining Chapters As of writing, 9 of 35 chapters are processed (Book I, Chapters 1–9). **Process Book I remainder:** ```bash export OPENROUTER_API_KEY=$(cat apikey-openrouter.txt | tr -d '[:space:]') git checkout clean-example-history python process_chapters.py --book 1 --provider openrouter ``` Already-processed chapters are skipped — their chapter view files exist. The `@{existing_entities}` macro ensures the LLM only extracts genuinely new entities. **Process Books II–V:** ```bash python process_chapters.py --book 2 --provider openrouter python process_chapters.py --book 3 --provider openrouter python process_chapters.py --book 4 --provider openrouter python process_chapters.py --book 5 --provider openrouter ``` **Run collection checks after each book:** ```bash markitect infospace check --provider openrouter markitect infospace viability ``` **Expected progression:** | After | Chapters | Expected coverage | |-------|----------|-------------------| | Book I (11 ch.) | 11/35 | S1, S2, S4 strong; S3 emerging | | Books I–II (16 ch.) | 16/35 | S3 (capital control) covered | | Books I–III (20 ch.) | 20/35 | Historical patterns add depth | | Books I–IV (30 ch.) | 30/35 | S5 (policy, mercantilism) emerging | | All (35 ch.) | 35/35 | Full coverage; S3* and algedonic signals from Book V | --- ## 13. Using the Infospace as a Discipline A completed, viable infospace can itself become a **discipline** — a lens applied to a new topic. For example, the Wealth of Nations infospace could be applied to analyse a modern supply chain. ```bash # In a new infospace directory: markitect infospace init \ --topic "Modern Supply Chain Management" \ --domain "Operations Research" \ --discipline "Wealth of Nations" # Bind the WoN infospace as a discipline: markitect infospace bind-discipline ../infospace-with-history # List bound disciplines and their viability: markitect infospace disciplines # Viable System Model PASS (from vsm-reference/) # Wealth of Nations PASS (from ../infospace-with-history) # Check for stale mappings after discipline update: markitect infospace stale-mappings ``` The discipline infospace must be viable (meeting its own thresholds) before it can be used as a lens. If the discipline's entities change, dependent mappings are flagged for re-evaluation. --- ## 14. Quality Improvement Loop The infospace is designed to be **iteratively refined**: 1. **Process chapters** — run the pipeline 2. **Evaluate** — `markitect infospace evaluate --provider openrouter` 3. **Check** — `markitect infospace check --provider openrouter` 4. **Review viability** — `markitect infospace viability` 5. **Refine guidelines** — update `extraction-rules.md` or `mapping-rules.md` to address identified weaknesses 6. **Re-process** — delete output files for specific chapters and re-run 7. **Compare** — `git diff` shows how refined guidelines changed the output Example: if checks show S3* (Audit) is consistently missing, add a paragraph to `extraction-rules.md` explicitly asking the LLM to look for audit, inspection, and oversight mechanisms. To re-process a specific chapter: ```bash rm -f output/entities/book-1-chapter-03-entities.md rm -f output/mappings/book-1-chapter-03-mappings.md rm -f output/analyses/book-1-chapter-03-analysis.md python process_chapters.py --chapter book-1-chapter-03 --provider openrouter ``` Never silently delete canonical entity files. Archive them instead: ```bash python process_chapters.py --archive-entity extent-of-the-market \ --reason "Subsumed by market-price and effectual-demand" python process_chapters.py --chapter book-1-chapter-03 --provider openrouter ``` --- ## 15. The Artifact Database (`infospace.db`) The pipeline stores all artifacts and dependency edges in a local SQLite database — `infospace.db`. This file is **not committed to git** because it is fully derived from the markdown files that are tracked. To regenerate it after a fresh clone (no LLM calls needed): ```bash python process_chapters.py --all --no-commit ``` --- ## 16. Adapting This Pattern to Your Own Project To build your own infospace: 1. `markitect infospace init --topic "..." --domain "..." --discipline "..."` 2. Write schemas defining required sections for each output type 3. Write extraction guidelines that tell the LLM what to look for 4. Create prompt templates using `@{macro}` syntax 5. Populate `artifacts/sources/` with your source corpus 6. Run `process_chapters.py` (or your equivalent pipeline script) 7. Evaluate with `markitect infospace evaluate` and `check` 8. Review `markitect infospace viability` against your thresholds 9. Iterate: refine guidelines, re-process, re-evaluate 10. Once viable, use as a discipline for a new infospace The key insight is that **schemas and guidelines are artifacts** — they live in the repository and can be versioned and diffed just like code. Every refinement decision is traceable through git history.