markitect-main/examples/infospace-with-history/TUTORIAL.md

# Building an Infospace with History — Tutorial

This tutorial walks through how to build a structured **infospace** from
Adam Smith's *The Wealth of Nations*, mapping classical economic concepts
to Stafford Beer's **Viable System Model** (VSM), using MarkiTect's
infospace tooling.

By the end you will understand how to:

1. Declare an infospace with `infospace.yaml` and `markitect infospace init`
2. Design schemas that scaffold structured LLM output
3. Write prompt templates with dependency injection (`@{macro}` syntax)
4. Run an incremental, chapter-by-chapter pipeline
5. Evaluate entity quality and run collection-level checks
6. Review viability against declared thresholds
7. Track every change through git history
8. Use a completed infospace as a discipline for a new project

---

## 1. What Is an Infospace?

An **infospace** is a curated, self-describing collection of **entities**
(concepts, mechanisms, observations) that together explain a **topic**
through the lens of one or more **disciplines**.

| Term | This example |
|---|---|
| Topic | *The Wealth of Nations* (Smith, 1776) |
| Discipline | Viable System Model (Beer) |
| Entities | Economic concepts: division of labour, natural price, … |
| Viability | Does the entity set answer the competency questions? |

The challenge with a large source corpus is that it is too big for a single
prompt. MarkiTect processes it **incrementally**, one chapter at a time,
building up the entity set and tracking progress through git.

An infospace is **viable** when it meets threshold scores across defined
metrics — it is fit for purpose as an explanatory tool.

---

## 2. Project Layout

```
examples/infospace-with-history/
│
├── infospace.yaml              # Declarative infospace configuration
├── README.md
├── TUTORIAL.md                 # This file
├── INFRA-TASKS.md              # Infrastructure issues found during the experiment
├── LAYERED-DEVELOPMENT.md      # Concept for L2–L4 entity classification and modelling
├── infospace.db                # SQLite artifact database (generated, not in git)
│
├── schemas/                    # Output structure definitions
│   ├── economic-entity-schema-v1.0.md
│   ├── vsm-concept-schema-v1.0.md
│   ├── vsm-mapping-schema-v1.0.md
│   └── chapter-analysis-schema-v1.0.md
│
├── templates/                  # Prompt templates (with @{macro} placeholders)
│   ├── extract-entities.md
│   ├── map-to-vsm.md
│   ├── synthesize-analysis.md
│   └── assess-metrics.md
│
├── artifacts/                  # Input artifacts
│   ├── sources/                # Chapter text (35 files)
│   ├── guidelines/             # Extraction and mapping rules
│   └── vsm-reference/         # VSM framework definition
│
└── output/                     # Generated artifacts (LLM outputs)
    ├── entities/               # Flat canonical entity set + chapter views
    │   ├── division-of-labour.md        # Canonical entity file (PRIMARY)
    │   ├── exchange.md
    │   ├── book-1-chapter-01-entities.md  # Chapter view (transclusion)
    │   └── ...
    ├── mappings/               # Per-chapter VSM mappings
    ├── analyses/               # Per-chapter synthesised analyses
    └── metrics/                # Collection metrics + history
        ├── metrics.yaml        # Latest metric values
        └── history.yaml        # Timestamped snapshot log
```

**Entity organisation**: The infospace maintains a **flat canonical set**
of entities — one markdown file per entity in `output/entities/`. Duplicate
slugs across chapters are skipped (first occurrence wins). Per-chapter
`*-entities.md` files are **secondary views** using transclusion directives
(`{{ include "entity.md" }}`), so editing a canonical file updates every
chapter view that references it.

---

## 3. Initialising an Infospace

### Starting fresh

Use `markitect infospace init` to create an `infospace.yaml`:

```bash
cd my-new-infospace/
markitect infospace init \
  --topic "The Wealth of Nations" \
  --domain "Classical Economics" \
  --sources artifacts/sources/ \
  --discipline "Viable System Model"
```

This creates a minimal `infospace.yaml`. Edit it to add schemas,
competency questions, and viability thresholds:

```yaml
topic:
  name: "The Wealth of Nations"
  domain: "Classical Economics"
  sources: artifacts/sources/

disciplines:
  - name: "Viable System Model"
    path: artifacts/vsm-reference/

schemas:
  entity: schemas/economic-entity-schema-v1.0.md
  mapping: schemas/vsm-mapping-schema-v1.0.md
  analysis: schemas/chapter-analysis-schema-v1.0.md

competency_questions: |
  1. How does Smith's division of labour map to VSM System 1 operations?
  2. What mechanisms in WoN correspond to VSM coordination (System 2)?
  3. Where does Smith describe self-organising regulation (System 3)?
  4. What role does the "invisible hand" play as a System 4 mechanism?
  5. How do Smith's views on government map to System 5 policy?
  6. Is the WoN entity set viable as an explanatory framework?

viability:
  redundancy_ratio: { max: 0.10 }
  coverage_ratio: { min: 0.50 }
  coherence_components: { max: 3 }
  consistency_cycles: { max: 0 }
  granularity_entropy: { min: 1.0 }

pipeline:
  stages:
    - name: extract-entities
      template: templates/extract-entities.md
    - name: map-to-vsm
      template: templates/map-to-vsm.md
    - name: synthesize-analysis
      template: templates/synthesize-analysis.md
```

### Checking status

At any point, inspect the infospace:

```bash
markitect infospace status
# Infospace: The Wealth of Nations
# Domain:    Classical Economics
# Entities:  109
# Domains:   Production, Distribution, Exchange, Regulation
# Disciplines: Viable System Model
# Chapters:  9/35 processed

markitect infospace entities
# Lists all entities with domain, source chapter, word count
```

---

## 4. Designing Schemas

Before writing any prompts, define **schemas** — markdown documents that
tell the LLM exactly what sections each output must contain. Schemas are
not code; the LLM reads them as instructions.

### Economic Entity Schema (`schemas/economic-entity-schema-v1.0.md`)

Every extracted entity must have:

- **H1 heading** with the entity name (title case)
- **Definition** (20–150 words, precise and non-circular)
- **Source Chapter** citing Book and Chapter
- **Context** — where in Smith's argument the entity appears
- **Economic Domain** (Production, Distribution, Exchange, etc.)

Optional: Smith's Original Wording, Modern Interpretation.

### VSM Mapping Schema (`schemas/vsm-mapping-schema-v1.0.md`)

Every entity-to-VSM mapping must have:

- **H1 heading**: `Entity Name -> VSM Concept Name`
- **Economic Entity Reference** and **VSM Concept Reference**
- **Mapping Rationale** (minimum 30 words, grounded in Beer's definitions)
- **Mapping Strength**: Strong, Moderate, or Weak

### Chapter Analysis Schema (`schemas/chapter-analysis-schema-v1.0.md`)

The per-chapter synthesis includes:

- **Chapter Summary** (50–300 words)
- **Entities Extracted** — bulleted list
- **VSM Mappings** — entity, concept, strength
- **VSM Coverage** — explicit assessment of S1 through S5 and S3*
- **Gaps & Observations**

**Key insight**: Schemas are artifacts — they live in the repository and
can be versioned, diffed, and refined just like code. Improving a schema
and re-processing a chapter is visible as a git diff.

---

## 5. Writing Prompt Templates

Each template is a markdown file with `@{macro_name}` placeholders that
MarkiTect's resolver fills with artifact content at compile time.

### Template 1: Extract Entities (`templates/extract-entities.md`)

```markdown
# Extract Economic Entities

You are an analytical economist specialising in classical economic theory.
Your task is to extract distinct economic entities from a chapter of
Adam Smith's *The Wealth of Nations*.

## Source Chapter
@{chapter_text}

## Extraction Guidelines
@{extraction_rules}

## VSM Framework Context
@{vsm_framework}

## Existing Entities
@{existing_entities}

## Output Format
Output each entity delimited by `--- ENTITY: <entity-name> ---` markers.
```

The `@{existing_entities}` macro is generated at runtime from canonical
files already on disk, enabling incremental extraction without duplication.

### Template 2: Map to VSM (`templates/map-to-vsm.md`)

Inputs: `@{entities}`, `@{vsm_framework}`, `@{mapping_rules}`.

### Template 3: Synthesise Analysis (`templates/synthesize-analysis.md`)

Inputs: `@{chapter_text}`, `@{entities}`, `@{mappings}`, `@{vsm_framework}`.

### Template 4: Assess Metrics (`templates/assess-metrics.md`)

Inputs: `@{all_analyses}` (all chapter analyses concatenated), `@{vsm_framework}`.
Runs across the entire infospace, not per-chapter.

**Dependency chain per chapter:**

```
chapter_text ─────┐
extraction_rules ──┤
vsm_framework ────┤
                   ▼
           extract-entities
                   │
                   ▼ entities
           map-to-vsm
                   │
                   ▼ mappings
           synthesize-analysis
                   │
                   ▼ analysis
```

---

## 6. Populating Artifacts

### Source chapters (`artifacts/sources/`)

35 markdown files with the full public-domain text of each chapter.
Named `book-1-chapter-01.md` through `book-5-chapter-03.md`.

### Guidelines (`artifacts/guidelines/`)

- **`extraction-rules.md`** — What constitutes an entity, granularity
  rules, naming conventions.
- **`mapping-rules.md`** — How to map entities to VSM systems, what
  constitutes Strong/Moderate/Weak strength.

### VSM reference (`artifacts/vsm-reference/`)

- **`vsm-framework.md`** — Complete description of Beer's VSM (S1–S5,
  S3*, recursion, variety, viability, algedonic signals, autonomy) with
  economic interpretations.

---

## 7. Processing Chapters

`markitect infospace process` orchestrates the three-stage pipeline declared
in `infospace.yaml`. It runs entity extraction → VSM mapping → analysis
synthesis for each source file, and commits each chapter to git.

### Single chapter

```bash
# Dry run — loads existing outputs only, no LLM calls:
markitect infospace process "book-1-chapter-05.md"

# Process via OpenRouter (free models available):
markitect infospace process "book-1-chapter-05.md" --provider openrouter

# With a specific free model:
markitect infospace process "book-1-chapter-05.md" \
  --provider openrouter --model meta-llama/llama-4-maverick:free

# Skip git commit after processing:
markitect infospace process "book-1-chapter-05.md" \
  --provider openrouter --no-commit
```

The GLOB_PATTERN is matched against the `sources` directory declared in
`infospace.yaml`. Already-processed chapters are skipped automatically —
their output files already exist on disk.

### Whole book or all chapters

```bash
# Process all chapters of Book 1:
markitect infospace process "book-1-*.md" --provider openrouter

# Process all 35 source files:
markitect infospace process --all --provider openrouter

# Process all chapters and run quality checks after each one:
markitect infospace process --all --provider openrouter --check-after-each
```

### Check progress

```bash
markitect infospace status
```

```
Infospace: The Wealth of Nations
Domain:    Classical Economics
Entities:  988
Domains:   Accumulation, Consumption, Distribution, Exchange,
           General Theory, Production, Regulation
Disciplines: Viable System Model
Last evaluated: 2026-02-19T21:54:44
```

```bash
markitect infospace entities
```

Lists all canonical entities with domain, source chapter, and word count.

### Entity lifecycle

Entities in the canonical set are **never silently deleted**. To retire
an entity, move it to `output/entities/archive/<slug>.md` and add a
dated archive header:

```markdown
<!-- archived: 2026-02-22 reason="Subsumed by monopoly-price — same market distortion" -->
```

Then commit the removal so the intellectual history of every decision
is preserved in git.

---

## 8. Evaluating Entity Quality

Once chapters are processed, evaluate the entity set using the infospace
tooling commands.

### Per-entity evaluation

```bash
# Evaluate all entities (requires LLM provider):
markitect infospace evaluate --provider openrouter

# Evaluate entities from a specific chapter:
markitect infospace evaluate --chapter book-1-chapter-05 --provider openrouter

# Re-evaluate a single entity:
markitect infospace evaluate --entity division-of-labour --provider openrouter
```

This runs the `evaluate-entity` prompt template against each entity,
scoring dimensions like definition precision, source grounding, and
VSM relevance. Results are written to `output/evaluations/`.

### Collection-level checks (C1–C5)

```bash
# Run all five collection checks:
markitect infospace check

# Run individual checks:
markitect infospace check --concern redundancy   # C1: Are any entities synonymous?
markitect infospace check --concern coverage     # C2: Which domain × chapter cells are empty?
markitect infospace check --concern coherence    # C3: Is the entity graph well-connected?
markitect infospace check --concern consistency  # C4: Are there circular definitions?
markitect infospace check --concern granularity  # C5: Is abstraction level balanced?
```

Collection checks are deterministic (embeddings, graph analysis, FCA) and
require no LLM provider.

Each check uses the platform's embedding, graph analysis, and FCA
infrastructure. Results are written to `output/metrics/` and a new
snapshot is appended to `metrics-history.yaml`.

Sample output (full corpus, 988 entities):

```
Collection checks — 988 entities

  C1 — redundancy
    redundancy_ratio: 0.0061
    similar_pairs: 3 candidates (word-overlap > 0.85)

  C2 — coverage
    coverage_ratio: 0.619
    domain_densities: Exchange 0.85, Regulation 0.85, General Theory 0.73 …
    density_std: 0.211  cross_cutting_ratio: 0.714

  C3 — coherence
    connected_components: 0   (no cross-reference graph built yet)
    modularity: 0.0

  C4 — consistency
    cycle_count: 0

  C5 — granularity
    granularity_entropy: 2.953
```

---

## 9. Reviewing Viability

```bash
markitect infospace viability
```

Compares the latest metrics against the thresholds declared in
`infospace.yaml`:

```
Metric                            Value       Threshold   Status
---------------------------------------------------------------
redundancy_ratio                 0.0059         max=0.1     PASS
coverage_ratio                   0.6190         min=0.4     PASS
coherence_components             0.0000           max=3     PASS
consistency_cycles               0.0000           max=0     PASS
granularity_entropy              2.9533         min=1.0     PASS

Viable: YES (5/5 thresholds met)
```

During early processing (first few books), coverage will fall and
then stabilise as the domain × chapter matrix fills in. The threshold
of 0.40 reflects realistic expectations for a multi-book corpus where
some domains are naturally sparse in certain chapters.

### Metrics history

```bash
markitect infospace history
```

Shows how metrics evolved across runs:

```
History: 36 snapshot(s)

#    Date                 Entities  Metrics
------------------------------------------
1    2026-02-19T13:07:13        18        6
2    2026-02-19T13:16:36        43        6
...
36   2026-02-19T21:54:44      1021        6
```

```bash
# Show trend for a specific metric:
markitect infospace history --metric coverage_ratio
```

---

## 10. Tracking History with Git

Every processed chapter produces one git commit containing:

- Compiled prompts (`*-prompt.md`) — audit what was sent to the LLM
- Canonical entity files (`output/entities/<slug>.md`) — first occurrence wins
- Chapter entity views (`<chapter>-entities.md`) — transclusion references
- Generated outputs (`*-mappings.md`, `*-analysis.md`)

This means:

- `git log` shows the chronological order of processing
- `git diff` between commits shows what each chapter contributed
- You can `git bisect` to find where quality degraded
- You can revert a chapter and re-process with improved guidelines

To review before committing:

```bash
markitect infospace process "book-1-chapter-05.md" \
  --provider openrouter --no-commit
# review output/entities/ and output/mappings/
git add output/
git commit -m "infospace: process book-1-chapter-05"
```

---

## 11. Cost and Performance

| | OpenRouter (free) | OpenRouter (paid) | Gemini (free) |
|---|---|---|---|
| Time per chapter | ~5 min | ~2 min | ~45 sec |
| Cost per chapter | $0.00 | ~$0.07 | $0.00 |
| Default model | `arcee-ai/trinity-large-preview:free` | `anthropic/claude-sonnet-4` | `gemini-2.5-flash` |
| Rate limits | ~200 req/day | High | Per-minute |

**OpenRouter free tier**: Sign up at [openrouter.ai](https://openrouter.ai)
(no credit card required). Store your key in `apikey-openrouter.txt` in the
project root (git-ignored), or set `OPENROUTER_API_KEY`.

```bash
export OPENROUTER_API_KEY=$(cat apikey-openrouter.txt | tr -d '[:space:]')
```

Use `openrouter/free` to automatically select from whichever free model is
available:

```bash
markitect infospace process "book-1-chapter-05.md" \
  --provider openrouter --model openrouter/free
```

**Gemini free tier**: Get a key at [aistudio.google.com/apikey](https://aistudio.google.com/apikey),
store in `apikey-geminifree.txt`.

Note: The `claude-code` provider (Claude CLI subprocess) is not available
when running inside a Claude Code session due to nested session restrictions.

---

## 12. Processing the Full Corpus

All 35 chapters have been processed in this example. The commands below
show how the full run was executed — use them as a template for your own
corpus.

**Process one book at a time:**

```bash
export OPENROUTER_API_KEY=$(cat apikey-openrouter.txt | tr -d '[:space:]')

markitect infospace process "book-1-*.md" --provider openrouter
markitect infospace process "book-2-*.md" --provider openrouter
markitect infospace process "book-3-*.md" --provider openrouter
markitect infospace process "book-4-*.md" --provider openrouter
markitect infospace process "book-5-*.md" --provider openrouter
```

Already-processed chapters are skipped automatically — their output files
exist on disk. The `@{existing_entities}` macro ensures the LLM only
extracts genuinely new entities.

**Or process everything at once:**

```bash
markitect infospace process --all --provider openrouter
```

**Run collection checks after each book:**

```bash
markitect infospace check
markitect infospace viability
```

**Observed metric progression (actual results):**

| After | Entities | coverage_ratio | entropy |
|-------|----------|----------------|---------|
| Book I (11 ch.) | ~236 | 0.51 | 2.77 |
| Books I–II (16 ch.) | ~348 | 0.56 | 2.82 |
| Books I–III (20 ch.) | ~456 | 0.59 | 2.97 |
| Books I–IV (30 ch.) | ~930 | 0.51 | 2.94 |
| All (35 ch.) | 988 | **0.62** | 2.95 |

Coverage dips in Books IV–V as policy-heavy chapters introduce domains
that are sparse in earlier books, then recovers as the matrix fills in.

---

## 13. Using the Infospace as a Discipline

A completed, viable infospace can itself become a **discipline** — a lens
applied to a new topic. The working example is in
`examples/supply-chain-vsm/`: it binds this WoN infospace as a discipline
and applies Smith's framework to modern supply chain management.

### What the composition demo contains

**8 entities** extracted from three source documents on coordination
mechanisms, capital and inventory, and market structure. Each entity
maps to a specific WoN concept with a rationale and conceptual continuity
rating (Strong / Moderate / Weak):

| Supply Chain Entity | WoN Concept | Strength | VSM |
|---|---|---|---|
| Demand Signal | Effectual Demand | Strong | S2 |
| Vendor-Managed Inventory | Division of Labour | Strong | S1/S2 |
| Just-in-Time Inventory | Circulating Capital | Strong | S1/S3 |
| Bullwhip Effect | Natural Price as Central Price | Moderate | S2 |
| Safety Stock | Accumulation of Stock | Moderate | S3 |
| Platform Intermediary | Merchant Capital | Strong | S2/S4 |
| Monopsony Power | Combination of Masters | Strong | S3* |
| Single-Source Dependency | Monopoly in Trade | Moderate | S4/S5 |

Because WoN entities are already mapped to VSM systems, supply chain
entities **inherit VSM positions by transitivity** — the supply chain
infospace gets VSM coverage without needing its own VSM reference.

### Running the composition demo

```bash
cd examples/supply-chain-vsm

# Check bound disciplines and their viability:
markitect infospace disciplines
```

```
Name                           Entities   Viable Path
----------------------------------------------------------------------
Wealth of Nations                   988      YES ../infospace-with-history
```

```bash
# Show infospace status:
markitect infospace status
```

```
Infospace: Modern Supply Chain Management
Domain:    Operations Management
Entities:  8
Disciplines: Wealth of Nations
```

```bash
# Run checks and review viability:
markitect infospace check
markitect infospace viability
```

```
Metric                            Value       Threshold   Status
---------------------------------------------------------------
redundancy_ratio                 0.0000         max=0.1     PASS
coverage_ratio                   1.0000         min=0.5     PASS
coherence_components             0.0000           max=2     PASS
consistency_cycles               0.0000           max=0     PASS
granularity_entropy              1.9056         min=0.8     PASS

Viable: YES (5/5 thresholds met)
```

### Setting up your own composed infospace

```bash
mkdir my-new-topic/ && cd my-new-topic/

markitect infospace init \
  --topic "My Topic" \
  --domain "My Domain"

# Bind the WoN infospace as a discipline:
markitect infospace bind-discipline --name "Wealth of Nations" \
  ../infospace-with-history

# Confirm it is viable before using:
markitect infospace disciplines
```

The discipline infospace must be viable (meeting its own thresholds)
before it can be used as a lens. If the discipline's entities change,
use `markitect infospace stale-mappings` to identify mappings that need
re-evaluation.

### The WoN core entity reference

Rather than injecting all 988 WoN entities into every prompt (which
would overflow context), the supply chain demo uses a curated reference
file at `artifacts/won-reference/core-entities.md` — 12 key WoN entities
selected for their relevance to operations and market structure. The
pipeline stage macro `@{won_core_entities}` injects this file.

For a different topic, create an equivalent curated reference of the
WoN entities most relevant to your domain.

---

## 14. Quality Improvement Loop

The infospace is designed to be **iteratively refined**:

1. **Process chapters** — `markitect infospace process "book-1-*.md" --provider openrouter`
2. **Evaluate** — `markitect infospace evaluate --provider openrouter`
3. **Check** — `markitect infospace check`
4. **Review viability** — `markitect infospace viability`
5. **Refine guidelines** — update `extraction-rules.md` or
   `mapping-rules.md` to address identified weaknesses
6. **Re-process** — delete output files for specific chapters and re-run
7. **Compare** — `git diff` shows how refined guidelines changed the output

Example: if checks show S3* (Audit) is consistently missing, add a
paragraph to `extraction-rules.md` explicitly asking the LLM to look for
audit, inspection, and oversight mechanisms.

To re-process a specific chapter:

```bash
# Delete stage outputs for that chapter (not canonical entity files):
rm -f output/entities/book-1-chapter-03-entities.md
rm -f output/mappings/book-1-chapter-03-mappings.md
rm -f output/analyses/book-1-chapter-03-analysis.md

# Re-run:
markitect infospace process "book-1-chapter-03.md" --provider openrouter
```

Never silently delete canonical entity files. Archive them instead by
moving to `output/entities/archive/` with a dated comment header, then
re-process the chapter so the pipeline can extract a replacement:

```bash
# Archive the entity manually:
mkdir -p output/entities/archive
mv output/entities/extent-of-the-market.md output/entities/archive/
# Add header to the archived file explaining why
echo '<!-- archived: 2026-02-22 reason="Subsumed by market-price and effectual-demand" -->' \
  | cat - output/entities/archive/extent-of-the-market.md > /tmp/tmp.md \
  && mv /tmp/tmp.md output/entities/archive/extent-of-the-market.md

# Delete the chapter entity view so the chapter re-runs:
rm -f output/entities/book-1-chapter-03-entities.md
markitect infospace process "book-1-chapter-03.md" --provider openrouter
```

---

## 15. The Artifact Database (`infospace.db`)

The pipeline stores all artifacts and dependency edges in a local SQLite
database — `infospace.db`. This file is **not committed to git** because
it is fully derived from the markdown files that are tracked.

To regenerate it after a fresh clone (no LLM calls needed):

```bash
markitect infospace process --all
```

Without `--provider`, the command runs in dry-run mode: it loads existing
output files from disk into the database without making any LLM calls.

---

## 16. Adapting This Pattern to Your Own Project

To build your own infospace:

1. `markitect infospace init --topic "..." --domain "..." --discipline "..."`
2. Write schemas defining required sections for each output type
3. Write extraction guidelines that tell the LLM what to look for
4. Create prompt templates using `@{macro}` syntax
5. Populate `artifacts/sources/` with your source corpus
6. `markitect infospace process --all --provider openrouter`
7. `markitect infospace check` and `markitect infospace evaluate --provider openrouter`
8. `markitect infospace viability` — review against your thresholds
9. Iterate: refine guidelines, re-process, re-evaluate
10. Once viable, use as a discipline for a new infospace

The key insight is that **schemas and guidelines are artifacts** — they
live in the repository and can be versioned and diffed just like code.
Every refinement decision is traceable through git history.

---

## 17. Observing Entity Heterogeneity

After processing all 35 chapters you will notice that the entity collection
is not homogeneous. Reviewing the files, some entities describe **things
that exist** (stocks, agents, institutions) while others describe **how
things connect** (mechanisms, signals, causal dependencies):

| Entity | Character |
|---|---|
| *Capital Stock* | A persistent resource — an element |
| *Division of Labour* | An ongoing activity — a process |
| *Natural Price* | A structural dependency — a relation |
| *Opportunity Cost* | An abstract invariant — a principle |
| *Banking System* | A socially constructed rule — an institution |

This heterogeneity is not a flaw in the extraction. It reflects the actual
structure of Smith's argument. But treating all entities identically — as
unnamed nodes in a flat collection — hides structural information that is
necessary for building a systemic model.

The VSM mapping compounds this: System 2 ends up containing both a *price
signal* (a relation) and a *market* (an element that hosts those signals).
Both are genuinely in S2, but conflating them makes it harder to answer the
competency questions precisely.

**The solution is layered development**: moving from the flat entity set
toward a typed, structured, minimal systemic model. The full concept and
rationale is documented in [`LAYERED-DEVELOPMENT.md`](LAYERED-DEVELOPMENT.md).

---

## 18. The Four Layers

Infospace development proceeds through four layers, each with its own
pipeline, schema, and viability check:

```
L0  Source text (35 chapters)
     │  extract-entities
     ▼
L1  Raw entities (~988)          ← current state after full processing
     │  classify-entities
     ▼
L2  Typed entities               Each entity has: type × VSM coordinate
     │  extract-relations
     ▼
L3  Relation graph               Explicit triplets: Element → Relation → Element
     │  distil-core
     ▼
L4  Systemic model               Minimal viable set (~30 elements + 20 relations)
```

Each layer is a **proper infospace** that uses the previous layer as its
topic (and sometimes as its discipline). The composition model already
built into MarkiTect makes this explicit and auditable.

---

## 19. Layer 2 — Classifying Entities

The goal of Layer 2 is to assign every entity a **type** and confirm or
refine its **VSM system assignment**, giving each entity a coordinate in a
structured space.

### Entity types

| Type | What it is | Examples |
|---|---|---|
| **Element** | A stock, agent, or artifact that persists | Capital Stock, Corn, Colony |
| **Process** | A flow or transformation (has duration) | Division of Labour, Trade, Credit Extension |
| **Relation** | A structural dependency between elements | Natural Price, Wages of Labour |
| **Principle** | An abstract law holding across contexts | Comparative Advantage, Opportunity Cost |
| **Institution** | A socially constructed rule system | Banking System, Guild, Taille |

### New schema: `schemas/typed-entity-schema-v1.0.md`

Extend the economic entity schema with two new required sections:

```markdown
## Entity Type

[Element | Process | Relation | Principle | Institution]

## VSM System

[S1 | S2 | S3 | S3* | S4 | S5]
```

And two supporting rationale fields (one sentence each):

```markdown
## Type Rationale

This is a Relation because it expresses a structural dependency between
Wages and Capital Stock rather than being an entity that exists independently.

## VSM Rationale

Assigned to S2 because Natural Price functions as the coordination signal
that prevents market price oscillation — Beer's primary definition of S2.
```

### New pipeline stage: `classify-entities`

Add the stage to `infospace.yaml` after the existing pipeline:

```yaml
pipeline:
  stages:
    - name: extract-entities
      template: templates/extract-entities.md
      output_dir: output/entities
      ...
    - name: map-to-vsm
      ...
    - name: synthesize-analysis
      ...
    - name: classify-entities
      template: templates/classify-entities.md
      output_dir: output/typed-entities
      output_macro: typed_entity
      max_tokens: 1200
      macros:
        vsm_framework: artifacts/vsm-reference/vsm-framework.md
        type_taxonomy: artifacts/guidelines/entity-type-taxonomy.md
```

This stage runs **once per entity** (not per chapter), taking the canonical
entity file as input and producing an enriched version in `output/typed-entities/`.

### New coverage metric — type × VSM matrix

At Layer 2, the coverage metric gains a new interpretation. The matrix is
no longer domain × chapter but **type × VSM system** — a 5 × 6 grid:

```
           S1     S2     S3    S3*    S4     S5
Element    ████   ████   ██    ·      ██     ██
Process    ████   ████   ██    ·      ████   ·
Relation   ████   ████   ████  ██     ██     ██
Principle  ██     ██     ·     ·      ████   ████
Institution·      ·      ████  ████   ·      ████
```

An empty cell in this matrix means the VSM system has no entities of that
structural type — a genuine explanatory gap.

---

## 20. Layer 3 — Extracting the Relation Graph

The goal of Layer 3 is to make the **connections between entities explicit**.
Rather than inferring connectivity from embedding similarity or co-occurrence,
Layer 3 extracts directed, typed **triplets** from entity definitions and
source chapters.

### Triplet structure

Each triplet is a directed edge in the relation graph:

```
Subject             Predicate          Object
──────────────────  ─────────────────  ──────────────────
Division of Labour  ←limited by→       Market Extent
Capital Stock       ←enables→          Division of Labour
Natural Price       ←centres on→       Market Price
Wages of Labour     ←regulated by→     Profit of Stock
```

The predicate is drawn from a **controlled vocabulary** of twelve relation
classes, each mapped to a VSM channel:

| Predicate class | VSM channel |
|---|---|
| enables / constrains | S1 structural dependency |
| regulates / is regulated by | S3→S1 control |
| coordinates | S2 anti-oscillation |
| produces / consumes | S1 operational flow |
| monitors / audits | S3* audit loop |
| adapts to / anticipates | S4 intelligence |
| defines / is defined by | S5 policy authority |
| contradicts / tensions with | cross-level conflict |

### New output directory: `output/relations/`

One file per triplet (or per named relation cluster):

```markdown
# Division of Labour — limited by — Market Extent

## Subject
Division of Labour (Process / S1)

## Predicate
limited by

## Object
Market Extent (Element / S2)

## VSM Channel
S1 operational capacity constrained by S2 coordination reach

## Evidence
Book I, Chapter 3: "The division of labour is limited by the extent
of the market."

## Feedback Role
Entry point of the Market Expansion loop.
```

### Feedback loops

The relation graph will reveal feedback loops — cycles in the directed
graph. These are the most structurally important outputs of Layer 3 because
they are the mechanisms Smith describes throughout the WoN:

```
Capital Accumulation → Division of Labour → Productivity
  → Profit Margin → Capital Accumulation    [positive reinforcement]

Market Price above Natural Price → Capital Inflow → Supply
  → Market Price restores                   [balancing loop, S2]

Wages rise → Consumer Demand → Employment
  → Wages rise                              [positive reinforcement, S1]
```

Finding and naming these loops is the primary intellectual payoff of
Layer 3. Each loop can be documented as a named pattern:

```bash
# Future command:
markitect infospace loops
# Detected feedback loops (3):
#   Capital Accumulation Loop (positive, S1→S3→S1)
#   Price Equilibration Loop  (balancing, S2)
#   Labour Market Loop        (positive, S1→S2→S1)
```

---

## 21. Layer 4 — The Minimal Systemic Model

Layer 4 answers the ultimate question: **what is the smallest set of
elements and relations that can generate Smith's argument from first
principles?**

The hypothesis is that the 988-entity collection can be reduced to a core
of roughly 30–40 elements, 15–25 relations, and 8–12 principles. Everything
else is a refinement, an illustration, or historical context.

### How the core is identified

Two methods work together:

**Graph centrality**: Entities with the highest combined in-degree and
out-degree in the Layer 3 relation graph are candidates. An entity that
many other entities connect to or depend on is structurally load-bearing.

**VSM completeness**: The core must have at least one entity at each VSM
level, and each level must have at least one Element, one Process, and one
Relation. This is the stopping condition — the minimum viable set is the
smallest set that is VSM-complete.

**FCA concept density**: The concept lattice from Layer 1 (FCA already
computed) identifies which entities co-occur across the most attributes
(domains and chapters). High-density concepts are likely core entities.

### Output: `output/core-model.md`

The final artifact documents the core model with explicit VSM assignment,
named feedback loops, and competency question coverage:

```markdown
# Core Systemic Model — The Wealth of Nations (L4)

## Core Elements
### S1 — Operations
- Labour
- Capital Stock
- Land
- Commodity

### S2 — Coordination
- Market

### S3 — Management
- Banking System

## Core Processes
- Division of Labour (S1)
- Agricultural Production (S1)
- Trade (S1)
- Capital Allocation (S3)

## Core Relations
- Natural Price — centres — Market Price (S2)
- Wages of Labour — allocates — Labour (S2)
- Profit of Stock — allocates — Capital (S2)

## Core Principles
- Invisible Hand (S4)
- Comparative Advantage (S4)
- System of Natural Liberty (S5)

## Feedback Loops
1. Capital Accumulation (positive, S1–S3)
2. Price Equilibration (balancing, S2)
3. Labour Market (positive, S1–S2)

## Viability
VSM coverage: S1 ✓  S2 ✓  S3 ✓  S4 ✓  S5 ✓
Competency questions answered: 6/6
Entities in core: 28 / 988 (3%)
```

### What the core enables

With a validated core model, the infospace becomes far more useful as a
discipline:

- **Composability**: Another infospace can import the WoN core as its
  discipline, knowing that only the 28 load-bearing entities will be
  injected as context — not all 988.
- **Gap analysis**: New source material can be evaluated against the core:
  does this modern supply chain text engage with Smith's three core
  relations? If not, the analysis is incomplete.
- **Theory comparison**: Two economic theories (Smith and Ricardo, say)
  can be compared at the core level — do they share elements? Where do
  their feedback loops diverge?

---

## 22. Running Layers 2–4 (Planned)

The following commands are planned for a future implementation phase.
They are documented here to describe the intended workflow.

### Layer 2: classify all entities

```bash
# Classify entity types and confirm VSM assignments:
markitect infospace classify --provider openrouter

# Classify a single entity:
markitect infospace classify --entity division-of-labour --provider openrouter

# Review type × VSM coverage matrix:
markitect infospace check type-coverage
```

Expected output:

```
Classifying 988 entities...
  [████████████████████] 988/988

Type distribution:
  Element:     312 (32%)
  Process:     248 (25%)
  Relation:    201 (20%)
  Principle:   142 (14%)
  Institution:  85 (9%)

Type × VSM coverage: 25/30 cells populated
  Missing: Institution/S1, Principle/S3*, Process/S5
```

### Layer 3: extract relations

```bash
# Extract relation triplets (per entity pair or per chapter):
markitect infospace extract-relations --provider openrouter

# View the relation graph:
markitect infospace graph --output output/relations/graph.dot

# Detect feedback loops:
markitect infospace loops
```

### Layer 4: distil the core

```bash
# Identify the minimal viable entity set:
markitect infospace distil --provider openrouter

# Review the core model:
cat output/core-model.md

# Check VSM completeness of the core:
markitect infospace viability --layer 4
```

---

## 23. Layer 2–4 as Composed Infospaces

The cleanest way to implement Layers 2–4 is as **separate infospaces**,
each using the previous layer as its topic and discipline. This is already
supported by the MarkiTect composition model.

```bash
# Layer 2 infospace — using L1 entities as the topic:
mkdir ../won-typed/
cd ../won-typed/
markitect infospace init \
  --topic "WoN Typed Entities" \
  --domain "Ontological Classification" \
  --discipline "Viable System Model"

# Bind the L1 infospace as the source topic:
markitect infospace bind-discipline ../infospace-with-history

# Layer 3 infospace — using L2 typed entities as the topic:
mkdir ../won-relations/
markitect infospace init \
  --topic "WoN Relation Graph" \
  --domain "Systemic Modelling"

markitect infospace bind-discipline ../won-typed

# Layer 4 infospace — the core model:
mkdir ../won-core/
markitect infospace init \
  --topic "WoN Core Model" \
  --domain "Systemic Modelling"

markitect infospace bind-discipline ../won-relations
```

This structure makes every distillation decision auditable through git
history. A reclassification in L2 (an entity's type changes from Process
to Relation) propagates as a flag on dependent L3 triplets, which in turn
flags the L4 core model for re-evaluation.

The intellectual history of how a theory was extracted from a text, typed,
connected, and distilled to its minimal core is fully preserved — as a set
of git commits, each with a human-readable rationale.