Update TUTORIAL.md to use infospace tooling commands alongside the chapter processing pipeline: - Add infospace.yaml declaration and `markitect infospace init` - Add sections for evaluate, check (C1–C5), and viability dashboard - Add `markitect infospace history` and status/entities commands - Add composition section (bind-discipline, disciplines, stale-mappings) - Update cost/performance: OpenRouter free tier, note claude-code limit - Update chapter count to 9/35, reference clean-example-history branch - Restructure as 16 sections following S3.4 roadmap outline Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
677 lines
22 KiB
Markdown
677 lines
22 KiB
Markdown
# Building an Infospace with History — Tutorial
|
||
|
||
This tutorial walks through how to build a structured **infospace** from
|
||
Adam Smith's *The Wealth of Nations*, mapping classical economic concepts
|
||
to Stafford Beer's **Viable System Model** (VSM), using MarkiTect's
|
||
infospace tooling.
|
||
|
||
By the end you will understand how to:
|
||
|
||
1. Declare an infospace with `infospace.yaml` and `markitect infospace init`
|
||
2. Design schemas that scaffold structured LLM output
|
||
3. Write prompt templates with dependency injection (`@{macro}` syntax)
|
||
4. Run an incremental, chapter-by-chapter pipeline
|
||
5. Evaluate entity quality and run collection-level checks
|
||
6. Review viability against declared thresholds
|
||
7. Track every change through git history
|
||
8. Use a completed infospace as a discipline for a new project
|
||
|
||
---
|
||
|
||
## 1. What Is an Infospace?
|
||
|
||
An **infospace** is a curated, self-describing collection of **entities**
|
||
(concepts, mechanisms, observations) that together explain a **topic**
|
||
through the lens of one or more **disciplines**.
|
||
|
||
| Term | This example |
|
||
|---|---|
|
||
| Topic | *The Wealth of Nations* (Smith, 1776) |
|
||
| Discipline | Viable System Model (Beer) |
|
||
| Entities | Economic concepts: division of labour, natural price, … |
|
||
| Viability | Does the entity set answer the competency questions? |
|
||
|
||
The challenge with a large source corpus is that it is too big for a single
|
||
prompt. MarkiTect processes it **incrementally**, one chapter at a time,
|
||
building up the entity set and tracking progress through git.
|
||
|
||
An infospace is **viable** when it meets threshold scores across defined
|
||
metrics — it is fit for purpose as an explanatory tool.
|
||
|
||
---
|
||
|
||
## 2. Project Layout
|
||
|
||
```
|
||
examples/infospace-with-history/
|
||
│
|
||
├── infospace.yaml # Declarative infospace configuration (NEW)
|
||
├── README.md
|
||
├── TUTORIAL.md # This file
|
||
├── INFRA-TASKS.md # Infrastructure issues found during the experiment
|
||
├── process_chapters.py # Pipeline script (chapter processing)
|
||
├── infospace.db # SQLite artifact database (generated, not in git)
|
||
│
|
||
├── schemas/ # Output structure definitions
|
||
│ ├── economic-entity-schema-v1.0.md
|
||
│ ├── vsm-concept-schema-v1.0.md
|
||
│ ├── vsm-mapping-schema-v1.0.md
|
||
│ └── chapter-analysis-schema-v1.0.md
|
||
│
|
||
├── templates/ # Prompt templates (with @{macro} placeholders)
|
||
│ ├── extract-entities.md
|
||
│ ├── map-to-vsm.md
|
||
│ ├── synthesize-analysis.md
|
||
│ └── assess-metrics.md
|
||
│
|
||
├── artifacts/ # Input artifacts
|
||
│ ├── sources/ # Chapter text (35 files)
|
||
│ ├── guidelines/ # Extraction and mapping rules
|
||
│ └── vsm-reference/ # VSM framework definition
|
||
│
|
||
└── output/ # Generated artifacts (LLM outputs)
|
||
├── entities/ # Flat canonical entity set + chapter views
|
||
│ ├── division-of-labour.md # Canonical entity file (PRIMARY)
|
||
│ ├── exchange.md
|
||
│ ├── book-1-chapter-01-entities.md # Chapter view (transclusion)
|
||
│ └── ...
|
||
├── mappings/ # Per-chapter VSM mappings
|
||
├── analyses/ # Per-chapter synthesised analyses
|
||
└── metrics/ # Collection metrics + history
|
||
├── metrics.yaml # Latest metric values
|
||
└── history.yaml # Timestamped snapshot log
|
||
```
|
||
|
||
**Entity organisation**: The infospace maintains a **flat canonical set**
|
||
of entities — one markdown file per entity in `output/entities/`. Duplicate
|
||
slugs across chapters are skipped (first occurrence wins). Per-chapter
|
||
`*-entities.md` files are **secondary views** using transclusion directives
|
||
(`{{ include "entity.md" }}`), so editing a canonical file updates every
|
||
chapter view that references it.
|
||
|
||
---
|
||
|
||
## 3. Initialising an Infospace
|
||
|
||
### Starting fresh
|
||
|
||
Use `markitect infospace init` to create an `infospace.yaml`:
|
||
|
||
```bash
|
||
cd my-new-infospace/
|
||
markitect infospace init \
|
||
--topic "The Wealth of Nations" \
|
||
--domain "Classical Economics" \
|
||
--sources artifacts/sources/ \
|
||
--discipline "Viable System Model"
|
||
```
|
||
|
||
This creates a minimal `infospace.yaml`. Edit it to add schemas,
|
||
competency questions, and viability thresholds:
|
||
|
||
```yaml
|
||
topic:
|
||
name: "The Wealth of Nations"
|
||
domain: "Classical Economics"
|
||
sources: artifacts/sources/
|
||
|
||
disciplines:
|
||
- name: "Viable System Model"
|
||
path: artifacts/vsm-reference/
|
||
|
||
schemas:
|
||
entity: schemas/economic-entity-schema-v1.0.md
|
||
mapping: schemas/vsm-mapping-schema-v1.0.md
|
||
analysis: schemas/chapter-analysis-schema-v1.0.md
|
||
|
||
competency_questions: |
|
||
1. How does Smith's division of labour map to VSM System 1 operations?
|
||
2. What mechanisms in WoN correspond to VSM coordination (System 2)?
|
||
3. Where does Smith describe self-organising regulation (System 3)?
|
||
4. What role does the "invisible hand" play as a System 4 mechanism?
|
||
5. How do Smith's views on government map to System 5 policy?
|
||
6. Is the WoN entity set viable as an explanatory framework?
|
||
|
||
viability:
|
||
redundancy_ratio: { max: 0.10 }
|
||
coverage_ratio: { min: 0.50 }
|
||
coherence_components: { max: 3 }
|
||
consistency_cycles: { max: 0 }
|
||
granularity_entropy: { min: 1.0 }
|
||
|
||
pipeline:
|
||
stages:
|
||
- name: extract-entities
|
||
template: templates/extract-entities.md
|
||
- name: map-to-vsm
|
||
template: templates/map-to-vsm.md
|
||
- name: synthesize-analysis
|
||
template: templates/synthesize-analysis.md
|
||
```
|
||
|
||
### Checking status
|
||
|
||
At any point, inspect the infospace:
|
||
|
||
```bash
|
||
markitect infospace status
|
||
# Infospace: The Wealth of Nations
|
||
# Domain: Classical Economics
|
||
# Entities: 109
|
||
# Domains: Production, Distribution, Exchange, Regulation
|
||
# Disciplines: Viable System Model
|
||
# Chapters: 9/35 processed
|
||
|
||
markitect infospace entities
|
||
# Lists all entities with domain, source chapter, word count
|
||
```
|
||
|
||
---
|
||
|
||
## 4. Designing Schemas
|
||
|
||
Before writing any prompts, define **schemas** — markdown documents that
|
||
tell the LLM exactly what sections each output must contain. Schemas are
|
||
not code; the LLM reads them as instructions.
|
||
|
||
### Economic Entity Schema (`schemas/economic-entity-schema-v1.0.md`)
|
||
|
||
Every extracted entity must have:
|
||
|
||
- **H1 heading** with the entity name (title case)
|
||
- **Definition** (20–150 words, precise and non-circular)
|
||
- **Source Chapter** citing Book and Chapter
|
||
- **Context** — where in Smith's argument the entity appears
|
||
- **Economic Domain** (Production, Distribution, Exchange, etc.)
|
||
|
||
Optional: Smith's Original Wording, Modern Interpretation.
|
||
|
||
### VSM Mapping Schema (`schemas/vsm-mapping-schema-v1.0.md`)
|
||
|
||
Every entity-to-VSM mapping must have:
|
||
|
||
- **H1 heading**: `Entity Name -> VSM Concept Name`
|
||
- **Economic Entity Reference** and **VSM Concept Reference**
|
||
- **Mapping Rationale** (minimum 30 words, grounded in Beer's definitions)
|
||
- **Mapping Strength**: Strong, Moderate, or Weak
|
||
|
||
### Chapter Analysis Schema (`schemas/chapter-analysis-schema-v1.0.md`)
|
||
|
||
The per-chapter synthesis includes:
|
||
|
||
- **Chapter Summary** (50–300 words)
|
||
- **Entities Extracted** — bulleted list
|
||
- **VSM Mappings** — entity, concept, strength
|
||
- **VSM Coverage** — explicit assessment of S1 through S5 and S3*
|
||
- **Gaps & Observations**
|
||
|
||
**Key insight**: Schemas are artifacts — they live in the repository and
|
||
can be versioned, diffed, and refined just like code. Improving a schema
|
||
and re-processing a chapter is visible as a git diff.
|
||
|
||
---
|
||
|
||
## 5. Writing Prompt Templates
|
||
|
||
Each template is a markdown file with `@{macro_name}` placeholders that
|
||
MarkiTect's resolver fills with artifact content at compile time.
|
||
|
||
### Template 1: Extract Entities (`templates/extract-entities.md`)
|
||
|
||
```markdown
|
||
# Extract Economic Entities
|
||
|
||
You are an analytical economist specialising in classical economic theory.
|
||
Your task is to extract distinct economic entities from a chapter of
|
||
Adam Smith's *The Wealth of Nations*.
|
||
|
||
## Source Chapter
|
||
@{chapter_text}
|
||
|
||
## Extraction Guidelines
|
||
@{extraction_rules}
|
||
|
||
## VSM Framework Context
|
||
@{vsm_framework}
|
||
|
||
## Existing Entities
|
||
@{existing_entities}
|
||
|
||
## Output Format
|
||
Output each entity delimited by `--- ENTITY: <entity-name> ---` markers.
|
||
```
|
||
|
||
The `@{existing_entities}` macro is generated at runtime from canonical
|
||
files already on disk, enabling incremental extraction without duplication.
|
||
|
||
### Template 2: Map to VSM (`templates/map-to-vsm.md`)
|
||
|
||
Inputs: `@{entities}`, `@{vsm_framework}`, `@{mapping_rules}`.
|
||
|
||
### Template 3: Synthesise Analysis (`templates/synthesize-analysis.md`)
|
||
|
||
Inputs: `@{chapter_text}`, `@{entities}`, `@{mappings}`, `@{vsm_framework}`.
|
||
|
||
### Template 4: Assess Metrics (`templates/assess-metrics.md`)
|
||
|
||
Inputs: `@{all_analyses}` (all chapter analyses concatenated), `@{vsm_framework}`.
|
||
Runs across the entire infospace, not per-chapter.
|
||
|
||
**Dependency chain per chapter:**
|
||
|
||
```
|
||
chapter_text ─────┐
|
||
extraction_rules ──┤
|
||
vsm_framework ────┤
|
||
▼
|
||
extract-entities
|
||
│
|
||
▼ entities
|
||
map-to-vsm
|
||
│
|
||
▼ mappings
|
||
synthesize-analysis
|
||
│
|
||
▼ analysis
|
||
```
|
||
|
||
---
|
||
|
||
## 6. Populating Artifacts
|
||
|
||
### Source chapters (`artifacts/sources/`)
|
||
|
||
35 markdown files with the full public-domain text of each chapter.
|
||
Named `book-1-chapter-01.md` through `book-5-chapter-03.md`.
|
||
|
||
### Guidelines (`artifacts/guidelines/`)
|
||
|
||
- **`extraction-rules.md`** — What constitutes an entity, granularity
|
||
rules, naming conventions.
|
||
- **`mapping-rules.md`** — How to map entities to VSM systems, what
|
||
constitutes Strong/Moderate/Weak strength.
|
||
|
||
### VSM reference (`artifacts/vsm-reference/`)
|
||
|
||
- **`vsm-framework.md`** — Complete description of Beer's VSM (S1–S5,
|
||
S3*, recursion, variety, viability, algedonic signals, autonomy) with
|
||
economic interpretations.
|
||
|
||
---
|
||
|
||
## 7. Processing Chapters
|
||
|
||
`process_chapters.py` orchestrates the three-stage pipeline. It initialises
|
||
the artifact repository, loads static artifacts, runs entity extraction →
|
||
VSM mapping → analysis synthesis, and commits each chapter to git.
|
||
|
||
### Single chapter
|
||
|
||
```bash
|
||
# Manual mode (writes prompts, awaits output files):
|
||
python process_chapters.py --chapter book-1-chapter-05 --no-commit
|
||
|
||
# Auto mode via OpenRouter (free models available):
|
||
python process_chapters.py --chapter book-1-chapter-05 --provider openrouter
|
||
|
||
# With a specific free model:
|
||
python process_chapters.py --chapter book-1-chapter-05 \
|
||
--provider openrouter --model meta-llama/llama-4-maverick:free
|
||
```
|
||
|
||
### Whole book or all chapters
|
||
|
||
```bash
|
||
python process_chapters.py --book 1 --provider openrouter
|
||
python process_chapters.py --all --provider openrouter
|
||
```
|
||
|
||
### Check progress
|
||
|
||
```bash
|
||
python process_chapters.py --list
|
||
```
|
||
|
||
```
|
||
Available chapters (35):
|
||
|
||
Chapter Entities Mappings Analysis
|
||
------------------------------ ------------ ------------ ------------
|
||
book-1-chapter-01 done (13) done done
|
||
book-1-chapter-02 done (7) done done
|
||
...
|
||
|
||
Canonical entity set: 109 unique entities
|
||
```
|
||
|
||
### Entity lifecycle
|
||
|
||
Entities in the canonical set are **never silently deleted**. Retire
|
||
an entity by archiving it with a documented reason:
|
||
|
||
```bash
|
||
python process_chapters.py --archive-entity enlarged-monopoly \
|
||
--reason "Subsumed by monopoly-price — same market distortion"
|
||
```
|
||
|
||
The archived file moves to `output/entities/archive/<slug>.md` with a
|
||
dated header, preserving the intellectual history of every decision.
|
||
|
||
---
|
||
|
||
## 8. Evaluating Entity Quality
|
||
|
||
Once chapters are processed, evaluate the entity set using the infospace
|
||
tooling commands.
|
||
|
||
### Per-entity evaluation
|
||
|
||
```bash
|
||
# Evaluate all entities (requires LLM provider):
|
||
markitect infospace evaluate --provider openrouter
|
||
|
||
# Evaluate entities from a specific chapter:
|
||
markitect infospace evaluate --chapter book-1-chapter-05 --provider openrouter
|
||
|
||
# Re-evaluate a single entity:
|
||
markitect infospace evaluate --entity division-of-labour --provider openrouter
|
||
```
|
||
|
||
This runs the `evaluate-entity` prompt template against each entity,
|
||
scoring dimensions like definition precision, source grounding, and
|
||
VSM relevance. Results are written to `output/evaluations/`.
|
||
|
||
### Collection-level checks (C1–C5)
|
||
|
||
```bash
|
||
# Run all five collection checks:
|
||
markitect infospace check --provider openrouter
|
||
|
||
# Run individual checks:
|
||
markitect infospace check redundancy # C1: Are any entities synonymous?
|
||
markitect infospace check coverage # C2: Which domain × VSM cells are empty?
|
||
markitect infospace check coherence # C3: Is the entity graph well-connected?
|
||
markitect infospace check consistency # C4: Are there circular definitions?
|
||
markitect infospace check granularity # C5: Is abstraction level balanced?
|
||
```
|
||
|
||
Each check uses the platform's embedding, graph analysis, and FCA
|
||
infrastructure. Results are written to `output/metrics/` and a new
|
||
snapshot is appended to `metrics-history.yaml`.
|
||
|
||
Sample output:
|
||
|
||
```
|
||
Running collection checks on 109 entities...
|
||
|
||
C1 — redundancy
|
||
redundancy_ratio: 0.0183
|
||
high_similarity_pairs: 2
|
||
|
||
C2 — coverage
|
||
coverage_ratio: 0.4286
|
||
empty_cells: [['Regulation', 'S3*'], ['Historical', 'S5']]
|
||
|
||
C3 — coherence
|
||
coherence_components: 1
|
||
modularity: 0.412
|
||
|
||
C4 — consistency
|
||
consistency_cycles: 0
|
||
grounding_ratio: 0.94
|
||
|
||
C5 — granularity
|
||
granularity_entropy: 2.69
|
||
```
|
||
|
||
---
|
||
|
||
## 9. Reviewing Viability
|
||
|
||
```bash
|
||
markitect infospace viability
|
||
```
|
||
|
||
Compares the latest metrics against the thresholds declared in
|
||
`infospace.yaml`:
|
||
|
||
```
|
||
Metric Value Threshold Status
|
||
-----------------------------------------------------------
|
||
redundancy_ratio 0.0183 max=0.10 PASS
|
||
coverage_ratio 0.4286 min=0.50 FAIL
|
||
coherence_components 1 max=3 PASS
|
||
consistency_cycles 0 max=0 PASS
|
||
granularity_entropy 2.6900 min=1.0 PASS
|
||
|
||
Viable: NO (4/5 thresholds met)
|
||
```
|
||
|
||
Coverage is currently failing (42% < 50% threshold) because only 9 of
|
||
35 chapters have been processed. Once more chapters are done, coverage
|
||
will rise.
|
||
|
||
### Metrics history
|
||
|
||
```bash
|
||
markitect infospace history
|
||
```
|
||
|
||
Shows how metrics evolved across runs:
|
||
|
||
```
|
||
Snapshot Date Entities coverage redundancy entropy
|
||
-------------------------------------------------------------
|
||
6ba48eb2 2026-02-19 85 0.361 0.000 2.687
|
||
```
|
||
|
||
---
|
||
|
||
## 10. Tracking History with Git
|
||
|
||
Every processed chapter produces one git commit containing:
|
||
|
||
- Compiled prompts (`*-prompt.md`) — audit what was sent to the LLM
|
||
- Canonical entity files (`output/entities/<slug>.md`) — first occurrence wins
|
||
- Chapter entity views (`<chapter>-entities.md`) — transclusion references
|
||
- Generated outputs (`*-mappings.md`, `*-analysis.md`)
|
||
|
||
This means:
|
||
|
||
- `git log` shows the chronological order of processing
|
||
- `git diff` between commits shows what each chapter contributed
|
||
- You can `git bisect` to find where quality degraded
|
||
- You can revert a chapter and re-process with improved guidelines
|
||
|
||
The `clean-example-history` branch in this repository demonstrates the
|
||
intended structure: each chapter is a single, self-contained commit.
|
||
Use it as a reference for how the infospace grew step by step.
|
||
|
||
To commit manually after reviewing:
|
||
|
||
```bash
|
||
python process_chapters.py --chapter book-1-chapter-05 --provider openrouter --no-commit
|
||
# review output/entities/ and output/mappings/
|
||
git add examples/infospace-with-history/output/
|
||
git commit -m "infospace: process book-1-chapter-05"
|
||
```
|
||
|
||
---
|
||
|
||
## 11. Cost and Performance
|
||
|
||
| | OpenRouter (free) | OpenRouter (paid) | Gemini (free) |
|
||
|---|---|---|---|
|
||
| Time per chapter | ~5 min | ~2 min | ~45 sec |
|
||
| Cost per chapter | $0.00 | ~$0.07 | $0.00 |
|
||
| Default model | `arcee-ai/trinity-large-preview:free` | `anthropic/claude-sonnet-4` | `gemini-2.5-flash` |
|
||
| Rate limits | ~200 req/day | High | Per-minute |
|
||
|
||
**OpenRouter free tier**: Sign up at [openrouter.ai](https://openrouter.ai)
|
||
(no credit card required). Store your key in `apikey-openrouter.txt` in the
|
||
project root (git-ignored), or set `OPENROUTER_API_KEY`.
|
||
|
||
```bash
|
||
export OPENROUTER_API_KEY=$(cat apikey-openrouter.txt | tr -d '[:space:]')
|
||
```
|
||
|
||
Use `openrouter/free` to automatically select from whichever free model is
|
||
available:
|
||
|
||
```bash
|
||
python process_chapters.py --chapter book-1-chapter-05 \
|
||
--provider openrouter --model openrouter/free
|
||
```
|
||
|
||
**Gemini free tier**: Get a key at [aistudio.google.com/apikey](https://aistudio.google.com/apikey),
|
||
store in `apikey-geminifree.txt`.
|
||
|
||
Note: The `claude-code` provider (Claude CLI subprocess) is not available
|
||
when running inside a Claude Code session due to nested session restrictions.
|
||
|
||
---
|
||
|
||
## 12. Completing the Remaining Chapters
|
||
|
||
As of writing, 9 of 35 chapters are processed (Book I, Chapters 1–9).
|
||
|
||
**Process Book I remainder:**
|
||
|
||
```bash
|
||
export OPENROUTER_API_KEY=$(cat apikey-openrouter.txt | tr -d '[:space:]')
|
||
git checkout clean-example-history
|
||
python process_chapters.py --book 1 --provider openrouter
|
||
```
|
||
|
||
Already-processed chapters are skipped — their chapter view files exist.
|
||
The `@{existing_entities}` macro ensures the LLM only extracts genuinely
|
||
new entities.
|
||
|
||
**Process Books II–V:**
|
||
|
||
```bash
|
||
python process_chapters.py --book 2 --provider openrouter
|
||
python process_chapters.py --book 3 --provider openrouter
|
||
python process_chapters.py --book 4 --provider openrouter
|
||
python process_chapters.py --book 5 --provider openrouter
|
||
```
|
||
|
||
**Run collection checks after each book:**
|
||
|
||
```bash
|
||
markitect infospace check --provider openrouter
|
||
markitect infospace viability
|
||
```
|
||
|
||
**Expected progression:**
|
||
|
||
| After | Chapters | Expected coverage |
|
||
|-------|----------|-------------------|
|
||
| Book I (11 ch.) | 11/35 | S1, S2, S4 strong; S3 emerging |
|
||
| Books I–II (16 ch.) | 16/35 | S3 (capital control) covered |
|
||
| Books I–III (20 ch.) | 20/35 | Historical patterns add depth |
|
||
| Books I–IV (30 ch.) | 30/35 | S5 (policy, mercantilism) emerging |
|
||
| All (35 ch.) | 35/35 | Full coverage; S3* and algedonic signals from Book V |
|
||
|
||
---
|
||
|
||
## 13. Using the Infospace as a Discipline
|
||
|
||
A completed, viable infospace can itself become a **discipline** — a lens
|
||
applied to a new topic. For example, the Wealth of Nations infospace could
|
||
be applied to analyse a modern supply chain.
|
||
|
||
```bash
|
||
# In a new infospace directory:
|
||
markitect infospace init \
|
||
--topic "Modern Supply Chain Management" \
|
||
--domain "Operations Research" \
|
||
--discipline "Wealth of Nations"
|
||
|
||
# Bind the WoN infospace as a discipline:
|
||
markitect infospace bind-discipline ../infospace-with-history
|
||
|
||
# List bound disciplines and their viability:
|
||
markitect infospace disciplines
|
||
# Viable System Model PASS (from vsm-reference/)
|
||
# Wealth of Nations PASS (from ../infospace-with-history)
|
||
|
||
# Check for stale mappings after discipline update:
|
||
markitect infospace stale-mappings
|
||
```
|
||
|
||
The discipline infospace must be viable (meeting its own thresholds)
|
||
before it can be used as a lens. If the discipline's entities change,
|
||
dependent mappings are flagged for re-evaluation.
|
||
|
||
---
|
||
|
||
## 14. Quality Improvement Loop
|
||
|
||
The infospace is designed to be **iteratively refined**:
|
||
|
||
1. **Process chapters** — run the pipeline
|
||
2. **Evaluate** — `markitect infospace evaluate --provider openrouter`
|
||
3. **Check** — `markitect infospace check --provider openrouter`
|
||
4. **Review viability** — `markitect infospace viability`
|
||
5. **Refine guidelines** — update `extraction-rules.md` or
|
||
`mapping-rules.md` to address identified weaknesses
|
||
6. **Re-process** — delete output files for specific chapters and re-run
|
||
7. **Compare** — `git diff` shows how refined guidelines changed the output
|
||
|
||
Example: if checks show S3* (Audit) is consistently missing, add a
|
||
paragraph to `extraction-rules.md` explicitly asking the LLM to look for
|
||
audit, inspection, and oversight mechanisms.
|
||
|
||
To re-process a specific chapter:
|
||
|
||
```bash
|
||
rm -f output/entities/book-1-chapter-03-entities.md
|
||
rm -f output/mappings/book-1-chapter-03-mappings.md
|
||
rm -f output/analyses/book-1-chapter-03-analysis.md
|
||
python process_chapters.py --chapter book-1-chapter-03 --provider openrouter
|
||
```
|
||
|
||
Never silently delete canonical entity files. Archive them instead:
|
||
|
||
```bash
|
||
python process_chapters.py --archive-entity extent-of-the-market \
|
||
--reason "Subsumed by market-price and effectual-demand"
|
||
python process_chapters.py --chapter book-1-chapter-03 --provider openrouter
|
||
```
|
||
|
||
---
|
||
|
||
## 15. The Artifact Database (`infospace.db`)
|
||
|
||
The pipeline stores all artifacts and dependency edges in a local SQLite
|
||
database — `infospace.db`. This file is **not committed to git** because
|
||
it is fully derived from the markdown files that are tracked.
|
||
|
||
To regenerate it after a fresh clone (no LLM calls needed):
|
||
|
||
```bash
|
||
python process_chapters.py --all --no-commit
|
||
```
|
||
|
||
---
|
||
|
||
## 16. Adapting This Pattern to Your Own Project
|
||
|
||
To build your own infospace:
|
||
|
||
1. `markitect infospace init --topic "..." --domain "..." --discipline "..."`
|
||
2. Write schemas defining required sections for each output type
|
||
3. Write extraction guidelines that tell the LLM what to look for
|
||
4. Create prompt templates using `@{macro}` syntax
|
||
5. Populate `artifacts/sources/` with your source corpus
|
||
6. Run `process_chapters.py` (or your equivalent pipeline script)
|
||
7. Evaluate with `markitect infospace evaluate` and `check`
|
||
8. Review `markitect infospace viability` against your thresholds
|
||
9. Iterate: refine guidelines, re-process, re-evaluate
|
||
10. Once viable, use as a discipline for a new infospace
|
||
|
||
The key insight is that **schemas and guidelines are artifacts** — they
|
||
live in the repository and can be versioned and diffed just like code.
|
||
Every refinement decision is traceable through git history.
|