Files

tegwick d1c6e53754 docs: add infospace primitives reference (S2.7)

Reference document covering all infospace tooling primitives: config,
entity metadata, schema validation, per-entity evaluation, collection
checks, metrics history, viability, composition, and CLI commands.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-19 02:05:09 +01:00

9.1 KiB

Raw Blame History

Infospace Primitives Reference

This document describes the primitives provided by the markitect/infospace/ package for creating, evaluating, maintaining, and composing infospaces.

Core Concepts

An infospace is a structured, evaluable, composable collection of entities that explains a topic through the lens of one or more disciplines.

Term	Meaning
Topic	The subject matter being explained
Discipline	A reusable framework of concepts applied as an analytical lens
Entity	The atomic unit of knowledge — slug, definition, provenance, domain
Evaluation	Per-entity or collection-level quality assessment
Viability	Whether an infospace meets its threshold scores

Configuration (`infospace.yaml`)

Every infospace is declared via an infospace.yaml file. The configuration model is defined in markitect/infospace/config.py.

Minimal example

topic:
  name: "The Wealth of Nations"
  domain: "Classical Economics"
  sources: artifacts/sources/

disciplines:
  - name: "Viable System Model"
    path: artifacts/vsm-reference/

schemas:
  entity: schemas/economic-entity-schema-v1.0.md

viability:
  coverage_ratio: { min: 0.60 }
  redundancy_ratio: { max: 0.05 }
  per_entity_mean: { min: 3.5 }

Key models

TopicConfig — name, domain, sources
DisciplineBinding — name, path (to another infospace directory)
SchemaRegistry — entity, mapping, analysis schema paths
ViabilityThreshold — metric, min, max bounds
PipelineConfig — Ordered list of PipelineStage entries
InfospaceConfig — Top-level config combining all of the above

Default directories

Setting	Default
`entities_dir`	`output/entities`
`evaluations_dir`	`output/evaluations`
`metrics_dir`	`output/metrics`

Entity Metadata

Entities are parsed from markdown files by markitect/infospace/entity_parser.py.

EntityMeta fields: slug, title, definition, domain, source_chapter, context, original_wording, modern_interpretation, definition_word_count, total_word_count, section_slugs.

from markitect.infospace import parse_entity_directory
entities = parse_entity_directory(Path("output/entities"))

Schema Validation

Deterministic validation of entity files against structural schemas.

from markitect.infospace import validate_entity, ECONOMIC_ENTITY_SCHEMA
result = validate_entity(entity_meta, schema=ECONOMIC_ENTITY_SCHEMA)
print(result.summary())

Checks: section presence, word count ranges, heading format, enum values.

Per-entity Evaluation

LLM-based quality assessment of individual entities. Defined in markitect/infospace/evaluate.py.

# Evaluate all entities
markitect infospace evaluate --provider openrouter

# Single entity
markitect infospace evaluate --entity division-of-labour --provider openrouter

Pipeline functions

build_evaluation_prompt(entity, topic, dimensions) — build the LLM prompt
parse_evaluation_response(text, dimensions) — parse LLM output to ScoreEntry list
run_entity_evaluation(config, entities, adapter, ...) — full batch pipeline

Results are written to output/evaluations/ as YAML frontmatter + markdown.

Collection-level Checks

Five concerns assessed at the collection level. Each has a dedicated module in markitect/infospace/checks/.

Concern	Module	Key metric
C1 — Redundancy	`redundancy.py`	`redundancy_ratio`
C2 — Coverage	`coverage.py`	`coverage_ratio`
C3 — Coherence	`coherence.py`	`coherence_components`, `modularity`
C4 — Consistency	`consistency.py`	`consistency_cycles`
C5 — Granularity	`granularity.py`	`granularity_entropy`

Orchestrator

from markitect.infospace.checks import run_all_checks
report = run_all_checks(entities, embeddings=emb, graph=g)
metrics = report.metrics()  # Dict[str, float]

CLI

# Run all checks
markitect infospace check

# Run specific concerns
markitect infospace check --concern redundancy --concern coverage

# JSON output
markitect infospace check --json

After each check run, metrics are automatically recorded to history.

Metrics History

Timestamped snapshots track metrics over time. Defined in markitect/infospace/history.py.

# Show history
markitect infospace history

# Trend for a single metric
markitect infospace history --metric coverage_ratio

# Compare two snapshots
markitect infospace history-diff 2026-02-01 2026-03-01

Key functions

snapshot_from_checks(report, entity_count) — create snapshot from check results
record_check_results(report, config, root, entity_count) — save metrics + append to history
get_history(config, root) — read full history
metric_trend(history, metric_name) — extract single metric across time

Viability

Viability is assessed by comparing current metrics to thresholds declared in infospace.yaml.

markitect infospace viability

Threshold model

viability:
  coverage_ratio: { min: 0.60 }       # must be >= 0.60
  redundancy_ratio: { max: 0.05 }     # must be <= 0.05
  consistency_cycles: { max: 0 }       # must be exactly 0

Each threshold has min and/or max bounds. A metric passes if it falls within bounds. An infospace is viable when all thresholds pass.

Composition

One infospace can use another as a discipline. The composition model is defined in markitect/infospace/composition.py.

Binding a discipline

markitect infospace bind-discipline ./path/to/vsm-infospace --name "Viable System Model"

This adds a DisciplineBinding to infospace.yaml and validates the discipline exists and has an infospace.yaml.

Checking discipline status

markitect infospace disciplines

Shows: name, entity count, viability status, path.

Viability requirement

A discipline must meet its own viability thresholds to be considered reliable. The check_discipline_status() function loads the discipline's metrics and runs its own threshold checks.

Stale mapping detection

markitect infospace stale-mappings

Compares local mapping references against the discipline's current entity set. If a referenced discipline entity has been removed, the mapping is flagged as stale.

Key functions

resolve_discipline_path(binding, root) — resolve to absolute path
load_discipline_config(binding, root) — load discipline's infospace.yaml
check_discipline_status(binding, root) — full status with viability
get_discipline_entities(binding, root) — entity list from discipline
find_stale_mappings(config, root, mapping_references) — detect stale refs
bind_discipline(config, name, path, root) — add binding to config

Evaluation Output Format

Evaluation results use YAML frontmatter + markdown body. Defined in markitect/infospace/evaluation.py and evaluation_io.py.

Per-entity evaluation file

---
entity_slug: division-of-labour
evaluator: openrouter/default
evaluated_at: '2026-02-19T10:30:00'
overall_score: 4.1667
scores:
- name: definition_precision
  value: 4.5
  max_value: 5.0
...
---

# Evaluation: Division Of Labour

## definition_precision — 4.5 / 5.0

The definition clearly captures the core concept...

Snapshot

snapshot_id: abc12345
created_at: '2026-02-19T10:30:00+00:00'
schema_name: default
entity_count: 85
entity_evaluations: [...]
collection_metrics:
  - name: coverage_ratio
    value: 0.75
    concern: C2

State

Runtime state is computed from entities, evaluations, and metrics. Defined in markitect/infospace/state.py.

from markitect.infospace import build_state
state = build_state(config, entities=entities, metrics=metrics)
state.is_viable          # True if all thresholds pass
state.viability_results  # List[ViabilityResult]
state.summary()          # Dict for display

CLI Command Summary

All commands are under markitect infospace:

Command	Purpose
`init`	Create a new `infospace.yaml`
`status`	Show entity count, domains, evaluation state
`entities`	List entities with metadata
`evaluate`	Run per-entity LLM evaluation
`check`	Run collection-level quality checks (C1-C5)
`viability`	Show viability dashboard
`history`	Show metrics history
`history-diff`	Compare two snapshots by date
`bind-discipline`	Bind an external infospace as a discipline
`disciplines`	List bound disciplines and viability
`stale-mappings`	Detect stale cross-infospace references

Platform Dependencies

The infospace tooling builds on these platform modules:

Module	Used for
`markitect/llm/`	Embedding adapters, LLM evaluation
`markitect/analysis/graph.py`	Graph analysis (networkx wrapper)
`markitect/analysis/fca.py`	Formal Concept Analysis
`markitect/prompts/execution/batch.py`	Batch LLM evaluation
`markitect/prompts/dependencies/models.py`	DependencyGraph

9.1 KiB Raw Blame History

Infospace Primitives Reference

Core Concepts

Configuration (infospace.yaml)

Minimal example

Key models

Default directories

Entity Metadata

Schema Validation

Per-entity Evaluation

Pipeline functions

Collection-level Checks

Orchestrator

CLI

Metrics History

Key functions

Viability

Threshold model

Composition

Binding a discipline

Checking discipline status

Viability requirement

Stale mapping detection

Key functions

Evaluation Output Format

Per-entity evaluation file

Snapshot

State

CLI Command Summary

Platform Dependencies

9.1 KiB

Raw Blame History

Configuration (`infospace.yaml`)