From d1c6e53754df83b541e1db7707ac0153fb3e6bc5 Mon Sep 17 00:00:00 2001 From: tegwick Date: Thu, 19 Feb 2026 02:05:09 +0100 Subject: [PATCH] docs: add infospace primitives reference (S2.7) Reference document covering all infospace tooling primitives: config, entity metadata, schema validation, per-entity evaluation, collection checks, metrics history, viability, composition, and CLI commands. Co-Authored-By: Claude Opus 4.6 --- docs/infospace-primitives.md | 344 +++++++++++++++++++++++++++++++++++ 1 file changed, 344 insertions(+) create mode 100644 docs/infospace-primitives.md diff --git a/docs/infospace-primitives.md b/docs/infospace-primitives.md new file mode 100644 index 00000000..ed0cc598 --- /dev/null +++ b/docs/infospace-primitives.md @@ -0,0 +1,344 @@ +# Infospace Primitives Reference + +This document describes the primitives provided by the `markitect/infospace/` +package for creating, evaluating, maintaining, and composing infospaces. + +--- + +## Core Concepts + +An **infospace** is a structured, evaluable, composable collection of +entities that explains a **topic** through the lens of one or more +**disciplines**. + +| Term | Meaning | +|------|---------| +| **Topic** | The subject matter being explained | +| **Discipline** | A reusable framework of concepts applied as an analytical lens | +| **Entity** | The atomic unit of knowledge — slug, definition, provenance, domain | +| **Evaluation** | Per-entity or collection-level quality assessment | +| **Viability** | Whether an infospace meets its threshold scores | + +--- + +## Configuration (`infospace.yaml`) + +Every infospace is declared via an `infospace.yaml` file. The configuration +model is defined in `markitect/infospace/config.py`. + +### Minimal example + +```yaml +topic: + name: "The Wealth of Nations" + domain: "Classical Economics" + sources: artifacts/sources/ + +disciplines: + - name: "Viable System Model" + path: artifacts/vsm-reference/ + +schemas: + entity: schemas/economic-entity-schema-v1.0.md + +viability: + coverage_ratio: { min: 0.60 } + redundancy_ratio: { max: 0.05 } + per_entity_mean: { min: 3.5 } +``` + +### Key models + +- **`TopicConfig`** — `name`, `domain`, `sources` +- **`DisciplineBinding`** — `name`, `path` (to another infospace directory) +- **`SchemaRegistry`** — `entity`, `mapping`, `analysis` schema paths +- **`ViabilityThreshold`** — `metric`, `min`, `max` bounds +- **`PipelineConfig`** — Ordered list of `PipelineStage` entries +- **`InfospaceConfig`** — Top-level config combining all of the above + +### Default directories + +| Setting | Default | +|---------|---------| +| `entities_dir` | `output/entities` | +| `evaluations_dir` | `output/evaluations` | +| `metrics_dir` | `output/metrics` | + +--- + +## Entity Metadata + +Entities are parsed from markdown files by `markitect/infospace/entity_parser.py`. + +**`EntityMeta`** fields: `slug`, `title`, `definition`, `domain`, +`source_chapter`, `context`, `original_wording`, `modern_interpretation`, +`definition_word_count`, `total_word_count`, `section_slugs`. + +```python +from markitect.infospace import parse_entity_directory +entities = parse_entity_directory(Path("output/entities")) +``` + +--- + +## Schema Validation + +Deterministic validation of entity files against structural schemas. + +```python +from markitect.infospace import validate_entity, ECONOMIC_ENTITY_SCHEMA +result = validate_entity(entity_meta, schema=ECONOMIC_ENTITY_SCHEMA) +print(result.summary()) +``` + +Checks: section presence, word count ranges, heading format, enum values. + +--- + +## Per-entity Evaluation + +LLM-based quality assessment of individual entities. Defined in +`markitect/infospace/evaluate.py`. + +```bash +# Evaluate all entities +markitect infospace evaluate --provider openrouter + +# Single entity +markitect infospace evaluate --entity division-of-labour --provider openrouter +``` + +### Pipeline functions + +- `build_evaluation_prompt(entity, topic, dimensions)` — build the LLM prompt +- `parse_evaluation_response(text, dimensions)` — parse LLM output to `ScoreEntry` list +- `run_entity_evaluation(config, entities, adapter, ...)` — full batch pipeline + +Results are written to `output/evaluations/` as YAML frontmatter + markdown. + +--- + +## Collection-level Checks + +Five concerns assessed at the collection level. Each has a dedicated +module in `markitect/infospace/checks/`. + +| Concern | Module | Key metric | +|---------|--------|------------| +| **C1 — Redundancy** | `redundancy.py` | `redundancy_ratio` | +| **C2 — Coverage** | `coverage.py` | `coverage_ratio` | +| **C3 — Coherence** | `coherence.py` | `coherence_components`, `modularity` | +| **C4 — Consistency** | `consistency.py` | `consistency_cycles` | +| **C5 — Granularity** | `granularity.py` | `granularity_entropy` | + +### Orchestrator + +```python +from markitect.infospace.checks import run_all_checks +report = run_all_checks(entities, embeddings=emb, graph=g) +metrics = report.metrics() # Dict[str, float] +``` + +### CLI + +```bash +# Run all checks +markitect infospace check + +# Run specific concerns +markitect infospace check --concern redundancy --concern coverage + +# JSON output +markitect infospace check --json +``` + +After each check run, metrics are automatically recorded to history. + +--- + +## Metrics History + +Timestamped snapshots track metrics over time. Defined in +`markitect/infospace/history.py`. + +```bash +# Show history +markitect infospace history + +# Trend for a single metric +markitect infospace history --metric coverage_ratio + +# Compare two snapshots +markitect infospace history-diff 2026-02-01 2026-03-01 +``` + +### Key functions + +- `snapshot_from_checks(report, entity_count)` — create snapshot from check results +- `record_check_results(report, config, root, entity_count)` — save metrics + append to history +- `get_history(config, root)` — read full history +- `metric_trend(history, metric_name)` — extract single metric across time + +--- + +## Viability + +Viability is assessed by comparing current metrics to thresholds declared +in `infospace.yaml`. + +```bash +markitect infospace viability +``` + +### Threshold model + +```yaml +viability: + coverage_ratio: { min: 0.60 } # must be >= 0.60 + redundancy_ratio: { max: 0.05 } # must be <= 0.05 + consistency_cycles: { max: 0 } # must be exactly 0 +``` + +Each threshold has `min` and/or `max` bounds. A metric passes if it falls +within bounds. An infospace is viable when all thresholds pass. + +--- + +## Composition + +One infospace can use another as a discipline. The composition model is +defined in `markitect/infospace/composition.py`. + +### Binding a discipline + +```bash +markitect infospace bind-discipline ./path/to/vsm-infospace --name "Viable System Model" +``` + +This adds a `DisciplineBinding` to `infospace.yaml` and validates the +discipline exists and has an `infospace.yaml`. + +### Checking discipline status + +```bash +markitect infospace disciplines +``` + +Shows: name, entity count, viability status, path. + +### Viability requirement + +A discipline must meet its own viability thresholds to be considered +reliable. The `check_discipline_status()` function loads the discipline's +metrics and runs its own threshold checks. + +### Stale mapping detection + +```bash +markitect infospace stale-mappings +``` + +Compares local mapping references against the discipline's current entity +set. If a referenced discipline entity has been removed, the mapping is +flagged as stale. + +### Key functions + +- `resolve_discipline_path(binding, root)` — resolve to absolute path +- `load_discipline_config(binding, root)` — load discipline's `infospace.yaml` +- `check_discipline_status(binding, root)` — full status with viability +- `get_discipline_entities(binding, root)` — entity list from discipline +- `find_stale_mappings(config, root, mapping_references)` — detect stale refs +- `bind_discipline(config, name, path, root)` — add binding to config + +--- + +## Evaluation Output Format + +Evaluation results use YAML frontmatter + markdown body. Defined in +`markitect/infospace/evaluation.py` and `evaluation_io.py`. + +### Per-entity evaluation file + +```markdown +--- +entity_slug: division-of-labour +evaluator: openrouter/default +evaluated_at: '2026-02-19T10:30:00' +overall_score: 4.1667 +scores: +- name: definition_precision + value: 4.5 + max_value: 5.0 +... +--- + +# Evaluation: Division Of Labour + +## definition_precision — 4.5 / 5.0 + +The definition clearly captures the core concept... +``` + +### Snapshot + +```yaml +snapshot_id: abc12345 +created_at: '2026-02-19T10:30:00+00:00' +schema_name: default +entity_count: 85 +entity_evaluations: [...] +collection_metrics: + - name: coverage_ratio + value: 0.75 + concern: C2 +``` + +--- + +## State + +Runtime state is computed from entities, evaluations, and metrics. +Defined in `markitect/infospace/state.py`. + +```python +from markitect.infospace import build_state +state = build_state(config, entities=entities, metrics=metrics) +state.is_viable # True if all thresholds pass +state.viability_results # List[ViabilityResult] +state.summary() # Dict for display +``` + +--- + +## CLI Command Summary + +All commands are under `markitect infospace`: + +| Command | Purpose | +|---------|---------| +| `init` | Create a new `infospace.yaml` | +| `status` | Show entity count, domains, evaluation state | +| `entities` | List entities with metadata | +| `evaluate` | Run per-entity LLM evaluation | +| `check` | Run collection-level quality checks (C1-C5) | +| `viability` | Show viability dashboard | +| `history` | Show metrics history | +| `history-diff` | Compare two snapshots by date | +| `bind-discipline` | Bind an external infospace as a discipline | +| `disciplines` | List bound disciplines and viability | +| `stale-mappings` | Detect stale cross-infospace references | + +--- + +## Platform Dependencies + +The infospace tooling builds on these platform modules: + +| Module | Used for | +|--------|----------| +| `markitect/llm/` | Embedding adapters, LLM evaluation | +| `markitect/analysis/graph.py` | Graph analysis (networkx wrapper) | +| `markitect/analysis/fca.py` | Formal Concept Analysis | +| `markitect/prompts/execution/batch.py` | Batch LLM evaluation | +| `markitect/prompts/dependencies/models.py` | DependencyGraph |