diff --git a/examples/infospace-with-history/METRICS-METHODOLOGY.md b/examples/infospace-with-history/METRICS-METHODOLOGY.md index 1b89c27f..03c0e391 100644 --- a/examples/infospace-with-history/METRICS-METHODOLOGY.md +++ b/examples/infospace-with-history/METRICS-METHODOLOGY.md @@ -290,31 +290,101 @@ pair list, scores, and merge/retire recommendations. ### C2: Coverage Completeness -**Goal:** Identify domain areas and VSM systems that lack adequate -representation in the entity set. +**Goal:** Identify domain areas that are structurally sparse or isolated +within the corpus — and separately, assess whether the entity set can answer +the infospace's declared competency questions. + +**What the deterministic check actually computes** + +The current implementation builds a binary *domain × chapter* cross-table: +one row per economic domain, one column per source chapter. A cell is +populated if at least one entity has that (domain, chapter) combination. + + coverage_ratio = populated_cells / (n_domains × n_chapters) + +**This is not the same as VSM coverage.** The domain × VSM matrix described +in earlier versions of this document requires VSM system mappings to be +supplied as `extra_attributes` to `check_coverage()`. The pipeline does not +currently do this, so `coverage_ratio` reflects *cross-chapter domain +distribution*, not *VSM system coverage*. + +**Important: interpret the distribution, not just the ratio** + +The aggregate ratio conflates two structurally different situations: + +| Situation | coverage_ratio | What it means | +|---|---|---| +| Healthy topic separation | Low | Domains are locally dense within their book/section — expected for a multi-topic corpus | +| Fragmented extraction | Low | Domains appear sporadically everywhere, never anchored | + +Both produce the same ratio. Use the per-domain density distribution to +distinguish them: + +| Metric | Meaning | +|--------|---------| +| `domain_densities` | Per-domain fraction of chapters containing ≥1 entity with that domain | +| `density_std` | Standard deviation of densities. High std → healthy topic separation (bimodal: some domains cross-cutting, others local). Low std → uniform but thin. | +| `cross_cutting_ratio` | Fraction of domains appearing in >50 % of chapters — the foundational, cross-cutting concepts. | + +Example interpretation for the WoN/VSM infospace (1021 entities, 35 chapters): + +``` +Exchange 0.848 ████████████████ cross-cutting +Regulation 0.848 ████████████████ cross-cutting +General Theory 0.727 ██████████████ cross-cutting +Production 0.636 ████████████ cross-cutting +Distribution 0.576 ███████████ borderline +Accumulation 0.364 ███████ book-specific +Consumption 0.333 ██████ book-specific + +density_std = 0.33 (high → healthy topic separation) +cross_cutting_ratio = 0.50 +coverage_ratio = 0.44 (below 0.50 threshold, but for correct reasons) +``` + +**What coverage does NOT capture** + +- **Entity-to-entity connections** — whether concepts reference each other, + form explanatory chains, or cluster coherently. That is C3 (Structural + Coherence). +- **VSM competency question answerability** — whether current entities + collectively support answering the declared competency questions. That + requires LLM-Eval and is a planned metric (see below). +- **Whether absent (domain, chapter) cells are meaningful gaps or expected + absences** — the ratio treats them identically. + +**Threshold guidance** + +- `min: 0.50` is appropriate for a focused, single-topic corpus where all + chapters address the same set of domains. +- For heterogeneous multi-book corpora, domains introduced late create empty + cells for all earlier chapters. A threshold of `0.30–0.40` is more + realistic. +- Prefer `cross_cutting_ratio` and `density_std` as the primary diagnostic + signals; use `coverage_ratio` only for trend tracking across snapshots. **Metrics:** -| Metric | Type | Computation | -|--------|------|-------------| -| `domain_vsm_matrix` | Deterministic | Count entities per {economic_domain, VSM_system} cell | -| `coverage_ratio` | Deterministic | `populated_cells / expected_cells` | -| `vsm_balance_entropy` | Deterministic | Shannon entropy of entity distribution across VSM systems (higher = more balanced) | -| `empty_cells` | Deterministic | List of {domain, VSM_system} pairs with zero entities | -| `competency_coverage` | LLM-Eval | For each competency question, can it be answered with current entities? | -| `fca_gap_concepts` | Deterministic | Attribute combinations in the FCA lattice with no corresponding entity | +| Metric | Type | Computation | Status | +|--------|------|-------------|--------| +| `coverage_ratio` | Deterministic | `populated_cells / (n_domains × n_chapters)` | ✅ Implemented | +| `domain_densities` | Deterministic | Per-domain fraction of chapters with ≥1 entity | ✅ Implemented | +| `density_std` | Deterministic | Std dev of domain densities | ✅ Implemented | +| `cross_cutting_ratio` | Deterministic | Fraction of domains with density > 0.5 | ✅ Implemented | +| `empty_cells` | Deterministic | List of unpopulated (domain, chapter) pairs | ✅ Implemented | +| `fca_gap_concepts` | Deterministic | Attribute combos in FCA lattice with no entity | ✅ Implemented | +| `domain_vsm_matrix` | Deterministic | Entities per {domain, VSM_system} cell — requires VSM mappings in `extra_attributes` | ⬜ Not yet wired | +| `competency_coverage` | LLM-Eval | For each competency question, can it be answered? | ⬜ Not yet implemented | -**Pipeline:** -1. Parse entity metadata (domain, VSM mapping) from files on disk -2. Build domain × VSM matrix; identify empty cells -3. Build FCA formal context; compute lattice; extract gap concepts -4. Define competency questions (initially hand-written, later LLM-generated - from the source material) -5. LLM-evaluate answerability of each question -6. Aggregate into coverage ratio, entropy, and gap list +**Pipeline (current):** +1. Parse entity metadata (domain, source chapter) from entity files +2. Build domain × chapter binary matrix; identify empty cells +3. Compute per-domain densities, std dev, cross-cutting ratio +4. Build FCA formal context; extract gap concepts +5. Aggregate into `CoverageReport` -**Output:** `output/metrics/coverage-report.md` + YAML with matrix, gaps, -and competency question results. +**Output:** Snapshot recorded in `output/metrics/history.yaml`. A +`coverage-report.md` per chapter is planned but not yet generated. ### C3: Structural Coherence diff --git a/markitect/infospace/checks/coverage.py b/markitect/infospace/checks/coverage.py index 887ffcbb..32848b96 100644 --- a/markitect/infospace/checks/coverage.py +++ b/markitect/infospace/checks/coverage.py @@ -1,12 +1,51 @@ """ C2 — Coverage completeness. -Uses FCA and cross-tabulation to detect structural coverage gaps: -attribute combinations (domain × VSM system) with no entities. +**What this measures** + +Builds a binary *domain × chapter* cross-table: rows are economic domains +found across all entities, columns are source chapters. A cell is marked +populated when at least one entity has that (domain, chapter) combination. + + coverage_ratio = populated_cells / (n_domains × n_chapters) + +This is a measure of how *uniformly* economic domains are distributed across +source chapters, not of how richly entities connect to each other (that is +C3 — Structural Coherence) and not of VSM competency-question answerability +(that requires supplying ``extra_attributes`` with VSM system mappings, which +the pipeline does not currently do). + +**Interpreting the ratio alone is misleading.** A single ratio cannot +distinguish two structurally different situations: + +- *Healthy topic separation* — domains are locally dense within their + book/section, sparse elsewhere. The matrix has clean block structure; + low cross-chapter density per domain is *expected*. +- *Fragmented extraction* — domains appear sporadically in all chapters, + never strongly anchored anywhere. The matrix is uniformly thin everywhere. + +Both can produce the same ratio. Use the *per-domain density distribution* +(``domain_densities``, ``density_std``, ``cross_cutting_ratio``) to +distinguish them: + +- High ``density_std`` + bimodal distribution → healthy topic separation. +- Low ``density_std`` + uniform distribution → potential fragmentation. +- ``cross_cutting_ratio`` measures what fraction of domains span more than + half the chapters — these are the foundational cross-cutting concepts. + +**Threshold note** + +A 0.50 threshold is appropriate for a focused single-topic corpus. For a +heterogeneous multi-book corpus (e.g. all five books of The Wealth of +Nations), domains introduced in later books create empty cells for all +earlier chapters, causing the ratio to fall below 0.50 even for structurally +healthy corpora. Consider 0.30–0.40 for large, multi-topic corpora. """ from __future__ import annotations +import math +import statistics from dataclasses import dataclass, field from typing import Any, Dict, List, Optional @@ -16,9 +55,30 @@ from markitect.analysis.fca import FormalContext, find_empty_cells, find_gap_con @dataclass class CoverageReport: - """Results from coverage analysis.""" + """Results from coverage analysis. + + Attributes: + coverage_ratio: Fraction of (domain, chapter) cells that are + populated. See module docstring for interpretation notes. + domain_densities: Per-domain fraction of chapters that contain + at least one entity with that domain. Keys are domain names. + density_std: Standard deviation of ``domain_densities`` values. + High std suggests healthy topic separation; low std suggests + uniform but thin coverage. + cross_cutting_ratio: Fraction of domains that appear in more than + 50 % of source chapters. These are the foundational concepts. + empty_cells: List of ``{dimension_a, dimension_b}`` dicts for each + unpopulated (domain, chapter) cell. + gap_concepts: FCA gap concepts — attribute combinations present in + the lattice but with no entity. + domain_counts: Total entity count per domain. + entity_count: Total number of entities analysed. + """ coverage_ratio: float = 0.0 + domain_densities: Dict[str, float] = field(default_factory=dict) + density_std: float = 0.0 + cross_cutting_ratio: float = 0.0 empty_cells: List[dict] = field(default_factory=list) gap_concepts: List[dict] = field(default_factory=list) domain_counts: Dict[str, int] = field(default_factory=dict) @@ -28,6 +88,9 @@ class CoverageReport: return { "concern": "C2", "coverage_ratio": round(self.coverage_ratio, 4), + "domain_densities": {k: round(v, 4) for k, v in self.domain_densities.items()}, + "density_std": round(self.density_std, 4), + "cross_cutting_ratio": round(self.cross_cutting_ratio, 4), "empty_cells": self.empty_cells, "gap_concepts_count": len(self.gap_concepts), "domain_counts": self.domain_counts, @@ -102,8 +165,29 @@ def check_coverage( populated = total_cells - len(empty) ratio = populated / total_cells if total_cells > 0 else 0.0 + # Per-domain density: fraction of chapters that contain this domain + n_chapters = len(chapters) + domain_densities: Dict[str, float] = {} + if n_chapters > 0: + empty_pairs = {(e["dimension_a"], e["dimension_b"]) for e in empty} + for d in domains: + populated_for_domain = sum( + 1 for c in chapters if (d, c) not in empty_pairs + ) + domain_densities[d.removeprefix("domain:")] = populated_for_domain / n_chapters + + density_values = list(domain_densities.values()) + density_std = statistics.stdev(density_values) if len(density_values) >= 2 else 0.0 + cross_cutting_ratio = ( + sum(1 for v in density_values if v > 0.5) / len(density_values) + if density_values else 0.0 + ) + return CoverageReport( coverage_ratio=ratio, + domain_densities=domain_densities, + density_std=round(density_std, 6), + cross_cutting_ratio=round(cross_cutting_ratio, 4), empty_cells=empty, gap_concepts=gap_dicts, domain_counts=domain_counts,