docs(metrics): clarify C2 coverage — domain×chapter matrix, not domain×VSM

- coverage.py: rewrite module docstring to explain what the metric actually computes (domain × chapter cross-tabulation, not VSM system coverage), what it does not capture (entity connectivity → C3), and when the threshold is appropriate - CoverageReport: add domain_densities, density_std, cross_cutting_ratio for distribution-level insight beyond the aggregate ratio - check_coverage: compute per-domain density and cross-cutting ratio - METRICS-METHODOLOGY.md: correct C2 section to match implementation, document the distribution-based interpretation, add implementation status table distinguishing what is wired vs planned Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-20 00:08:46 +01:00
parent 0f54f094e4
commit dfe56a4f9b
2 changed files with 177 additions and 23 deletions
--- a/examples/infospace-with-history/METRICS-METHODOLOGY.md
+++ b/examples/infospace-with-history/METRICS-METHODOLOGY.md
@@ -290,31 +290,101 @@ pair list, scores, and merge/retire recommendations.
 ### C2: Coverage Completeness
-**Goal:** Identify domain areas and VSM systems that lack adequate
+**Goal:** Identify domain areas that are structurally sparse or isolated
-representation in the entity set.
+within the corpus — and separately, assess whether the entity set can answer
 the infospace's declared competency questions.
 **What the deterministic check actually computes**
 The current implementation builds a binary *domain × chapter* cross-table:
 one row per economic domain, one column per source chapter.  A cell is
 populated if at least one entity has that (domain, chapter) combination.
    coverage_ratio = populated_cells / (n_domains × n_chapters)
 **This is not the same as VSM coverage.**  The domain × VSM matrix described
 in earlier versions of this document requires VSM system mappings to be
 supplied as `extra_attributes` to `check_coverage()`.  The pipeline does not
 currently do this, so `coverage_ratio` reflects *cross-chapter domain
 distribution*, not *VSM system coverage*.
 **Important: interpret the distribution, not just the ratio**
 The aggregate ratio conflates two structurally different situations:
 | Situation | coverage_ratio | What it means |
 |---|---|---|
 | Healthy topic separation | Low | Domains are locally dense within their book/section — expected for a multi-topic corpus |
 | Fragmented extraction | Low | Domains appear sporadically everywhere, never anchored |
 Both produce the same ratio.  Use the per-domain density distribution to
 distinguish them:
 | Metric | Meaning |
 |--------|---------|
 | `domain_densities` | Per-domain fraction of chapters containing ≥1 entity with that domain |
 | `density_std` | Standard deviation of densities.  High std → healthy topic separation (bimodal: some domains cross-cutting, others local).  Low std → uniform but thin. |
 | `cross_cutting_ratio` | Fraction of domains appearing in >50 % of chapters — the foundational, cross-cutting concepts. |
 Example interpretation for the WoN/VSM infospace (1021 entities, 35 chapters):
 ```
 Exchange        0.848  ████████████████   cross-cutting
 Regulation      0.848  ████████████████   cross-cutting
 General Theory  0.727  ██████████████     cross-cutting
 Production      0.636  ████████████       cross-cutting
 Distribution    0.576  ███████████        borderline
 Accumulation    0.364  ███████            book-specific
 Consumption     0.333  ██████             book-specific
 density_std = 0.33   (high → healthy topic separation)
 cross_cutting_ratio = 0.50
 coverage_ratio = 0.44  (below 0.50 threshold, but for correct reasons)
 ```
 **What coverage does NOT capture**
 - **Entity-to-entity connections** — whether concepts reference each other,
  form explanatory chains, or cluster coherently.  That is C3 (Structural
  Coherence).
 - **VSM competency question answerability** — whether current entities
  collectively support answering the declared competency questions.  That
  requires LLM-Eval and is a planned metric (see below).
 - **Whether absent (domain, chapter) cells are meaningful gaps or expected
  absences** — the ratio treats them identically.
 **Threshold guidance**
 - `min: 0.50` is appropriate for a focused, single-topic corpus where all
  chapters address the same set of domains.
 - For heterogeneous multi-book corpora, domains introduced late create empty
  cells for all earlier chapters.  A threshold of `0.30–0.40` is more
  realistic.
 - Prefer `cross_cutting_ratio` and `density_std` as the primary diagnostic
  signals; use `coverage_ratio` only for trend tracking across snapshots.
 **Metrics:**
-| Metric | Type | Computation |
+| Metric | Type | Computation | Status |
-|--------|------|-------------|
+|--------|------|-------------|--------|
-| `domain_vsm_matrix` | Deterministic | Count entities per {economic_domain, VSM_system} cell |
+| `coverage_ratio` | Deterministic | `populated_cells / (n_domains × n_chapters)` | ✅ Implemented |
-| `coverage_ratio` | Deterministic | `populated_cells / expected_cells` |
+| `domain_densities` | Deterministic | Per-domain fraction of chapters with ≥1 entity | ✅ Implemented |
-| `vsm_balance_entropy` | Deterministic | Shannon entropy of entity distribution across VSM systems (higher = more balanced) |
+| `density_std` | Deterministic | Std dev of domain densities | ✅ Implemented |
-| `empty_cells` | Deterministic | List of {domain, VSM_system} pairs with zero entities |
+| `cross_cutting_ratio` | Deterministic | Fraction of domains with density > 0.5 | ✅ Implemented |
-| `competency_coverage` | LLM-Eval | For each competency question, can it be answered with current entities? |
+| `empty_cells` | Deterministic | List of unpopulated (domain, chapter) pairs | ✅ Implemented |
-| `fca_gap_concepts` | Deterministic | Attribute combinations in the FCA lattice with no corresponding entity |
+| `fca_gap_concepts` | Deterministic | Attribute combos in FCA lattice with no entity | ✅ Implemented |
 | `domain_vsm_matrix` | Deterministic | Entities per {domain, VSM_system} cell — requires VSM mappings in `extra_attributes` | ⬜ Not yet wired |
 | `competency_coverage` | LLM-Eval | For each competency question, can it be answered? | ⬜ Not yet implemented |
-**Pipeline:**
+**Pipeline (current):**
-1. Parse entity metadata (domain, VSM mapping) from files on disk
+1. Parse entity metadata (domain, source chapter) from entity files
-2. Build domain × VSM matrix; identify empty cells
+2. Build domain × chapter binary matrix; identify empty cells
-3. Build FCA formal context; compute lattice; extract gap concepts
+3. Compute per-domain densities, std dev, cross-cutting ratio
-4. Define competency questions (initially hand-written, later LLM-generated
+4. Build FCA formal context; extract gap concepts
-   from the source material)
+5. Aggregate into `CoverageReport`
 5. LLM-evaluate answerability of each question
 6. Aggregate into coverage ratio, entropy, and gap list
-**Output:** `output/metrics/coverage-report.md` + YAML with matrix, gaps,
+**Output:** Snapshot recorded in `output/metrics/history.yaml`.  A
-and competency question results.
+`coverage-report.md` per chapter is planned but not yet generated.
 ### C3: Structural Coherence
--- a/markitect/infospace/checks/coverage.py
+++ b/markitect/infospace/checks/coverage.py
@@ -1,12 +1,51 @@
 """
 C2 — Coverage completeness.
-Uses FCA and cross-tabulation to detect structural coverage gaps:
+**What this measures**
-attribute combinations (domain × VSM system) with no entities.
+
 Builds a binary *domain × chapter* cross-table: rows are economic domains
 found across all entities, columns are source chapters.  A cell is marked
 populated when at least one entity has that (domain, chapter) combination.
    coverage_ratio = populated_cells / (n_domains × n_chapters)
 This is a measure of how *uniformly* economic domains are distributed across
 source chapters, not of how richly entities connect to each other (that is
 C3 — Structural Coherence) and not of VSM competency-question answerability
 (that requires supplying ``extra_attributes`` with VSM system mappings, which
 the pipeline does not currently do).
 **Interpreting the ratio alone is misleading.**  A single ratio cannot
 distinguish two structurally different situations:
 - *Healthy topic separation* — domains are locally dense within their
  book/section, sparse elsewhere.  The matrix has clean block structure;
  low cross-chapter density per domain is *expected*.
 - *Fragmented extraction* — domains appear sporadically in all chapters,
  never strongly anchored anywhere.  The matrix is uniformly thin everywhere.
 Both can produce the same ratio.  Use the *per-domain density distribution*
 (``domain_densities``, ``density_std``, ``cross_cutting_ratio``) to
 distinguish them:
 - High ``density_std`` + bimodal distribution → healthy topic separation.
 - Low ``density_std`` + uniform distribution → potential fragmentation.
 - ``cross_cutting_ratio`` measures what fraction of domains span more than
  half the chapters — these are the foundational cross-cutting concepts.
 **Threshold note**
 A 0.50 threshold is appropriate for a focused single-topic corpus.  For a
 heterogeneous multi-book corpus (e.g. all five books of The Wealth of
 Nations), domains introduced in later books create empty cells for all
 earlier chapters, causing the ratio to fall below 0.50 even for structurally
 healthy corpora.  Consider 0.30–0.40 for large, multi-topic corpora.
 """
 from __future__ import annotations
 import math
 import statistics
 from dataclasses import dataclass, field
 from typing import Any, Dict, List, Optional
@@ -16,9 +55,30 @@ from markitect.analysis.fca import FormalContext, find_empty_cells, find_gap_con
@dataclass
 class CoverageReport:
-    """Results from coverage analysis."""
+    """Results from coverage analysis.
    Attributes:
        coverage_ratio: Fraction of (domain, chapter) cells that are
            populated.  See module docstring for interpretation notes.
        domain_densities: Per-domain fraction of chapters that contain
            at least one entity with that domain.  Keys are domain names.
        density_std: Standard deviation of ``domain_densities`` values.
            High std suggests healthy topic separation; low std suggests
            uniform but thin coverage.
        cross_cutting_ratio: Fraction of domains that appear in more than
            50 % of source chapters.  These are the foundational concepts.
        empty_cells: List of ``{dimension_a, dimension_b}`` dicts for each
            unpopulated (domain, chapter) cell.
        gap_concepts: FCA gap concepts — attribute combinations present in
            the lattice but with no entity.
        domain_counts: Total entity count per domain.
        entity_count: Total number of entities analysed.
    """
    coverage_ratio: float = 0.0
    domain_densities: Dict[str, float] = field(default_factory=dict)
    density_std: float = 0.0
    cross_cutting_ratio: float = 0.0
    empty_cells: List[dict] = field(default_factory=list)
    gap_concepts: List[dict] = field(default_factory=list)
    domain_counts: Dict[str, int] = field(default_factory=dict)
@@ -28,6 +88,9 @@ class CoverageReport:
        return {
            "concern": "C2",
            "coverage_ratio": round(self.coverage_ratio, 4),
            "domain_densities": {k: round(v, 4) for k, v in self.domain_densities.items()},
            "density_std": round(self.density_std, 4),
            "cross_cutting_ratio": round(self.cross_cutting_ratio, 4),
            "empty_cells": self.empty_cells,
            "gap_concepts_count": len(self.gap_concepts),
            "domain_counts": self.domain_counts,
@@ -102,8 +165,29 @@ def check_coverage(
    populated = total_cells - len(empty)
    ratio = populated / total_cells if total_cells > 0 else 0.0
    # Per-domain density: fraction of chapters that contain this domain
    n_chapters = len(chapters)
    domain_densities: Dict[str, float] = {}
    if n_chapters > 0:
        empty_pairs = {(e["dimension_a"], e["dimension_b"]) for e in empty}
        for d in domains:
            populated_for_domain = sum(
                1 for c in chapters if (d, c) not in empty_pairs
            )
            domain_densities[d.removeprefix("domain:")] = populated_for_domain / n_chapters
    density_values = list(domain_densities.values())
    density_std = statistics.stdev(density_values) if len(density_values) >= 2 else 0.0
    cross_cutting_ratio = (
        sum(1 for v in density_values if v > 0.5) / len(density_values)
        if density_values else 0.0
    )
    return CoverageReport(
        coverage_ratio=ratio,
        domain_densities=domain_densities,
        density_std=round(density_std, 6),
        cross_cutting_ratio=round(cross_cutting_ratio, 4),
        empty_cells=empty,
        gap_concepts=gap_dicts,
        domain_counts=domain_counts,