docs(metrics): clarify C2 coverage — domain×chapter matrix, not domain×VSM

- coverage.py: rewrite module docstring to explain what the metric actually
  computes (domain × chapter cross-tabulation, not VSM system coverage),
  what it does not capture (entity connectivity → C3), and when the
  threshold is appropriate
- CoverageReport: add domain_densities, density_std, cross_cutting_ratio
  for distribution-level insight beyond the aggregate ratio
- check_coverage: compute per-domain density and cross-cutting ratio
- METRICS-METHODOLOGY.md: correct C2 section to match implementation,
  document the distribution-based interpretation, add implementation status
  table distinguishing what is wired vs planned

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-20 00:08:46 +01:00
parent 0f54f094e4
commit dfe56a4f9b
2 changed files with 177 additions and 23 deletions

View File

@@ -290,31 +290,101 @@ pair list, scores, and merge/retire recommendations.
### C2: Coverage Completeness
**Goal:** Identify domain areas and VSM systems that lack adequate
representation in the entity set.
**Goal:** Identify domain areas that are structurally sparse or isolated
within the corpus — and separately, assess whether the entity set can answer
the infospace's declared competency questions.
**What the deterministic check actually computes**
The current implementation builds a binary *domain × chapter* cross-table:
one row per economic domain, one column per source chapter. A cell is
populated if at least one entity has that (domain, chapter) combination.
coverage_ratio = populated_cells / (n_domains × n_chapters)
**This is not the same as VSM coverage.** The domain × VSM matrix described
in earlier versions of this document requires VSM system mappings to be
supplied as `extra_attributes` to `check_coverage()`. The pipeline does not
currently do this, so `coverage_ratio` reflects *cross-chapter domain
distribution*, not *VSM system coverage*.
**Important: interpret the distribution, not just the ratio**
The aggregate ratio conflates two structurally different situations:
| Situation | coverage_ratio | What it means |
|---|---|---|
| Healthy topic separation | Low | Domains are locally dense within their book/section — expected for a multi-topic corpus |
| Fragmented extraction | Low | Domains appear sporadically everywhere, never anchored |
Both produce the same ratio. Use the per-domain density distribution to
distinguish them:
| Metric | Meaning |
|--------|---------|
| `domain_densities` | Per-domain fraction of chapters containing ≥1 entity with that domain |
| `density_std` | Standard deviation of densities. High std → healthy topic separation (bimodal: some domains cross-cutting, others local). Low std → uniform but thin. |
| `cross_cutting_ratio` | Fraction of domains appearing in >50 % of chapters — the foundational, cross-cutting concepts. |
Example interpretation for the WoN/VSM infospace (1021 entities, 35 chapters):
```
Exchange 0.848 ████████████████ cross-cutting
Regulation 0.848 ████████████████ cross-cutting
General Theory 0.727 ██████████████ cross-cutting
Production 0.636 ████████████ cross-cutting
Distribution 0.576 ███████████ borderline
Accumulation 0.364 ███████ book-specific
Consumption 0.333 ██████ book-specific
density_std = 0.33 (high → healthy topic separation)
cross_cutting_ratio = 0.50
coverage_ratio = 0.44 (below 0.50 threshold, but for correct reasons)
```
**What coverage does NOT capture**
- **Entity-to-entity connections** — whether concepts reference each other,
form explanatory chains, or cluster coherently. That is C3 (Structural
Coherence).
- **VSM competency question answerability** — whether current entities
collectively support answering the declared competency questions. That
requires LLM-Eval and is a planned metric (see below).
- **Whether absent (domain, chapter) cells are meaningful gaps or expected
absences** — the ratio treats them identically.
**Threshold guidance**
- `min: 0.50` is appropriate for a focused, single-topic corpus where all
chapters address the same set of domains.
- For heterogeneous multi-book corpora, domains introduced late create empty
cells for all earlier chapters. A threshold of `0.300.40` is more
realistic.
- Prefer `cross_cutting_ratio` and `density_std` as the primary diagnostic
signals; use `coverage_ratio` only for trend tracking across snapshots.
**Metrics:**
| Metric | Type | Computation |
|--------|------|-------------|
| `domain_vsm_matrix` | Deterministic | Count entities per {economic_domain, VSM_system} cell |
| `coverage_ratio` | Deterministic | `populated_cells / expected_cells` |
| `vsm_balance_entropy` | Deterministic | Shannon entropy of entity distribution across VSM systems (higher = more balanced) |
| `empty_cells` | Deterministic | List of {domain, VSM_system} pairs with zero entities |
| `competency_coverage` | LLM-Eval | For each competency question, can it be answered with current entities? |
| `fca_gap_concepts` | Deterministic | Attribute combinations in the FCA lattice with no corresponding entity |
| Metric | Type | Computation | Status |
|--------|------|-------------|--------|
| `coverage_ratio` | Deterministic | `populated_cells / (n_domains × n_chapters)` | ✅ Implemented |
| `domain_densities` | Deterministic | Per-domain fraction of chapters with ≥1 entity | ✅ Implemented |
| `density_std` | Deterministic | Std dev of domain densities | ✅ Implemented |
| `cross_cutting_ratio` | Deterministic | Fraction of domains with density > 0.5 | ✅ Implemented |
| `empty_cells` | Deterministic | List of unpopulated (domain, chapter) pairs | ✅ Implemented |
| `fca_gap_concepts` | Deterministic | Attribute combos in FCA lattice with no entity | ✅ Implemented |
| `domain_vsm_matrix` | Deterministic | Entities per {domain, VSM_system} cell — requires VSM mappings in `extra_attributes` | ⬜ Not yet wired |
| `competency_coverage` | LLM-Eval | For each competency question, can it be answered? | ⬜ Not yet implemented |
**Pipeline:**
1. Parse entity metadata (domain, VSM mapping) from files on disk
2. Build domain × VSM matrix; identify empty cells
3. Build FCA formal context; compute lattice; extract gap concepts
4. Define competency questions (initially hand-written, later LLM-generated
from the source material)
5. LLM-evaluate answerability of each question
6. Aggregate into coverage ratio, entropy, and gap list
**Pipeline (current):**
1. Parse entity metadata (domain, source chapter) from entity files
2. Build domain × chapter binary matrix; identify empty cells
3. Compute per-domain densities, std dev, cross-cutting ratio
4. Build FCA formal context; extract gap concepts
5. Aggregate into `CoverageReport`
**Output:** `output/metrics/coverage-report.md` + YAML with matrix, gaps,
and competency question results.
**Output:** Snapshot recorded in `output/metrics/history.yaml`. A
`coverage-report.md` per chapter is planned but not yet generated.
### C3: Structural Coherence

View File

@@ -1,12 +1,51 @@
"""
C2 — Coverage completeness.
Uses FCA and cross-tabulation to detect structural coverage gaps:
attribute combinations (domain × VSM system) with no entities.
**What this measures**
Builds a binary *domain × chapter* cross-table: rows are economic domains
found across all entities, columns are source chapters. A cell is marked
populated when at least one entity has that (domain, chapter) combination.
coverage_ratio = populated_cells / (n_domains × n_chapters)
This is a measure of how *uniformly* economic domains are distributed across
source chapters, not of how richly entities connect to each other (that is
C3 — Structural Coherence) and not of VSM competency-question answerability
(that requires supplying ``extra_attributes`` with VSM system mappings, which
the pipeline does not currently do).
**Interpreting the ratio alone is misleading.** A single ratio cannot
distinguish two structurally different situations:
- *Healthy topic separation* — domains are locally dense within their
book/section, sparse elsewhere. The matrix has clean block structure;
low cross-chapter density per domain is *expected*.
- *Fragmented extraction* — domains appear sporadically in all chapters,
never strongly anchored anywhere. The matrix is uniformly thin everywhere.
Both can produce the same ratio. Use the *per-domain density distribution*
(``domain_densities``, ``density_std``, ``cross_cutting_ratio``) to
distinguish them:
- High ``density_std`` + bimodal distribution → healthy topic separation.
- Low ``density_std`` + uniform distribution → potential fragmentation.
- ``cross_cutting_ratio`` measures what fraction of domains span more than
half the chapters — these are the foundational cross-cutting concepts.
**Threshold note**
A 0.50 threshold is appropriate for a focused single-topic corpus. For a
heterogeneous multi-book corpus (e.g. all five books of The Wealth of
Nations), domains introduced in later books create empty cells for all
earlier chapters, causing the ratio to fall below 0.50 even for structurally
healthy corpora. Consider 0.300.40 for large, multi-topic corpora.
"""
from __future__ import annotations
import math
import statistics
from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional
@@ -16,9 +55,30 @@ from markitect.analysis.fca import FormalContext, find_empty_cells, find_gap_con
@dataclass
class CoverageReport:
"""Results from coverage analysis."""
"""Results from coverage analysis.
Attributes:
coverage_ratio: Fraction of (domain, chapter) cells that are
populated. See module docstring for interpretation notes.
domain_densities: Per-domain fraction of chapters that contain
at least one entity with that domain. Keys are domain names.
density_std: Standard deviation of ``domain_densities`` values.
High std suggests healthy topic separation; low std suggests
uniform but thin coverage.
cross_cutting_ratio: Fraction of domains that appear in more than
50 % of source chapters. These are the foundational concepts.
empty_cells: List of ``{dimension_a, dimension_b}`` dicts for each
unpopulated (domain, chapter) cell.
gap_concepts: FCA gap concepts — attribute combinations present in
the lattice but with no entity.
domain_counts: Total entity count per domain.
entity_count: Total number of entities analysed.
"""
coverage_ratio: float = 0.0
domain_densities: Dict[str, float] = field(default_factory=dict)
density_std: float = 0.0
cross_cutting_ratio: float = 0.0
empty_cells: List[dict] = field(default_factory=list)
gap_concepts: List[dict] = field(default_factory=list)
domain_counts: Dict[str, int] = field(default_factory=dict)
@@ -28,6 +88,9 @@ class CoverageReport:
return {
"concern": "C2",
"coverage_ratio": round(self.coverage_ratio, 4),
"domain_densities": {k: round(v, 4) for k, v in self.domain_densities.items()},
"density_std": round(self.density_std, 4),
"cross_cutting_ratio": round(self.cross_cutting_ratio, 4),
"empty_cells": self.empty_cells,
"gap_concepts_count": len(self.gap_concepts),
"domain_counts": self.domain_counts,
@@ -102,8 +165,29 @@ def check_coverage(
populated = total_cells - len(empty)
ratio = populated / total_cells if total_cells > 0 else 0.0
# Per-domain density: fraction of chapters that contain this domain
n_chapters = len(chapters)
domain_densities: Dict[str, float] = {}
if n_chapters > 0:
empty_pairs = {(e["dimension_a"], e["dimension_b"]) for e in empty}
for d in domains:
populated_for_domain = sum(
1 for c in chapters if (d, c) not in empty_pairs
)
domain_densities[d.removeprefix("domain:")] = populated_for_domain / n_chapters
density_values = list(domain_densities.values())
density_std = statistics.stdev(density_values) if len(density_values) >= 2 else 0.0
cross_cutting_ratio = (
sum(1 for v in density_values if v > 0.5) / len(density_values)
if density_values else 0.0
)
return CoverageReport(
coverage_ratio=ratio,
domain_densities=domain_densities,
density_std=round(density_std, 6),
cross_cutting_ratio=round(cross_cutting_ratio, 4),
empty_cells=empty,
gap_concepts=gap_dicts,
domain_counts=domain_counts,