docs(metrics): clarify C2 coverage — domain×chapter matrix, not domain×VSM
- coverage.py: rewrite module docstring to explain what the metric actually computes (domain × chapter cross-tabulation, not VSM system coverage), what it does not capture (entity connectivity → C3), and when the threshold is appropriate - CoverageReport: add domain_densities, density_std, cross_cutting_ratio for distribution-level insight beyond the aggregate ratio - check_coverage: compute per-domain density and cross-cutting ratio - METRICS-METHODOLOGY.md: correct C2 section to match implementation, document the distribution-based interpretation, add implementation status table distinguishing what is wired vs planned Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -290,31 +290,101 @@ pair list, scores, and merge/retire recommendations.
|
|||||||
|
|
||||||
### C2: Coverage Completeness
|
### C2: Coverage Completeness
|
||||||
|
|
||||||
**Goal:** Identify domain areas and VSM systems that lack adequate
|
**Goal:** Identify domain areas that are structurally sparse or isolated
|
||||||
representation in the entity set.
|
within the corpus — and separately, assess whether the entity set can answer
|
||||||
|
the infospace's declared competency questions.
|
||||||
|
|
||||||
|
**What the deterministic check actually computes**
|
||||||
|
|
||||||
|
The current implementation builds a binary *domain × chapter* cross-table:
|
||||||
|
one row per economic domain, one column per source chapter. A cell is
|
||||||
|
populated if at least one entity has that (domain, chapter) combination.
|
||||||
|
|
||||||
|
coverage_ratio = populated_cells / (n_domains × n_chapters)
|
||||||
|
|
||||||
|
**This is not the same as VSM coverage.** The domain × VSM matrix described
|
||||||
|
in earlier versions of this document requires VSM system mappings to be
|
||||||
|
supplied as `extra_attributes` to `check_coverage()`. The pipeline does not
|
||||||
|
currently do this, so `coverage_ratio` reflects *cross-chapter domain
|
||||||
|
distribution*, not *VSM system coverage*.
|
||||||
|
|
||||||
|
**Important: interpret the distribution, not just the ratio**
|
||||||
|
|
||||||
|
The aggregate ratio conflates two structurally different situations:
|
||||||
|
|
||||||
|
| Situation | coverage_ratio | What it means |
|
||||||
|
|---|---|---|
|
||||||
|
| Healthy topic separation | Low | Domains are locally dense within their book/section — expected for a multi-topic corpus |
|
||||||
|
| Fragmented extraction | Low | Domains appear sporadically everywhere, never anchored |
|
||||||
|
|
||||||
|
Both produce the same ratio. Use the per-domain density distribution to
|
||||||
|
distinguish them:
|
||||||
|
|
||||||
|
| Metric | Meaning |
|
||||||
|
|--------|---------|
|
||||||
|
| `domain_densities` | Per-domain fraction of chapters containing ≥1 entity with that domain |
|
||||||
|
| `density_std` | Standard deviation of densities. High std → healthy topic separation (bimodal: some domains cross-cutting, others local). Low std → uniform but thin. |
|
||||||
|
| `cross_cutting_ratio` | Fraction of domains appearing in >50 % of chapters — the foundational, cross-cutting concepts. |
|
||||||
|
|
||||||
|
Example interpretation for the WoN/VSM infospace (1021 entities, 35 chapters):
|
||||||
|
|
||||||
|
```
|
||||||
|
Exchange 0.848 ████████████████ cross-cutting
|
||||||
|
Regulation 0.848 ████████████████ cross-cutting
|
||||||
|
General Theory 0.727 ██████████████ cross-cutting
|
||||||
|
Production 0.636 ████████████ cross-cutting
|
||||||
|
Distribution 0.576 ███████████ borderline
|
||||||
|
Accumulation 0.364 ███████ book-specific
|
||||||
|
Consumption 0.333 ██████ book-specific
|
||||||
|
|
||||||
|
density_std = 0.33 (high → healthy topic separation)
|
||||||
|
cross_cutting_ratio = 0.50
|
||||||
|
coverage_ratio = 0.44 (below 0.50 threshold, but for correct reasons)
|
||||||
|
```
|
||||||
|
|
||||||
|
**What coverage does NOT capture**
|
||||||
|
|
||||||
|
- **Entity-to-entity connections** — whether concepts reference each other,
|
||||||
|
form explanatory chains, or cluster coherently. That is C3 (Structural
|
||||||
|
Coherence).
|
||||||
|
- **VSM competency question answerability** — whether current entities
|
||||||
|
collectively support answering the declared competency questions. That
|
||||||
|
requires LLM-Eval and is a planned metric (see below).
|
||||||
|
- **Whether absent (domain, chapter) cells are meaningful gaps or expected
|
||||||
|
absences** — the ratio treats them identically.
|
||||||
|
|
||||||
|
**Threshold guidance**
|
||||||
|
|
||||||
|
- `min: 0.50` is appropriate for a focused, single-topic corpus where all
|
||||||
|
chapters address the same set of domains.
|
||||||
|
- For heterogeneous multi-book corpora, domains introduced late create empty
|
||||||
|
cells for all earlier chapters. A threshold of `0.30–0.40` is more
|
||||||
|
realistic.
|
||||||
|
- Prefer `cross_cutting_ratio` and `density_std` as the primary diagnostic
|
||||||
|
signals; use `coverage_ratio` only for trend tracking across snapshots.
|
||||||
|
|
||||||
**Metrics:**
|
**Metrics:**
|
||||||
|
|
||||||
| Metric | Type | Computation |
|
| Metric | Type | Computation | Status |
|
||||||
|--------|------|-------------|
|
|--------|------|-------------|--------|
|
||||||
| `domain_vsm_matrix` | Deterministic | Count entities per {economic_domain, VSM_system} cell |
|
| `coverage_ratio` | Deterministic | `populated_cells / (n_domains × n_chapters)` | ✅ Implemented |
|
||||||
| `coverage_ratio` | Deterministic | `populated_cells / expected_cells` |
|
| `domain_densities` | Deterministic | Per-domain fraction of chapters with ≥1 entity | ✅ Implemented |
|
||||||
| `vsm_balance_entropy` | Deterministic | Shannon entropy of entity distribution across VSM systems (higher = more balanced) |
|
| `density_std` | Deterministic | Std dev of domain densities | ✅ Implemented |
|
||||||
| `empty_cells` | Deterministic | List of {domain, VSM_system} pairs with zero entities |
|
| `cross_cutting_ratio` | Deterministic | Fraction of domains with density > 0.5 | ✅ Implemented |
|
||||||
| `competency_coverage` | LLM-Eval | For each competency question, can it be answered with current entities? |
|
| `empty_cells` | Deterministic | List of unpopulated (domain, chapter) pairs | ✅ Implemented |
|
||||||
| `fca_gap_concepts` | Deterministic | Attribute combinations in the FCA lattice with no corresponding entity |
|
| `fca_gap_concepts` | Deterministic | Attribute combos in FCA lattice with no entity | ✅ Implemented |
|
||||||
|
| `domain_vsm_matrix` | Deterministic | Entities per {domain, VSM_system} cell — requires VSM mappings in `extra_attributes` | ⬜ Not yet wired |
|
||||||
|
| `competency_coverage` | LLM-Eval | For each competency question, can it be answered? | ⬜ Not yet implemented |
|
||||||
|
|
||||||
**Pipeline:**
|
**Pipeline (current):**
|
||||||
1. Parse entity metadata (domain, VSM mapping) from files on disk
|
1. Parse entity metadata (domain, source chapter) from entity files
|
||||||
2. Build domain × VSM matrix; identify empty cells
|
2. Build domain × chapter binary matrix; identify empty cells
|
||||||
3. Build FCA formal context; compute lattice; extract gap concepts
|
3. Compute per-domain densities, std dev, cross-cutting ratio
|
||||||
4. Define competency questions (initially hand-written, later LLM-generated
|
4. Build FCA formal context; extract gap concepts
|
||||||
from the source material)
|
5. Aggregate into `CoverageReport`
|
||||||
5. LLM-evaluate answerability of each question
|
|
||||||
6. Aggregate into coverage ratio, entropy, and gap list
|
|
||||||
|
|
||||||
**Output:** `output/metrics/coverage-report.md` + YAML with matrix, gaps,
|
**Output:** Snapshot recorded in `output/metrics/history.yaml`. A
|
||||||
and competency question results.
|
`coverage-report.md` per chapter is planned but not yet generated.
|
||||||
|
|
||||||
### C3: Structural Coherence
|
### C3: Structural Coherence
|
||||||
|
|
||||||
|
|||||||
@@ -1,12 +1,51 @@
|
|||||||
"""
|
"""
|
||||||
C2 — Coverage completeness.
|
C2 — Coverage completeness.
|
||||||
|
|
||||||
Uses FCA and cross-tabulation to detect structural coverage gaps:
|
**What this measures**
|
||||||
attribute combinations (domain × VSM system) with no entities.
|
|
||||||
|
Builds a binary *domain × chapter* cross-table: rows are economic domains
|
||||||
|
found across all entities, columns are source chapters. A cell is marked
|
||||||
|
populated when at least one entity has that (domain, chapter) combination.
|
||||||
|
|
||||||
|
coverage_ratio = populated_cells / (n_domains × n_chapters)
|
||||||
|
|
||||||
|
This is a measure of how *uniformly* economic domains are distributed across
|
||||||
|
source chapters, not of how richly entities connect to each other (that is
|
||||||
|
C3 — Structural Coherence) and not of VSM competency-question answerability
|
||||||
|
(that requires supplying ``extra_attributes`` with VSM system mappings, which
|
||||||
|
the pipeline does not currently do).
|
||||||
|
|
||||||
|
**Interpreting the ratio alone is misleading.** A single ratio cannot
|
||||||
|
distinguish two structurally different situations:
|
||||||
|
|
||||||
|
- *Healthy topic separation* — domains are locally dense within their
|
||||||
|
book/section, sparse elsewhere. The matrix has clean block structure;
|
||||||
|
low cross-chapter density per domain is *expected*.
|
||||||
|
- *Fragmented extraction* — domains appear sporadically in all chapters,
|
||||||
|
never strongly anchored anywhere. The matrix is uniformly thin everywhere.
|
||||||
|
|
||||||
|
Both can produce the same ratio. Use the *per-domain density distribution*
|
||||||
|
(``domain_densities``, ``density_std``, ``cross_cutting_ratio``) to
|
||||||
|
distinguish them:
|
||||||
|
|
||||||
|
- High ``density_std`` + bimodal distribution → healthy topic separation.
|
||||||
|
- Low ``density_std`` + uniform distribution → potential fragmentation.
|
||||||
|
- ``cross_cutting_ratio`` measures what fraction of domains span more than
|
||||||
|
half the chapters — these are the foundational cross-cutting concepts.
|
||||||
|
|
||||||
|
**Threshold note**
|
||||||
|
|
||||||
|
A 0.50 threshold is appropriate for a focused single-topic corpus. For a
|
||||||
|
heterogeneous multi-book corpus (e.g. all five books of The Wealth of
|
||||||
|
Nations), domains introduced in later books create empty cells for all
|
||||||
|
earlier chapters, causing the ratio to fall below 0.50 even for structurally
|
||||||
|
healthy corpora. Consider 0.30–0.40 for large, multi-topic corpora.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import math
|
||||||
|
import statistics
|
||||||
from dataclasses import dataclass, field
|
from dataclasses import dataclass, field
|
||||||
from typing import Any, Dict, List, Optional
|
from typing import Any, Dict, List, Optional
|
||||||
|
|
||||||
@@ -16,9 +55,30 @@ from markitect.analysis.fca import FormalContext, find_empty_cells, find_gap_con
|
|||||||
|
|
||||||
@dataclass
|
@dataclass
|
||||||
class CoverageReport:
|
class CoverageReport:
|
||||||
"""Results from coverage analysis."""
|
"""Results from coverage analysis.
|
||||||
|
|
||||||
|
Attributes:
|
||||||
|
coverage_ratio: Fraction of (domain, chapter) cells that are
|
||||||
|
populated. See module docstring for interpretation notes.
|
||||||
|
domain_densities: Per-domain fraction of chapters that contain
|
||||||
|
at least one entity with that domain. Keys are domain names.
|
||||||
|
density_std: Standard deviation of ``domain_densities`` values.
|
||||||
|
High std suggests healthy topic separation; low std suggests
|
||||||
|
uniform but thin coverage.
|
||||||
|
cross_cutting_ratio: Fraction of domains that appear in more than
|
||||||
|
50 % of source chapters. These are the foundational concepts.
|
||||||
|
empty_cells: List of ``{dimension_a, dimension_b}`` dicts for each
|
||||||
|
unpopulated (domain, chapter) cell.
|
||||||
|
gap_concepts: FCA gap concepts — attribute combinations present in
|
||||||
|
the lattice but with no entity.
|
||||||
|
domain_counts: Total entity count per domain.
|
||||||
|
entity_count: Total number of entities analysed.
|
||||||
|
"""
|
||||||
|
|
||||||
coverage_ratio: float = 0.0
|
coverage_ratio: float = 0.0
|
||||||
|
domain_densities: Dict[str, float] = field(default_factory=dict)
|
||||||
|
density_std: float = 0.0
|
||||||
|
cross_cutting_ratio: float = 0.0
|
||||||
empty_cells: List[dict] = field(default_factory=list)
|
empty_cells: List[dict] = field(default_factory=list)
|
||||||
gap_concepts: List[dict] = field(default_factory=list)
|
gap_concepts: List[dict] = field(default_factory=list)
|
||||||
domain_counts: Dict[str, int] = field(default_factory=dict)
|
domain_counts: Dict[str, int] = field(default_factory=dict)
|
||||||
@@ -28,6 +88,9 @@ class CoverageReport:
|
|||||||
return {
|
return {
|
||||||
"concern": "C2",
|
"concern": "C2",
|
||||||
"coverage_ratio": round(self.coverage_ratio, 4),
|
"coverage_ratio": round(self.coverage_ratio, 4),
|
||||||
|
"domain_densities": {k: round(v, 4) for k, v in self.domain_densities.items()},
|
||||||
|
"density_std": round(self.density_std, 4),
|
||||||
|
"cross_cutting_ratio": round(self.cross_cutting_ratio, 4),
|
||||||
"empty_cells": self.empty_cells,
|
"empty_cells": self.empty_cells,
|
||||||
"gap_concepts_count": len(self.gap_concepts),
|
"gap_concepts_count": len(self.gap_concepts),
|
||||||
"domain_counts": self.domain_counts,
|
"domain_counts": self.domain_counts,
|
||||||
@@ -102,8 +165,29 @@ def check_coverage(
|
|||||||
populated = total_cells - len(empty)
|
populated = total_cells - len(empty)
|
||||||
ratio = populated / total_cells if total_cells > 0 else 0.0
|
ratio = populated / total_cells if total_cells > 0 else 0.0
|
||||||
|
|
||||||
|
# Per-domain density: fraction of chapters that contain this domain
|
||||||
|
n_chapters = len(chapters)
|
||||||
|
domain_densities: Dict[str, float] = {}
|
||||||
|
if n_chapters > 0:
|
||||||
|
empty_pairs = {(e["dimension_a"], e["dimension_b"]) for e in empty}
|
||||||
|
for d in domains:
|
||||||
|
populated_for_domain = sum(
|
||||||
|
1 for c in chapters if (d, c) not in empty_pairs
|
||||||
|
)
|
||||||
|
domain_densities[d.removeprefix("domain:")] = populated_for_domain / n_chapters
|
||||||
|
|
||||||
|
density_values = list(domain_densities.values())
|
||||||
|
density_std = statistics.stdev(density_values) if len(density_values) >= 2 else 0.0
|
||||||
|
cross_cutting_ratio = (
|
||||||
|
sum(1 for v in density_values if v > 0.5) / len(density_values)
|
||||||
|
if density_values else 0.0
|
||||||
|
)
|
||||||
|
|
||||||
return CoverageReport(
|
return CoverageReport(
|
||||||
coverage_ratio=ratio,
|
coverage_ratio=ratio,
|
||||||
|
domain_densities=domain_densities,
|
||||||
|
density_std=round(density_std, 6),
|
||||||
|
cross_cutting_ratio=round(cross_cutting_ratio, 4),
|
||||||
empty_cells=empty,
|
empty_cells=empty,
|
||||||
gap_concepts=gap_dicts,
|
gap_concepts=gap_dicts,
|
||||||
domain_counts=domain_counts,
|
domain_counts=domain_counts,
|
||||||
|
|||||||
Reference in New Issue
Block a user