docs(metrics): clarify C2 coverage — domain×chapter matrix, not domain×VSM

- coverage.py: rewrite module docstring to explain what the metric actually
  computes (domain × chapter cross-tabulation, not VSM system coverage),
  what it does not capture (entity connectivity → C3), and when the
  threshold is appropriate
- CoverageReport: add domain_densities, density_std, cross_cutting_ratio
  for distribution-level insight beyond the aggregate ratio
- check_coverage: compute per-domain density and cross-cutting ratio
- METRICS-METHODOLOGY.md: correct C2 section to match implementation,
  document the distribution-based interpretation, add implementation status
  table distinguishing what is wired vs planned

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-20 00:08:46 +01:00
parent 0f54f094e4
commit dfe56a4f9b
2 changed files with 177 additions and 23 deletions

View File

@@ -290,31 +290,101 @@ pair list, scores, and merge/retire recommendations.
### C2: Coverage Completeness
**Goal:** Identify domain areas and VSM systems that lack adequate
representation in the entity set.
**Goal:** Identify domain areas that are structurally sparse or isolated
within the corpus — and separately, assess whether the entity set can answer
the infospace's declared competency questions.
**What the deterministic check actually computes**
The current implementation builds a binary *domain × chapter* cross-table:
one row per economic domain, one column per source chapter. A cell is
populated if at least one entity has that (domain, chapter) combination.
coverage_ratio = populated_cells / (n_domains × n_chapters)
**This is not the same as VSM coverage.** The domain × VSM matrix described
in earlier versions of this document requires VSM system mappings to be
supplied as `extra_attributes` to `check_coverage()`. The pipeline does not
currently do this, so `coverage_ratio` reflects *cross-chapter domain
distribution*, not *VSM system coverage*.
**Important: interpret the distribution, not just the ratio**
The aggregate ratio conflates two structurally different situations:
| Situation | coverage_ratio | What it means |
|---|---|---|
| Healthy topic separation | Low | Domains are locally dense within their book/section — expected for a multi-topic corpus |
| Fragmented extraction | Low | Domains appear sporadically everywhere, never anchored |
Both produce the same ratio. Use the per-domain density distribution to
distinguish them:
| Metric | Meaning |
|--------|---------|
| `domain_densities` | Per-domain fraction of chapters containing ≥1 entity with that domain |
| `density_std` | Standard deviation of densities. High std → healthy topic separation (bimodal: some domains cross-cutting, others local). Low std → uniform but thin. |
| `cross_cutting_ratio` | Fraction of domains appearing in >50 % of chapters — the foundational, cross-cutting concepts. |
Example interpretation for the WoN/VSM infospace (1021 entities, 35 chapters):
```
Exchange 0.848 ████████████████ cross-cutting
Regulation 0.848 ████████████████ cross-cutting
General Theory 0.727 ██████████████ cross-cutting
Production 0.636 ████████████ cross-cutting
Distribution 0.576 ███████████ borderline
Accumulation 0.364 ███████ book-specific
Consumption 0.333 ██████ book-specific
density_std = 0.33 (high → healthy topic separation)
cross_cutting_ratio = 0.50
coverage_ratio = 0.44 (below 0.50 threshold, but for correct reasons)
```
**What coverage does NOT capture**
- **Entity-to-entity connections** — whether concepts reference each other,
form explanatory chains, or cluster coherently. That is C3 (Structural
Coherence).
- **VSM competency question answerability** — whether current entities
collectively support answering the declared competency questions. That
requires LLM-Eval and is a planned metric (see below).
- **Whether absent (domain, chapter) cells are meaningful gaps or expected
absences** — the ratio treats them identically.
**Threshold guidance**
- `min: 0.50` is appropriate for a focused, single-topic corpus where all
chapters address the same set of domains.
- For heterogeneous multi-book corpora, domains introduced late create empty
cells for all earlier chapters. A threshold of `0.300.40` is more
realistic.
- Prefer `cross_cutting_ratio` and `density_std` as the primary diagnostic
signals; use `coverage_ratio` only for trend tracking across snapshots.
**Metrics:**
| Metric | Type | Computation |
|--------|------|-------------|
| `domain_vsm_matrix` | Deterministic | Count entities per {economic_domain, VSM_system} cell |
| `coverage_ratio` | Deterministic | `populated_cells / expected_cells` |
| `vsm_balance_entropy` | Deterministic | Shannon entropy of entity distribution across VSM systems (higher = more balanced) |
| `empty_cells` | Deterministic | List of {domain, VSM_system} pairs with zero entities |
| `competency_coverage` | LLM-Eval | For each competency question, can it be answered with current entities? |
| `fca_gap_concepts` | Deterministic | Attribute combinations in the FCA lattice with no corresponding entity |
| Metric | Type | Computation | Status |
|--------|------|-------------|--------|
| `coverage_ratio` | Deterministic | `populated_cells / (n_domains × n_chapters)` | ✅ Implemented |
| `domain_densities` | Deterministic | Per-domain fraction of chapters with ≥1 entity | ✅ Implemented |
| `density_std` | Deterministic | Std dev of domain densities | ✅ Implemented |
| `cross_cutting_ratio` | Deterministic | Fraction of domains with density > 0.5 | ✅ Implemented |
| `empty_cells` | Deterministic | List of unpopulated (domain, chapter) pairs | ✅ Implemented |
| `fca_gap_concepts` | Deterministic | Attribute combos in FCA lattice with no entity | ✅ Implemented |
| `domain_vsm_matrix` | Deterministic | Entities per {domain, VSM_system} cell — requires VSM mappings in `extra_attributes` | ⬜ Not yet wired |
| `competency_coverage` | LLM-Eval | For each competency question, can it be answered? | ⬜ Not yet implemented |
**Pipeline:**
1. Parse entity metadata (domain, VSM mapping) from files on disk
2. Build domain × VSM matrix; identify empty cells
3. Build FCA formal context; compute lattice; extract gap concepts
4. Define competency questions (initially hand-written, later LLM-generated
from the source material)
5. LLM-evaluate answerability of each question
6. Aggregate into coverage ratio, entropy, and gap list
**Pipeline (current):**
1. Parse entity metadata (domain, source chapter) from entity files
2. Build domain × chapter binary matrix; identify empty cells
3. Compute per-domain densities, std dev, cross-cutting ratio
4. Build FCA formal context; extract gap concepts
5. Aggregate into `CoverageReport`
**Output:** `output/metrics/coverage-report.md` + YAML with matrix, gaps,
and competency question results.
**Output:** Snapshot recorded in `output/metrics/history.yaml`. A
`coverage-report.md` per chapter is planned but not yet generated.
### C3: Structural Coherence