feat(example): add per-entity LLM evaluations for 985 WoN entities (S3.3)

Batch evaluation of all 988 entities via OpenRouter. 984 succeeded on
first pass; 3 failed (network errors). eval-summary --update-metrics
written with per_entity_mean=3.9556.

Viability dashboard: 6/6 PASS
  redundancy_ratio   0.0061  (max 0.10)
  coverage_ratio     0.6190  (min 0.40)
  coherence_comps    0.0000  (max 3)
  consistency_cycles 0.0000  (max 0)
  granularity_entropy 2.6748 (min 1.0)
  per_entity_mean    3.9556  (min 3.5)

Dimension breakdown (mean across 985 entities):
  definition_precision  3.62
  source_grounding      4.36
  domain_placement      4.56
  vsm_relevance         3.31
  explanatory_value     3.94

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-23 09:36:46 +01:00
parent 81a4c8796a
commit a9ca0adfcf
986 changed files with 63216 additions and 1 deletions

View File

@@ -0,0 +1,61 @@
---
entity_slug: warehouse_export_system
evaluator: null
evaluated_at: '2026-02-23T06:38:10.957085'
overall_score: 1.2
scores:
- name: definition_precision
value: 1.0
max_value: 5.0
rationale: There is no definition provided whatsoever, making it impossible to assess
what this entity represents or its conceptual boundaries. Without any definitional
content, this fails completely on precision.
- name: source_grounding
value: 1.0
max_value: 5.0
rationale: With no source chapter specified and no definition or context provided,
there is no evidence this entity is grounded in Smith's actual text. The term
"warehouse export system" sounds anachronistically modern and technical for 18th-century
economic discourse.
- name: domain_placement
value: 1.0
max_value: 5.0
rationale: The domain is listed as "unspecified," providing no information about
where this entity fits within economic or thematic categories. Without context
or definition, proper domain placement cannot be assessed.
- name: vsm_relevance
value: 2.0
max_value: 5.0
rationale: While "warehouse export system" could theoretically relate to S1 (operational
activities) in a supply chain context, the complete lack of definition makes VSM
mapping speculative at best. The systemic nature of the term suggests potential
VSM relevance, but this cannot be confirmed without substantive content.
- name: explanatory_value
value: 1.0
max_value: 5.0
rationale: An entity with no definition, context, or source grounding provides zero
explanatory value for understanding Smith's economic theory. It contributes nothing
to illuminating mechanisms or structural relations in "The Wealth of Nations."
---
# Evaluation: Warehouse Export System
## definition_precision — 1.0 / 5.0
There is no definition provided whatsoever, making it impossible to assess what this entity represents or its conceptual boundaries. Without any definitional content, this fails completely on precision.
## source_grounding — 1.0 / 5.0
With no source chapter specified and no definition or context provided, there is no evidence this entity is grounded in Smith's actual text. The term "warehouse export system" sounds anachronistically modern and technical for 18th-century economic discourse.
## domain_placement — 1.0 / 5.0
The domain is listed as "unspecified," providing no information about where this entity fits within economic or thematic categories. Without context or definition, proper domain placement cannot be assessed.
## vsm_relevance — 2.0 / 5.0
While "warehouse export system" could theoretically relate to S1 (operational activities) in a supply chain context, the complete lack of definition makes VSM mapping speculative at best. The systemic nature of the term suggests potential VSM relevance, but this cannot be confirmed without substantive content.
## explanatory_value — 1.0 / 5.0
An entity with no definition, context, or source grounding provides zero explanatory value for understanding Smith's economic theory. It contributes nothing to illuminating mechanisms or structural relations in "The Wealth of Nations."