feat(example): add per-entity LLM evaluations for 985 WoN entities (S3.3)

Batch evaluation of all 988 entities via OpenRouter. 984 succeeded on
first pass; 3 failed (network errors). eval-summary --update-metrics
written with per_entity_mean=3.9556.

Viability dashboard: 6/6 PASS
  redundancy_ratio   0.0061  (max 0.10)
  coverage_ratio     0.6190  (min 0.40)
  coherence_comps    0.0000  (max 3)
  consistency_cycles 0.0000  (max 0)
  granularity_entropy 2.6748 (min 1.0)
  per_entity_mean    3.9556  (min 3.5)

Dimension breakdown (mean across 985 entities):
  definition_precision  3.62
  source_grounding      4.36
  domain_placement      4.56
  vsm_relevance         3.31
  explanatory_value     3.94

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-23 09:36:46 +01:00
parent 81a4c8796a
commit a9ca0adfcf
986 changed files with 63216 additions and 1 deletions

View File

@@ -0,0 +1,65 @@
---
entity_slug: economic_system_function
evaluator: null
evaluated_at: '2026-02-23T05:16:23.011806'
overall_score: 2.0
scores:
- name: definition_precision
value: 2.0
max_value: 5.0
rationale: The definition is overly broad and circular, essentially defining "economic
system function" as "how economic systems function." It uses vague terms like
"operational role" and "practical effects" without establishing clear boundaries
that distinguish this concept from related ones like economic systems themselves
or their outcomes.
- name: source_grounding
value: 2.0
max_value: 5.0
rationale: While Smith does discuss different economic systems in Book IV, the entity
appears to be a modern analytical abstraction rather than a concept Smith explicitly
develops. The reference to "Book IV, Chapter 0" is also problematic since Book
IV doesn't have a Chapter 0, suggesting weak source grounding.
- name: domain_placement
value: 3.0
max_value: 5.0
rationale: '"General Theory" is appropriate given the broad, abstract nature of
the concept, though the entity is so general it could arguably fit in multiple
domains. The placement isn''t wrong but reflects the entity''s lack of specificity
rather than clear conceptual boundaries.'
- name: vsm_relevance
value: 1.0
max_value: 5.0
rationale: This entity is too abstract and meta-level to map meaningfully to any
specific VSM system. It describes the general notion of "how systems work" rather
than identifying particular functional components that would correspond to S1-S5
operations.
- name: explanatory_value
value: 2.0
max_value: 5.0
rationale: The entity provides minimal explanatory power because it merely labels
the general idea that economic systems have functions without identifying specific
mechanisms, causal relationships, or structural elements. It names a surface phenomenon
rather than illuminating underlying processes that Smith actually analyzes.
---
# Evaluation: Economic System Function
## definition_precision — 2.0 / 5.0
The definition is overly broad and circular, essentially defining "economic system function" as "how economic systems function." It uses vague terms like "operational role" and "practical effects" without establishing clear boundaries that distinguish this concept from related ones like economic systems themselves or their outcomes.
## source_grounding — 2.0 / 5.0
While Smith does discuss different economic systems in Book IV, the entity appears to be a modern analytical abstraction rather than a concept Smith explicitly develops. The reference to "Book IV, Chapter 0" is also problematic since Book IV doesn't have a Chapter 0, suggesting weak source grounding.
## domain_placement — 3.0 / 5.0
"General Theory" is appropriate given the broad, abstract nature of the concept, though the entity is so general it could arguably fit in multiple domains. The placement isn't wrong but reflects the entity's lack of specificity rather than clear conceptual boundaries.
## vsm_relevance — 1.0 / 5.0
This entity is too abstract and meta-level to map meaningfully to any specific VSM system. It describes the general notion of "how systems work" rather than identifying particular functional components that would correspond to S1-S5 operations.
## explanatory_value — 2.0 / 5.0
The entity provides minimal explanatory power because it merely labels the general idea that economic systems have functions without identifying specific mechanisms, causal relationships, or structural elements. It names a surface phenomenon rather than illuminating underlying processes that Smith actually analyzes.