feat(example): add per-entity LLM evaluations for 985 WoN entities (S3.3)

Batch evaluation of all 988 entities via OpenRouter. 984 succeeded on
first pass; 3 failed (network errors). eval-summary --update-metrics
written with per_entity_mean=3.9556.

Viability dashboard: 6/6 PASS
  redundancy_ratio   0.0061  (max 0.10)
  coverage_ratio     0.6190  (min 0.40)
  coherence_comps    0.0000  (max 3)
  consistency_cycles 0.0000  (max 0)
  granularity_entropy 2.6748 (min 1.0)
  per_entity_mean    3.9556  (min 3.5)

Dimension breakdown (mean across 985 entities):
  definition_precision  3.62
  source_grounding      4.36
  domain_placement      4.56
  vsm_relevance         3.31
  explanatory_value     3.94

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-23 09:36:46 +01:00
parent 81a4c8796a
commit a9ca0adfcf
986 changed files with 63216 additions and 1 deletions

View File

@@ -0,0 +1,62 @@
---
entity_slug: proportion_between_metals
evaluator: null
evaluated_at: '2026-02-23T06:11:39.695782'
overall_score: 4.2
scores:
- name: definition_precision
value: 4.0
max_value: 5.0
rationale: The definition clearly identifies a specific concept - the exchange ratio
between precious metals - and distinguishes between official and market-determined
ratios. It avoids circularity and captures a measurable economic relationship.
- name: source_grounding
value: 5.0
max_value: 5.0
rationale: Smith explicitly discusses the proportion between gold and silver in
Book I, Chapter 5, examining how these ratios affect monetary systems and the
relative values of different metals as money. This is directly grounded in the
source text.
- name: domain_placement
value: 5.0
max_value: 5.0
rationale: '"Regulation" is the appropriate domain since Smith discusses both market-determined
ratios and official government attempts to establish stable proportions between
metals. This sits squarely within monetary regulation and policy.'
- name: vsm_relevance
value: 3.0
max_value: 5.0
rationale: This entity has some VSM relevance as it relates to S2 (coordination
between different monetary standards) and S3 (regulatory mechanisms for maintaining
stable ratios), but the mapping is not particularly strong or illuminating for
organizational analysis.
- name: explanatory_value
value: 4.0
max_value: 5.0
rationale: This entity reveals an important structural mechanism in monetary systems
- how the relative values of different metals create coordination problems and
regulatory challenges. It illuminates the underlying dynamics of bimetallic monetary
systems rather than just naming a surface phenomenon.
---
# Evaluation: Proportion Between Metals
## definition_precision — 4.0 / 5.0
The definition clearly identifies a specific concept - the exchange ratio between precious metals - and distinguishes between official and market-determined ratios. It avoids circularity and captures a measurable economic relationship.
## source_grounding — 5.0 / 5.0
Smith explicitly discusses the proportion between gold and silver in Book I, Chapter 5, examining how these ratios affect monetary systems and the relative values of different metals as money. This is directly grounded in the source text.
## domain_placement — 5.0 / 5.0
"Regulation" is the appropriate domain since Smith discusses both market-determined ratios and official government attempts to establish stable proportions between metals. This sits squarely within monetary regulation and policy.
## vsm_relevance — 3.0 / 5.0
This entity has some VSM relevance as it relates to S2 (coordination between different monetary standards) and S3 (regulatory mechanisms for maintaining stable ratios), but the mapping is not particularly strong or illuminating for organizational analysis.
## explanatory_value — 4.0 / 5.0
This entity reveals an important structural mechanism in monetary systems - how the relative values of different metals create coordination problems and regulatory challenges. It illuminates the underlying dynamics of bimetallic monetary systems rather than just naming a surface phenomenon.