Batch evaluation of all 988 entities via OpenRouter. 984 succeeded on first pass; 3 failed (network errors). eval-summary --update-metrics written with per_entity_mean=3.9556. Viability dashboard: 6/6 PASS redundancy_ratio 0.0061 (max 0.10) coverage_ratio 0.6190 (min 0.40) coherence_comps 0.0000 (max 3) consistency_cycles 0.0000 (max 0) granularity_entropy 2.6748 (min 1.0) per_entity_mean 3.9556 (min 3.5) Dimension breakdown (mean across 985 entities): definition_precision 3.62 source_grounding 4.36 domain_placement 4.56 vsm_relevance 3.31 explanatory_value 3.94 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3.6 KiB
entity_slug, evaluator, evaluated_at, overall_score, scores
| entity_slug | evaluator | evaluated_at | overall_score | scores | |||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| tale_versus_weight_measurement | null | 2026-02-23T06:29:06.116095 | 4.0 |
|
Evaluation: Tale Versus Weight Measurement
definition_precision — 4.0 / 5.0
The definition clearly distinguishes between two specific methods of coin measurement (counting vs. weighing) and identifies the trade-off between accuracy and convenience. The concept is well-bounded and avoids circularity, though it could be slightly more precise about what constitutes "tale" measurement.
source_grounding — 4.0 / 5.0
This appears well-grounded in Smith's actual discussion of monetary mechanics in Book IV, Chapter 6, where he examines practical aspects of currency use including the weighing of gold coins. The entity accurately reflects Smith's analysis of how measurement methods affect monetary policy effectiveness.
domain_placement — 5.0 / 5.0
The "Regulation" domain assignment is highly appropriate, as this concept deals with monetary policy mechanisms and the regulatory implications of different currency measurement standards. This is fundamentally about how monetary systems are regulated and standardized.
vsm_relevance — 3.0 / 5.0
This entity has moderate VSM relevance, primarily mapping to S3 (internal regulation/audit) as it concerns measurement standards and control mechanisms within the monetary system. It also touches on S2 (coordination) regarding standardization, but the mapping is not as natural as more operational concepts.
explanatory_value — 4.0 / 5.0
This entity provides genuine explanatory power by illuminating a specific mechanism that affects currency stability and seignorage effectiveness. It reveals how seemingly technical measurement choices have broader economic implications for monetary policy and system stability.