Files
markitect-main/examples/infospace-with-history/output/evaluations/tale_versus_weight_measurement.md
tegwick a9ca0adfcf feat(example): add per-entity LLM evaluations for 985 WoN entities (S3.3)
Batch evaluation of all 988 entities via OpenRouter. 984 succeeded on
first pass; 3 failed (network errors). eval-summary --update-metrics
written with per_entity_mean=3.9556.

Viability dashboard: 6/6 PASS
  redundancy_ratio   0.0061  (max 0.10)
  coverage_ratio     0.6190  (min 0.40)
  coherence_comps    0.0000  (max 3)
  consistency_cycles 0.0000  (max 0)
  granularity_entropy 2.6748 (min 1.0)
  per_entity_mean    3.9556  (min 3.5)

Dimension breakdown (mean across 985 entities):
  definition_precision  3.62
  source_grounding      4.36
  domain_placement      4.56
  vsm_relevance         3.31
  explanatory_value     3.94

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 09:36:46 +01:00

3.6 KiB

entity_slug, evaluator, evaluated_at, overall_score, scores
entity_slug evaluator evaluated_at overall_score scores
tale_versus_weight_measurement null 2026-02-23T06:29:06.116095 4.0
name value max_value rationale
definition_precision 4.0 5.0 The definition clearly distinguishes between two specific methods of coin measurement (counting vs. weighing) and identifies the trade-off between accuracy and convenience. The concept is well-bounded and avoids circularity, though it could be slightly more precise about what constitutes "tale" measurement.
name value max_value rationale
source_grounding 4.0 5.0 This appears well-grounded in Smith's actual discussion of monetary mechanics in Book IV, Chapter 6, where he examines practical aspects of currency use including the weighing of gold coins. The entity accurately reflects Smith's analysis of how measurement methods affect monetary policy effectiveness.
name value max_value rationale
domain_placement 5.0 5.0 The "Regulation" domain assignment is highly appropriate, as this concept deals with monetary policy mechanisms and the regulatory implications of different currency measurement standards. This is fundamentally about how monetary systems are regulated and standardized.
name value max_value rationale
vsm_relevance 3.0 5.0 This entity has moderate VSM relevance, primarily mapping to S3 (internal regulation/audit) as it concerns measurement standards and control mechanisms within the monetary system. It also touches on S2 (coordination) regarding standardization, but the mapping is not as natural as more operational concepts.
name value max_value rationale
explanatory_value 4.0 5.0 This entity provides genuine explanatory power by illuminating a specific mechanism that affects currency stability and seignorage effectiveness. It reveals how seemingly technical measurement choices have broader economic implications for monetary policy and system stability.

Evaluation: Tale Versus Weight Measurement

definition_precision — 4.0 / 5.0

The definition clearly distinguishes between two specific methods of coin measurement (counting vs. weighing) and identifies the trade-off between accuracy and convenience. The concept is well-bounded and avoids circularity, though it could be slightly more precise about what constitutes "tale" measurement.

source_grounding — 4.0 / 5.0

This appears well-grounded in Smith's actual discussion of monetary mechanics in Book IV, Chapter 6, where he examines practical aspects of currency use including the weighing of gold coins. The entity accurately reflects Smith's analysis of how measurement methods affect monetary policy effectiveness.

domain_placement — 5.0 / 5.0

The "Regulation" domain assignment is highly appropriate, as this concept deals with monetary policy mechanisms and the regulatory implications of different currency measurement standards. This is fundamentally about how monetary systems are regulated and standardized.

vsm_relevance — 3.0 / 5.0

This entity has moderate VSM relevance, primarily mapping to S3 (internal regulation/audit) as it concerns measurement standards and control mechanisms within the monetary system. It also touches on S2 (coordination) regarding standardization, but the mapping is not as natural as more operational concepts.

explanatory_value — 4.0 / 5.0

This entity provides genuine explanatory power by illuminating a specific mechanism that affects currency stability and seignorage effectiveness. It reveals how seemingly technical measurement choices have broader economic implications for monetary policy and system stability.