Files
markitect-main/examples/infospace-with-history/output/evaluations/coin_degradation_measurement.md
tegwick a9ca0adfcf feat(example): add per-entity LLM evaluations for 985 WoN entities (S3.3)
Batch evaluation of all 988 entities via OpenRouter. 984 succeeded on
first pass; 3 failed (network errors). eval-summary --update-metrics
written with per_entity_mean=3.9556.

Viability dashboard: 6/6 PASS
  redundancy_ratio   0.0061  (max 0.10)
  coverage_ratio     0.6190  (min 0.40)
  coherence_comps    0.0000  (max 3)
  consistency_cycles 0.0000  (max 0)
  granularity_entropy 2.6748 (min 1.0)
  per_entity_mean    3.9556  (min 3.5)

Dimension breakdown (mean across 985 entities):
  definition_precision  3.62
  source_grounding      4.36
  domain_placement      4.56
  vsm_relevance         3.31
  explanatory_value     3.94

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 09:36:46 +01:00

3.1 KiB

entity_slug, evaluator, evaluated_at, overall_score, scores
entity_slug evaluator evaluated_at overall_score scores
coin_degradation_measurement null 2026-02-23T04:43:54.844272 4.4
name value max_value rationale
definition_precision 4.0 5.0 The definition is precise and specific, providing exact quantitative measures (gold >2%, silver >8% below standard weight) and clearly identifying the causes of degradation. It captures a distinct measurable phenomenon rather than a vague concept.
name value max_value rationale
source_grounding 5.0 5.0 This entity is directly grounded in Smith's text, which provides the specific degradation percentages for English coins before recoinage. The figures and context are explicitly stated in Book IV, Chapter 6.
name value max_value rationale
domain_placement 5.0 5.0 The "Regulation" domain is perfectly appropriate, as coin degradation measurement is fundamentally about monetary regulation and the need for government oversight of currency standards. This fits squarely within regulatory economic policy.
name value max_value rationale
vsm_relevance 4.0 5.0 This entity maps well to S3 (internal regulation/audit) as it represents the measurement and monitoring function necessary for maintaining monetary system integrity. It also connects to S2 (coordination) by providing data needed to prevent monetary oscillations.
name value max_value rationale
explanatory_value 4.0 5.0 The entity provides genuine explanatory power by quantifying the extent of monetary degradation and demonstrating why recoinage was necessary. It illuminates the mechanism by which poor monetary regulation leads to currency debasement and economic instability.

Evaluation: Coin Degradation Measurement

definition_precision — 4.0 / 5.0

The definition is precise and specific, providing exact quantitative measures (gold >2%, silver >8% below standard weight) and clearly identifying the causes of degradation. It captures a distinct measurable phenomenon rather than a vague concept.

source_grounding — 5.0 / 5.0

This entity is directly grounded in Smith's text, which provides the specific degradation percentages for English coins before recoinage. The figures and context are explicitly stated in Book IV, Chapter 6.

domain_placement — 5.0 / 5.0

The "Regulation" domain is perfectly appropriate, as coin degradation measurement is fundamentally about monetary regulation and the need for government oversight of currency standards. This fits squarely within regulatory economic policy.

vsm_relevance — 4.0 / 5.0

This entity maps well to S3 (internal regulation/audit) as it represents the measurement and monitoring function necessary for maintaining monetary system integrity. It also connects to S2 (coordination) by providing data needed to prevent monetary oscillations.

explanatory_value — 4.0 / 5.0

The entity provides genuine explanatory power by quantifying the extent of monetary degradation and demonstrating why recoinage was necessary. It illuminates the mechanism by which poor monetary regulation leads to currency debasement and economic instability.