Files
markitect-main/examples/infospace-with-history/output/evaluations/aulnagers.md
tegwick a9ca0adfcf feat(example): add per-entity LLM evaluations for 985 WoN entities (S3.3)
Batch evaluation of all 988 entities via OpenRouter. 984 succeeded on
first pass; 3 failed (network errors). eval-summary --update-metrics
written with per_entity_mean=3.9556.

Viability dashboard: 6/6 PASS
  redundancy_ratio   0.0061  (max 0.10)
  coverage_ratio     0.6190  (min 0.40)
  coherence_comps    0.0000  (max 3)
  consistency_cycles 0.0000  (max 0)
  granularity_entropy 2.6748 (min 1.0)
  per_entity_mean    3.9556  (min 3.5)

Dimension breakdown (mean across 985 entities):
  definition_precision  3.62
  source_grounding      4.36
  domain_placement      4.56
  vsm_relevance         3.31
  explanatory_value     3.94

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 09:36:46 +01:00

2.8 KiB

entity_slug, evaluator, evaluated_at, overall_score, scores
entity_slug evaluator evaluated_at overall_score scores
aulnagers null 2026-02-23T00:36:23.671148 4.4
name value max_value rationale
definition_precision 4.0 5.0 The definition is precise and captures a distinct historical role - public officials who certified woollen cloth quality and dimensions. The analogy to mint officials effectively clarifies their function without circularity.
name value max_value rationale
source_grounding 5.0 5.0 This entity is directly grounded in Smith's text from Book I, Chapter 4, where he explicitly draws the parallel between aulnagers and mint officials as examples of public quality certification systems.
name value max_value rationale
domain_placement 5.0 5.0 The "Regulation" domain assignment is perfectly appropriate, as aulnagers represent a clear example of government regulatory oversight in commercial markets through quality standardization.
name value max_value rationale
vsm_relevance 4.0 5.0 Aulnagers map naturally to S3 (internal regulation/audit) as they perform quality control and standardization functions within the broader economic system. They could also relate to S2 (coordination) through their standardization role.
name value max_value rationale
explanatory_value 4.0 5.0 This entity illuminates an important mechanism - how public institutions create trust and reduce transaction costs through quality certification - demonstrating Smith's broader point about the institutional foundations of market exchange.

Evaluation: Aulnagers

definition_precision — 4.0 / 5.0

The definition is precise and captures a distinct historical role - public officials who certified woollen cloth quality and dimensions. The analogy to mint officials effectively clarifies their function without circularity.

source_grounding — 5.0 / 5.0

This entity is directly grounded in Smith's text from Book I, Chapter 4, where he explicitly draws the parallel between aulnagers and mint officials as examples of public quality certification systems.

domain_placement — 5.0 / 5.0

The "Regulation" domain assignment is perfectly appropriate, as aulnagers represent a clear example of government regulatory oversight in commercial markets through quality standardization.

vsm_relevance — 4.0 / 5.0

Aulnagers map naturally to S3 (internal regulation/audit) as they perform quality control and standardization functions within the broader economic system. They could also relate to S2 (coordination) through their standardization role.

explanatory_value — 4.0 / 5.0

This entity illuminates an important mechanism - how public institutions create trust and reduce transaction costs through quality certification - demonstrating Smith's broader point about the institutional foundations of market exchange.