Files
markitect-main/examples/infospace-with-history/output/evaluations/bank_economic_development_metrics.md
tegwick a9ca0adfcf feat(example): add per-entity LLM evaluations for 985 WoN entities (S3.3)
Batch evaluation of all 988 entities via OpenRouter. 984 succeeded on
first pass; 3 failed (network errors). eval-summary --update-metrics
written with per_entity_mean=3.9556.

Viability dashboard: 6/6 PASS
  redundancy_ratio   0.0061  (max 0.10)
  coverage_ratio     0.6190  (min 0.40)
  coherence_comps    0.0000  (max 3)
  consistency_cycles 0.0000  (max 0)
  granularity_entropy 2.6748 (min 1.0)
  per_entity_mean    3.9556  (min 3.5)

Dimension breakdown (mean across 985 entities):
  definition_precision  3.62
  source_grounding      4.36
  domain_placement      4.56
  vsm_relevance         3.31
  explanatory_value     3.94

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 09:36:46 +01:00

3.6 KiB

entity_slug, evaluator, evaluated_at, overall_score, scores
entity_slug evaluator evaluated_at overall_score scores
bank_economic_development_metrics null 2026-02-23T00:38:42.345058 2.6
name value max_value rationale
definition_precision 2.0 5.0 The definition is vague and umbrella-like, listing broad categories ("capital allocation efficiency, transaction cost reduction, financial innovation impact") without clearly defining what constitutes these metrics or how they're measured. It reads more like a modern economic framework than a precise concept from Smith's work.
name value max_value rationale
source_grounding 2.0 5.0 While Smith discusses banking's role in economic development in Book II, Chapter 2, he doesn't present a systematic framework of "metrics" for evaluating banking performance in the modern sense implied here. This appears to impose contemporary economic measurement concepts onto Smith's more descriptive analysis.
name value max_value rationale
domain_placement 4.0 5.0 The "Accumulation" domain is appropriate since Book II, Chapter 2 deals with how banking facilitates capital accumulation and economic growth. The entity correctly identifies this as part of the broader accumulation process Smith describes.
name value max_value rationale
vsm_relevance 3.0 5.0 This entity could map to S3 (internal regulation/audit) as it involves measurement and evaluation systems, but the metrics themselves are too abstract and don't clearly represent operational elements of a viable system. It's more of a meta-analytical framework than a system component.
name value max_value rationale
explanatory_value 2.0 5.0 The entity doesn't illuminate specific mechanisms Smith describes but rather creates a modern analytical overlay that obscures his actual insights about banking's role. It names categories without explaining the underlying economic relationships Smith actually discusses.

Evaluation: Bank Economic Development Metrics

definition_precision — 2.0 / 5.0

The definition is vague and umbrella-like, listing broad categories ("capital allocation efficiency, transaction cost reduction, financial innovation impact") without clearly defining what constitutes these metrics or how they're measured. It reads more like a modern economic framework than a precise concept from Smith's work.

source_grounding — 2.0 / 5.0

While Smith discusses banking's role in economic development in Book II, Chapter 2, he doesn't present a systematic framework of "metrics" for evaluating banking performance in the modern sense implied here. This appears to impose contemporary economic measurement concepts onto Smith's more descriptive analysis.

domain_placement — 4.0 / 5.0

The "Accumulation" domain is appropriate since Book II, Chapter 2 deals with how banking facilitates capital accumulation and economic growth. The entity correctly identifies this as part of the broader accumulation process Smith describes.

vsm_relevance — 3.0 / 5.0

This entity could map to S3 (internal regulation/audit) as it involves measurement and evaluation systems, but the metrics themselves are too abstract and don't clearly represent operational elements of a viable system. It's more of a meta-analytical framework than a system component.

explanatory_value — 2.0 / 5.0

The entity doesn't illuminate specific mechanisms Smith describes but rather creates a modern analytical overlay that obscures his actual insights about banking's role. It names categories without explaining the underlying economic relationships Smith actually discusses.