Batch evaluation of all 988 entities via OpenRouter. 984 succeeded on first pass; 3 failed (network errors). eval-summary --update-metrics written with per_entity_mean=3.9556. Viability dashboard: 6/6 PASS redundancy_ratio 0.0061 (max 0.10) coverage_ratio 0.6190 (min 0.40) coherence_comps 0.0000 (max 3) consistency_cycles 0.0000 (max 0) granularity_entropy 2.6748 (min 1.0) per_entity_mean 3.9556 (min 3.5) Dimension breakdown (mean across 985 entities): definition_precision 3.62 source_grounding 4.36 domain_placement 4.56 vsm_relevance 3.31 explanatory_value 3.94 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3.6 KiB
entity_slug, evaluator, evaluated_at, overall_score, scores
| entity_slug | evaluator | evaluated_at | overall_score | scores | |||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| bank_economic_development_metrics | null | 2026-02-23T00:38:42.345058 | 2.6 |
|
Evaluation: Bank Economic Development Metrics
definition_precision — 2.0 / 5.0
The definition is vague and umbrella-like, listing broad categories ("capital allocation efficiency, transaction cost reduction, financial innovation impact") without clearly defining what constitutes these metrics or how they're measured. It reads more like a modern economic framework than a precise concept from Smith's work.
source_grounding — 2.0 / 5.0
While Smith discusses banking's role in economic development in Book II, Chapter 2, he doesn't present a systematic framework of "metrics" for evaluating banking performance in the modern sense implied here. This appears to impose contemporary economic measurement concepts onto Smith's more descriptive analysis.
domain_placement — 4.0 / 5.0
The "Accumulation" domain is appropriate since Book II, Chapter 2 deals with how banking facilitates capital accumulation and economic growth. The entity correctly identifies this as part of the broader accumulation process Smith describes.
vsm_relevance — 3.0 / 5.0
This entity could map to S3 (internal regulation/audit) as it involves measurement and evaluation systems, but the metrics themselves are too abstract and don't clearly represent operational elements of a viable system. It's more of a meta-analytical framework than a system component.
explanatory_value — 2.0 / 5.0
The entity doesn't illuminate specific mechanisms Smith describes but rather creates a modern analytical overlay that obscures his actual insights about banking's role. It names categories without explaining the underlying economic relationships Smith actually discusses.