Batch evaluation of all 988 entities via OpenRouter. 984 succeeded on first pass; 3 failed (network errors). eval-summary --update-metrics written with per_entity_mean=3.9556. Viability dashboard: 6/6 PASS redundancy_ratio 0.0061 (max 0.10) coverage_ratio 0.6190 (min 0.40) coherence_comps 0.0000 (max 3) consistency_cycles 0.0000 (max 0) granularity_entropy 2.6748 (min 1.0) per_entity_mean 3.9556 (min 3.5) Dimension breakdown (mean across 985 entities): definition_precision 3.62 source_grounding 4.36 domain_placement 4.56 vsm_relevance 3.31 explanatory_value 3.94 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
3.7 KiB
entity_slug, evaluator, evaluated_at, overall_score, scores
| entity_slug | evaluator | evaluated_at | overall_score | scores | |||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| bank_financial_stability_metrics | null | 2026-02-23T00:41:51.431624 | 2.4 |
|
Evaluation: Bank Financial Stability Metrics
definition_precision — 2.0 / 5.0
The definition lists modern banking metrics (capital adequacy ratios, liquidity coverage ratios, stress tests) that are anachronistic for Smith's era and creates a vague umbrella term rather than capturing a distinct concept. The circular phrasing "measures used to assess banking stability" that "help evaluate banking stability" lacks precision.
source_grounding — 1.0 / 5.0
This entity introduces modern regulatory banking concepts that did not exist in Smith's time and are not discussed in Book II, Chapter 2, which focuses on the nature and accumulation of stock. Smith's banking analysis lacks the sophisticated quantitative metrics described here.
domain_placement — 3.0 / 5.0
While "Regulation" is a reasonable domain for banking oversight concepts, the entity would be better placed in a "Banking" or "Financial Systems" domain since it focuses on internal bank assessment rather than external regulatory frameworks. The domain assignment is defensible but not optimal.
vsm_relevance — 4.0 / 5.0
This entity maps well to VSM System 3 (internal regulation/audit) as these metrics represent monitoring and control mechanisms for assessing organizational health. The concept of systematic measurement for stability aligns naturally with the regulatory/audit function.
explanatory_value — 2.0 / 5.0
While banking stability assessment is conceptually important, this entity merely names modern measurement categories without illuminating underlying mechanisms or structural relationships that Smith actually explored. It adds little explanatory power to understanding Smith's economic framework.