Files
markitect-main/examples/infospace-with-history/output/evaluations/economic_system_improvement.md
tegwick a9ca0adfcf feat(example): add per-entity LLM evaluations for 985 WoN entities (S3.3)
Batch evaluation of all 988 entities via OpenRouter. 984 succeeded on
first pass; 3 failed (network errors). eval-summary --update-metrics
written with per_entity_mean=3.9556.

Viability dashboard: 6/6 PASS
  redundancy_ratio   0.0061  (max 0.10)
  coverage_ratio     0.6190  (min 0.40)
  coherence_comps    0.0000  (max 3)
  consistency_cycles 0.0000  (max 0)
  granularity_entropy 2.6748 (min 1.0)
  per_entity_mean    3.9556  (min 3.5)

Dimension breakdown (mean across 985 entities):
  definition_precision  3.62
  source_grounding      4.36
  domain_placement      4.56
  vsm_relevance         3.31
  explanatory_value     3.94

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 09:36:46 +01:00

3.3 KiB

entity_slug, evaluator, evaluated_at, overall_score, scores
entity_slug evaluator evaluated_at overall_score scores
economic_system_improvement null 2026-02-23T05:16:56.220025 2.6
name value max_value rationale
definition_precision 2.0 5.0 The definition is quite vague and circular, essentially defining improvement as "enhancing and refining...to increase effectiveness and efficiency." It lacks specificity about what constitutes improvement or how it differs from general policy change.
name value max_value rationale
source_grounding 2.0 5.0 While Smith does compare different economic systems and discusses their relative merits, he doesn't explicitly theorize about "economic system improvement" as a distinct process or concept. This appears to be an interpretive extrapolation rather than a clearly stated idea in the text.
name value max_value rationale
domain_placement 3.0 5.0 "General Theory" is appropriate given the broad, meta-level nature of this concept, though the assignment to "Book IV, Chapter 0" (which doesn't exist) suggests weak textual grounding. The domain placement itself is reasonable for such an abstract concept.
name value max_value rationale
vsm_relevance 4.0 5.0 This entity maps well to S4 (intelligence/environmental adaptation) and S5 (identity/policy) functions, as it involves learning from experience, adapting systems based on understanding, and refining organizational identity and policies. It has clear VSM relevance for system evolution.
name value max_value rationale
explanatory_value 2.0 5.0 The entity names a general phenomenon but doesn't illuminate specific mechanisms or structural relations that Smith discusses. It's too abstract to provide genuine explanatory power about how economic systems actually function or change.

Evaluation: Economic System Improvement

definition_precision — 2.0 / 5.0

The definition is quite vague and circular, essentially defining improvement as "enhancing and refining...to increase effectiveness and efficiency." It lacks specificity about what constitutes improvement or how it differs from general policy change.

source_grounding — 2.0 / 5.0

While Smith does compare different economic systems and discusses their relative merits, he doesn't explicitly theorize about "economic system improvement" as a distinct process or concept. This appears to be an interpretive extrapolation rather than a clearly stated idea in the text.

domain_placement — 3.0 / 5.0

"General Theory" is appropriate given the broad, meta-level nature of this concept, though the assignment to "Book IV, Chapter 0" (which doesn't exist) suggests weak textual grounding. The domain placement itself is reasonable for such an abstract concept.

vsm_relevance — 4.0 / 5.0

This entity maps well to S4 (intelligence/environmental adaptation) and S5 (identity/policy) functions, as it involves learning from experience, adapting systems based on understanding, and refining organizational identity and policies. It has clear VSM relevance for system evolution.

explanatory_value — 2.0 / 5.0

The entity names a general phenomenon but doesn't illuminate specific mechanisms or structural relations that Smith discusses. It's too abstract to provide genuine explanatory power about how economic systems actually function or change.