feat(example): add per-entity LLM evaluations for 985 WoN entities (S3.3)
Batch evaluation of all 988 entities via OpenRouter. 984 succeeded on first pass; 3 failed (network errors). eval-summary --update-metrics written with per_entity_mean=3.9556. Viability dashboard: 6/6 PASS redundancy_ratio 0.0061 (max 0.10) coverage_ratio 0.6190 (min 0.40) coherence_comps 0.0000 (max 3) consistency_cycles 0.0000 (max 0) granularity_entropy 2.6748 (min 1.0) per_entity_mean 3.9556 (min 3.5) Dimension breakdown (mean across 985 entities): definition_precision 3.62 source_grounding 4.36 domain_placement 4.56 vsm_relevance 3.31 explanatory_value 3.94 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,61 @@
|
||||
---
|
||||
entity_slug: natural_development_sequence
|
||||
evaluator: null
|
||||
evaluated_at: '2026-02-23T05:58:38.077362'
|
||||
overall_score: 1.2
|
||||
scores:
|
||||
- name: definition_precision
|
||||
value: 1.0
|
||||
max_value: 5.0
|
||||
rationale: There is no definition provided at all, making this entity completely
|
||||
imprecise. Without any definitional content, it's impossible to determine what
|
||||
specific concept this entity is meant to capture.
|
||||
- name: source_grounding
|
||||
value: 1.0
|
||||
max_value: 5.0
|
||||
rationale: With no definition, context, or source chapter specified, there's no
|
||||
evidence this entity is grounded in Smith's actual text. The term "natural development
|
||||
sequence" could refer to many different concepts that Smith discusses, but without
|
||||
specificity, it appears to be a vague abstraction.
|
||||
- name: domain_placement
|
||||
value: 1.0
|
||||
max_value: 5.0
|
||||
rationale: The domain is listed as "unspecified," indicating no clear conceptual
|
||||
categorization has been established. Without knowing what specific sequence or
|
||||
development process this refers to, proper domain placement is impossible.
|
||||
- name: vsm_relevance
|
||||
value: 2.0
|
||||
max_value: 5.0
|
||||
rationale: While developmental sequences could potentially map to VSM systems (particularly
|
||||
S4's environmental adaptation or S1's operational evolution), the complete lack
|
||||
of definition makes any VSM placement purely speculative. The concept is too undefined
|
||||
to assess its systemic relevance.
|
||||
- name: explanatory_value
|
||||
value: 1.0
|
||||
max_value: 5.0
|
||||
rationale: An entity with no definition, context, or source grounding provides zero
|
||||
explanatory power. It names nothing specific and illuminates no particular mechanism
|
||||
or structural relation from Smith's work.
|
||||
---
|
||||
|
||||
# Evaluation: Natural Development Sequence
|
||||
|
||||
## definition_precision — 1.0 / 5.0
|
||||
|
||||
There is no definition provided at all, making this entity completely imprecise. Without any definitional content, it's impossible to determine what specific concept this entity is meant to capture.
|
||||
|
||||
## source_grounding — 1.0 / 5.0
|
||||
|
||||
With no definition, context, or source chapter specified, there's no evidence this entity is grounded in Smith's actual text. The term "natural development sequence" could refer to many different concepts that Smith discusses, but without specificity, it appears to be a vague abstraction.
|
||||
|
||||
## domain_placement — 1.0 / 5.0
|
||||
|
||||
The domain is listed as "unspecified," indicating no clear conceptual categorization has been established. Without knowing what specific sequence or development process this refers to, proper domain placement is impossible.
|
||||
|
||||
## vsm_relevance — 2.0 / 5.0
|
||||
|
||||
While developmental sequences could potentially map to VSM systems (particularly S4's environmental adaptation or S1's operational evolution), the complete lack of definition makes any VSM placement purely speculative. The concept is too undefined to assess its systemic relevance.
|
||||
|
||||
## explanatory_value — 1.0 / 5.0
|
||||
|
||||
An entity with no definition, context, or source grounding provides zero explanatory power. It names nothing specific and illuminates no particular mechanism or structural relation from Smith's work.
|
||||
Reference in New Issue
Block a user