feat(example): add per-entity LLM evaluations for 985 WoN entities (S3.3)

Batch evaluation of all 988 entities via OpenRouter. 984 succeeded on
first pass; 3 failed (network errors). eval-summary --update-metrics
written with per_entity_mean=3.9556.

Viability dashboard: 6/6 PASS
  redundancy_ratio   0.0061  (max 0.10)
  coverage_ratio     0.6190  (min 0.40)
  coherence_comps    0.0000  (max 3)
  consistency_cycles 0.0000  (max 0)
  granularity_entropy 2.6748 (min 1.0)
  per_entity_mean    3.9556  (min 3.5)

Dimension breakdown (mean across 985 entities):
  definition_precision  3.62
  source_grounding      4.36
  domain_placement      4.56
  vsm_relevance         3.31
  explanatory_value     3.94

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-23 09:36:46 +01:00
parent 81a4c8796a
commit a9ca0adfcf
986 changed files with 63216 additions and 1 deletions

View File

@@ -0,0 +1,64 @@
---
entity_slug: judgment_in_labour_application
evaluator: null
evaluated_at: '2026-02-23T05:39:05.008065'
overall_score: 4.4
scores:
- name: definition_precision
value: 4.0
max_value: 5.0
rationale: The definition clearly distinguishes judgment as decision-making capacity
about labor direction, separate from skill and dexterity. It captures a specific
cognitive capability that develops through specialization rather than being a
vague umbrella term.
- name: source_grounding
value: 5.0
max_value: 5.0
rationale: This entity is directly grounded in Smith's explicit discussion in Book
I, Chapter 1, where he identifies judgment as one of three specific improvements
in labor resulting from division of labor. The concept is clearly stated in the
source text, not inferred or extrapolated.
- name: domain_placement
value: 5.0
max_value: 5.0
rationale: The "Production" domain assignment is correct, as this concept directly
relates to how labor is organized and applied in productive processes. It fits
naturally within the operational aspects of economic production rather than exchange,
distribution, or policy domains.
- name: vsm_relevance
value: 4.0
max_value: 5.0
rationale: This entity maps well to S1 (primary operations) as it concerns how workers
make decisions within their specific operational roles. It also has some relevance
to S2 (coordination) as improved judgment helps workers coordinate their activities
more effectively within the production process.
- name: explanatory_value
value: 4.0
max_value: 5.0
rationale: The entity illuminates an important mechanism by which division of labor
improves productivity - not just through mechanical skill development, but through
enhanced decision-making capabilities. It explains how specialization creates
cognitive as well as physical improvements in work performance.
---
# Evaluation: Judgment In Labour Application
## definition_precision — 4.0 / 5.0
The definition clearly distinguishes judgment as decision-making capacity about labor direction, separate from skill and dexterity. It captures a specific cognitive capability that develops through specialization rather than being a vague umbrella term.
## source_grounding — 5.0 / 5.0
This entity is directly grounded in Smith's explicit discussion in Book I, Chapter 1, where he identifies judgment as one of three specific improvements in labor resulting from division of labor. The concept is clearly stated in the source text, not inferred or extrapolated.
## domain_placement — 5.0 / 5.0
The "Production" domain assignment is correct, as this concept directly relates to how labor is organized and applied in productive processes. It fits naturally within the operational aspects of economic production rather than exchange, distribution, or policy domains.
## vsm_relevance — 4.0 / 5.0
This entity maps well to S1 (primary operations) as it concerns how workers make decisions within their specific operational roles. It also has some relevance to S2 (coordination) as improved judgment helps workers coordinate their activities more effectively within the production process.
## explanatory_value — 4.0 / 5.0
The entity illuminates an important mechanism by which division of labor improves productivity - not just through mechanical skill development, but through enhanced decision-making capabilities. It explains how specialization creates cognitive as well as physical improvements in work performance.