tegwick
a9ca0adfcf
feat(example): add per-entity LLM evaluations for 985 WoN entities (S3.3)
Batch evaluation of all 988 entities via OpenRouter. 984 succeeded on
first pass; 3 failed (network errors). eval-summary --update-metrics
written with per_entity_mean=3.9556.
Viability dashboard: 6/6 PASS
redundancy_ratio 0.0061 (max 0.10)
coverage_ratio 0.6190 (min 0.40)
coherence_comps 0.0000 (max 3)
consistency_cycles 0.0000 (max 0)
granularity_entropy 2.6748 (min 1.0)
per_entity_mean 3.9556 (min 3.5)
Dimension breakdown (mean across 985 entities):
definition_precision 3.62
source_grounding 4.36
domain_placement 4.56
vsm_relevance 3.31
explanatory_value 3.94
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 09:36:46 +01:00
..
2025-10-29 22:31:52 +01:00
2026-02-09 23:50:07 +01:00
2025-12-16 17:00:31 +01:00
2025-10-29 22:31:52 +01:00
2025-10-29 23:12:44 +01:00
2026-02-23 09:36:46 +01:00
2025-10-29 22:31:52 +01:00
2025-10-29 22:31:52 +01:00
2026-01-05 13:13:24 +01:00
2025-10-29 22:31:52 +01:00
2026-01-04 23:47:02 +01:00
2026-02-23 00:08:51 +01:00
2026-01-06 12:32:38 +01:00
2026-01-05 13:13:24 +01:00