Files
markitect-main/examples/infospace-with-history/output/evaluations/assaying.md
tegwick a9ca0adfcf feat(example): add per-entity LLM evaluations for 985 WoN entities (S3.3)
Batch evaluation of all 988 entities via OpenRouter. 984 succeeded on
first pass; 3 failed (network errors). eval-summary --update-metrics
written with per_entity_mean=3.9556.

Viability dashboard: 6/6 PASS
  redundancy_ratio   0.0061  (max 0.10)
  coverage_ratio     0.6190  (min 0.40)
  coherence_comps    0.0000  (max 3)
  consistency_cycles 0.0000  (max 0)
  granularity_entropy 2.6748 (min 1.0)
  per_entity_mean    3.9556  (min 3.5)

Dimension breakdown (mean across 985 entities):
  definition_precision  3.62
  source_grounding      4.36
  domain_placement      4.56
  vsm_relevance         3.31
  explanatory_value     3.94

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-23 09:36:46 +01:00

64 lines
3.3 KiB
Markdown

---
entity_slug: assaying
evaluator: null
evaluated_at: '2026-02-23T00:35:54.499834'
overall_score: 4.4
scores:
- name: definition_precision
value: 4.0
max_value: 5.0
rationale: The definition is precise and captures a distinct technical process -
testing metal purity to verify quality in exchange. It clearly distinguishes assaying
from other metal-related processes and specifies its role in addressing uncertainty
about metal quality.
- name: source_grounding
value: 5.0
max_value: 5.0
rationale: This entity is directly grounded in Smith's text where he explicitly
discusses the inconveniences of using unstamped metals, including the difficulty
and uncertainty of determining their purity. The concept emerges naturally from
Smith's analysis of early exchange mechanisms.
- name: domain_placement
value: 5.0
max_value: 5.0
rationale: The "Exchange" domain placement is correct, as assaying is fundamentally
about verifying the quality of exchange media (metals) to facilitate trade. This
fits perfectly within Smith's discussion of the evolution of exchange mechanisms.
- name: vsm_relevance
value: 4.0
max_value: 5.0
rationale: Assaying maps well to S3 (internal regulation/audit) as it represents
a quality control and verification function within exchange systems. It could
also relate to S2 (coordination) by reducing uncertainty and enabling smoother
transactions between parties.
- name: explanatory_value
value: 4.0
max_value: 5.0
rationale: This entity illuminates a crucial mechanism in the evolution of money
- how societies addressed the problem of verifying exchange media quality. It
helps explain why stamped coinage emerged as a solution to the assaying problem,
showing structural relations in monetary development.
---
# Evaluation: Assaying
## definition_precision — 4.0 / 5.0
The definition is precise and captures a distinct technical process - testing metal purity to verify quality in exchange. It clearly distinguishes assaying from other metal-related processes and specifies its role in addressing uncertainty about metal quality.
## source_grounding — 5.0 / 5.0
This entity is directly grounded in Smith's text where he explicitly discusses the inconveniences of using unstamped metals, including the difficulty and uncertainty of determining their purity. The concept emerges naturally from Smith's analysis of early exchange mechanisms.
## domain_placement — 5.0 / 5.0
The "Exchange" domain placement is correct, as assaying is fundamentally about verifying the quality of exchange media (metals) to facilitate trade. This fits perfectly within Smith's discussion of the evolution of exchange mechanisms.
## vsm_relevance — 4.0 / 5.0
Assaying maps well to S3 (internal regulation/audit) as it represents a quality control and verification function within exchange systems. It could also relate to S2 (coordination) by reducing uncertainty and enabling smoother transactions between parties.
## explanatory_value — 4.0 / 5.0
This entity illuminates a crucial mechanism in the evolution of money - how societies addressed the problem of verifying exchange media quality. It helps explain why stamped coinage emerged as a solution to the assaying problem, showing structural relations in monetary development.