From 8095a1da4c21dcf2bff50e27968e54edefdd32c0 Mon Sep 17 00:00:00 2001 From: tegwick Date: Thu, 19 Feb 2026 13:01:09 +0100 Subject: [PATCH] fix(example): standardise domain enum and source chapter format in schema/rules Two root causes of metric fragmentation observed in collection checks: 1. Schema's Economic Domain used free-form examples ("labour economics, trade theory") which overrode the enum in extraction-rules.md, causing the LLM to produce multi-domain strings and non-canonical values. Fix: schema now specifies the exact 7-value enum with descriptions. 2. Source Chapter had no format constraint, producing 9 different formats for 7 chapters (full titles, mixed Roman/Arabic numerals, asterisks). Fix: extraction-rules now mandate "Book [Roman], Chapter [n]" exactly. These fixes are prerequisites for clean reprocessing (S3.2 continuation). Co-Authored-By: Claude Sonnet 4.6 --- .../artifacts/guidelines/extraction-rules.md | 9 +++++++-- .../schemas/economic-entity-schema-v1.0.md | 14 ++++++++++++-- 2 files changed, 19 insertions(+), 4 deletions(-) diff --git a/examples/infospace-with-history/artifacts/guidelines/extraction-rules.md b/examples/infospace-with-history/artifacts/guidelines/extraction-rules.md index dd5ac0c4..f2523128 100644 --- a/examples/infospace-with-history/artifacts/guidelines/extraction-rules.md +++ b/examples/infospace-with-history/artifacts/guidelines/extraction-rules.md @@ -57,5 +57,10 @@ entities at the level of specificity where they carry independent meaning. - Each entity must have a definition that would be comprehensible without reading the source chapter. - Each entity must cite the specific book and chapter of first appearance. -- Economic Domain must be one of: Production, Distribution, Exchange, - Consumption, Accumulation, Regulation, or General Theory. +- **Economic Domain** must be EXACTLY ONE of: Production, Distribution, + Exchange, Consumption, Accumulation, Regulation, or General Theory. + Do not combine multiple domains. Do not use any other value. +- **Source Chapter format**: Use `Book [Roman numeral], Chapter [number]` + — for example `Book I, Chapter 3`. Do not include the chapter title, + quotation marks, markdown formatting, or asterisks. Use Roman numerals + for the book (I, II, III, IV, V). diff --git a/examples/infospace-with-history/schemas/economic-entity-schema-v1.0.md b/examples/infospace-with-history/schemas/economic-entity-schema-v1.0.md index b3e2c31f..96d3df71 100644 --- a/examples/infospace-with-history/schemas/economic-entity-schema-v1.0.md +++ b/examples/infospace-with-history/schemas/economic-entity-schema-v1.0.md @@ -16,8 +16,18 @@ The broader context in which this entity appears within the source text. Describe the argument or passage where the entity is discussed. ### Economic Domain -The area of economics this entity belongs to (e.g., labour economics, -trade theory, market theory, institutional economics). +The area of economics this entity belongs to. Use **exactly one** value +from this list: + +- **Production** — labour, manufacturing, technology, productivity +- **Distribution** — wages, profit, rent, income shares +- **Exchange** — markets, prices, trade, money, barter +- **Consumption** — demand, utility, wants, expenditure +- **Accumulation** — capital, savings, stock, investment +- **Regulation** — policy, law, institutions, monopoly, government +- **General Theory** — foundational principles spanning multiple domains + +Do not combine multiple values. Do not use any other domain name. ## Optional Sections