fix(example): standardise domain enum and source chapter format in schema/rules

Two root causes of metric fragmentation observed in collection checks:

1. Schema's Economic Domain used free-form examples ("labour economics,
   trade theory") which overrode the enum in extraction-rules.md, causing
   the LLM to produce multi-domain strings and non-canonical values.
   Fix: schema now specifies the exact 7-value enum with descriptions.

2. Source Chapter had no format constraint, producing 9 different formats
   for 7 chapters (full titles, mixed Roman/Arabic numerals, asterisks).
   Fix: extraction-rules now mandate "Book [Roman], Chapter [n]" exactly.

These fixes are prerequisites for clean reprocessing (S3.2 continuation).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-19 13:01:09 +01:00
parent 715ef19d1c
commit 77dd3fee6d
2 changed files with 19 additions and 4 deletions

View File

@@ -57,5 +57,10 @@ entities at the level of specificity where they carry independent meaning.
- Each entity must have a definition that would be comprehensible without - Each entity must have a definition that would be comprehensible without
reading the source chapter. reading the source chapter.
- Each entity must cite the specific book and chapter of first appearance. - Each entity must cite the specific book and chapter of first appearance.
- Economic Domain must be one of: Production, Distribution, Exchange, - **Economic Domain** must be EXACTLY ONE of: Production, Distribution,
Consumption, Accumulation, Regulation, or General Theory. Exchange, Consumption, Accumulation, Regulation, or General Theory.
Do not combine multiple domains. Do not use any other value.
- **Source Chapter format**: Use `Book [Roman numeral], Chapter [number]`
— for example `Book I, Chapter 3`. Do not include the chapter title,
quotation marks, markdown formatting, or asterisks. Use Roman numerals
for the book (I, II, III, IV, V).

View File

@@ -16,8 +16,18 @@ The broader context in which this entity appears within the source text.
Describe the argument or passage where the entity is discussed. Describe the argument or passage where the entity is discussed.
### Economic Domain ### Economic Domain
The area of economics this entity belongs to (e.g., labour economics, The area of economics this entity belongs to. Use **exactly one** value
trade theory, market theory, institutional economics). from this list:
- **Production** — labour, manufacturing, technology, productivity
- **Distribution** — wages, profit, rent, income shares
- **Exchange** — markets, prices, trade, money, barter
- **Consumption** — demand, utility, wants, expenditure
- **Accumulation** — capital, savings, stock, investment
- **Regulation** — policy, law, institutions, monopoly, government
- **General Theory** — foundational principles spanning multiple domains
Do not combine multiple values. Do not use any other domain name.
## Optional Sections ## Optional Sections