Two root causes of metric fragmentation observed in collection checks:
1. Schema's Economic Domain used free-form examples ("labour economics,
trade theory") which overrode the enum in extraction-rules.md, causing
the LLM to produce multi-domain strings and non-canonical values.
Fix: schema now specifies the exact 7-value enum with descriptions.
2. Source Chapter had no format constraint, producing 9 different formats
for 7 chapters (full titles, mixed Roman/Arabic numerals, asterisks).
Fix: extraction-rules now mandate "Book [Roman], Chapter [n]" exactly.
These fixes are prerequisites for clean reprocessing (S3.2 continuation).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
48 lines
1.7 KiB
Markdown
48 lines
1.7 KiB
Markdown
# Economic Entity Schema v1.0
|
|
|
|
Schema definition for economic entities extracted from source texts.
|
|
|
|
## Required Sections
|
|
|
|
### Definition
|
|
A clear, analytical definition of the economic entity (20-150 words).
|
|
|
|
### Source Chapter
|
|
The specific chapter from which this entity was extracted,
|
|
including book and chapter number.
|
|
|
|
### Context
|
|
The broader context in which this entity appears within the source text.
|
|
Describe the argument or passage where the entity is discussed.
|
|
|
|
### Economic Domain
|
|
The area of economics this entity belongs to. Use **exactly one** value
|
|
from this list:
|
|
|
|
- **Production** — labour, manufacturing, technology, productivity
|
|
- **Distribution** — wages, profit, rent, income shares
|
|
- **Exchange** — markets, prices, trade, money, barter
|
|
- **Consumption** — demand, utility, wants, expenditure
|
|
- **Accumulation** — capital, savings, stock, investment
|
|
- **Regulation** — policy, law, institutions, monopoly, government
|
|
- **General Theory** — foundational principles spanning multiple domains
|
|
|
|
Do not combine multiple values. Do not use any other domain name.
|
|
|
|
## Optional Sections
|
|
|
|
### Smith's Original Wording
|
|
A direct quotation from Adam Smith's text that defines or describes
|
|
this entity. Must be enclosed in quotation marks with chapter reference.
|
|
|
|
### Modern Interpretation
|
|
How this entity is understood in modern economic theory, including
|
|
any evolution in meaning since Smith's time.
|
|
|
|
## Validation Rules
|
|
|
|
1. The document MUST contain an H1 heading with the entity name.
|
|
2. The document MUST contain all four required sections: Definition, Source Chapter, Context, Economic Domain.
|
|
3. The Definition section MUST be between 20 and 150 words.
|
|
4. The Source Chapter section MUST cite a specific chapter (e.g., "Book I, Chapter 1").
|