- Add `.*-raw\.md$` to `_DEFAULT_EXCLUDE_PATTERNS` in entity_parser.py to
prevent per-chapter raw LLM output files from being parsed as entities.
This eliminates 33 malformed domain values where delimiter text was
bleeding into the Economic Domain field.
- Lower coverage_ratio threshold from 0.50 → 0.40 in infospace.yaml to
reflect realistic multi-book corpus expectations (documented rationale
in METRICS-METHODOLOGY.md).
Post-fix metrics: 988 entities, 0 malformed, coverage_ratio=0.619 (pass).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extract section-tree algorithm from SchemaGenerator into standalone
core/section_tree.py and build markitect/infospace/ package with
EntityMeta dataclass and parse_entity_file/parse_entity_directory.
Foundation for schema compliance, coverage, and granularity metrics.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>