markitect-main

Author	SHA1	Message	Date
tegwick	7f1eecbdb2	feat(infospace): add eval-summary command and improve evaluate pipeline (S3.3) - Fix evaluate dimensions to match template file: definition_precision, source_grounding, domain_placement, vsm_relevance, explanatory_value (was domain_relevance, discipline_alignment, conceptual_clarity) - Add VSM background context to evaluation prompt so LLM can score vsm_relevance without macro injection - Fix model_name bug: was sending literal "default" to API (HTTP 400) - Refactor run_entity_evaluation to write files incrementally via callback rather than all at once after the batch — long runs are now resumable if interrupted - Add incremental skip in CLI: entities with existing eval files are skipped automatically on re-run (acts as resume) - Add eval-summary command: reads all eval files, shows per-dimension means, optionally writes per_entity_mean to metrics.yaml - Fix record_check_results to merge rather than overwrite metrics.yaml so per_entity_mean survives subsequent check runs - Add per_entity_mean viability threshold (min: 3.5) to infospace.yaml Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 01:26:45 +01:00
tegwick	9c32ad1837	fix(infospace): exclude raw LLM output from entity parsing; lower coverage threshold - Add `.*-raw\.md$` to `_DEFAULT_EXCLUDE_PATTERNS` in entity_parser.py to prevent per-chapter raw LLM output files from being parsed as entities. This eliminates 33 malformed domain values where delimiter text was bleeding into the Economic Domain field. - Lower coverage_ratio threshold from 0.50 → 0.40 in infospace.yaml to reflect realistic multi-book corpus expectations (documented rationale in METRICS-METHODOLOGY.md). Post-fix metrics: 988 entities, 0 malformed, coverage_ratio=0.619 (pass). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-20 09:28:20 +01:00
tegwick	679f482e49	config(example): increase extract-entities max_tokens to 8000 Chapters with many pre-existing entities were still truncating at 6000 tokens because the LLM needs space to output the full list of candidates even when most are skipped. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-19 18:48:33 +01:00
tegwick	90ca14dd85	config(example): increase max_tokens for map-to-vsm (10k) and synthesize (4k) map-to-vsm was consistently truncating at 6000 tokens; synthesize-analysis sometimes truncated at 3000 for chapters with many entities. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-19 15:21:04 +01:00
tegwick	df1fdf1842	feat(pipeline): per-stage max_tokens, LLM provenance, processing log - PipelineStage now supports max_tokens to override the 4096 default - SourcePipeline records provider/model on each entity file as HTML comment - output/processing-log.yaml tracks tokens, cost, duration, retries, errors - _call_llm returns (content, metadata) for downstream traceability - _http.py wraps JSON parse errors with body preview for debugging - infospace.yaml stages: extract/map=6000 tokens, synthesize=3000 tokens Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-19 14:50:49 +01:00
tegwick	72d9904485	feat(infospace): add process command for batch source file processing - Extend PipelineStage with name, output_dir, output_macro, split_entities, and macros fields for declarative pipeline config - Add SourcePipeline class (pipeline.py) using simple @{macro} substitution — no SQLite dependency, skip-if-exists per stage, LLM retry on rate limits, git commit per source - Add `markitect infospace process [GLOB_PATTERN]` CLI command with --all, --provider, --model, --check-after-each, --no-commit flags - Update infospace.yaml with output_dir, output_macro, split_entities, and macros for each pipeline stage in the WoN example Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-19 13:29:50 +01:00
tegwick	94cb2063af	feat(example): migrate to infospace config with tooling integration (S3.1) Add infospace.yaml declaring topic, disciplines, schemas, viability thresholds. Integrate infospace tooling into process_chapters.py with --infospace-status, --infospace-check, and --infospace-viability flags. Initial check: 85 entities, 4/5 viable (coverage 0.36 < 0.50 — only 7/35 chapters processed so far). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-19 02:29:53 +01:00

7 Commits