feat(infospace): add process command for batch source file processing

- Extend PipelineStage with name, output_dir, output_macro,
  split_entities, and macros fields for declarative pipeline config
- Add SourcePipeline class (pipeline.py) using simple @{macro}
  substitution — no SQLite dependency, skip-if-exists per stage,
  LLM retry on rate limits, git commit per source
- Add `markitect infospace process [GLOB_PATTERN]` CLI command with
  --all, --provider, --model, --check-after-each, --no-commit flags
- Update infospace.yaml with output_dir, output_macro, split_entities,
  and macros for each pipeline stage in the WoN example

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-19 13:29:32 +01:00
parent 4e0b27b075
commit c594bc3a38
4 changed files with 654 additions and 1 deletions

View File

@@ -42,10 +42,25 @@ pipeline:
stages:
- name: extract-entities
template: templates/extract-entities.md
output_dir: output/entities
output_macro: entities
split_entities: true
macros:
extraction_rules: artifacts/guidelines/extraction-rules.md
vsm_framework: artifacts/vsm-reference/vsm-framework.md
- name: map-to-vsm
template: templates/map-to-vsm.md
output_dir: output/mappings
output_macro: mappings
macros:
mapping_rules: artifacts/guidelines/mapping-rules.md
vsm_framework: artifacts/vsm-reference/vsm-framework.md
- name: synthesize-analysis
template: templates/synthesize-analysis.md
output_dir: output/analyses
output_macro: analysis
macros:
vsm_framework: artifacts/vsm-reference/vsm-framework.md
post_batch:
- name: assess-metrics
template: templates/assess-metrics.md