- Fix evaluate dimensions to match template file:
definition_precision, source_grounding, domain_placement,
vsm_relevance, explanatory_value (was domain_relevance,
discipline_alignment, conceptual_clarity)
- Add VSM background context to evaluation prompt so LLM can
score vsm_relevance without macro injection
- Fix model_name bug: was sending literal "default" to API (HTTP 400)
- Refactor run_entity_evaluation to write files incrementally via
callback rather than all at once after the batch — long runs are
now resumable if interrupted
- Add incremental skip in CLI: entities with existing eval files
are skipped automatically on re-run (acts as resume)
- Add eval-summary command: reads all eval files, shows per-dimension
means, optionally writes per_entity_mean to metrics.yaml
- Fix record_check_results to merge rather than overwrite metrics.yaml
so per_entity_mean survives subsequent check runs
- Add per_entity_mean viability threshold (min: 3.5) to infospace.yaml
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- PipelineStage now supports max_tokens to override the 4096 default
- SourcePipeline records provider/model on each entity file as HTML comment
- output/processing-log.yaml tracks tokens, cost, duration, retries, errors
- _call_llm returns (content, metadata) for downstream traceability
- _http.py wraps JSON parse errors with body preview for debugging
- infospace.yaml stages: extract/map=6000 tokens, synthesize=3000 tokens
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Extend PipelineStage with name, output_dir, output_macro,
split_entities, and macros fields for declarative pipeline config
- Add SourcePipeline class (pipeline.py) using simple @{macro}
substitution — no SQLite dependency, skip-if-exists per stage,
LLM retry on rate limits, git commit per source
- Add `markitect infospace process [GLOB_PATTERN]` CLI command with
--all, --provider, --model, --check-after-each, --no-commit flags
- Update infospace.yaml with output_dir, output_macro, split_entities,
and macros for each pipeline stage in the WoN example
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
History module with snapshot creation from check results, metrics file
I/O, auto-append to history after checks, date-based snapshot lookup,
and metric trend extraction. CLI commands: history, history-diff.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Evaluation pipeline builds prompts from entity metadata, delegates
to BatchEvaluator, parses structured LLM responses into ScoreEntry
objects, and writes evaluation files. CLI: 'markitect infospace evaluate'
with --provider, --entity, --chapter filters.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds 'markitect infospace' command group with init (create config),
status (entity count/domains/disciplines), entities (list with sort),
and viability (threshold dashboard with pass/fail).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>