markitect-main

Files

tegwick 7c38f9b427 merge(reprocess-v2): complete pipeline rewrite and full corpus processing

Merges the reprocess-v2 branch into main, covering:

Infrastructure changes:
- markitect infospace process — new CLI command for batch source processing
- SourcePipeline — @{macro} substitution, skip-if-exists, git commit per source
- PipelineStage config extended with name, output_dir, output_macro,
  split_entities, macros, max_tokens fields
- Per-stage max_tokens (extract=8k, map-to-vsm=10k, synthesize=4k)
- LLM provenance comment in each new entity file
- output/processing-log.yaml with per-source token/cost/duration/retry stats
- Retry on all LLM errors (not just rate limits) with 5s back-off
- C2 coverage: add domain_densities, density_std, cross_cutting_ratio

Example (infospace-with-history):
- All 35 chapters processed: 1021 entities across Books 1–5
- Per-chapter git commits showing metric evolution from 0 → final state
- Final metrics: coverage=0.44, granularity=2.95, redundancy=0.006
- METRICS-METHODOLOGY.md C2 section corrected and expanded

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-02-20 00:11:39 +01:00

asset-management

refactor: reorganize examples directory with topic-based subdirectories

2025-10-29 22:31:52 +01:00

content-generator

feat(examples): add content-generator example demonstrating Prompt Dependency Resolution

2026-02-09 23:50:07 +01:00

design-patterns

docs: Add design pattern examples and update submodule

2025-12-16 17:00:31 +01:00

essays

refactor: reorganize examples directory with topic-based subdirectories