markitect-main

Author	SHA1	Message	Date
tegwick	dfab3d598b	feat(cli): add 'helper' alias for markitect helper command markitect helper <QUESTION> now works as a short alias for markitect llm-helper, per the original plan specification. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 05:40:11 +01:00
tegwick	34ed7a6fab	docs(tutorial): update §8-9 for eval-summary command and 6/6 viability - Add eval-summary command documentation with dimension descriptions - Document resumable evaluate (incremental skip on re-run) - Fix --entity slug example to use underscores (not hyphens) - Update viability output to show per_entity_mean as 6th threshold - Add workflow note: check → eval-summary --update-metrics → viability Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 05:33:11 +01:00
tegwick	7f1eecbdb2	feat(infospace): add eval-summary command and improve evaluate pipeline (S3.3) - Fix evaluate dimensions to match template file: definition_precision, source_grounding, domain_placement, vsm_relevance, explanatory_value (was domain_relevance, discipline_alignment, conceptual_clarity) - Add VSM background context to evaluation prompt so LLM can score vsm_relevance without macro injection - Fix model_name bug: was sending literal "default" to API (HTTP 400) - Refactor run_entity_evaluation to write files incrementally via callback rather than all at once after the batch — long runs are now resumable if interrupted - Add incremental skip in CLI: entities with existing eval files are skipped automatically on re-run (acts as resume) - Add eval-summary command: reads all eval files, shows per-dimension means, optionally writes per_entity_mean to metrics.yaml - Fix record_check_results to merge rather than overwrite metrics.yaml so per_entity_mean survives subsequent check runs - Add per_entity_mean viability threshold (min: 3.5) to infospace.yaml Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 01:26:45 +01:00
tegwick	574bb11db6	feat(example): add supply-chain-vsm composition demo (S3.5) Demonstrates infospace composition: the Wealth of Nations infospace is used as a discipline, applying Smith's economic framework as a lens to analyse modern supply chain management concepts. New example: examples/supply-chain-vsm/ - infospace.yaml binding WoN as discipline (../infospace-with-history) - 3 source documents: coordination mechanisms, capital & inventory, market structure (~400 words each, original content) - supply-chain-entity-schema-v1.0.md with WoN Concept required section - won-mapping-schema-v1.0.md with Conceptual Continuity rating - artifacts/won-reference/core-entities.md — 12 curated WoN entities for injection as discipline context - 8 hand-crafted entity files demonstrating LLM output format - 3 mapping files with full rationale and VSM inheritance chains - Viable: YES (5/5 thresholds) Key mappings demonstrated: Demand Signal → Effectual Demand (Strong, S2) Vendor-Managed Inventory → Division of Labour (Strong, S1/S2) Just-in-Time Inventory → Circulating Capital (Strong, S1/S3) Bullwhip Effect → Natural Price (Moderate, S2) Platform Intermediary → Merchant Capital (Strong, S2/S4) Monopsony Power → Combination of Masters (Strong, S3*) Platform fix: entity_parser.py now recognises ## Supply Chain Domain as a domain alias for ## Economic Domain, enabling composed infospaces to use their own domain section name. Tutorial §13 rewritten with real commands, real output, and the full mapping table from the demo. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 00:08:51 +01:00
tegwick	8f00fa2018	docs(tutorial): update all commands to use markitect infospace CLI (S3.4) Replace all process_chapters.py references throughout the tutorial with the correct markitect infospace subcommands: - §2 Project layout: remove process_chapters.py, add LAYERED-DEVELOPMENT.md - §7 Processing: --chapter → process "glob", --book N → "book-N-*.md", --list → status/entities, --archive-entity → documented manual step - §8 Check: remove incorrect --provider flag; note checks are deterministic - §9 Viability: real output from full 988-entity corpus (Viable: YES) - §10 History: real snapshot table; add --metric flag example - §10 Git tracking: remove process_chapters.py from commit example - §11 Cost: update openrouter/free example command - §12 Completion: rewrite with actual observed metric progression table - §14 Quality loop: update all commands; add archive-entity manual procedure - §15 Artifact DB: --all without --provider = dry-run (no LLM calls) - §16 Adapting: update step 6 and 7 to new CLI Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-22 23:31:38 +01:00
tegwick	c861520ccd	docs(example): add layered development concept and extend tutorial Some checks failed Test Suite / unit-tests (3.11) (push) Has been cancelled Details Test Suite / unit-tests (3.12) (push) Has been cancelled Details Test Suite / integration-tests (push) Has been cancelled Details Test Suite / e2e-tests (push) Has been cancelled Details Test Suite / performance-tests (push) Has been cancelled Details Test Suite / code-quality (push) Has been cancelled Details Test Suite / security-scan (push) Has been cancelled Details Test Suite / test-summary (push) Has been cancelled Details Adds LAYERED-DEVELOPMENT.md documenting the concept for evolving a flat entity collection into a structured systemic model through four layers: L0 Source text → L1 Raw entities (current) → L2 Typed entities → L3 Relation graph → L4 Minimal systemic model Covers: the element/relation/principle/institution type taxonomy, VSM as a structural coordinate system, the type × VSM coverage matrix, triplet extraction with a controlled predicate vocabulary, feedback loop detection, and the distillation hypothesis for finding the generative core of a corpus. Extends TUTORIAL.md with sections 17–23: 17. Observing entity heterogeneity 18. The four-layer model overview 19. Layer 2 — classifying entities (schema, pipeline stage, metrics) 20. Layer 3 — extracting the relation graph (triplets, feedback loops) 21. Layer 4 — the minimal systemic model (core-model.md output) 22. Planned CLI commands for layers 2–4 23. Layers 2–4 as composed infospaces Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-20 10:43:32 +01:00
tegwick	9c32ad1837	fix(infospace): exclude raw LLM output from entity parsing; lower coverage threshold - Add `.*-raw\.md$` to `_DEFAULT_EXCLUDE_PATTERNS` in entity_parser.py to prevent per-chapter raw LLM output files from being parsed as entities. This eliminates 33 malformed domain values where delimiter text was bleeding into the Economic Domain field. - Lower coverage_ratio threshold from 0.50 → 0.40 in infospace.yaml to reflect realistic multi-book corpus expectations (documented rationale in METRICS-METHODOLOGY.md). Post-fix metrics: 988 entities, 0 malformed, coverage_ratio=0.619 (pass). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-20 09:28:20 +01:00
tegwick	7c38f9b427	merge(reprocess-v2): complete pipeline rewrite and full corpus processing Merges the reprocess-v2 branch into main, covering: Infrastructure changes: - markitect infospace process — new CLI command for batch source processing - SourcePipeline — @{macro} substitution, skip-if-exists, git commit per source - PipelineStage config extended with name, output_dir, output_macro, split_entities, macros, max_tokens fields - Per-stage max_tokens (extract=8k, map-to-vsm=10k, synthesize=4k) - LLM provenance comment in each new entity file - output/processing-log.yaml with per-source token/cost/duration/retry stats - Retry on all LLM errors (not just rate limits) with 5s back-off - C2 coverage: add domain_densities, density_std, cross_cutting_ratio Example (infospace-with-history): - All 35 chapters processed: 1021 entities across Books 1–5 - Per-chapter git commits showing metric evolution from 0 → final state - Final metrics: coverage=0.44, granularity=2.95, redundancy=0.006 - METRICS-METHODOLOGY.md C2 section corrected and expanded Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-20 00:11:39 +01:00
tegwick	dfe56a4f9b	docs(metrics): clarify C2 coverage — domain×chapter matrix, not domain×VSM - coverage.py: rewrite module docstring to explain what the metric actually computes (domain × chapter cross-tabulation, not VSM system coverage), what it does not capture (entity connectivity → C3), and when the threshold is appropriate - CoverageReport: add domain_densities, density_std, cross_cutting_ratio for distribution-level insight beyond the aggregate ratio - check_coverage: compute per-domain density and cross-cutting ratio - METRICS-METHODOLOGY.md: correct C2 section to match implementation, document the distribution-based interpretation, add implementation status table distinguishing what is wired vs planned Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-20 00:08:46 +01:00
tegwick	0f54f094e4	chore(example): final metrics snapshot — all 35 chapters processed 1021 entities extracted across all Books 1-5 of The Wealth of Nations. Final metrics: coverage=0.4424, granularity=2.9533, redundancy=0.0059. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-19 22:54:54 +01:00
tegwick	4a15a50337	infospace: process book-5-chapter-03 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 22:54:40 +01:00
tegwick	92dfe367c7	infospace: process book-5-chapter-02 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 22:46:32 +01:00
tegwick	23c397e46a	infospace: process book-5-chapter-01 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 22:36:06 +01:00
tegwick	e695ddfbbd	infospace: process book-4-chapter-09 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 22:32:07 +01:00
tegwick	5245dbbfc8	infospace: process book-4-chapter-08 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 22:25:52 +01:00
tegwick	4319d2a32b	infospace: process book-4-chapter-07 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 22:14:18 +01:00
tegwick	efdaa884c8	infospace: process book-4-chapter-06 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 22:01:44 +01:00
tegwick	2804de3d24	infospace: process book-4-chapter-05 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 21:47:52 +01:00
tegwick	3e96ac7b8d	infospace: process book-4-chapter-04 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 21:36:17 +01:00
tegwick	a687e508f3	infospace: process book-4-chapter-03 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 21:31:40 +01:00
tegwick	da9c5fce80	infospace: process book-4-chapter-02 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 21:19:39 +01:00
tegwick	cd87ebfdc0	infospace: process book-4-chapter-01 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 21:13:08 +01:00
tegwick	666f78d1ba	infospace: process book-4-introduction Extract entities, map to VSM, and synthesize analysis.	2026-02-19 21:02:00 +01:00
tegwick	579e02989b	infospace: process book-3-chapter-04 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 20:46:20 +01:00
tegwick	8401c69ff2	infospace: process book-3-chapter-03 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 20:40:35 +01:00
tegwick	1b9a31665c	fix(pipeline): retry on all LLM errors (not just rate limits) Free-tier APIs intermittently return invalid JSON or empty responses. Now any exception in _call_llm retries up to 3 times with a 5s back-off, rather than failing immediately on non-rate-limit errors. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-19 20:32:23 +01:00
tegwick	06e904ccf5	infospace: process book-3-chapter-02 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 20:30:22 +01:00
tegwick	59d42b1665	infospace: process book-3-chapter-01 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 20:18:15 +01:00
tegwick	8c11e13fef	infospace: process book-2-chapter-05 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 20:03:11 +01:00
tegwick	ac4e508aff	infospace: process book-2-chapter-04 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 19:57:59 +01:00
tegwick	8e1943afdb	infospace: process book-2-chapter-03 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 19:50:53 +01:00
tegwick	05711e541d	infospace: process book-2-chapter-02 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 19:43:19 +01:00
tegwick	8cb9ee6f6e	infospace: process book-2-chapter-01 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 19:26:57 +01:00
tegwick	db129fde6b	infospace: process book-1-chapter-11 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 19:19:20 +01:00
tegwick	6d9ec4e34b	infospace: process book-1-chapter-10 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 18:59:36 +01:00
tegwick	679f482e49	config(example): increase extract-entities max_tokens to 8000 Chapters with many pre-existing entities were still truncating at 6000 tokens because the LLM needs space to output the full list of candidates even when most are skipped. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-19 18:48:33 +01:00
tegwick	368571905a	infospace: process book-1-chapter-09 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 15:58:08 +01:00
tegwick	9c95912d68	infospace: process book-1-chapter-08 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 15:47:12 +01:00
tegwick	0828581269	infospace: process book-1-chapter-07 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 15:40:24 +01:00
tegwick	283abac378	infospace: process book-1-chapter-06 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 15:29:59 +01:00
tegwick	90ca14dd85	config(example): increase max_tokens for map-to-vsm (10k) and synthesize (4k) map-to-vsm was consistently truncating at 6000 tokens; synthesize-analysis sometimes truncated at 3000 for chapters with many entities. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-19 15:21:04 +01:00
tegwick	098b781f92	infospace: process book-1-chapter-05 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 15:20:35 +01:00
tegwick	eea397a380	infospace: process book-1-chapter-04 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 15:12:54 +01:00
tegwick	7615beb139	chore(example): update metrics after chapter-03 collection check Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-19 15:06:03 +01:00
tegwick	c2e06c15d7	infospace: process book-1-chapter-03 Extract entities, map to VSM, and synthesize analysis.	2026-02-19 15:04:57 +01:00
tegwick	df1fdf1842	feat(pipeline): per-stage max_tokens, LLM provenance, processing log - PipelineStage now supports max_tokens to override the 4096 default - SourcePipeline records provider/model on each entity file as HTML comment - output/processing-log.yaml tracks tokens, cost, duration, retries, errors - _call_llm returns (content, metadata) for downstream traceability - _http.py wraps JSON parse errors with body preview for debugging - infospace.yaml stages: extract/map=6000 tokens, synthesize=3000 tokens Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-19 14:50:49 +01:00
tegwick	5ede1de4b8	fix(pipeline): retry on 0-entity response, save raw debug, improve template - SourcePipeline: retry split_entities stage once when 0 entity delimiters are found (free-tier models intermittently return short non-formatted responses); save raw LLM response to <stage>-raw.md alongside prompts - Return None (pause pipeline) rather than writing empty view file when no entities found after max retries - _http.py: wrap json.JSONDecodeError in LLMAPIError with body preview - extract-entities.md: add explicit H2-heading format example to Output Format section to prevent models from using inline "Section:" format Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-19 14:26:28 +01:00
tegwick	72d9904485	feat(infospace): add process command for batch source file processing - Extend PipelineStage with name, output_dir, output_macro, split_entities, and macros fields for declarative pipeline config - Add SourcePipeline class (pipeline.py) using simple @{macro} substitution — no SQLite dependency, skip-if-exists per stage, LLM retry on rate limits, git commit per source - Add `markitect infospace process [GLOB_PATTERN]` CLI command with --all, --provider, --model, --check-after-each, --no-commit flags - Update infospace.yaml with output_dir, output_macro, split_entities, and macros for each pipeline stage in the WoN example Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-19 13:29:50 +01:00
tegwick	c594bc3a38	feat(infospace): add process command for batch source file processing - Extend PipelineStage with name, output_dir, output_macro, split_entities, and macros fields for declarative pipeline config - Add SourcePipeline class (pipeline.py) using simple @{macro} substitution — no SQLite dependency, skip-if-exists per stage, LLM retry on rate limits, git commit per source - Add `markitect infospace process [GLOB_PATTERN]` CLI command with --all, --provider, --model, --check-after-each, --no-commit flags - Update infospace.yaml with output_dir, output_macro, split_entities, and macros for each pipeline stage in the WoN example Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-19 13:29:32 +01:00
tegwick	77dd3fee6d	fix(example): standardise domain enum and source chapter format in schema/rules Two root causes of metric fragmentation observed in collection checks: 1. Schema's Economic Domain used free-form examples ("labour economics, trade theory") which overrode the enum in extraction-rules.md, causing the LLM to produce multi-domain strings and non-canonical values. Fix: schema now specifies the exact 7-value enum with descriptions. 2. Source Chapter had no format constraint, producing 9 different formats for 7 chapters (full titles, mixed Roman/Arabic numerals, asterisks). Fix: extraction-rules now mandate "Book [Roman], Chapter [n]" exactly. These fixes are prerequisites for clean reprocessing (S3.2 continuation). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-19 13:02:05 +01:00

1 2 3 4 5 ...

614 Commits