Three coordinated changes that let the pipeline produce a clean
chapter-by-chapter git history on long texts without archaeology after
the fact.
1. Richer commit messages. `SourcePipeline._git_commit` now diffs the
staged changes, buckets added files by output subdirectory (entities,
evaluations, classifications, mappings, analyses, metrics, logs), and
includes counts in the commit body. So `git log` reads "entities:
+23, evaluations: +23" per chapter instead of the same generic blurb
on every commit. Zero behaviour change when no output changed; falls
back to the original message if the diff query fails.
2. --eval-after-source / --classify-after-source on `infospace process`.
After a source's stages succeed, the pipeline identifies which entity
files are *new* (set diff of entity slugs before vs after), loads
their EntityMeta, and runs per-entity evaluation and/or
classification scoped to just those slugs before the per-source git
commit lands. Result: each chapter's commit is self-contained —
extraction + evaluation + classification in one atomic unit. Gated
behind explicit flags because the cost is real (LLM latency per
chapter rather than amortised across one bulk batch).
3. `markitect infospace chapters` subcommand. Lists source files in
canonical order with entity count, evaluated count, classified
count, and mean per-entity score per source. Text or JSON output.
Natural triage surface for long-text infospaces — spot chapters that
under-extracted or evaluated poorly.
Also: `docs/advanced-usage.md` gets a new "Systematic processing of
long texts" section with the recommended flag combo and the tradeoff
note on cost.
11 new unit tests cover the chapters command (text/json/no-sources),
the process flag wiring (help + provider requirement), and the
commit-body bucket logic. Full infospace+llm unit suite (315 tests)
green; 3 pre-existing infospace failures unchanged.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- `markitect infospace entity <name>`: single-entity lookup tolerating
hyphens/underscores/case, with substring matching, ambiguity listing,
and near-match hints. Prints slug, source path, domain, chapter, word
count, VSM system, overall score, evaluator, and evaluation file path.
- `markitect infospace evaluate --model-fallback <model>`: if any
entities fail with a rate-limit error, retry just those with a fresh
adapter on the fallback model (different free-tier models have
separate quota buckets).
- `markitect llm-check`: advisory when `OPENROUTER_API_KEY` is set but
not used by the resolved provider; targeted hint when OpenRouter
returns 401 (almost always a stale env key).
- `build_state`: raises `TypeError` with actionable message if passed a
path instead of an `InfospaceConfig` — prior failure mode was a
confusing `AttributeError` deep in the stack.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Five improvements that eliminate most of the agent-in-the-loop friction
observed while closing out the 988-entity WoN evaluation (C.1):
1. Gemini adapter now retries on 429 + 5xx with exponential backoff
(same pattern already used by OpenRouter/OpenAI). Removes the need
for shell-level retry wrappers when hitting free-tier rate limits.
2. evaluate CLI prints the underlying error ("ERROR — HTTP 503 …")
instead of a bare "ERROR", so agents don't have to drop into Python
to diagnose transient failures.
3. --entity/--chapter now respect existing evaluation files by default
(previously only the full-collection pass did). New --force flag
opts into re-evaluation. Stops silently burning free-tier quota on
re-runs of the same slug.
4. --entity accepts hyphenated slugs (matching entity filenames) and
normalizes them to the underscore form used on disk. On a miss the
CLI suggests near matches instead of a bare "not found".
5. eval-summary --update-metrics is no longer destructive:
read_metrics_file/write_metrics_file preserve structured values
(type_distribution) and don't flatten ints to floats. Fixes a
silent data loss observed on every run.
Bonus: the evaluator field in written evaluation frontmatter now
falls back from run_config.model_name to the adapter's resolved model
(or the model echoed back in the API response), so rows no longer
show `evaluator: null` when --model is omitted.
Tests: new tests/unit/llm/test_gemini.py covers retry behavior;
tests/unit/infospace/test_history.py gains a round-trip test that
pins the type_distribution / int-preservation invariants.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Stage 1 — Decouple:
- Move RunConfig + LLMResponse to markitect/llm/models.py (canonical)
- Move LLMAdapter + Mock/ErrorLLMAdapter to markitect/llm/adapter.py
- markitect/prompts/execution/models.py and llm_adapter.py become re-export shims
- All 4 adapters + factory.py updated to import from markitect.llm.*
- Parameterize app_name in toml_config.py (resolve_llm, get_default_layers,
get_preference_layers): paths and env var now derived from app_name arg
- Add tests/test_llm_isolation.py: 7 isolation + backward-compat tests
Stage 2 — Extract:
- Standalone llm-connect package created at ~/llm-connect/
- All 18 llm files copied; markitect.* imports replaced with llm_connect.*
- LLMError base inlined in llm_connect/exceptions.py (no markitect dep)
- llm-connect installed into markitect-venv; declared in pyproject.toml
Smoke test: markitect llm-check succeeds (live Gemini API call).
Backward compat: markitect.prompts.execution.{models,llm_adapter} still work.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New graph_export.py module supporting the `markitect infospace graph`
command added in the previous commit.
- build_entity_graph(): constructs node/edge graph from L2 classifications
and L3 relation triplets, with feedback loop detection via networkx
- apply_filters(): subgraph filters by entity type, VSM system, ego
neighbourhood, feedback-loops-only, and classified-only
- to_mermaid(): Mermaid flowchart export
- Uses "-- label -->" syntax for all edges (robust with parentheses);
"== label ==>" thick arrows for feedback loop edges
- markdown_fence=True wraps output in ```mermaid block (VS Code / GitHub)
- color_by="type" or "vsm" with distinct palettes for each
- to_dot(): Graphviz DOT export with fillcolor per type/VSM system
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
markitect helper <QUESTION> now works as a short alias for
markitect llm-helper, per the original plan specification.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix evaluate dimensions to match template file:
definition_precision, source_grounding, domain_placement,
vsm_relevance, explanatory_value (was domain_relevance,
discipline_alignment, conceptual_clarity)
- Add VSM background context to evaluation prompt so LLM can
score vsm_relevance without macro injection
- Fix model_name bug: was sending literal "default" to API (HTTP 400)
- Refactor run_entity_evaluation to write files incrementally via
callback rather than all at once after the batch — long runs are
now resumable if interrupted
- Add incremental skip in CLI: entities with existing eval files
are skipped automatically on re-run (acts as resume)
- Add eval-summary command: reads all eval files, shows per-dimension
means, optionally writes per_entity_mean to metrics.yaml
- Fix record_check_results to merge rather than overwrite metrics.yaml
so per_entity_mean survives subsequent check runs
- Add per_entity_mean viability threshold (min: 3.5) to infospace.yaml
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Demonstrates infospace composition: the Wealth of Nations infospace is
used as a discipline, applying Smith's economic framework as a lens to
analyse modern supply chain management concepts.
New example: examples/supply-chain-vsm/
- infospace.yaml binding WoN as discipline (../infospace-with-history)
- 3 source documents: coordination mechanisms, capital & inventory,
market structure (~400 words each, original content)
- supply-chain-entity-schema-v1.0.md with WoN Concept required section
- won-mapping-schema-v1.0.md with Conceptual Continuity rating
- artifacts/won-reference/core-entities.md — 12 curated WoN entities
for injection as discipline context
- 8 hand-crafted entity files demonstrating LLM output format
- 3 mapping files with full rationale and VSM inheritance chains
- Viable: YES (5/5 thresholds)
Key mappings demonstrated:
Demand Signal → Effectual Demand (Strong, S2)
Vendor-Managed Inventory → Division of Labour (Strong, S1/S2)
Just-in-Time Inventory → Circulating Capital (Strong, S1/S3)
Bullwhip Effect → Natural Price (Moderate, S2)
Platform Intermediary → Merchant Capital (Strong, S2/S4)
Monopsony Power → Combination of Masters (Strong, S3*)
Platform fix: entity_parser.py now recognises ## Supply Chain Domain
as a domain alias for ## Economic Domain, enabling composed infospaces
to use their own domain section name.
Tutorial §13 rewritten with real commands, real output, and the full
mapping table from the demo.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add `.*-raw\.md$` to `_DEFAULT_EXCLUDE_PATTERNS` in entity_parser.py to
prevent per-chapter raw LLM output files from being parsed as entities.
This eliminates 33 malformed domain values where delimiter text was
bleeding into the Economic Domain field.
- Lower coverage_ratio threshold from 0.50 → 0.40 in infospace.yaml to
reflect realistic multi-book corpus expectations (documented rationale
in METRICS-METHODOLOGY.md).
Post-fix metrics: 988 entities, 0 malformed, coverage_ratio=0.619 (pass).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- coverage.py: rewrite module docstring to explain what the metric actually
computes (domain × chapter cross-tabulation, not VSM system coverage),
what it does not capture (entity connectivity → C3), and when the
threshold is appropriate
- CoverageReport: add domain_densities, density_std, cross_cutting_ratio
for distribution-level insight beyond the aggregate ratio
- check_coverage: compute per-domain density and cross-cutting ratio
- METRICS-METHODOLOGY.md: correct C2 section to match implementation,
document the distribution-based interpretation, add implementation status
table distinguishing what is wired vs planned
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Free-tier APIs intermittently return invalid JSON or empty responses.
Now any exception in _call_llm retries up to 3 times with a 5s back-off,
rather than failing immediately on non-rate-limit errors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- PipelineStage now supports max_tokens to override the 4096 default
- SourcePipeline records provider/model on each entity file as HTML comment
- output/processing-log.yaml tracks tokens, cost, duration, retries, errors
- _call_llm returns (content, metadata) for downstream traceability
- _http.py wraps JSON parse errors with body preview for debugging
- infospace.yaml stages: extract/map=6000 tokens, synthesize=3000 tokens
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- SourcePipeline: retry split_entities stage once when 0 entity delimiters
are found (free-tier models intermittently return short non-formatted
responses); save raw LLM response to <stage>-raw.md alongside prompts
- Return None (pause pipeline) rather than writing empty view file when
no entities found after max retries
- _http.py: wrap json.JSONDecodeError in LLMAPIError with body preview
- extract-entities.md: add explicit H2-heading format example to Output
Format section to prevent models from using inline "Section:" format
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Extend PipelineStage with name, output_dir, output_macro,
split_entities, and macros fields for declarative pipeline config
- Add SourcePipeline class (pipeline.py) using simple @{macro}
substitution — no SQLite dependency, skip-if-exists per stage,
LLM retry on rate limits, git commit per source
- Add `markitect infospace process [GLOB_PATTERN]` CLI command with
--all, --provider, --model, --check-after-each, --no-commit flags
- Update infospace.yaml with output_dir, output_macro, split_entities,
and macros for each pipeline stage in the WoN example
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
History module with snapshot creation from check results, metrics file
I/O, auto-append to history after checks, date-based snapshot lookup,
and metric trend extraction. CLI commands: history, history-diff.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Evaluation pipeline builds prompts from entity metadata, delegates
to BatchEvaluator, parses structured LLM responses into ScoreEntry
objects, and writes evaluation files. CLI: 'markitect infospace evaluate'
with --provider, --entity, --chapter filters.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds 'markitect infospace' command group with init (create config),
status (entity count/domains/disciplines), entities (list with sort),
and viability (threshold dashboard with pass/fail).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
InfospaceConfig (topic, disciplines, schemas, competency questions,
viability thresholds, pipeline) with YAML load/save and directory
discovery. InfospaceState aggregates entities, evaluations, and
viability checks for status reporting.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
BatchEvaluator runs evaluation prompts across item batches with
incremental evaluation (skip unchanged via content digest), per-item
error isolation, progress callbacks, and aggregate token usage tracking.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pure-Python FCA implementation: FormalContext (entity × attribute
binary relation with extent/intent/closure), ConceptLattice via
NextClosure algorithm, find_gap_concepts() for structural coverage
gaps, and find_empty_cells() for cross-tabulation analysis.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add data models (ScoreEntry, EntityEvaluation, EvaluationSnapshot,
SnapshotDiff) and I/O utilities for YAML frontmatter evaluation files,
snapshot persistence, history append, and snapshot diffing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add OpenAI-compatible embedding support (works with both OpenAI and
OpenRouter), file-based embedding cache with content-digest invalidation,
and pure-Python cosine similarity utilities for downstream redundancy
detection.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Deterministic validation of EntityMeta against declarative schemas:
section presence/word counts, heading format, domain enum values.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extract section-tree algorithm from SchemaGenerator into standalone
core/section_tree.py and build markitect/infospace/ package with
EntityMeta dataclass and parse_entity_file/parse_entity_directory.
Foundation for schema compliance, coverage, and granularity metrics.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
schema-generate now builds content-aware schemas from the document's
section hierarchy instead of counting markdown syntax elements. Detects
key-value tables, data tables, link lists, and mixed content patterns
to produce schemas that reflect the actual document outline.
Old behavior preserved via --mode syntactic. Validator and visualization
tools pinned to syntactic mode for compatibility.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When markitdown is installed but a format-specific sub-dependency is
missing (e.g. pdfminer-six for PDF), translate the raw traceback into
a DependencyMissingError with the correct install command.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Always register MarkitdownExtractor so it overrides specialized extractors
for all its extensions. When markitdown-no-magika is not installed, users
now see the correct install hint instead of the old pymupdf4llm message.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Uses markitdown-no-magika (lighter fork without magika/onnxruntime) to
handle PDF, HTML, DOCX, PPTX, XLSX, XLS, CSV, JSON, and XML files.
Specialized extractors (pymupdf4llm, markdownify) remain as fallbacks
when markitdown is not installed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduces a new `markitect/proxy/` module with pluggable extractors that
convert non-markdown sources (PDF, HTML) into tracked markdown proxy files.
Proxy files preserve origin metadata (path, checksum, timestamp) so they
can be kept in sync when the original changes.
CLI commands: `proxy create`, `proxy update`, `proxy status`, `proxy extractors`.
Built-in extractors: PDF (pymupdf4llm), HTML (markdownify), Markdown (built-in).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
`markitect version` now prints a clean version string (Unix style),
with -v for commit/branch/dirty. `markitect release` shows detailed
development status: commits since tag, local changes, upstream
divergence. No overlap between the two commands.
Replaces get_version_info()/get_release_info() with get_version()
and get_release_status(). Drops yaml output format from release
(json + text sufficient).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When running from a git repo, use setuptools-scm at runtime to derive
the version from tags. Falls back to the static _version.py only when
not in a git repo (e.g. installed from wheel). This ensures
`markitect version` stays correct without requiring `pip install -e .`
after every tag.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add _normalize_release_info() to ensure get_release_info() returns
keys expected by the CLI release command regardless of whether the
release-management capability is available.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add TOML-based config resolution with 7-level priority chain:
CLI flags > env var > user preference > directory preference >
directory default > user default > hardcoded fallback.
New commands: llm-default (view/set/clear defaults), llm-preference
(view/set/clear preferences). Each shows only its own scope. llm-check
now displays source attribution for resolved provider/model.
Existing commands (llm-helper, llm-check) refactored to use
resolve_llm() instead of manual resolution. Hardcoded fallback
changed from openrouter/aurora-alpha to gemini/gemini-2.5-flash
due to persistent OpenRouter 502 errors.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Register qwen/qwen3-coder-next under the openrouter provider and extend
llm-catalog with a "Known Models" column so all cataloged models are
discoverable.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Consistent llm-* naming scheme for all LLM CLI commands. llm-catalog shows
provider metadata and key status; llm-check sends a minimal prompt to verify
connectivity.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add `markitect helper <QUESTION>` CLI command that answers questions
about markitect using its own documentation as LLM context. Uses
OpenRouter with openrouter/aurora-alpha by default; model is
configurable via --model flag or MARKITECT_HELPER_MODEL env var.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add OpenAIAdapter for the OpenAI chat completions API (apikey-chatgpt.txt
or OPENAI_API_KEY). Set default model to arcee-ai/trinity-large-preview:free
for the infospace pipeline and increase max_tokens from 4096 to 8192.
Reprocess chapter 05 with Trinity Large (was Gemini: 1 truncated entity,
now 19 complete entities). Process chapters 06 (Aurora Alpha, 10 entities)
and 07 (Trinity Large, 15 entities including regenerated violent-policy.md).
Canonical set now at 85 unique entities.
Add entity archive policy: entities are never silently deleted. Retired
entities move to output/entities/archive/ with a dated reason header.
New CLI option: --archive-entity <slug> --reason "...". The --list
output shows the archive count alongside the canonical set.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add GeminiAdapter calling Google's Generative Language REST API
(default model: gemini-2.5-flash). Register "gemini" as third
provider in the factory and CLI. Add rate-limit retry with
exponential backoff to the pipeline's _call_llm helper. Increase
default max_tokens from 2000 to 4096.
Process book-1-chapter-05 via Gemini free tier — 1 new entity
extracted (necessaries-conveniencies-and-amusements-of-life),
41 existing entities correctly skipped by dedup. Canonical set
now at 42 unique entities.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- ContentMacro: add __post_init__ to auto-derive raw_text when built
programmatically, preventing str.replace("", X) corruption
- MacroParser: add @{target} shorthand syntax support mapped to REQUIRED kind,
updating parse, has_macros, count_macros, and find_macro_positions
- Artifact: store content in model and SQLite DB, replace resolver placeholder
with actual artifact content, add migration for existing databases
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements markitect/llm/ package with concrete LLMAdapter implementations:
- OpenRouterAdapter: HTTP via urllib with retry/backoff on 429/5xx
- ClaudeCodeAdapter: subprocess-based Claude CLI with stdin piping
- Factory pattern: create_adapter("openrouter") or create_adapter("claude-code")
- API key resolution chain: constructor > env var > project-root key file
- 42 unit tests, 2 integration tests (gated on API key / CLI availability)
Also adds the infospace-with-history example with Wealth of Nations VSM
analysis pipeline, templates, schemas, source chapters, and processed
output for chapters 1-2. process_chapters.py now supports --provider
and --model flags for automatic LLM-driven processing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add quality gate framework with schema validation (JSON Schema via
jsonschema library), pattern validation (regex-based), multi-gate
QualityValidator with SQLite persistence, HaltingPolicyEngine with
budget/iteration/improvement checks, and RefinementLoop for iterative
execute-validate-halt cycles.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add directed dependency graph with cycle detection, topological sort,
and query service for finding dependents/dependencies transitively.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>