Demonstrates infospace composition: the Wealth of Nations infospace is
used as a discipline, applying Smith's economic framework as a lens to
analyse modern supply chain management concepts.
New example: examples/supply-chain-vsm/
- infospace.yaml binding WoN as discipline (../infospace-with-history)
- 3 source documents: coordination mechanisms, capital & inventory,
market structure (~400 words each, original content)
- supply-chain-entity-schema-v1.0.md with WoN Concept required section
- won-mapping-schema-v1.0.md with Conceptual Continuity rating
- artifacts/won-reference/core-entities.md — 12 curated WoN entities
for injection as discipline context
- 8 hand-crafted entity files demonstrating LLM output format
- 3 mapping files with full rationale and VSM inheritance chains
- Viable: YES (5/5 thresholds)
Key mappings demonstrated:
Demand Signal → Effectual Demand (Strong, S2)
Vendor-Managed Inventory → Division of Labour (Strong, S1/S2)
Just-in-Time Inventory → Circulating Capital (Strong, S1/S3)
Bullwhip Effect → Natural Price (Moderate, S2)
Platform Intermediary → Merchant Capital (Strong, S2/S4)
Monopsony Power → Combination of Masters (Strong, S3*)
Platform fix: entity_parser.py now recognises ## Supply Chain Domain
as a domain alias for ## Economic Domain, enabling composed infospaces
to use their own domain section name.
Tutorial §13 rewritten with real commands, real output, and the full
mapping table from the demo.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add `.*-raw\.md$` to `_DEFAULT_EXCLUDE_PATTERNS` in entity_parser.py to
prevent per-chapter raw LLM output files from being parsed as entities.
This eliminates 33 malformed domain values where delimiter text was
bleeding into the Economic Domain field.
- Lower coverage_ratio threshold from 0.50 → 0.40 in infospace.yaml to
reflect realistic multi-book corpus expectations (documented rationale
in METRICS-METHODOLOGY.md).
Post-fix metrics: 988 entities, 0 malformed, coverage_ratio=0.619 (pass).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- coverage.py: rewrite module docstring to explain what the metric actually
computes (domain × chapter cross-tabulation, not VSM system coverage),
what it does not capture (entity connectivity → C3), and when the
threshold is appropriate
- CoverageReport: add domain_densities, density_std, cross_cutting_ratio
for distribution-level insight beyond the aggregate ratio
- check_coverage: compute per-domain density and cross-cutting ratio
- METRICS-METHODOLOGY.md: correct C2 section to match implementation,
document the distribution-based interpretation, add implementation status
table distinguishing what is wired vs planned
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Free-tier APIs intermittently return invalid JSON or empty responses.
Now any exception in _call_llm retries up to 3 times with a 5s back-off,
rather than failing immediately on non-rate-limit errors.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- PipelineStage now supports max_tokens to override the 4096 default
- SourcePipeline records provider/model on each entity file as HTML comment
- output/processing-log.yaml tracks tokens, cost, duration, retries, errors
- _call_llm returns (content, metadata) for downstream traceability
- _http.py wraps JSON parse errors with body preview for debugging
- infospace.yaml stages: extract/map=6000 tokens, synthesize=3000 tokens
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- SourcePipeline: retry split_entities stage once when 0 entity delimiters
are found (free-tier models intermittently return short non-formatted
responses); save raw LLM response to <stage>-raw.md alongside prompts
- Return None (pause pipeline) rather than writing empty view file when
no entities found after max retries
- _http.py: wrap json.JSONDecodeError in LLMAPIError with body preview
- extract-entities.md: add explicit H2-heading format example to Output
Format section to prevent models from using inline "Section:" format
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Extend PipelineStage with name, output_dir, output_macro,
split_entities, and macros fields for declarative pipeline config
- Add SourcePipeline class (pipeline.py) using simple @{macro}
substitution — no SQLite dependency, skip-if-exists per stage,
LLM retry on rate limits, git commit per source
- Add `markitect infospace process [GLOB_PATTERN]` CLI command with
--all, --provider, --model, --check-after-each, --no-commit flags
- Update infospace.yaml with output_dir, output_macro, split_entities,
and macros for each pipeline stage in the WoN example
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
History module with snapshot creation from check results, metrics file
I/O, auto-append to history after checks, date-based snapshot lookup,
and metric trend extraction. CLI commands: history, history-diff.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Evaluation pipeline builds prompts from entity metadata, delegates
to BatchEvaluator, parses structured LLM responses into ScoreEntry
objects, and writes evaluation files. CLI: 'markitect infospace evaluate'
with --provider, --entity, --chapter filters.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds 'markitect infospace' command group with init (create config),
status (entity count/domains/disciplines), entities (list with sort),
and viability (threshold dashboard with pass/fail).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
InfospaceConfig (topic, disciplines, schemas, competency questions,
viability thresholds, pipeline) with YAML load/save and directory
discovery. InfospaceState aggregates entities, evaluations, and
viability checks for status reporting.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
BatchEvaluator runs evaluation prompts across item batches with
incremental evaluation (skip unchanged via content digest), per-item
error isolation, progress callbacks, and aggregate token usage tracking.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pure-Python FCA implementation: FormalContext (entity × attribute
binary relation with extent/intent/closure), ConceptLattice via
NextClosure algorithm, find_gap_concepts() for structural coverage
gaps, and find_empty_cells() for cross-tabulation analysis.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add data models (ScoreEntry, EntityEvaluation, EvaluationSnapshot,
SnapshotDiff) and I/O utilities for YAML frontmatter evaluation files,
snapshot persistence, history append, and snapshot diffing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add OpenAI-compatible embedding support (works with both OpenAI and
OpenRouter), file-based embedding cache with content-digest invalidation,
and pure-Python cosine similarity utilities for downstream redundancy
detection.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Deterministic validation of EntityMeta against declarative schemas:
section presence/word counts, heading format, domain enum values.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extract section-tree algorithm from SchemaGenerator into standalone
core/section_tree.py and build markitect/infospace/ package with
EntityMeta dataclass and parse_entity_file/parse_entity_directory.
Foundation for schema compliance, coverage, and granularity metrics.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
schema-generate now builds content-aware schemas from the document's
section hierarchy instead of counting markdown syntax elements. Detects
key-value tables, data tables, link lists, and mixed content patterns
to produce schemas that reflect the actual document outline.
Old behavior preserved via --mode syntactic. Validator and visualization
tools pinned to syntactic mode for compatibility.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When markitdown is installed but a format-specific sub-dependency is
missing (e.g. pdfminer-six for PDF), translate the raw traceback into
a DependencyMissingError with the correct install command.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Always register MarkitdownExtractor so it overrides specialized extractors
for all its extensions. When markitdown-no-magika is not installed, users
now see the correct install hint instead of the old pymupdf4llm message.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Uses markitdown-no-magika (lighter fork without magika/onnxruntime) to
handle PDF, HTML, DOCX, PPTX, XLSX, XLS, CSV, JSON, and XML files.
Specialized extractors (pymupdf4llm, markdownify) remain as fallbacks
when markitdown is not installed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduces a new `markitect/proxy/` module with pluggable extractors that
convert non-markdown sources (PDF, HTML) into tracked markdown proxy files.
Proxy files preserve origin metadata (path, checksum, timestamp) so they
can be kept in sync when the original changes.
CLI commands: `proxy create`, `proxy update`, `proxy status`, `proxy extractors`.
Built-in extractors: PDF (pymupdf4llm), HTML (markdownify), Markdown (built-in).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
`markitect version` now prints a clean version string (Unix style),
with -v for commit/branch/dirty. `markitect release` shows detailed
development status: commits since tag, local changes, upstream
divergence. No overlap between the two commands.
Replaces get_version_info()/get_release_info() with get_version()
and get_release_status(). Drops yaml output format from release
(json + text sufficient).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When running from a git repo, use setuptools-scm at runtime to derive
the version from tags. Falls back to the static _version.py only when
not in a git repo (e.g. installed from wheel). This ensures
`markitect version` stays correct without requiring `pip install -e .`
after every tag.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add _normalize_release_info() to ensure get_release_info() returns
keys expected by the CLI release command regardless of whether the
release-management capability is available.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add TOML-based config resolution with 7-level priority chain:
CLI flags > env var > user preference > directory preference >
directory default > user default > hardcoded fallback.
New commands: llm-default (view/set/clear defaults), llm-preference
(view/set/clear preferences). Each shows only its own scope. llm-check
now displays source attribution for resolved provider/model.
Existing commands (llm-helper, llm-check) refactored to use
resolve_llm() instead of manual resolution. Hardcoded fallback
changed from openrouter/aurora-alpha to gemini/gemini-2.5-flash
due to persistent OpenRouter 502 errors.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Register qwen/qwen3-coder-next under the openrouter provider and extend
llm-catalog with a "Known Models" column so all cataloged models are
discoverable.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Consistent llm-* naming scheme for all LLM CLI commands. llm-catalog shows
provider metadata and key status; llm-check sends a minimal prompt to verify
connectivity.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add `markitect helper <QUESTION>` CLI command that answers questions
about markitect using its own documentation as LLM context. Uses
OpenRouter with openrouter/aurora-alpha by default; model is
configurable via --model flag or MARKITECT_HELPER_MODEL env var.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add OpenAIAdapter for the OpenAI chat completions API (apikey-chatgpt.txt
or OPENAI_API_KEY). Set default model to arcee-ai/trinity-large-preview:free
for the infospace pipeline and increase max_tokens from 4096 to 8192.
Reprocess chapter 05 with Trinity Large (was Gemini: 1 truncated entity,
now 19 complete entities). Process chapters 06 (Aurora Alpha, 10 entities)
and 07 (Trinity Large, 15 entities including regenerated violent-policy.md).
Canonical set now at 85 unique entities.
Add entity archive policy: entities are never silently deleted. Retired
entities move to output/entities/archive/ with a dated reason header.
New CLI option: --archive-entity <slug> --reason "...". The --list
output shows the archive count alongside the canonical set.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add GeminiAdapter calling Google's Generative Language REST API
(default model: gemini-2.5-flash). Register "gemini" as third
provider in the factory and CLI. Add rate-limit retry with
exponential backoff to the pipeline's _call_llm helper. Increase
default max_tokens from 2000 to 4096.
Process book-1-chapter-05 via Gemini free tier — 1 new entity
extracted (necessaries-conveniencies-and-amusements-of-life),
41 existing entities correctly skipped by dedup. Canonical set
now at 42 unique entities.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- ContentMacro: add __post_init__ to auto-derive raw_text when built
programmatically, preventing str.replace("", X) corruption
- MacroParser: add @{target} shorthand syntax support mapped to REQUIRED kind,
updating parse, has_macros, count_macros, and find_macro_positions
- Artifact: store content in model and SQLite DB, replace resolver placeholder
with actual artifact content, add migration for existing databases
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements markitect/llm/ package with concrete LLMAdapter implementations:
- OpenRouterAdapter: HTTP via urllib with retry/backoff on 429/5xx
- ClaudeCodeAdapter: subprocess-based Claude CLI with stdin piping
- Factory pattern: create_adapter("openrouter") or create_adapter("claude-code")
- API key resolution chain: constructor > env var > project-root key file
- 42 unit tests, 2 integration tests (gated on API key / CLI availability)
Also adds the infospace-with-history example with Wealth of Nations VSM
analysis pipeline, templates, schemas, source chapters, and processed
output for chapters 1-2. process_chapters.py now supports --provider
and --model flags for automatic LLM-driven processing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add quality gate framework with schema validation (JSON Schema via
jsonschema library), pattern validation (regex-based), multi-gate
QualityValidator with SQLite persistence, HaltingPolicyEngine with
budget/iteration/improvement checks, and RefinementLoop for iterative
execute-validate-halt cycles.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add directed dependency graph with cycle detection, topological sort,
and query service for finding dependents/dependencies transitively.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements optional git-based version control for information spaces:
- HistoryConfig model for configuring history tracking
- Commit, Branch, HistoryEntry, DiffResult models
- IHistoryBackend and IHistoryQuery interfaces
- GitHistoryBackend using git CLI for version control
- GitHistoryEventHandler for event-driven auto-commits
- HistoryEventCoordinator for managing space history
- HistoryQueryService for high-level history queries
- Automatic commits on DOCUMENT_ADDED/REMOVED/CONTENT_CHANGED events
- Support for:
* Commit log with pagination and filtering
* Diff between versions
* File content at specific versions
* Branch creation and switching
* Version restoration
* Uncommitted changes detection
- 43 comprehensive unit tests with git availability checks
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements space composition and inheritance features:
- SpaceReference model for space-to-space references (includes, extends, links_to, composed_of)
- Variable inheritance through parent chain with local override
- Config inheritance with source tracking
- Access control models (SpacePermission, SpaceRole, AccessLevel)
- InheritanceResolver for walking parent chains
- AccessControlService for permission management
- ComposableSpaceService integrating all composability features
- Circular reference detection for EXTENDS references
- SQLite repositories for references and permissions
- 57 comprehensive unit tests
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements API layer for Information Spaces:
- GraphQL schema types for spaces, documents, variables
- GraphQL queries and mutations for space operations
- CLI command group with all space management commands
- Resolver functions connecting GraphQL to SpaceService
- 38 unit tests for API components
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements HTML rendering system for Information Spaces:
- SpaceRenderer: Abstract base class for renderers
- RenderConfig: Configuration for format, theme, TOC, etc.
- RenderResult: Immutable result with content hash and metadata
- ThemeConfig: Layered theme system with customization
- CompositeRenderer: Multi-format renderer delegation
- MarkdownToHTMLRenderer: Full markdown-to-HTML conversion
- Theme support (github, dark, minimal, academic)
- Code block handling
- Link target="_blank" for external links
- Table of contents generation
- Heading ID generation for navigation
- HTMLRendererFactory: Factory for common renderer configurations
- SpaceRenderingService: Orchestration layer
- Transclusion variable substitution
- Render caching with automatic invalidation
- Event emission (RENDER_STARTED, RENDER_COMPLETED, RENDER_FAILED)
- Batch rendering support
- Statistics tracking
- SpaceRenderingServiceBuilder: Fluent builder pattern
60 unit tests covering all components.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>