When markitdown is installed but a format-specific sub-dependency is
missing (e.g. pdfminer-six for PDF), translate the raw traceback into
a DependencyMissingError with the correct install command.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Always register MarkitdownExtractor so it overrides specialized extractors
for all its extensions. When markitdown-no-magika is not installed, users
now see the correct install hint instead of the old pymupdf4llm message.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Uses markitdown-no-magika (lighter fork without magika/onnxruntime) to
handle PDF, HTML, DOCX, PPTX, XLSX, XLS, CSV, JSON, and XML files.
Specialized extractors (pymupdf4llm, markdownify) remain as fallbacks
when markitdown is not installed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduces a new `markitect/proxy/` module with pluggable extractors that
convert non-markdown sources (PDF, HTML) into tracked markdown proxy files.
Proxy files preserve origin metadata (path, checksum, timestamp) so they
can be kept in sync when the original changes.
CLI commands: `proxy create`, `proxy update`, `proxy status`, `proxy extractors`.
Built-in extractors: PDF (pymupdf4llm), HTML (markdownify), Markdown (built-in).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
`markitect version` now prints a clean version string (Unix style),
with -v for commit/branch/dirty. `markitect release` shows detailed
development status: commits since tag, local changes, upstream
divergence. No overlap between the two commands.
Replaces get_version_info()/get_release_info() with get_version()
and get_release_status(). Drops yaml output format from release
(json + text sufficient).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When running from a git repo, use setuptools-scm at runtime to derive
the version from tags. Falls back to the static _version.py only when
not in a git repo (e.g. installed from wheel). This ensures
`markitect version` stays correct without requiring `pip install -e .`
after every tag.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add _normalize_release_info() to ensure get_release_info() returns
keys expected by the CLI release command regardless of whether the
release-management capability is available.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add TOML-based config resolution with 7-level priority chain:
CLI flags > env var > user preference > directory preference >
directory default > user default > hardcoded fallback.
New commands: llm-default (view/set/clear defaults), llm-preference
(view/set/clear preferences). Each shows only its own scope. llm-check
now displays source attribution for resolved provider/model.
Existing commands (llm-helper, llm-check) refactored to use
resolve_llm() instead of manual resolution. Hardcoded fallback
changed from openrouter/aurora-alpha to gemini/gemini-2.5-flash
due to persistent OpenRouter 502 errors.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Register qwen/qwen3-coder-next under the openrouter provider and extend
llm-catalog with a "Known Models" column so all cataloged models are
discoverable.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Consistent llm-* naming scheme for all LLM CLI commands. llm-catalog shows
provider metadata and key status; llm-check sends a minimal prompt to verify
connectivity.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add `markitect helper <QUESTION>` CLI command that answers questions
about markitect using its own documentation as LLM context. Uses
OpenRouter with openrouter/aurora-alpha by default; model is
configurable via --model flag or MARKITECT_HELPER_MODEL env var.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add OpenAIAdapter for the OpenAI chat completions API (apikey-chatgpt.txt
or OPENAI_API_KEY). Set default model to arcee-ai/trinity-large-preview:free
for the infospace pipeline and increase max_tokens from 4096 to 8192.
Reprocess chapter 05 with Trinity Large (was Gemini: 1 truncated entity,
now 19 complete entities). Process chapters 06 (Aurora Alpha, 10 entities)
and 07 (Trinity Large, 15 entities including regenerated violent-policy.md).
Canonical set now at 85 unique entities.
Add entity archive policy: entities are never silently deleted. Retired
entities move to output/entities/archive/ with a dated reason header.
New CLI option: --archive-entity <slug> --reason "...". The --list
output shows the archive count alongside the canonical set.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add GeminiAdapter calling Google's Generative Language REST API
(default model: gemini-2.5-flash). Register "gemini" as third
provider in the factory and CLI. Add rate-limit retry with
exponential backoff to the pipeline's _call_llm helper. Increase
default max_tokens from 2000 to 4096.
Process book-1-chapter-05 via Gemini free tier — 1 new entity
extracted (necessaries-conveniencies-and-amusements-of-life),
41 existing entities correctly skipped by dedup. Canonical set
now at 42 unique entities.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Restructure entity storage from per-chapter subdirectories to a flat
canonical set in output/entities/. Each entity exists as a single file;
duplicates across chapters are detected by slug collision and skipped
(first occurrence wins). Chapter views use {{ include }} transclusion
to reference shared entity files.
Add @{existing_entities} macro to extract-entities template so the LLM
knows which entities already exist and focuses on genuinely new ones.
Refactor _call_llm() from _execute_llm() for callers that handle their
own file I/O. 41 unique entities from 4 chapters (2 duplicates removed).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- ContentMacro: add __post_init__ to auto-derive raw_text when built
programmatically, preventing str.replace("", X) corruption
- MacroParser: add @{target} shorthand syntax support mapped to REQUIRED kind,
updating parse, has_macros, count_macros, and find_macro_positions
- Artifact: store content in model and SQLite DB, replace resolver placeholder
with actual artifact content, add migration for existing databases
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comprehensive walkthrough covering schema design, prompt templates,
artifact population, pipeline usage, LLM integration, git history
tracking, metrics, and how to complete the remaining 31 chapters.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All 3 stages (entities, mappings, analysis) auto-generated.
1m53s wall time, 9,478 tokens (real), ~$0.07 est. cost.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Auto-generated mappings and analysis via Claude Code CLI adapter.
Entities were already present from a previous session.
Stats: 5m04s wall time, ~51K estimated tokens, ~$0.35 estimated cost.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements markitect/llm/ package with concrete LLMAdapter implementations:
- OpenRouterAdapter: HTTP via urllib with retry/backoff on 429/5xx
- ClaudeCodeAdapter: subprocess-based Claude CLI with stdin piping
- Factory pattern: create_adapter("openrouter") or create_adapter("claude-code")
- API key resolution chain: constructor > env var > project-root key file
- 42 unit tests, 2 integration tests (gated on API key / CLI availability)
Also adds the infospace-with-history example with Wealth of Nations VSM
analysis pipeline, templates, schemas, source chapters, and processed
output for chapters 1-2. process_chapters.py now supports --provider
and --model flags for automatic LLM-driven processing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This example demonstrates the full workflow of generating InfoTech primers
using MarkiTect's Prompt Dependency Resolution infrastructure.
Features demonstrated:
- Artifact creation and storage with content-based addressing
- PromptTemplate with @{macro} resolution across multiple spaces
- Automatic dependency tracking and graph construction
- Provenance tracing from outputs back to inputs
- Visualization export (Mermaid format)
- Incremental execution with change detection
Files added:
- generate_primers.py: Complete working example
- README.md: Quick start guide and architecture overview
- TUTORIAL.md: Comprehensive 500+ line tutorial
- templates/generate-primer.md: Template with macros
- artifacts/topics/: ETL and Microservices topic definitions
- artifacts/guidelines/: Authoring rules and research protocol
- prepdr/: Original manual system (preserved for reference)
Example output:
- Generates 2 primers (ETL, Microservices)
- Creates 8 artifacts across 4 information spaces
- Records 8 dependency edges in SQLite database
- Exports dependency graph visualization
Run with: cd examples/content-generator && python generate_primers.py
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add quality gate framework with schema validation (JSON Schema via
jsonschema library), pattern validation (regex-based), multi-gate
QualityValidator with SQLite persistence, HaltingPolicyEngine with
budget/iteration/improvement checks, and RefinementLoop for iterative
execute-validate-halt cycles.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add directed dependency graph with cycle detection, topological sort,
and query service for finding dependents/dependencies transitively.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create detailed 26-week workplan for Prompt Dependency Resolution system
implementing all 11 functional requirements across 8 phases:
- Phase 1-2: Foundation (artifacts, templates, macros)
- Phase 3-4: Resolution and execution engine with idempotent runs
- Phase 5-6: Dependency tracking and incremental recomputation
- Phase 7-8: Quality validation and observability/traceability
Includes database schemas, verification strategies, risk management,
and complete file structure for ~60 new modules.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Provides high-level overview of MarkiTect from value and functional perspective:
- What MarkiTect is and why it matters
- Core capabilities (Information Spaces, Schema Management, etc.)
- Practical use cases across different domains
- Key benefits for different user types
- Getting started guidance
- Philosophy and design principles
Focuses on user value and functionality without implementation details.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implements optional git-based version control for information spaces:
- HistoryConfig model for configuring history tracking
- Commit, Branch, HistoryEntry, DiffResult models
- IHistoryBackend and IHistoryQuery interfaces
- GitHistoryBackend using git CLI for version control
- GitHistoryEventHandler for event-driven auto-commits
- HistoryEventCoordinator for managing space history
- HistoryQueryService for high-level history queries
- Automatic commits on DOCUMENT_ADDED/REMOVED/CONTENT_CHANGED events
- Support for:
* Commit log with pagination and filtering
* Diff between versions
* File content at specific versions
* Branch creation and switching
* Version restoration
* Uncommitted changes detection
- 43 comprehensive unit tests with git availability checks
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements space composition and inheritance features:
- SpaceReference model for space-to-space references (includes, extends, links_to, composed_of)
- Variable inheritance through parent chain with local override
- Config inheritance with source tracking
- Access control models (SpacePermission, SpaceRole, AccessLevel)
- InheritanceResolver for walking parent chains
- AccessControlService for permission management
- ComposableSpaceService integrating all composability features
- Circular reference detection for EXTENDS references
- SQLite repositories for references and permissions
- 57 comprehensive unit tests
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements API layer for Information Spaces:
- GraphQL schema types for spaces, documents, variables
- GraphQL queries and mutations for space operations
- CLI command group with all space management commands
- Resolver functions connecting GraphQL to SpaceService
- 38 unit tests for API components
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements HTML rendering system for Information Spaces:
- SpaceRenderer: Abstract base class for renderers
- RenderConfig: Configuration for format, theme, TOC, etc.
- RenderResult: Immutable result with content hash and metadata
- ThemeConfig: Layered theme system with customization
- CompositeRenderer: Multi-format renderer delegation
- MarkdownToHTMLRenderer: Full markdown-to-HTML conversion
- Theme support (github, dark, minimal, academic)
- Code block handling
- Link target="_blank" for external links
- Table of contents generation
- Heading ID generation for navigation
- HTMLRendererFactory: Factory for common renderer configurations
- SpaceRenderingService: Orchestration layer
- Transclusion variable substitution
- Render caching with automatic invalidation
- Event emission (RENDER_STARTED, RENDER_COMPLETED, RENDER_FAILED)
- Batch rendering support
- Statistics tracking
- SpaceRenderingServiceBuilder: Fluent builder pattern
60 unit tests covering all components.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Move topic from roadmap/ to history/
- Add DONE.md with comprehensive completion summary
- Topic fully complete with all 9 optimizations implemented
- Exceeded original scope (Stages 1-2 + all of Stage 3)
- Ready for archive
Add release notes extraction from CHANGELOG for publishing:
- Create ChangelogParser class to extract version sections from CHANGELOG
- Support multiple output formats: markdown, plain text, HTML
- Add 'release notes VERSION' CLI command to extract notes
- Auto-detect latest version if not specified
- Support piping to gh/gitea release commands
- Save to file with --output option
- Plain text format removes markdown formatting
- HTML format converts markdown to HTML
This streamlines creating release notes for GitHub/Gitea releases
by extracting CHANGELOG content automatically.
Usage:
release notes 0.10.0 # Extract markdown notes
release notes # Latest version
release notes 0.10.0 --format plain # Plain text
release notes 0.10.0 -o notes.md # Save to file
release notes 0.10.0 | gh release create v0.10.0 -F -
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add automated schema ingestion from markitect/schemas/ directory:
- Create auto_ingest_schemas() function in schema_loader module
- Automatically detect and ingest .md schema files from schemas/
- Skip schemas that are already ingested in database
- Return detailed results with ingested/skipped/failed lists
- Add 'markitect schema-auto-ingest' CLI command
- Support verbose mode for detailed progress reporting
- Useful for post-install setup and development workflows
This eliminates the manual step of running schema-ingest for each
bundled schema file, streamlining schema management.
Usage:
markitect schema-auto-ingest # Ingest all new schemas
markitect schema-auto-ingest --verbose # Show detailed progress
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>