InfospaceConfig (topic, disciplines, schemas, competency questions,
viability thresholds, pipeline) with YAML load/save and directory
discovery. InfospaceState aggregates entities, evaluations, and
viability checks for status reporting.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
BatchEvaluator runs evaluation prompts across item batches with
incremental evaluation (skip unchanged via content digest), per-item
error isolation, progress callbacks, and aggregate token usage tracking.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pure-Python FCA implementation: FormalContext (entity × attribute
binary relation with extent/intent/closure), ConceptLattice via
NextClosure algorithm, find_gap_concepts() for structural coverage
gaps, and find_empty_cells() for cross-tabulation analysis.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add data models (ScoreEntry, EntityEvaluation, EvaluationSnapshot,
SnapshotDiff) and I/O utilities for YAML frontmatter evaluation files,
snapshot persistence, history append, and snapshot diffing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add OpenAI-compatible embedding support (works with both OpenAI and
OpenRouter), file-based embedding cache with content-digest invalidation,
and pure-Python cosine similarity utilities for downstream redundancy
detection.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Deterministic validation of EntityMeta against declarative schemas:
section presence/word counts, heading format, domain enum values.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extract section-tree algorithm from SchemaGenerator into standalone
core/section_tree.py and build markitect/infospace/ package with
EntityMeta dataclass and parse_entity_file/parse_entity_directory.
Foundation for schema compliance, coverage, and granularity metrics.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Conceptual overview of infospaces as structured, evaluable, composable
knowledge collections. Establishes the vocabulary (topic, discipline,
entity, viability), the build cycle (extract, map, evaluate, refine),
the five collection quality concerns, and the composition model
(hierarchical, networked, swarm).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The SQLite artifact database is a derived cache regenerable from
committed files — no LLM calls needed. Added tutorial section
explaining why it is excluded and how to rebuild it after a fresh clone.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
schema-generate now builds content-aware schemas from the document's
section hierarchy instead of counting markdown syntax elements. Detects
key-value tables, data tables, link lists, and mixed content patterns
to produce schemas that reflect the actual document outline.
Old behavior preserved via --mode syntactic. Validator and visualization
tools pinned to syntactic mode for compatibility.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When markitdown is installed but a format-specific sub-dependency is
missing (e.g. pdfminer-six for PDF), translate the raw traceback into
a DependencyMissingError with the correct install command.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Always register MarkitdownExtractor so it overrides specialized extractors
for all its extensions. When markitdown-no-magika is not installed, users
now see the correct install hint instead of the old pymupdf4llm message.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Uses markitdown-no-magika (lighter fork without magika/onnxruntime) to
handle PDF, HTML, DOCX, PPTX, XLSX, XLS, CSV, JSON, and XML files.
Specialized extractors (pymupdf4llm, markdownify) remain as fallbacks
when markitdown is not installed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduces a new `markitect/proxy/` module with pluggable extractors that
convert non-markdown sources (PDF, HTML) into tracked markdown proxy files.
Proxy files preserve origin metadata (path, checksum, timestamp) so they
can be kept in sync when the original changes.
CLI commands: `proxy create`, `proxy update`, `proxy status`, `proxy extractors`.
Built-in extractors: PDF (pymupdf4llm), HTML (markdownify), Markdown (built-in).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
`markitect version` now prints a clean version string (Unix style),
with -v for commit/branch/dirty. `markitect release` shows detailed
development status: commits since tag, local changes, upstream
divergence. No overlap between the two commands.
Replaces get_version_info()/get_release_info() with get_version()
and get_release_status(). Drops yaml output format from release
(json + text sufficient).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When running from a git repo, use setuptools-scm at runtime to derive
the version from tags. Falls back to the static _version.py only when
not in a git repo (e.g. installed from wheel). This ensures
`markitect version` stays correct without requiring `pip install -e .`
after every tag.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add _normalize_release_info() to ensure get_release_info() returns
keys expected by the CLI release command regardless of whether the
release-management capability is available.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add TOML-based config resolution with 7-level priority chain:
CLI flags > env var > user preference > directory preference >
directory default > user default > hardcoded fallback.
New commands: llm-default (view/set/clear defaults), llm-preference
(view/set/clear preferences). Each shows only its own scope. llm-check
now displays source attribution for resolved provider/model.
Existing commands (llm-helper, llm-check) refactored to use
resolve_llm() instead of manual resolution. Hardcoded fallback
changed from openrouter/aurora-alpha to gemini/gemini-2.5-flash
due to persistent OpenRouter 502 errors.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Register qwen/qwen3-coder-next under the openrouter provider and extend
llm-catalog with a "Known Models" column so all cataloged models are
discoverable.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Consistent llm-* naming scheme for all LLM CLI commands. llm-catalog shows
provider metadata and key status; llm-check sends a minimal prompt to verify
connectivity.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add `markitect helper <QUESTION>` CLI command that answers questions
about markitect using its own documentation as LLM context. Uses
OpenRouter with openrouter/aurora-alpha by default; model is
configurable via --model flag or MARKITECT_HELPER_MODEL env var.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add OpenAIAdapter for the OpenAI chat completions API (apikey-chatgpt.txt
or OPENAI_API_KEY). Set default model to arcee-ai/trinity-large-preview:free
for the infospace pipeline and increase max_tokens from 4096 to 8192.
Reprocess chapter 05 with Trinity Large (was Gemini: 1 truncated entity,
now 19 complete entities). Process chapters 06 (Aurora Alpha, 10 entities)
and 07 (Trinity Large, 15 entities including regenerated violent-policy.md).
Canonical set now at 85 unique entities.
Add entity archive policy: entities are never silently deleted. Retired
entities move to output/entities/archive/ with a dated reason header.
New CLI option: --archive-entity <slug> --reason "...". The --list
output shows the archive count alongside the canonical set.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add GeminiAdapter calling Google's Generative Language REST API
(default model: gemini-2.5-flash). Register "gemini" as third
provider in the factory and CLI. Add rate-limit retry with
exponential backoff to the pipeline's _call_llm helper. Increase
default max_tokens from 2000 to 4096.
Process book-1-chapter-05 via Gemini free tier — 1 new entity
extracted (necessaries-conveniencies-and-amusements-of-life),
41 existing entities correctly skipped by dedup. Canonical set
now at 42 unique entities.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Restructure entity storage from per-chapter subdirectories to a flat
canonical set in output/entities/. Each entity exists as a single file;
duplicates across chapters are detected by slug collision and skipped
(first occurrence wins). Chapter views use {{ include }} transclusion
to reference shared entity files.
Add @{existing_entities} macro to extract-entities template so the LLM
knows which entities already exist and focuses on genuinely new ones.
Refactor _call_llm() from _execute_llm() for callers that handle their
own file I/O. 41 unique entities from 4 chapters (2 duplicates removed).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- ContentMacro: add __post_init__ to auto-derive raw_text when built
programmatically, preventing str.replace("", X) corruption
- MacroParser: add @{target} shorthand syntax support mapped to REQUIRED kind,
updating parse, has_macros, count_macros, and find_macro_positions
- Artifact: store content in model and SQLite DB, replace resolver placeholder
with actual artifact content, add migration for existing databases
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comprehensive walkthrough covering schema design, prompt templates,
artifact population, pipeline usage, LLM integration, git history
tracking, metrics, and how to complete the remaining 31 chapters.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All 3 stages (entities, mappings, analysis) auto-generated.
1m53s wall time, 9,478 tokens (real), ~$0.07 est. cost.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Auto-generated mappings and analysis via Claude Code CLI adapter.
Entities were already present from a previous session.
Stats: 5m04s wall time, ~51K estimated tokens, ~$0.35 estimated cost.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Implements markitect/llm/ package with concrete LLMAdapter implementations:
- OpenRouterAdapter: HTTP via urllib with retry/backoff on 429/5xx
- ClaudeCodeAdapter: subprocess-based Claude CLI with stdin piping
- Factory pattern: create_adapter("openrouter") or create_adapter("claude-code")
- API key resolution chain: constructor > env var > project-root key file
- 42 unit tests, 2 integration tests (gated on API key / CLI availability)
Also adds the infospace-with-history example with Wealth of Nations VSM
analysis pipeline, templates, schemas, source chapters, and processed
output for chapters 1-2. process_chapters.py now supports --provider
and --model flags for automatic LLM-driven processing.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This example demonstrates the full workflow of generating InfoTech primers
using MarkiTect's Prompt Dependency Resolution infrastructure.
Features demonstrated:
- Artifact creation and storage with content-based addressing
- PromptTemplate with @{macro} resolution across multiple spaces
- Automatic dependency tracking and graph construction
- Provenance tracing from outputs back to inputs
- Visualization export (Mermaid format)
- Incremental execution with change detection
Files added:
- generate_primers.py: Complete working example
- README.md: Quick start guide and architecture overview
- TUTORIAL.md: Comprehensive 500+ line tutorial
- templates/generate-primer.md: Template with macros
- artifacts/topics/: ETL and Microservices topic definitions
- artifacts/guidelines/: Authoring rules and research protocol
- prepdr/: Original manual system (preserved for reference)
Example output:
- Generates 2 primers (ETL, Microservices)
- Creates 8 artifacts across 4 information spaces
- Records 8 dependency edges in SQLite database
- Exports dependency graph visualization
Run with: cd examples/content-generator && python generate_primers.py
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add quality gate framework with schema validation (JSON Schema via
jsonschema library), pattern validation (regex-based), multi-gate
QualityValidator with SQLite persistence, HaltingPolicyEngine with
budget/iteration/improvement checks, and RefinementLoop for iterative
execute-validate-halt cycles.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add directed dependency graph with cycle detection, topological sort,
and query service for finding dependents/dependencies transitively.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Create detailed 26-week workplan for Prompt Dependency Resolution system
implementing all 11 functional requirements across 8 phases:
- Phase 1-2: Foundation (artifacts, templates, macros)
- Phase 3-4: Resolution and execution engine with idempotent runs
- Phase 5-6: Dependency tracking and incremental recomputation
- Phase 7-8: Quality validation and observability/traceability
Includes database schemas, verification strategies, risk management,
and complete file structure for ~60 new modules.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Provides high-level overview of MarkiTect from value and functional perspective:
- What MarkiTect is and why it matters
- Core capabilities (Information Spaces, Schema Management, etc.)
- Practical use cases across different domains
- Key benefits for different user types
- Getting started guidance
- Philosophy and design principles
Focuses on user value and functionality without implementation details.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implements optional git-based version control for information spaces:
- HistoryConfig model for configuring history tracking
- Commit, Branch, HistoryEntry, DiffResult models
- IHistoryBackend and IHistoryQuery interfaces
- GitHistoryBackend using git CLI for version control
- GitHistoryEventHandler for event-driven auto-commits
- HistoryEventCoordinator for managing space history
- HistoryQueryService for high-level history queries
- Automatic commits on DOCUMENT_ADDED/REMOVED/CONTENT_CHANGED events
- Support for:
* Commit log with pagination and filtering
* Diff between versions
* File content at specific versions
* Branch creation and switching
* Version restoration
* Uncommitted changes detection
- 43 comprehensive unit tests with git availability checks
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements space composition and inheritance features:
- SpaceReference model for space-to-space references (includes, extends, links_to, composed_of)
- Variable inheritance through parent chain with local override
- Config inheritance with source tracking
- Access control models (SpacePermission, SpaceRole, AccessLevel)
- InheritanceResolver for walking parent chains
- AccessControlService for permission management
- ComposableSpaceService integrating all composability features
- Circular reference detection for EXTENDS references
- SQLite repositories for references and permissions
- 57 comprehensive unit tests
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Implements API layer for Information Spaces:
- GraphQL schema types for spaces, documents, variables
- GraphQL queries and mutations for space operations
- CLI command group with all space management commands
- Resolver functions connecting GraphQL to SpaceService
- 38 unit tests for API components
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>