Finishes the in-progress rename so docs, configs, tests, and capability manifests all reference the current repo name consistently. Fixes two tests (test_roundtrip_consolidated.py, test_issue_140_roundtrip_simplified.py) whose hardcoded cwd paths would have broken under the renamed directory. Archival content under history/, reports/, and roadmap/eat-the-frog/, plus derived artifacts (.venv_old/, node_modules/, asset_registry.json) are intentionally left untouched. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
11 KiB
MarkiTect Project Structure
This document describes the current project layout, architectural decisions, and the reorganization plan for the Information Space Service evolution.
Overview
MarkiTect is a markdown processing toolkit with transclusion, schema validation, asset management, and multi-format output capabilities. The project follows a hybrid layout that is being incrementally consolidated.
Current Directory Structure
markitect-main/
├── markitect/ # Main package
│ ├── [34 root-level .py files] # Core functionality (see below)
│ ├── assets/ # Asset discovery, management, caching (21 files)
│ ├── finance/ # Cost tracking, work time management (9 files)
│ ├── plugins/ # Plugin system with base classes (7 files)
│ ├── packaging/ # Asset packaging, MDZ variants (7 files)
│ ├── production/ # Deployment validation, benchmarks (6 files)
│ ├── legacy/ # Legacy compatibility layer (8 files)
│ ├── explode_variants/ # Document expansion, variants (9 files)
│ ├── query_paradigms/ # Query paradigm implementations (4 files)
│ ├── validators/ # Content/link/section validation (4 files)
│ ├── matter_frontmatter/ # Front matter parsing (4 files)
│ ├── matter_contentmatter/ # Content matter parsing (4 files)
│ ├── matter_tailmatter/ # Tail matter parsing (4 files)
│ ├── profile/ # User profile management (4 files)
│ ├── graphql/ # GraphQL query implementation (4 files)
│ ├── template/ # Template management (3 files)
│ ├── themes/ # Theme system with subdirectories (1 file)
│ └── schemas/ # Built-in schema definitions (9 files)
├── application/ # Application layer services
├── domain/ # Domain models
├── infrastructure/ # Infrastructure implementations
├── tests/ # Test suite (90+ test files)
│ ├── unit/ # Unit tests
│ ├── integration/ # Integration tests
│ ├── e2e/ # End-to-end tests
│ └── fixtures/ # Test data
├── docs/ # Documentation (12+ subdirectories)
├── src/ # JavaScript/frontend components
└── roadmap/ # Project roadmap
Root-Level Modules (/markitect/)
The 34 root-level Python files are organized by function:
Core Infrastructure
| File | Lines | Purpose |
|---|---|---|
parser.py |
~50 | Markdown AST parsing using markdown-it |
serializer.py |
~360 | AST serialization back to Markdown |
document_manager.py |
~100 | Wrapper around CleanDocumentManager |
clean_document_manager.py |
~2000 | Clean document management implementation |
workspace.py |
~200 | Workspace management |
database.py |
~400 | SQLite database management |
Schema Management (6 files, 99KB total)
| File | Lines | Purpose |
|---|---|---|
schema_generator.py |
~600 | JSON schema generation from markdown AST |
schema_analyzer.py |
~450 | Schema rigidity analysis with phase classification |
schema_loader.py |
~600 | Schema loading from markdown with frontmatter |
schema_refiner.py |
~600 | Automatic schema refinement using loosening rules |
schema_validator.py |
~900 | Comprehensive schema validation |
schema_naming.py |
~300 | Schema naming convention enforcement |
Configuration & Services
| File | Purpose |
|---|---|
config_manager.py |
Configuration file management |
frontmatter.py |
YAML frontmatter parsing |
exceptions.py |
Custom exception classes |
ast_service.py |
AST service layer |
cache_service.py |
Caching functionality |
ast_cache.py |
AST caching implementation |
performance_tracker.py |
Performance metrics |
Validation & Analysis
| File | Purpose |
|---|---|
semantic_validator.py |
Semantic validation layer |
validation_error.py |
Validation error handling |
metaschema.py |
Metaschema validation for custom extensions |
CLI & Commands
| File | Purpose |
|---|---|
cli.py |
Main CLI interface (274KB, comprehensive) |
cli_utils.py |
CLI utilities |
asset_commands.py |
Asset-related CLI commands |
draft_generator.py |
Draft generation functionality |
Utilities
| File | Purpose |
|---|---|
batch_processor.py |
Batch processing operations |
associated_files.py |
Associated file tracking |
legacy_compat.py |
Legacy compatibility layer |
legacy_integration_example.py |
Integration examples |
_version.py, __version__.py |
Version management |
Subpackages
assets/ (21 files)
Complete asset management system including discovery, analytics, caching, deduplication, and packaging. Key files:
repository.py- Asset repository patterndiscovery.py- Asset discovery algorithmscache.py- Asset caching layeranalytics.py- Asset usage analytics
finance/ (9 files)
Cost tracking and work time management:
models.py- Financial data modelscost_tracker.py- Cost tracking implementationperiod_tracker.py- Period-based trackingreport_generator.py- Financial reports
plugins/ (7 files)
Extensible plugin system:
base.py- Plugin base classes and typesregistry.py- Plugin registrybuiltin/- Built-in plugin implementations
packaging/ (7 files)
Asset packaging and MDZ format support:
mdz_packager.py- MDZ package creationtransclusion.py- Transclusion handlingvariant_factory.py- Variant generation
production/ (6 files)
Deployment and production validation:
deployment_validator.py- Deployment checksperformance_benchmark.py- Performance testingcross_platform_validator.py- Platform compatibility
legacy/ (8 files)
Backward compatibility layer:
compatibility.py- Compatibility wrappersdeprecation.py- Deprecation warningsgit_tracker.py- Git integration (useful for Phase 8)
Test Structure
tests/
├── conftest.py # Shared pytest configuration
├── fixtures/ # Test data files
│ ├── content_test_files/
│ ├── contentmatter_test_files/
│ ├── frontmatter_test_files/
│ └── tailmatter_test_files/
├── unit/ # Unit tests by domain
│ ├── application/
│ └── infrastructure/
├── integration/ # Integration tests
│ └── repositories/
└── e2e/ # End-to-end tests
├── cli/
└── performance/
Planned Reorganization
Motivation
The current layout has grown organically, resulting in:
- 34 files at root level - Too many modules at package root
- No clear grouping - Schema tools, core infrastructure, and utilities mixed
- Hybrid architecture - Mix of root packages and monolithic /markitect/
Target Structure
After reorganization, the /markitect/ package will have clearer structure:
markitect/
├── core/ # Core infrastructure (NEW)
│ ├── __init__.py
│ ├── parser.py # (from markitect/)
│ ├── serializer.py # (from markitect/)
│ ├── document_manager.py # (from markitect/)
│ └── workspace.py # (from markitect/)
├── schema/ # Schema management (NEW)
│ ├── __init__.py
│ ├── validator.py # (from schema_validator.py)
│ ├── generator.py # (from schema_generator.py)
│ ├── loader.py # (from schema_loader.py)
│ ├── analyzer.py # (from schema_analyzer.py)
│ ├── refiner.py # (from schema_refiner.py)
│ └── naming.py # (from schema_naming.py)
├── storage/ # Storage concerns (NEW)
│ ├── __init__.py
│ ├── database.py # (from markitect/)
│ └── cache.py # (consolidated)
├── spaces/ # Information spaces (Phase 1+)
│ ├── models.py
│ ├── events/
│ ├── repositories/
│ ├── transclusion/
│ ├── rendering/
│ ├── sync/
│ └── services/
└── [existing subpackages] # assets/, plugins/, etc.
Backward Compatibility
Original import paths will continue to work through re-exports:
# Old import (still works)
from markitect.parser import parse_markdown
# New import (preferred)
from markitect.core.parser import parse_markdown
Migration Strategy
- Create new subpackages with copied content
- Update internal imports to new paths
- Add deprecation warnings to old paths
- Re-export from original locations for compatibility
- Verify all tests pass
- Update documentation
Information Space Service Architecture
The reorganization prepares for the Information Space Service evolution, which adds:
Phase 1-3: Foundation
InformationSpaceentity with lifecycle managementSpaceRepositoryfor persistence- Event system for change tracking
- Persistent transclusion context
Phase 4-5: Modes
- HTML rendering mode with caching
- Directory mode with bidirectional sync
Phase 6-7: API & Composability
- GraphQL schema extensions
- CLI commands for space management
- Space references and inheritance
Phase 8: Git History (Optional)
- Git-based version control for spaces
- Event-driven commits
- Version navigation
See docs/roadmap/information-space-service/ for the complete workplan.
Key Dependencies
From pyproject.toml:
- Python >=3.8 (tested on 3.12)
- markdown-it-py - Markdown parsing
- PyYAML - YAML/frontmatter handling
- click - CLI framework
- tabulate - Table formatting
- jsonpath-ng - JSON path queries
- aiohttp - Async HTTP
Version Information
- Current version is managed in
_version.pyand__version__.py - Follows semantic versioning
- CHANGELOG.md tracks all changes
Related Documentation
- CLI Tutorial - CLI usage guide
- Plugin System - Plugin architecture
- Schema Management Guide - Schema workflows
- Asset Management Guide - Asset system
- Error Handling Strategy - Error patterns