Phase 0 - Project Organization: - Create docs/PROJECT_STRUCTURE.md documenting codebase layout - Create markitect/core/ with parser, serializer, document_manager, workspace - Create markitect/schema/ consolidating 6 schema_*.py modules - Create markitect/storage/ with database module - Maintain backward compatibility via re-exports from original locations - Add docs/roadmap/information-space-service/ with README and WORKPLAN Phase 1 - Foundation (Weeks 1-3): - Week 1: Core domain models (InformationSpace, SpaceDocument, SpaceConfig, SpaceMetadata, SpaceVariable, TransclusionReference, SpaceStatus) - Week 2: Repository layer with interfaces (ISpaceRepository, IDocumentAssociationRepository, IVariableRepository, IReferenceRepository) and SQLite implementations with foreign key cascade deletes - Week 3: SpaceService orchestration layer with full CRUD, document, variable, and reference tracking operations Test coverage: 124 tests (25 model + 63 repository + 36 integration) Capabilities delivered: - CAP-001: InformationSpace entity with lifecycle management - CAP-002: SpaceRepository CRUD with SQLite backing - CAP-003: Document-Space associations with path-based organization - CAP-004: Space metadata and configuration schemas - CAP-005: Database schema with migrations Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
287 lines
11 KiB
Markdown
287 lines
11 KiB
Markdown
# MarkiTect Project Structure
|
|
|
|
This document describes the current project layout, architectural decisions, and the reorganization plan for the Information Space Service evolution.
|
|
|
|
## Overview
|
|
|
|
MarkiTect is a markdown processing toolkit with transclusion, schema validation, asset management, and multi-format output capabilities. The project follows a hybrid layout that is being incrementally consolidated.
|
|
|
|
## Current Directory Structure
|
|
|
|
```
|
|
markitect_project/
|
|
├── markitect/ # Main package
|
|
│ ├── [34 root-level .py files] # Core functionality (see below)
|
|
│ ├── assets/ # Asset discovery, management, caching (21 files)
|
|
│ ├── finance/ # Cost tracking, work time management (9 files)
|
|
│ ├── plugins/ # Plugin system with base classes (7 files)
|
|
│ ├── packaging/ # Asset packaging, MDZ variants (7 files)
|
|
│ ├── production/ # Deployment validation, benchmarks (6 files)
|
|
│ ├── legacy/ # Legacy compatibility layer (8 files)
|
|
│ ├── explode_variants/ # Document expansion, variants (9 files)
|
|
│ ├── query_paradigms/ # Query paradigm implementations (4 files)
|
|
│ ├── validators/ # Content/link/section validation (4 files)
|
|
│ ├── matter_frontmatter/ # Front matter parsing (4 files)
|
|
│ ├── matter_contentmatter/ # Content matter parsing (4 files)
|
|
│ ├── matter_tailmatter/ # Tail matter parsing (4 files)
|
|
│ ├── profile/ # User profile management (4 files)
|
|
│ ├── graphql/ # GraphQL query implementation (4 files)
|
|
│ ├── template/ # Template management (3 files)
|
|
│ ├── themes/ # Theme system with subdirectories (1 file)
|
|
│ └── schemas/ # Built-in schema definitions (9 files)
|
|
├── application/ # Application layer services
|
|
├── domain/ # Domain models
|
|
├── infrastructure/ # Infrastructure implementations
|
|
├── tests/ # Test suite (90+ test files)
|
|
│ ├── unit/ # Unit tests
|
|
│ ├── integration/ # Integration tests
|
|
│ ├── e2e/ # End-to-end tests
|
|
│ └── fixtures/ # Test data
|
|
├── docs/ # Documentation (12+ subdirectories)
|
|
├── src/ # JavaScript/frontend components
|
|
└── roadmap/ # Project roadmap
|
|
```
|
|
|
|
## Root-Level Modules (/markitect/)
|
|
|
|
The 34 root-level Python files are organized by function:
|
|
|
|
### Core Infrastructure
|
|
| File | Lines | Purpose |
|
|
|------|-------|---------|
|
|
| `parser.py` | ~50 | Markdown AST parsing using markdown-it |
|
|
| `serializer.py` | ~360 | AST serialization back to Markdown |
|
|
| `document_manager.py` | ~100 | Wrapper around CleanDocumentManager |
|
|
| `clean_document_manager.py` | ~2000 | Clean document management implementation |
|
|
| `workspace.py` | ~200 | Workspace management |
|
|
| `database.py` | ~400 | SQLite database management |
|
|
|
|
### Schema Management (6 files, 99KB total)
|
|
| File | Lines | Purpose |
|
|
|------|-------|---------|
|
|
| `schema_generator.py` | ~600 | JSON schema generation from markdown AST |
|
|
| `schema_analyzer.py` | ~450 | Schema rigidity analysis with phase classification |
|
|
| `schema_loader.py` | ~600 | Schema loading from markdown with frontmatter |
|
|
| `schema_refiner.py` | ~600 | Automatic schema refinement using loosening rules |
|
|
| `schema_validator.py` | ~900 | Comprehensive schema validation |
|
|
| `schema_naming.py` | ~300 | Schema naming convention enforcement |
|
|
|
|
### Configuration & Services
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `config_manager.py` | Configuration file management |
|
|
| `frontmatter.py` | YAML frontmatter parsing |
|
|
| `exceptions.py` | Custom exception classes |
|
|
| `ast_service.py` | AST service layer |
|
|
| `cache_service.py` | Caching functionality |
|
|
| `ast_cache.py` | AST caching implementation |
|
|
| `performance_tracker.py` | Performance metrics |
|
|
|
|
### Validation & Analysis
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `semantic_validator.py` | Semantic validation layer |
|
|
| `validation_error.py` | Validation error handling |
|
|
| `metaschema.py` | Metaschema validation for custom extensions |
|
|
|
|
### CLI & Commands
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `cli.py` | Main CLI interface (274KB, comprehensive) |
|
|
| `cli_utils.py` | CLI utilities |
|
|
| `asset_commands.py` | Asset-related CLI commands |
|
|
| `draft_generator.py` | Draft generation functionality |
|
|
|
|
### Utilities
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `batch_processor.py` | Batch processing operations |
|
|
| `associated_files.py` | Associated file tracking |
|
|
| `legacy_compat.py` | Legacy compatibility layer |
|
|
| `legacy_integration_example.py` | Integration examples |
|
|
| `_version.py`, `__version__.py` | Version management |
|
|
|
|
## Subpackages
|
|
|
|
### assets/ (21 files)
|
|
Complete asset management system including discovery, analytics, caching, deduplication, and packaging. Key files:
|
|
- `repository.py` - Asset repository pattern
|
|
- `discovery.py` - Asset discovery algorithms
|
|
- `cache.py` - Asset caching layer
|
|
- `analytics.py` - Asset usage analytics
|
|
|
|
### finance/ (9 files)
|
|
Cost tracking and work time management:
|
|
- `models.py` - Financial data models
|
|
- `cost_tracker.py` - Cost tracking implementation
|
|
- `period_tracker.py` - Period-based tracking
|
|
- `report_generator.py` - Financial reports
|
|
|
|
### plugins/ (7 files)
|
|
Extensible plugin system:
|
|
- `base.py` - Plugin base classes and types
|
|
- `registry.py` - Plugin registry
|
|
- `builtin/` - Built-in plugin implementations
|
|
|
|
### packaging/ (7 files)
|
|
Asset packaging and MDZ format support:
|
|
- `mdz_packager.py` - MDZ package creation
|
|
- `transclusion.py` - Transclusion handling
|
|
- `variant_factory.py` - Variant generation
|
|
|
|
### production/ (6 files)
|
|
Deployment and production validation:
|
|
- `deployment_validator.py` - Deployment checks
|
|
- `performance_benchmark.py` - Performance testing
|
|
- `cross_platform_validator.py` - Platform compatibility
|
|
|
|
### legacy/ (8 files)
|
|
Backward compatibility layer:
|
|
- `compatibility.py` - Compatibility wrappers
|
|
- `deprecation.py` - Deprecation warnings
|
|
- `git_tracker.py` - Git integration (useful for Phase 8)
|
|
|
|
## Test Structure
|
|
|
|
```
|
|
tests/
|
|
├── conftest.py # Shared pytest configuration
|
|
├── fixtures/ # Test data files
|
|
│ ├── content_test_files/
|
|
│ ├── contentmatter_test_files/
|
|
│ ├── frontmatter_test_files/
|
|
│ └── tailmatter_test_files/
|
|
├── unit/ # Unit tests by domain
|
|
│ ├── application/
|
|
│ └── infrastructure/
|
|
├── integration/ # Integration tests
|
|
│ └── repositories/
|
|
└── e2e/ # End-to-end tests
|
|
├── cli/
|
|
└── performance/
|
|
```
|
|
|
|
---
|
|
|
|
## Planned Reorganization
|
|
|
|
### Motivation
|
|
|
|
The current layout has grown organically, resulting in:
|
|
1. **34 files at root level** - Too many modules at package root
|
|
2. **No clear grouping** - Schema tools, core infrastructure, and utilities mixed
|
|
3. **Hybrid architecture** - Mix of root packages and monolithic /markitect/
|
|
|
|
### Target Structure
|
|
|
|
After reorganization, the /markitect/ package will have clearer structure:
|
|
|
|
```
|
|
markitect/
|
|
├── core/ # Core infrastructure (NEW)
|
|
│ ├── __init__.py
|
|
│ ├── parser.py # (from markitect/)
|
|
│ ├── serializer.py # (from markitect/)
|
|
│ ├── document_manager.py # (from markitect/)
|
|
│ └── workspace.py # (from markitect/)
|
|
├── schema/ # Schema management (NEW)
|
|
│ ├── __init__.py
|
|
│ ├── validator.py # (from schema_validator.py)
|
|
│ ├── generator.py # (from schema_generator.py)
|
|
│ ├── loader.py # (from schema_loader.py)
|
|
│ ├── analyzer.py # (from schema_analyzer.py)
|
|
│ ├── refiner.py # (from schema_refiner.py)
|
|
│ └── naming.py # (from schema_naming.py)
|
|
├── storage/ # Storage concerns (NEW)
|
|
│ ├── __init__.py
|
|
│ ├── database.py # (from markitect/)
|
|
│ └── cache.py # (consolidated)
|
|
├── spaces/ # Information spaces (Phase 1+)
|
|
│ ├── models.py
|
|
│ ├── events/
|
|
│ ├── repositories/
|
|
│ ├── transclusion/
|
|
│ ├── rendering/
|
|
│ ├── sync/
|
|
│ └── services/
|
|
└── [existing subpackages] # assets/, plugins/, etc.
|
|
```
|
|
|
|
### Backward Compatibility
|
|
|
|
Original import paths will continue to work through re-exports:
|
|
|
|
```python
|
|
# Old import (still works)
|
|
from markitect.parser import parse_markdown
|
|
|
|
# New import (preferred)
|
|
from markitect.core.parser import parse_markdown
|
|
```
|
|
|
|
### Migration Strategy
|
|
|
|
1. Create new subpackages with copied content
|
|
2. Update internal imports to new paths
|
|
3. Add deprecation warnings to old paths
|
|
4. Re-export from original locations for compatibility
|
|
5. Verify all tests pass
|
|
6. Update documentation
|
|
|
|
---
|
|
|
|
## Information Space Service Architecture
|
|
|
|
The reorganization prepares for the Information Space Service evolution, which adds:
|
|
|
|
### Phase 1-3: Foundation
|
|
- `InformationSpace` entity with lifecycle management
|
|
- `SpaceRepository` for persistence
|
|
- Event system for change tracking
|
|
- Persistent transclusion context
|
|
|
|
### Phase 4-5: Modes
|
|
- HTML rendering mode with caching
|
|
- Directory mode with bidirectional sync
|
|
|
|
### Phase 6-7: API & Composability
|
|
- GraphQL schema extensions
|
|
- CLI commands for space management
|
|
- Space references and inheritance
|
|
|
|
### Phase 8: Git History (Optional)
|
|
- Git-based version control for spaces
|
|
- Event-driven commits
|
|
- Version navigation
|
|
|
|
See [docs/roadmap/information-space-service/](./roadmap/information-space-service/) for the complete workplan.
|
|
|
|
---
|
|
|
|
## Key Dependencies
|
|
|
|
From `pyproject.toml`:
|
|
- Python >=3.8 (tested on 3.12)
|
|
- markdown-it-py - Markdown parsing
|
|
- PyYAML - YAML/frontmatter handling
|
|
- click - CLI framework
|
|
- tabulate - Table formatting
|
|
- jsonpath-ng - JSON path queries
|
|
- aiohttp - Async HTTP
|
|
|
|
## Version Information
|
|
|
|
- Current version is managed in `_version.py` and `__version__.py`
|
|
- Follows semantic versioning
|
|
- CHANGELOG.md tracks all changes
|
|
|
|
---
|
|
|
|
## Related Documentation
|
|
|
|
- [CLI Tutorial](CLI_TUTORIAL.md) - CLI usage guide
|
|
- [Plugin System](PLUGIN_SYSTEM.md) - Plugin architecture
|
|
- [Schema Management Guide](SCHEMA_MANAGEMENT_GUIDE.md) - Schema workflows
|
|
- [Asset Management Guide](ASSET_MANAGEMENT_USER_GUIDE.md) - Asset system
|
|
- [Error Handling Strategy](ERROR_HANDLING_STRATEGY.md) - Error patterns
|