Files
markitect-main/docs/api/explode-variants.md
tegwick 6ddd4ea6e3
Some checks failed
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
feat: complete Issue #151 - Phase 4: Integration and Documentation
Implements comprehensive CLI integration and documentation for the
explode-implode system, completing both Issues #147 and #151.

Key Features Added:
- md-package CLI command (create/extract/info actions)
- md-transclude CLI command (process/validate actions)
- Complete user guide (556 lines) with tutorials and examples
- Technical API documentation (500 lines) for developers
- Migration guide (761 lines) with step-by-step procedures
- Cost analysis documenting ~85 hours of development value

Technical Implementation:
- Full MDZ packaging support with asset embedding
- Template-based transclusion with variable substitution
- Comprehensive error handling and verbose output modes
- Integration with existing MarkiTect CLI architecture

Documentation Suite:
- docs/user-guides/explode-implode-complete-guide.md
- docs/api/explode-variants.md
- docs/user-guides/migration-guide.md
- docs/cost-analysis/issues-147-151-implementation.md

This implementation transforms MarkiTect from a simple markdown
processor into a comprehensive document management platform with
sophisticated organizational capabilities.

Closes #147: Directory organization preservation fully implemented
Closes #151: CLI integration and documentation completed

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-14 11:11:51 +02:00

501 lines
14 KiB
Markdown

# Explode-Implode API Documentation
**Technical reference for MarkiTect's explode-implode variant system**
## Table of Contents
1. [Core Classes](#core-classes)
2. [Variant Types](#variant-types)
3. [Detection System](#detection-system)
4. [Packaging Integration](#packaging-integration)
5. [Error Handling](#error-handling)
6. [Advanced Usage](#advanced-usage)
---
## Core Classes
### ExplodeVariant
Base abstract class for all variant implementations.
```python
from markitect.explode_implode.variants.base import ExplodeVariant
class ExplodeVariant(ABC):
"""Base class for document explosion variants."""
@abstractmethod
def explode(self, content: str, output_dir: Path,
create_manifest: bool = True) -> Dict[str, Any]:
"""Explode document content into organized structure."""
@abstractmethod
def implode(self, input_dir: Path) -> str:
"""Reassemble exploded structure into single document."""
@abstractmethod
def detect_variant(self, directory: Path) -> bool:
"""Detect if directory follows this variant's structure."""
```
#### Methods
**`explode(content, output_dir, create_manifest=True)`**
- **Parameters:**
- `content` (str): Source markdown content
- `output_dir` (Path): Target directory for exploded files
- `create_manifest` (bool): Generate manifest.md for reversibility
- **Returns:** Dict with explosion statistics and metadata
- **Raises:** `ExplodeError` on processing failures
**`implode(input_dir)`**
- **Parameters:**
- `input_dir` (Path): Directory containing exploded structure
- **Returns:** Reassembled markdown content as string
- **Raises:** `ImplodeError` on assembly failures
**`detect_variant(directory)`**
- **Parameters:**
- `directory` (Path): Directory to analyze
- **Returns:** Boolean indicating variant match confidence
- **Used by:** Auto-detection system during implode operations
### VariantDetector
Coordinates variant detection across all registered variants.
```python
from markitect.explode_implode.detection import VariantDetector
detector = VariantDetector()
variant_type = detector.detect_variant(Path("exploded_dir/"))
```
#### Methods
**`detect_variant(directory)`**
- **Parameters:**
- `directory` (Path): Directory to analyze
- **Returns:** String variant name ('flat', 'hierarchical', 'semantic')
- **Raises:** `VariantDetectionError` if no variant matches
**`register_variant(name, variant_class)`**
- **Parameters:**
- `name` (str): Variant identifier
- `variant_class` (ExplodeVariant): Variant implementation class
- **Purpose:** Register custom variants with detection system
## Variant Types
### FlatVariant
Organizes all sections as peer files in a single directory.
```python
from markitect.explode_implode.variants.flat import FlatVariant
variant = FlatVariant()
result = variant.explode(content, Path("output/"), create_manifest=True)
```
**Structure Pattern:**
```
document.mdd/
├── manifest.md
├── section_1.md
├── section_2.md
└── section_3.md
```
**Detection Logic:**
- Manifest indicates `explosion_type: flat`
- OR majority of files are in root directory
- OR no numbered directory patterns detected
**Configuration Options:**
- `max_filename_length`: Maximum characters in generated filenames (default: 50)
- `sanitize_filenames`: Clean special characters from filenames (default: True)
### HierarchicalVariant
Creates nested directory structure reflecting document hierarchy.
```python
from markitect.explode_implode.variants.hierarchical import HierarchicalVariant
variant = HierarchicalVariant(max_depth=3)
result = variant.explode(content, Path("output/"), create_manifest=True)
```
**Structure Pattern:**
```
document.mdd/
├── manifest.md
├── 01_introduction/
│ └── index.md
├── 02_getting_started/
│ ├── index.md
│ ├── 01_installation.md
│ └── 02_configuration.md
```
**Detection Logic:**
- Manifest indicates `explosion_type: hierarchical`
- OR numbered directory patterns (01_, 02_, etc.)
- OR nested directory structure with index.md files
**Configuration Options:**
- `max_depth`: Maximum nesting levels (default: unlimited)
- `numbering_format`: Directory numbering pattern (default: "{:02d}_")
- `index_filename`: Name for section index files (default: "index.md")
### SemanticVariant
Uses meaningful directory names based on content analysis.
```python
from markitect.explode_implode.variants.semantic import SemanticVariant
variant = SemanticVariant()
result = variant.explode(content, Path("output/"), create_manifest=True)
```
**Structure Pattern:**
```
document.mdd/
├── manifest.md
├── introduction/
├── tutorials/
├── reference/
└── appendices/
```
**Detection Logic:**
- Manifest indicates `explosion_type: semantic`
- OR semantic directory names detected
- OR content-based organization patterns
**Content Analysis:**
- Header text analysis for semantic meaning
- Keyword extraction for directory naming
- Content type classification (intro, tutorial, reference, etc.)
## Detection System
### Auto-Detection Algorithm
The detection system uses a multi-pass approach:
```python
def detect_variant(directory: Path) -> str:
"""
Detection priority order:
1. Manifest-based detection (highest confidence)
2. Pattern recognition (medium confidence)
3. Structure analysis (lower confidence)
4. Fallback to 'flat' (lowest confidence)
"""
# Pass 1: Manifest detection
manifest_path = directory / "manifest.md"
if manifest_path.exists():
return parse_manifest_variant(manifest_path)
# Pass 2: Pattern recognition
for variant_name, variant_class in registered_variants.items():
if variant_class.detect_variant(directory):
return variant_name
# Pass 3: Fallback
return 'flat'
```
### Detection Confidence Levels
**High Confidence (90-100%)**
- Manifest file explicitly specifies variant
- Clear structural patterns match variant expectations
**Medium Confidence (70-89%)**
- Directory naming patterns suggest specific variant
- File organization follows variant conventions
**Low Confidence (50-69%)**
- Some indicators present but ambiguous
- Structure could match multiple variants
**Fallback (<50%)**
- No clear patterns detected
- Default to flat variant for safety
### Custom Detection Logic
Register custom variants with the detection system:
```python
from markitect.explode_implode.detection import VariantDetector
from markitect.explode_implode.variants.base import ExplodeVariant
class CustomVariant(ExplodeVariant):
def detect_variant(self, directory: Path) -> bool:
# Custom detection logic
return self._check_custom_patterns(directory)
# ... implement other abstract methods
# Register variant
detector = VariantDetector()
detector.register_variant('custom', CustomVariant)
```
## Packaging Integration
### MDZ Package Integration
Explode-implode variants integrate seamlessly with MDZ packaging:
```python
from markitect.packaging.variants import MdzVariant
from markitect.explode_implode.variants.hierarchical import HierarchicalVariant
# Create exploded structure
explode_variant = HierarchicalVariant()
explode_variant.explode(content, Path("temp_exploded/"))
# Package exploded structure
mdz_variant = MdzVariant()
package_path = mdz_variant.create_package(
content_path=Path("temp_exploded/"),
output_path=Path("document.mdz")
)
```
### Package Metadata Integration
Explosion metadata is preserved in package manifests:
```json
{
"format": "mdz",
"version": "1.0",
"explosion_metadata": {
"variant_type": "hierarchical",
"max_depth": 3,
"section_count": 15,
"created": "2025-10-14T10:00:00Z"
},
"assets": [...],
"dependencies": [...]
}
```
## Error Handling
### Exception Hierarchy
```python
class ExplodeImplodeError(Exception):
"""Base exception for explode-implode operations."""
class ExplodeError(ExplodeImplodeError):
"""Errors during document explosion."""
class ImplodeError(ExplodeImplodeError):
"""Errors during document reassembly."""
class VariantDetectionError(ExplodeImplodeError):
"""Errors in variant detection process."""
class ManifestError(ExplodeImplodeError):
"""Errors in manifest processing."""
```
### Common Error Scenarios
**Explosion Failures:**
```python
try:
variant.explode(content, output_dir)
except ExplodeError as e:
if "insufficient disk space" in str(e):
# Handle disk space issues
elif "invalid markdown structure" in str(e):
# Handle malformed content
```
**Implosion Failures:**
```python
try:
content = variant.implode(input_dir)
except ImplodeError as e:
if "missing manifest" in str(e):
# Try force variant detection
variant = detector.detect_variant(input_dir)
elif "corrupted files" in str(e):
# Handle file corruption
```
### Error Recovery Strategies
**Missing Manifest Recovery:**
```python
def recover_missing_manifest(directory: Path) -> str:
"""Attempt recovery when manifest.md is missing."""
try:
# Try auto-detection
return detector.detect_variant(directory)
except VariantDetectionError:
# Fallback to flat variant
return 'flat'
```
**Partial File Recovery:**
```python
def recover_partial_explosion(directory: Path) -> Dict[str, Any]:
"""Recover from incomplete explosion operations."""
valid_files = []
corrupted_files = []
for file_path in directory.rglob("*.md"):
try:
validate_markdown_file(file_path)
valid_files.append(file_path)
except ValidationError:
corrupted_files.append(file_path)
return {
'valid_files': valid_files,
'corrupted_files': corrupted_files,
'recovery_possible': len(valid_files) > 0
}
```
## Advanced Usage
### Custom Variant Development
Create specialized variants for specific use cases:
```python
class GitBookVariant(ExplodeVariant):
"""Variant optimized for GitBook-style documentation."""
def __init__(self, chapters_per_directory: int = 5):
self.chapters_per_directory = chapters_per_directory
def explode(self, content: str, output_dir: Path,
create_manifest: bool = True) -> Dict[str, Any]:
# Custom explosion logic for GitBook structure
sections = self._parse_gitbook_structure(content)
return self._create_gitbook_directories(sections, output_dir)
def detect_variant(self, directory: Path) -> bool:
# Look for SUMMARY.md and GitBook conventions
summary_path = directory / "SUMMARY.md"
return summary_path.exists() and self._validate_gitbook_structure(directory)
```
### Performance Optimization
**Parallel Processing:**
```python
import asyncio
from concurrent.futures import ThreadPoolExecutor
class OptimizedHierarchicalVariant(HierarchicalVariant):
async def explode_async(self, content: str, output_dir: Path) -> Dict[str, Any]:
"""Asynchronous explosion for large documents."""
sections = self._parse_sections(content)
with ThreadPoolExecutor(max_workers=4) as executor:
tasks = []
for section in sections:
task = asyncio.get_event_loop().run_in_executor(
executor, self._process_section, section, output_dir
)
tasks.append(task)
results = await asyncio.gather(*tasks)
return self._consolidate_results(results)
```
**Memory-Efficient Processing:**
```python
class StreamingVariant(ExplodeVariant):
"""Process large documents without loading entirely into memory."""
def explode_streaming(self, input_file: Path, output_dir: Path) -> Dict[str, Any]:
"""Stream-process large markdown files."""
section_buffer = []
current_section = None
with open(input_file, 'r', encoding='utf-8') as f:
for line_num, line in enumerate(f):
if self._is_section_header(line):
if current_section:
self._write_section(current_section, section_buffer, output_dir)
current_section = self._parse_section_header(line)
section_buffer = []
section_buffer.append(line)
# Write final section
if current_section:
self._write_section(current_section, section_buffer, output_dir)
```
### Integration with Build Systems
**Makefile Integration:**
```makefile
# Explode source document for editing
explode:
markitect md-explode source/document.md --variant hierarchical
# Reassemble for production
implode:
markitect md-implode source/document.mdd --output dist/document.md
# Package for distribution
package: implode
markitect md-package create dist/document.md --format mdz --output dist/document.mdz
```
**GitHub Actions Integration:**
```yaml
name: Document Processing
on: [push, pull_request]
jobs:
process-docs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Install MarkiTect
run: pip install markitect
- name: Validate exploded structure
run: markitect md-implode docs/source/ --dry-run --verbose
- name: Generate final document
run: markitect md-implode docs/source/ --output dist/complete-guide.md
- name: Create distribution package
run: markitect md-package create dist/complete-guide.md --format mdz
```
---
## API Reference Summary
| Class/Function | Purpose | Key Methods |
|---------------|---------|-------------|
| `ExplodeVariant` | Base variant class | `explode()`, `implode()`, `detect_variant()` |
| `FlatVariant` | Flat file organization | Inherits base methods |
| `HierarchicalVariant` | Nested directory structure | Inherits base methods + `max_depth` |
| `SemanticVariant` | Content-based organization | Inherits base methods + semantic analysis |
| `VariantDetector` | Auto-detection system | `detect_variant()`, `register_variant()` |
| `ExplodeError` | Explosion operation errors | Standard exception interface |
| `ImplodeError` | Reassembly operation errors | Standard exception interface |
**Version:** 1.0.0
**Last Updated:** 2025-10-14
**Compatibility:** MarkiTect 1.0+