Files
markitect-main/docs/api/explode-variants.md
tegwick 6ddd4ea6e3
Some checks failed
Test Suite / integration-tests (push) Has been cancelled
Test Suite / e2e-tests (push) Has been cancelled
Test Suite / performance-tests (push) Has been cancelled
Test Suite / code-quality (push) Has been cancelled
Test Suite / security-scan (push) Has been cancelled
Test Suite / unit-tests (3.11) (push) Has been cancelled
Test Suite / unit-tests (3.12) (push) Has been cancelled
Test Suite / test-summary (push) Has been cancelled
feat: complete Issue #151 - Phase 4: Integration and Documentation
Implements comprehensive CLI integration and documentation for the
explode-implode system, completing both Issues #147 and #151.

Key Features Added:
- md-package CLI command (create/extract/info actions)
- md-transclude CLI command (process/validate actions)
- Complete user guide (556 lines) with tutorials and examples
- Technical API documentation (500 lines) for developers
- Migration guide (761 lines) with step-by-step procedures
- Cost analysis documenting ~85 hours of development value

Technical Implementation:
- Full MDZ packaging support with asset embedding
- Template-based transclusion with variable substitution
- Comprehensive error handling and verbose output modes
- Integration with existing MarkiTect CLI architecture

Documentation Suite:
- docs/user-guides/explode-implode-complete-guide.md
- docs/api/explode-variants.md
- docs/user-guides/migration-guide.md
- docs/cost-analysis/issues-147-151-implementation.md

This implementation transforms MarkiTect from a simple markdown
processor into a comprehensive document management platform with
sophisticated organizational capabilities.

Closes #147: Directory organization preservation fully implemented
Closes #151: CLI integration and documentation completed

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-10-14 11:11:51 +02:00

14 KiB

Explode-Implode API Documentation

Technical reference for MarkiTect's explode-implode variant system

Table of Contents

  1. Core Classes
  2. Variant Types
  3. Detection System
  4. Packaging Integration
  5. Error Handling
  6. Advanced Usage

Core Classes

ExplodeVariant

Base abstract class for all variant implementations.

from markitect.explode_implode.variants.base import ExplodeVariant

class ExplodeVariant(ABC):
    """Base class for document explosion variants."""

    @abstractmethod
    def explode(self, content: str, output_dir: Path,
               create_manifest: bool = True) -> Dict[str, Any]:
        """Explode document content into organized structure."""

    @abstractmethod
    def implode(self, input_dir: Path) -> str:
        """Reassemble exploded structure into single document."""

    @abstractmethod
    def detect_variant(self, directory: Path) -> bool:
        """Detect if directory follows this variant's structure."""

Methods

explode(content, output_dir, create_manifest=True)

  • Parameters:
    • content (str): Source markdown content
    • output_dir (Path): Target directory for exploded files
    • create_manifest (bool): Generate manifest.md for reversibility
  • Returns: Dict with explosion statistics and metadata
  • Raises: ExplodeError on processing failures

implode(input_dir)

  • Parameters:
    • input_dir (Path): Directory containing exploded structure
  • Returns: Reassembled markdown content as string
  • Raises: ImplodeError on assembly failures

detect_variant(directory)

  • Parameters:
    • directory (Path): Directory to analyze
  • Returns: Boolean indicating variant match confidence
  • Used by: Auto-detection system during implode operations

VariantDetector

Coordinates variant detection across all registered variants.

from markitect.explode_implode.detection import VariantDetector

detector = VariantDetector()
variant_type = detector.detect_variant(Path("exploded_dir/"))

Methods

detect_variant(directory)

  • Parameters:
    • directory (Path): Directory to analyze
  • Returns: String variant name ('flat', 'hierarchical', 'semantic')
  • Raises: VariantDetectionError if no variant matches

register_variant(name, variant_class)

  • Parameters:
    • name (str): Variant identifier
    • variant_class (ExplodeVariant): Variant implementation class
  • Purpose: Register custom variants with detection system

Variant Types

FlatVariant

Organizes all sections as peer files in a single directory.

from markitect.explode_implode.variants.flat import FlatVariant

variant = FlatVariant()
result = variant.explode(content, Path("output/"), create_manifest=True)

Structure Pattern:

document.mdd/
├── manifest.md
├── section_1.md
├── section_2.md
└── section_3.md

Detection Logic:

  • Manifest indicates explosion_type: flat
  • OR majority of files are in root directory
  • OR no numbered directory patterns detected

Configuration Options:

  • max_filename_length: Maximum characters in generated filenames (default: 50)
  • sanitize_filenames: Clean special characters from filenames (default: True)

HierarchicalVariant

Creates nested directory structure reflecting document hierarchy.

from markitect.explode_implode.variants.hierarchical import HierarchicalVariant

variant = HierarchicalVariant(max_depth=3)
result = variant.explode(content, Path("output/"), create_manifest=True)

Structure Pattern:

document.mdd/
├── manifest.md
├── 01_introduction/
│   └── index.md
├── 02_getting_started/
│   ├── index.md
│   ├── 01_installation.md
│   └── 02_configuration.md

Detection Logic:

  • Manifest indicates explosion_type: hierarchical
  • OR numbered directory patterns (01_, 02_, etc.)
  • OR nested directory structure with index.md files

Configuration Options:

  • max_depth: Maximum nesting levels (default: unlimited)
  • numbering_format: Directory numbering pattern (default: "{:02d}_")
  • index_filename: Name for section index files (default: "index.md")

SemanticVariant

Uses meaningful directory names based on content analysis.

from markitect.explode_implode.variants.semantic import SemanticVariant

variant = SemanticVariant()
result = variant.explode(content, Path("output/"), create_manifest=True)

Structure Pattern:

document.mdd/
├── manifest.md
├── introduction/
├── tutorials/
├── reference/
└── appendices/

Detection Logic:

  • Manifest indicates explosion_type: semantic
  • OR semantic directory names detected
  • OR content-based organization patterns

Content Analysis:

  • Header text analysis for semantic meaning
  • Keyword extraction for directory naming
  • Content type classification (intro, tutorial, reference, etc.)

Detection System

Auto-Detection Algorithm

The detection system uses a multi-pass approach:

def detect_variant(directory: Path) -> str:
    """
    Detection priority order:
    1. Manifest-based detection (highest confidence)
    2. Pattern recognition (medium confidence)
    3. Structure analysis (lower confidence)
    4. Fallback to 'flat' (lowest confidence)
    """

    # Pass 1: Manifest detection
    manifest_path = directory / "manifest.md"
    if manifest_path.exists():
        return parse_manifest_variant(manifest_path)

    # Pass 2: Pattern recognition
    for variant_name, variant_class in registered_variants.items():
        if variant_class.detect_variant(directory):
            return variant_name

    # Pass 3: Fallback
    return 'flat'

Detection Confidence Levels

High Confidence (90-100%)

  • Manifest file explicitly specifies variant
  • Clear structural patterns match variant expectations

Medium Confidence (70-89%)

  • Directory naming patterns suggest specific variant
  • File organization follows variant conventions

Low Confidence (50-69%)

  • Some indicators present but ambiguous
  • Structure could match multiple variants

Fallback (<50%)

  • No clear patterns detected
  • Default to flat variant for safety

Custom Detection Logic

Register custom variants with the detection system:

from markitect.explode_implode.detection import VariantDetector
from markitect.explode_implode.variants.base import ExplodeVariant

class CustomVariant(ExplodeVariant):
    def detect_variant(self, directory: Path) -> bool:
        # Custom detection logic
        return self._check_custom_patterns(directory)

    # ... implement other abstract methods

# Register variant
detector = VariantDetector()
detector.register_variant('custom', CustomVariant)

Packaging Integration

MDZ Package Integration

Explode-implode variants integrate seamlessly with MDZ packaging:

from markitect.packaging.variants import MdzVariant
from markitect.explode_implode.variants.hierarchical import HierarchicalVariant

# Create exploded structure
explode_variant = HierarchicalVariant()
explode_variant.explode(content, Path("temp_exploded/"))

# Package exploded structure
mdz_variant = MdzVariant()
package_path = mdz_variant.create_package(
    content_path=Path("temp_exploded/"),
    output_path=Path("document.mdz")
)

Package Metadata Integration

Explosion metadata is preserved in package manifests:

{
  "format": "mdz",
  "version": "1.0",
  "explosion_metadata": {
    "variant_type": "hierarchical",
    "max_depth": 3,
    "section_count": 15,
    "created": "2025-10-14T10:00:00Z"
  },
  "assets": [...],
  "dependencies": [...]
}

Error Handling

Exception Hierarchy

class ExplodeImplodeError(Exception):
    """Base exception for explode-implode operations."""

class ExplodeError(ExplodeImplodeError):
    """Errors during document explosion."""

class ImplodeError(ExplodeImplodeError):
    """Errors during document reassembly."""

class VariantDetectionError(ExplodeImplodeError):
    """Errors in variant detection process."""

class ManifestError(ExplodeImplodeError):
    """Errors in manifest processing."""

Common Error Scenarios

Explosion Failures:

try:
    variant.explode(content, output_dir)
except ExplodeError as e:
    if "insufficient disk space" in str(e):
        # Handle disk space issues
    elif "invalid markdown structure" in str(e):
        # Handle malformed content

Implosion Failures:

try:
    content = variant.implode(input_dir)
except ImplodeError as e:
    if "missing manifest" in str(e):
        # Try force variant detection
        variant = detector.detect_variant(input_dir)
    elif "corrupted files" in str(e):
        # Handle file corruption

Error Recovery Strategies

Missing Manifest Recovery:

def recover_missing_manifest(directory: Path) -> str:
    """Attempt recovery when manifest.md is missing."""
    try:
        # Try auto-detection
        return detector.detect_variant(directory)
    except VariantDetectionError:
        # Fallback to flat variant
        return 'flat'

Partial File Recovery:

def recover_partial_explosion(directory: Path) -> Dict[str, Any]:
    """Recover from incomplete explosion operations."""
    valid_files = []
    corrupted_files = []

    for file_path in directory.rglob("*.md"):
        try:
            validate_markdown_file(file_path)
            valid_files.append(file_path)
        except ValidationError:
            corrupted_files.append(file_path)

    return {
        'valid_files': valid_files,
        'corrupted_files': corrupted_files,
        'recovery_possible': len(valid_files) > 0
    }

Advanced Usage

Custom Variant Development

Create specialized variants for specific use cases:

class GitBookVariant(ExplodeVariant):
    """Variant optimized for GitBook-style documentation."""

    def __init__(self, chapters_per_directory: int = 5):
        self.chapters_per_directory = chapters_per_directory

    def explode(self, content: str, output_dir: Path,
               create_manifest: bool = True) -> Dict[str, Any]:
        # Custom explosion logic for GitBook structure
        sections = self._parse_gitbook_structure(content)
        return self._create_gitbook_directories(sections, output_dir)

    def detect_variant(self, directory: Path) -> bool:
        # Look for SUMMARY.md and GitBook conventions
        summary_path = directory / "SUMMARY.md"
        return summary_path.exists() and self._validate_gitbook_structure(directory)

Performance Optimization

Parallel Processing:

import asyncio
from concurrent.futures import ThreadPoolExecutor

class OptimizedHierarchicalVariant(HierarchicalVariant):
    async def explode_async(self, content: str, output_dir: Path) -> Dict[str, Any]:
        """Asynchronous explosion for large documents."""
        sections = self._parse_sections(content)

        with ThreadPoolExecutor(max_workers=4) as executor:
            tasks = []
            for section in sections:
                task = asyncio.get_event_loop().run_in_executor(
                    executor, self._process_section, section, output_dir
                )
                tasks.append(task)

            results = await asyncio.gather(*tasks)
            return self._consolidate_results(results)

Memory-Efficient Processing:

class StreamingVariant(ExplodeVariant):
    """Process large documents without loading entirely into memory."""

    def explode_streaming(self, input_file: Path, output_dir: Path) -> Dict[str, Any]:
        """Stream-process large markdown files."""
        section_buffer = []
        current_section = None

        with open(input_file, 'r', encoding='utf-8') as f:
            for line_num, line in enumerate(f):
                if self._is_section_header(line):
                    if current_section:
                        self._write_section(current_section, section_buffer, output_dir)
                    current_section = self._parse_section_header(line)
                    section_buffer = []

                section_buffer.append(line)

        # Write final section
        if current_section:
            self._write_section(current_section, section_buffer, output_dir)

Integration with Build Systems

Makefile Integration:

# Explode source document for editing
explode:
	markitect md-explode source/document.md --variant hierarchical

# Reassemble for production
implode:
	markitect md-implode source/document.mdd --output dist/document.md

# Package for distribution
package: implode
	markitect md-package create dist/document.md --format mdz --output dist/document.mdz

GitHub Actions Integration:

name: Document Processing
on: [push, pull_request]

jobs:
  process-docs:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v3
    - name: Install MarkiTect
      run: pip install markitect
    - name: Validate exploded structure
      run: markitect md-implode docs/source/ --dry-run --verbose
    - name: Generate final document
      run: markitect md-implode docs/source/ --output dist/complete-guide.md
    - name: Create distribution package
      run: markitect md-package create dist/complete-guide.md --format mdz

API Reference Summary

Class/Function Purpose Key Methods
ExplodeVariant Base variant class explode(), implode(), detect_variant()
FlatVariant Flat file organization Inherits base methods
HierarchicalVariant Nested directory structure Inherits base methods + max_depth
SemanticVariant Content-based organization Inherits base methods + semantic analysis
VariantDetector Auto-detection system detect_variant(), register_variant()
ExplodeError Explosion operation errors Standard exception interface
ImplodeError Reassembly operation errors Standard exception interface

Version: 1.0.0 Last Updated: 2025-10-14 Compatibility: MarkiTect 1.0+